llvm-project

Commit Graph

Author	SHA1	Message	Date
Abinav Puthan Purayil	485dd0b752	[GlobalISel] Handle constant splat in funnel shift combine This change adds the constant splat versions of m_ICst() (by using getBuildVectorConstantSplat()) and uses it in matchOrShiftToFunnelShift(). The getBuildVectorConstantSplat() name is shortened to getIConstantSplatVal() so that the *SExtVal() version would have a more compact name. Differential Revision: https://reviews.llvm.org/D125516	2022-05-16 16:03:30 +05:30
Eli Friedman	96c2a0c9ff	[GlobalIsel] Fix fallback if stack protector isn't supported. When GlobalISel fails, we need to report the error, and we need to set the FailedISel property. We skipped those steps if stack protector insertion failed, which led to a very strange miscompile. Differential Revision: https://reviews.llvm.org/D125584	2022-05-13 14:17:27 -07:00
Jay Foad	26e1ebd3ea	[GlobalISel] Change ConstantFoldVectorBinop to return vector of APInt Previously it built MIR for the results and returned a Register. This avoids building constants for earlier elements of the vector if later elements will fail to fold, and allows CSEMIRBuilder::buildInstr to avoid unconditionally building a copy from the result. Use a new helper function MachineIRBuilder::buildBuildVectorConstant to build a G_BUILD_VECTOR of G_CONSTANTs. Differential Revision: https://reviews.llvm.org/D117758	2022-05-13 09:33:07 +01:00
Jon Roelofs	e1c808b36e	Fix zero-width bitfield extracts to emit 0 Fixes #55129	2022-05-03 14:46:42 -07:00
Matt Arsenault	40bc9112c0	GlobalISel: Relax handling of G_ASSERT_* with source register classes The most common situation where G_ASSERT_ZEXT appears for AMDGPU is a copy from a physical register, which happens to use set the actual register class on the virtual register. After copy coalescing, the assert's source operand had a vreg with a set class. The verifier was strictly rejecting cases where the set class/bank weren't an exact match. Additionally, RegBankSelect was also expecting a register bank to be set on the register, not a class. This is much stricter than regular copies so relax this behavior. This now allows these 2 cases: 1. Source register has either class or bank, and the result does not 2. Source register has a register class, and the result is a register with a matching bank. This should avoid needing some kind of special handling to avoid violating this constraint when folding copies.	2022-04-22 10:49:50 -04:00
Matt Arsenault	507259820a	GlobalISel: Add LegalizeMutations to help use More/FewerElements	2022-04-19 21:04:32 -04:00
Jonas Paulsson	46f83caebc	[InlineAsm] Add support for address operands ("p"). This patch adds support for inline assembly address operands using the "p" constraint on X86 and SystemZ. This was in fact broken on X86 (see example at https://reviews.llvm.org/D110267, Nov 23). These operands should probably be treated the same as memory operands by CodeGenPrepare, which have been commented with "TODO" there. Review: Xiang Zhang and Ulrich Weigand Differential Revision: https://reviews.llvm.org/D122220	2022-04-13 12:50:21 +02:00
Matt Arsenault	3754f60112	GlobalISel: Implement MoreElements for select of vector conditions	2022-04-12 16:54:04 -04:00
Matt Arsenault	3f2cc7cc2b	GlobalISel: Fix lowerSelect handling of boolean high bits This was making several invalid assumptions about the incoming select. First, it was assuming the incoming condition was either s1 or already sign extended, not accounting for different boolean high bits behavior between scalar and vector conditions. We only had a vector boolean due to the intermediate step vector select, which is now avoided. Second, it was assuming it can use the result vector type as a boolean mask. These types don't have anything to do with other, and only makes sense in the context of the expansion to bit operations. Since these logically are part of the same lowering, do the complete expansion in a single step. The added select_v4s1_s1 test does fail to legalize, since it seems AArch64's vector legalization support is pretty incomplete.	2022-04-12 16:54:03 -04:00
Matt Arsenault	0e489926be	GlobalISel: Handle widening addo/subo booleans This will be tested in a future patch	2022-04-12 16:54:03 -04:00
Matt Arsenault	95c2bcbf8b	GlobalISel: Handle widening umulo/smulo condition outputs	2022-04-12 16:54:03 -04:00
Matt Arsenault	abe171df06	GlobalISel: Update mutationIsSane assert for scalable vectors	2022-04-12 16:54:03 -04:00
Matt Arsenault	d1f97a3419	GlobalISel: Add memSizeNotByteSizePow2 legality helper This is really a replacement for memSizeInBytesNotPow2 that actually does what most every target wants. In particular, since s1 rounds to 1 byte, it wasn't lowered by this predicate. This results in targets needing to think harder and add more matchers to catch all the degenerate cases. Also small bug fix that prevented the correct insertion of G_ASSERT_ZEXT in the AArch64 use case.	2022-04-11 19:43:37 -04:00
Matt Arsenault	1416744f84	GlobalISel: Implement computeKnownBits for overflow bool results	2022-04-11 19:43:37 -04:00
Daniel Sanders	93977f37e6	Check if register class was changed in constrainOperandRegClass() NFC When no actual change happens there's no need to notify the observers about the fact the register class is being constrained. So we should avoid notifying observers when no change has happened, because this can dramatically affect compile time for particular test cases. Reviewed By: dsanders, arsenm Differential Revision: https://reviews.llvm.org/D122615	2022-04-05 11:55:07 -07:00
Abinav Puthan Purayil	898d5776ec	[AMDGPU][GlobalISel] Scalarize add/sub with overflow ops in the legalizer Differential Revision: https://reviews.llvm.org/D122803	2022-03-31 21:46:34 +05:30
serge-sans-paille	60ca256953	Cleanup include: Add missing header Should fix https://lab.llvm.org/buildbot#builders/57/builds/16192 introduced by `02c28970b2`	2022-03-23 15:15:56 +01:00
Benjamin Kramer	9a6e0afac5	Unbreak the build after `02c28970b2`	2022-03-23 14:38:13 +01:00
serge-sans-paille	02c28970b2	Cleanup include: codegen second round Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D122180	2022-03-23 13:54:00 +01:00
Kazu Hirata	1eada2adda	[CodeGen] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)	2022-03-20 23:11:06 -07:00
Shengchen Kan	37b378386e	[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments	2022-03-16 20:25:42 +08:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
Amara Emerson	8cbf18cb04	[GlobalISel] Fix store merging incorrectly merging volatile stores. The existing volatile checks only handle aliasing hazards between stores, but that isn't enough since by that point volatile stores may have already been added to the current candidate group.	2022-03-14 13:48:51 -07:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit `7f230feeea`. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Paul Robinson	7b85f0f32f	[PS4] isPS4 and isPS4CPU are not meaningfully different	2022-03-03 11:36:59 -05:00
Mircea Trofin	cb2160760e	[nfc][codegen] Move RegisterBank[Info].h under CodeGen This wraps up from D119053. The 2 headers are moved as described, fixed file headers and include guards, updated all files where the old paths were detected (simple grep through the repo), and `clang-format`-ed it all. Differential Revision: https://reviews.llvm.org/D119876	2022-03-01 21:53:25 -08:00
Nikita Popov	87ebd9a36f	[IR] Use CallBase::getParamElementType() (NFC) As this method now exists on CallBase, use it rather than the one on AttributeList.	2022-02-25 10:01:58 +01:00
Amara Emerson	b09e63bad1	[AArch64][GlobalISel] Implement combines for boolean G_SELECT->bitwise ops. Differential Revision: https://reviews.llvm.org/D117160	2022-02-20 00:53:09 -08:00
Mircea Trofin	c62eefb886	[nfc][codegen] Move RegisterBank[Info].cpp under CodeGen Layering-wise, it seems RegisterBank stuff fits under CodeGen, like other target abstraction. In particular, TargetSubtargetInfo has a getRegBankInfo member, but using that object requires making sure GlobalISel is linked, which is not always the case (e.g. llvm-jitlink doesn't). Differential Revision: https://reviews.llvm.org/D119053	2022-02-15 11:27:15 -08:00
Julien Pages	dcb2da13f1	[AMDGPU] Add a new intrinsic to control fp_trunc rounding mode Add a new llvm.fptrunc.round intrinsic to precisely control the rounding mode when converting from f32 to f16. Differential Revision: https://reviews.llvm.org/D110579	2022-02-11 12:08:23 -05:00
Jay Foad	abda8d2229	[GlobalISel] CSE FP constants at -O0 At -O0 we claim to CSE constants only. I think this should apply to G_FCONSTANT as well as G_CONSTANT. Differential Revision: https://reviews.llvm.org/D119344	2022-02-10 09:17:11 +00:00
Matt Arsenault	5af0f097ba	GlobalISel: Constant fold G_PTR_ADD Some globals lower to literal addresses on AMDGPU. This may be wrong for non-integral address spaces. I'm wondering if we should just allow regular G_ADD to use pointer types, and reserve G_PTR_ADD for non-integral address spaces.	2022-02-08 19:21:06 -05:00
Matt Arsenault	2af4a554fe	GlobalISel: Constant fold FP bin ops in MIRBuilder Might as well handle these if we're going to handle the integer ops here.	2022-02-08 18:51:10 -05:00
Matt Arsenault	930f2498d4	GlobalISel: Constant fold integer min/max opcodes	2022-02-08 18:50:35 -05:00
Matt Arsenault	0877fbcc16	GlobalISel: Add FoldBinOpIntoSelect combine This will do the combine in cases that should fold, but don't now. e.g. we're relying on the CSEMIRBuilder's incomplete constant folding. For instance it doesn't handle FP operations or vectors (and we don't have separate constant folding combines either to catch them).	2022-02-08 18:17:21 -05:00
Sheng	76c83e747f	[GlobalISel] Add big endian support in CallLowering When splitting values, CallLowering assumes Lo part goes first. But in big endian ISA such as M68k, Hi part goes first. This patch fixes this. Differential Revision: https://reviews.llvm.org/D116877	2022-02-08 14:43:38 +00:00
Sheng	146c7820d9	[GlobalISel][Legalizer] Support reducing load/store width in big endian order	2022-02-07 20:06:17 -05:00
Simon Pilgrim	5d3a86489f	[GlobalISel] Move getOpcode() calls inside assert() to avoid (void)s. NFC. Tidier solution to the unused variable warnings - we already do this in other places in this file.	2022-02-07 09:50:27 +00:00
Djordje Todorovic	def10a2895	[GlobalIsel] Fix another "unused variable" warning	2022-02-07 09:32:22 +01:00
Djordje Todorovic	eab395fa40	Fix the warning after D118805 A variable was used within assert() only.	2022-02-07 09:25:02 +01:00
Kazu Hirata	3a8c51480f	[CodeGen] Use = default (NFC) Identified with modernize-use-equals-default	2022-02-06 10:54:44 -08:00
Róbert Ágoston	cd4ed08b5a	[GlobalISel] Don't combine instructions which are fed by memory instructions using different size Memory instructions like extending loads from the same address are not equal if their size is not equal. This fixes https://github.com/llvm/llvm-project/issues/53524. Differential Revision: https://reviews.llvm.org/D118805	2022-02-04 15:00:47 -08:00
Jessica Paquette	9a61e731ff	[GlobalISel] Combine (G_ADDO x, 0) -> x + no carry out Similar to the G_MULO change. The code for checking if a constant is legal/pre-legalize is shared between these, and is kind of hairy. So, factor it out into a new function: `isConstantLegalOrBeforeLegalizer`. To make the refactoring clean, further refactor `isLegalOrBeforeLegalizer` into a wrapper for two functions: - `isPreLegalize` - `isLegal` This is a bit easier to read in general. https://godbolt.org/z/KW7oszP1o Differential Revision: https://reviews.llvm.org/D118655	2022-02-03 14:25:15 -08:00
Jessica Paquette	c636899dc1	[GlobalISel] Combine: (G_MULO x, 0) -> 0 + no carry out Similar to the following combine in `DAGCombiner::visitMULO`: ``` // fold (mulo x, 0) -> 0 + no carry out if (isNullOrNullSplat(N1)) return CombineTo(N, DAG.getConstant(0, DL, VT), DAG.getConstant(0, DL, CarryVT)); ``` This fixes some generally poor codegen for `mulo`: https://godbolt.org/z/eTxYsvz8f Differential Revision: https://reviews.llvm.org/D118635	2022-02-03 14:23:58 -08:00
Kazu Hirata	2bea207d26	[CodeGen] Use default member initialization (NFC) Identified with modernize-use-default-member-init.	2022-01-30 12:32:51 -08:00
Matt Arsenault	2d670de84c	GlobalISel: Avoid crash on asm with lying result types The physical register in the asm has the wrong type for the declared IR. It seems to work in the DAG by extracting the 4 elements that are defined in the IR from the register, but that isn't handled here. This doesn't seem to be a well tested path since other mismatched cases are crashing the DAG asm handling.	2022-01-26 15:23:59 -05:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit `ef82063207`. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00
Sebastian Neubauer	4723f3cf03	[AMDGPU][GlobalISel] Combine unmerge of undef Fold (unmerge undef) -> undef, undef, ... Differential Revision: https://reviews.llvm.org/D118138	2022-01-26 12:30:36 +01:00
Nikita Popov	a3a2239aaa	[GlobalISel] Avoid pointer element type access during InlineAsm lowering Same change as has been made for the SDAG lowering.	2022-01-25 14:26:47 +01:00
Nikita Popov	aa97bc116d	[NFC] Remove uses of PointerType::getElementType() Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.	2022-01-25 09:44:52 +01:00
Matt Arsenault	99e8e17313	Reapply "Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction" This reverts commit `a97e20a3a8`.	2022-01-24 09:26:52 -05:00
Nikita Popov	0d1308a7b7	[AArch64][GlobalISel] Support returned argument with multiple registers The call lowering code assumed that a returned argument could only consist of one register. Pass an ArrayRef<Register> instead of Register to make sure that all parts get assigned. Fixes https://github.com/llvm/llvm-project/issues/53315. Differential Revision: https://reviews.llvm.org/D117866	2022-01-24 10:55:28 +01:00
Abinav Puthan Purayil	68b70d17d8	[GlobalISel] Fold or of shifts with constant amount to funnel shift. This change folds (or (shl x, C0), (lshr y, C1)) to funnel shift iff C0 and C1 are constants where C0 + C1 is the bit-width of the shift instructions. Differential Revision: https://reviews.llvm.org/D116529	2022-01-24 10:43:32 +05:30
Lucas Prates	283f5a198a	[GlobalISel] Fix incorrect sign extension when combining G_INTTOPTR and G_PTR_ADD The GlobalISel combiner currently uses sign extension when manipulating the LHS constant when combining a sequence of the following sequence of machine instructions into a single constant: ``` %0:_(s32) = G_CONSTANT i32 <CONSTANT> %1:_(p0) = G_INTTOPTR %0:_(s32) %2:_(s64) = G_CONSTANT i64 <CONSTANT> %3:_(p0) = G_PTR_ADD %1:_, %2:_(s64) ``` This causes an issue when the bit width of the first contant and the target pointer size are different, as G_INTTOPTR has no sign extension semantics. This patch fixes this by capture an arbitrary precision in when matching the constant, allowing the matching function to correctly zero extend it. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D116941	2022-01-20 17:02:52 +00:00
Daniel Thornburgh	2e2999cd44	[NFC] Test commit to verify commit access.	2022-01-18 18:03:26 -08:00
Matt Arsenault	5599c43124	GlobalISel: Swap order of operand checks in ConstantFoldVectorBinop Since constants are canonicalized to the RHS, this is more likely to exit early.	2022-01-18 17:21:02 -05:00
Matt Arsenault	da72822763	GlobalISel: Fix CSEMIRBuilder mishandling constant folds of vectors This was ignoring the requested result register, resulting in a missing def when this happened in the IRTranslator. Fixes some crashes and verifier errors at -O0. Alternatively we could pass DstOps to the constant fold functions.	2022-01-18 17:21:02 -05:00
Nikita Popov	c63a3175c2	[AttrBuilder] Remove ctor accepting AttributeList and Index Use the AttributeSet constructor instead. There's no good reason why AttrBuilder itself should exact the AttributeSet from the AttributeList. Moving this out of the AttrBuilder generally results in cleaner code.	2022-01-15 22:39:31 +01:00
James Y Knight	a97e20a3a8	Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction" This commit sometimes causes a crash when compiling a vtable thunk. E.g.: clang '--target=aarch64-grtev4-linux-gnu' -xc++ - -c -o /dev/null <<EOF struct a { virtual int f(); }; struct c { virtual int &g() const; }; struct d : a, c { int &g() const; }; int &d::g() const {} EOF Some follow-up commits have been reverted as well: Revert "IR: Make getRetAlign check callee function attributes" Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC." Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC." This reverts commit `4f414af6a7`. This reverts commit `a5507d2e25`. This reverts commit `3d2d208f6a`. This reverts commit `07ddfa95e3`.	2022-01-14 04:50:07 +00:00
Simon Pilgrim	4f414af6a7	Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC.	2022-01-13 11:10:50 +00:00
Matt Arsenault	5a16306c09	GlobalISel: Always enable GISelKnownBits for InstructionSelect This wasn't running at -O0, and causing crashes for AMDGPU. AMDGPU needs this to match the addressing modes of stack access instructions, which is even more important at -O0 than with optimizations. It currently costs nothing to run ahead of time, so just always enable it.	2022-01-12 18:57:24 -05:00
Matt Arsenault	07ddfa95e3	GlobalISel: Add G_ASSERT_ALIGN hint instruction Insert it for call return values only for now, which is the only case the DAG handles also.	2022-01-12 18:20:58 -05:00
Matt Arsenault	8a16201a0b	GlobalISel: Fix insert point in localizer This was inserting the new G_CONSTANT after the use, and the later block scan would run off the end. Fix calling SkipPHIsAndLabels for no apparent reason.	2022-01-12 13:44:05 -05:00
Petar Avramovic	c8c5dc766b	GlobalIsel: Fix fma combine when one of the operands comes from unmerge Fma combine assumes that MRI.getVRegDef(Reg)->getOperand(0).getReg() = Reg which is not true when Reg is defined by instruction with multiple defs e.g. G_UNMERGE_VALUES. Fix is to keep register and the instruction that defines register in DefinitionAndSourceRegister and use when needed. Differential Revision: https://reviews.llvm.org/D117032	2022-01-12 17:47:25 +01:00
Matt Arsenault	5a434ceafb	GlobalISel: Use cloneVirtualRegister in localizer	2022-01-11 16:10:12 -05:00
Matt Arsenault	0ba4e4b500	GlobalISel: Pass DebugLoc to getFunctionLiveInPhysReg Fixes crash in assertion about dropping debug info.	2022-01-10 13:50:52 -05:00
Serge Guelton	d2cc6c2d0c	Use a sorted array instead of a map to store AttrBuilder string attributes Using and std::map<SmallString, SmallString> for target dependent attributes is inefficient: it makes its constructor slightly heavier, and involves extra allocation for each new string attribute. Storing the attribute key/value as strings implies extra allocation/copy step. Use a sorted vector instead. Given the low number of attributes generally involved, this is cheaper, as showcased by https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions Differential Revision: https://reviews.llvm.org/D116599	2022-01-10 14:49:53 +01:00
Jay Foad	50fb44eebb	[GlobalISel] Use getPreferredShiftAmountTy in one more G_UBFX combine Change CombinerHelper::matchBitfieldExtractFromShrAnd to use getPreferredShiftAmountTy for the shift-amount-like operands of G_UBFX just like all the other G_[SU]BFX combines do. This better matches the AMDGPU legality rules for these instructions. Differential Revision: https://reviews.llvm.org/D116803	2022-01-08 09:20:44 +00:00
Jay Foad	ff971873b3	[GlobalISel] Fix legality checks for G_UBFX combines 1. Fix CombinerHelper::matchBitfieldExtractFromAnd to check legality with the correct types for the G_UBFX that it builds. 2. Fix AMDGPUTargetLowering::isConstantUnsignedBitfieldExtractLegal to match the legality rules: result and first operand can be s32 or s64 but the "shift amount" operands are always s32. 3. Add AMDGPU tests where the post-legalizer combiner would create illegal MIR without the above fixes. Differential Revision: https://reviews.llvm.org/D116802	2022-01-08 09:20:44 +00:00
Kazu Hirata	b932bdf59f	[llvm] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-07 17:45:09 -08:00
Jay Foad	3f3fe4a5cf	[GlobalISel] Fix typo Extact to Extract in function name. NFC.	2022-01-07 11:13:35 +00:00
Nikita Popov	e4d1779990	[IR] Add ConstraintInfo::hasArg() helper (NFC) Checking whether a constraint corresponds to an argument is a recurring pattern.	2022-01-07 10:44:38 +01:00
Kazu Hirata	e5947760c2	Revert "[llvm] Remove redundant member initialization (NFC)" This reverts commit `fd4808887e`. This patch causes gcc to issue a lot of warnings like: warning: base class ‘class llvm::MCParsedAsmOperand’ should be explicitly initialized in the copy constructor [-Wextra]	2022-01-03 11:28:47 -08:00
Kazu Hirata	fd4808887e	[llvm] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-01 16:18:18 -08:00
Petar Avramovic	508e39afe0	GlobalISel: remove redundant line added in D114198. NFC	2021-12-27 12:14:13 +01:00
Kazu Hirata	2d303e6781	Remove redundant return and continue statements (NFC) Identified with readability-redundant-control-flow.	2021-12-24 23:17:54 -08:00
Fangrui Song	ea2d4c5881	[GlobalISel] Fix -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off builds after D114198	2021-12-24 00:55:54 -08:00
Petar Avramovic	29f88b93fd	[GlobalISel] Rework more/fewer elements for vectors Artifact combiner is not able to access individual elements after using LCMTy style merge/unmerge, extract and insert to change vector number of elements (pad with undef or split to sub-vector instructions). Use unmerge to individual elements instead and then merge elements into requested types. Change argument lowering for vectors and moreElementsVector to use buildPadVectorWithUndefElements and buildDeleteTrailingVectorElements. FewerElementsVector had a few helpers that had different behavior, introduce new helper for most of the opcodes. FewerElementsVector helper is more flexible since it can create leftover instruction smaller then requested type (useful in case target wants to avoid pad with undef and use fewer registers). If target does not want leftover of different type it should call more elements first. Some helpers were performing more elements first to have split without leftover. Opcodes that used this helper use clampMaxNumElementsStrict (does more elements first) in LegalizerInfo to avoid test changes. Fixes failures caused by failing to combine artifacts created during more/fewer elements vector. Differential Revision: https://reviews.llvm.org/D114198	2021-12-23 14:30:02 +01:00
Konstantin Schwarz	a344653725	[GlobalISel] Fix IRTranslator for constexpr fcmp The existing code assumed fcmp to always be an Instruction, but it can also be a ConstExpr. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D115450	2021-12-10 08:49:12 +01:00
Mircea Trofin	91a0da0142	[NFC] Rename MachineFunction::DeleteMachineBasicBlock Renamed to conform to coding style	2021-12-08 18:12:51 -08:00
Jack Andersen	f108c7f59d	[GlobalISel] Allow DBG_VALUE to use undefined vregs before LiveDebugValues. Expanding on D109750. Since `DBG_VALUE` instructions have final register validity determined in `LDVImpl::handleDebugValue`, there is no apparent reason to immediately prune unused register operands as their defs are erased. Consequently, this renders `MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval` moot; gaining a substantial performance improvement. The only necessary changes involve making relevant passes consider invalid DBG_VALUE vregs uses as valid. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D112852	2021-12-05 15:55:59 -05:00
Kazu Hirata	3aed282257	[CodeGen] Use range-based for loops (NFC)	2021-12-03 20:45:59 -08:00
Abinav Puthan Purayil	bc5dbb0bae	[GlobalISel] Add matchers for constant splat. This change exposes isBuildVectorConstantSplat() to the llvm namespace and uses it to implement the constant splat versions of m_SpecificICst(). CombinerHelper::matchOrShiftToFunnelShift() can now work with vector types and CombinerHelper::matchMulOBy2()'s match for a constant splat is simplified. Differential Revision: https://reviews.llvm.org/D114625	2021-11-30 15:18:50 +05:30
Mirko Brkusanin	0dd570ff56	[AMDGPU][GlobalISel] Transform (fsub (fpext (fneg (fmul x, y))), z) -> (fneg (fma (fpext x), (fpext y), z)) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D98050	2021-11-29 16:27:22 +01:00
Mirko Brkusanin	37c2a2201d	[AMDGPU][GlobalISel] Transform (fsub (fpext (fmul x, y)), z) -> (fma (fpext x), (fpext y), (fneg z)) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D98049	2021-11-29 16:27:22 +01:00
Mirko Brkusanin	5fe7fcd28e	[AMDGPU][GlobalISel] Transform (fsub (fneg (fmul, x, y)), z) -> (fma (fneg x), y, (fneg z)) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D98048	2021-11-29 16:27:22 +01:00
Mirko Brkusanin	a782169270	[AMDGPU][GlobalISel] Transform (fsub (fmul x, y), z) -> (fma x, y, -z) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D96614	2021-11-29 16:27:22 +01:00
Mirko Brkusanin	e5e49a08f1	[AMDGPU][GlobalISel] Transform (fadd (fma x, y, (fpext (fmul u, v))), z) -> (fma x, y, (fma (fpext u), (fpext v), z)) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D98047	2021-11-29 16:27:21 +01:00
Mirko Brkusanin	f732292536	[AMDGPU][GlobalISel] Transform (fadd (fma x, y, (fmul u, v)), z) -> (fma x, y, (fma u, v, z)) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D97938	2021-11-29 16:27:21 +01:00
Mirko Brkusanin	8951136216	[AMDGPU][GlobalISel] Transform (fadd (fpext (fmul x, y)), z) -> (fma (fpext x), (fpext y), z) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D97937	2021-11-29 16:27:21 +01:00
Mirko Brkusanin	881840fc26	[AMDGPU][GlobalISel] Transform (fadd (fmul x, y), z) -> (fma x, y, z) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D93305	2021-11-29 16:27:21 +01:00
Abinav Puthan Purayil	4af45f10cc	[GlobalISel] Fold or of shifts to funnel shift. This change folds a basic funnel shift idiom: - (or (shl x, amt), (lshr y, sub(bw, amt))) -> fshl(x, y, amt) - (or (shl x, sub(bw, amt)), (lshr y, amt)) -> fshr(x, y, amt) This also helps in folding to rotate shift if x and y are equal since we already have a funnel shift to rotate combine. Differential Revision: https://reviews.llvm.org/D114499	2021-11-26 17:05:29 +05:30
Kazu Hirata	259cd6f893	[llvm] Use range-based for loops (NFC)	2021-11-25 22:17:10 -08:00
Kazu Hirata	bfd5dd1568	[llvm] Use range-based for loops (NFC)	2021-11-25 08:55:16 -08:00
Jameson Nash	0332d105b9	GlobalISel: remove assert that memcpy Src and Dst addrspace must be identical The LangRef does not require these arguments to have the same type. Differential Revision: https://reviews.llvm.org/D93154	2021-11-24 20:23:05 -05:00
Zarko Todorovski	95875d246a	[LLVM][NFC]Inclusive language: remove occurances of sanity check/test from llvm Part of work to use more inclusive language in clang/llvm. Rewording some comments and change function and variable names.	2021-11-24 17:29:55 -05:00
Kazu Hirata	d45cb1d7ea	[llvm] Use range-based for loops (NFC)	2021-11-23 08:54:48 -08:00

1 2 3 4 5 ...

1915 Commits