llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	8dfb9627b7	[X86] Make v32i16/v64i8 legal types without avx512bw. Use custom splitting instead. This moves v32i16/v64i8 to a model consistent with how we treat integer types with avx1. This does change the ABI for types vXi16/vXi8 vectors larger than 512 bits to pass in multiple zmms instead of multiple ymms. We'd already hacked some code to make v64i8/v32i16 pass in zmm. Cost model is still a bit of a mess. In some place I tried to match existing behavior. But really we need to account for splitting and concating costs. Cost model for shuffles is especially pessimistic. Differential Revision: https://reviews.llvm.org/D76212	2020-04-15 12:17:18 -07:00
Craig Topper	d1da1b53ff	[X86] Cleanup ISD::BRIND handling code in X86DAGToDAGISel::Select. NFC -Drop llvm:: on MVT::i32 -Use getValueType instead of getSimpleValueType for an equality check just cause its shorter and doesn't matter. -Don't create a const SDValue & since its cheap to copy. -Remove explicit case from MVT enum to EVT. -Add message to assert.	2020-04-11 15:01:05 -07:00
Craig Topper	21a7d08e72	[X86] Move code that replaces ISD::VSELECT with X86ISD::BLENDV from X86DAGToDAGISel::Select to PreprocessISelDAG	2020-04-11 15:01:05 -07:00
Scott Constable	71e8021d82	[X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to "Indirect Thunks" There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g., https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Therefore it makes sense to refactor X86RetpolineThunks as a more general capability. Differential Revision: https://reviews.llvm.org/D76810	2020-04-02 21:55:13 -07:00
Guillaume Chatelet	c7468c1696	[Alignment][NFC] Use Align in SelectionDAG::getMemIntrinsicNode Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, nemanjai, hiraditya, kbarton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77149	2020-04-01 09:32:05 +00:00
Craig Topper	cdd1cd7120	[X86] Don't form masked instructions if the operation has an additional user. This will cause the operation to be repeated in both a mask and another masked or unmasked form. This can a wasted of execution resources. Differential Revision: https://reviews.llvm.org/D60940	2020-03-27 10:44:22 -07:00
Simon Pilgrim	e20e6f26fa	Fix shadow variable warning. NFC.	2020-03-02 18:53:19 +00:00
Simon Pilgrim	2b624e04c7	Fix 'unsigned variable can never be negative' cppcheck warning. NFCI.	2020-03-02 18:53:18 +00:00
Craig Topper	14306ce80c	[X86] Add proper MachinePointerInfo to the loads/stores created for moving data between SSE and X87 in X86DAGToDAGISel::PreprocessISelDAG	2020-02-26 14:45:37 -08:00
Craig Topper	89ba4acad6	[X86] Pass parameters into selectVectorAddr to remove dependency on X86MaskedGatherScatterSDNode. Might be able to get rid of X86ISD::SCATTER and some uses of X86ISD::GATHER. Which require isel to use ISD::SCATTER and ISD::GATHER as well.	2020-02-24 23:56:34 -08:00
Craig Topper	9238dfb4d8	[X86] Remove mask output from X86 gather/scatter ISD opcodes. Instead add it when we make the machine nodes during instruction selections. This makes this ISD node closer to ISD::MGATHER. Trying to see if we remove the X86 specific ones.	2020-02-24 23:56:28 -08:00
Craig Topper	7a7146cf72	[X86] When creating X86ISD::MGATHER nodes from AVX2 gather intrinsics, cast the mask to integer type. The gather intrinsics use a floating point mask when the result type is FP. But we call DemandedBits on the mask assuming its an integer type. We also use integer types when we create it from generic IR. So add a bitcast to the intrinsic path to guarantee the integer type.	2020-02-23 23:00:41 -08:00
Craig Topper	f1b8ec3398	[X86] Use custom isel for gather/scatter instructions. The type profile we use for the isel patterns lied about how many operands the gather/scatter node has to skip the index and scale operands. This allowed us to expand the baseptr operand into base, displacement, and segment and then merge the index and scale with them in the final instruction during isel. This is kind of a hack that relies on isel not checking the number of operands at all. This commit switches to custom isel where we can manage this directly without relying on holes in the isel checking.	2020-02-23 22:33:06 -08:00
Craig Topper	dd0b18e1ec	[X86] Disable load folding for X86ISD::ADD with 128 as an immediate. It can be turned into a sub with -128 instead as long as the carry flag isn't used.	2020-02-16 20:52:51 -08:00
Craig Topper	d26f11108b	[X86] Split X86ISD::CMP into an integer and FP opcode.	2020-02-16 10:10:19 -08:00
Craig Topper	e5b3ae4b34	[X86] Merge two switches together to simplify some code. NFC	2020-02-15 12:55:51 -08:00
Craig Topper	3f7649799b	[X86] Move combineIncDecVector logic from Select to PreprocessISelDAG. This allows it to work properly with masked inc/dec for avx512. Those would have a vselect as the root node so didn't get a chance to call combineIncDecVector. This also simplifies the logic because we don't have to manage the topological ordering.	2020-02-15 09:59:12 -08:00
Fangrui Song	0dce409cee	[AsmPrinter] De-capitalize Emit{Function,BasicBlock]* and Emit{Start,End}OfAsmFile	2020-02-13 13:22:49 -08:00
Craig Topper	656d66f5fc	[X86] Use custom isel for (X86sbb_flag 0, 0) so we can use 32-bit SBB for i8/i16. We were using MOV32r0 and an extract_subreg as an input. By using custom isel we can move the extract_subreg to after the SBB instead of on the input.	2020-02-09 13:19:35 -08:00
Craig Topper	e1cbfecdb8	[X86] Add flag result VT to a MOV32r0 created in X86DAGToDAGISel::Select The flag isn't used, but I believe this matches the MOV32r0 that would be created by the table emitter. This should allow this node to be CSEed with any others created by the table.	2020-02-09 13:19:21 -08:00
Craig Topper	dd262222b4	[X86] Use MVT::i32 for the type of a MOV32r0 created in X86DAGToDAGISel::Select. Not sure if this really matters. The VT isn't really used after this point. At best it might affect CSE.	2020-02-09 11:57:42 -08:00
Craig Topper	dbcc1392b3	[X86] Remove isel patterns that include a vselect/X86selects and a strict FP node. A vselect+strictfp node is not equivalent to a masked operation. The exceptions of the strictfp node are not masked by a vselect after it so we can't match it to a masked operation. We already had a hack in IsLegalToFold to prevent these patterns from matching. This patch removes that hack and removes the patterns.	2020-02-09 11:45:54 -08:00
Craig Topper	ae4e49868a	[X86] Turn vXi1 any_extends into sign_extends in PreprocessISelDAG and remove some isel patterns. Similar to what we do for other vector any_extends, but instead of zero_extend we need to use sign_extend.	2020-02-06 21:32:53 -08:00
Craig Topper	4175d7e22e	[X86] Custom isel floating point X86ISD::CMP on pre-CMOV targets. Eliminate ConvertCmpIfNecessary If we don't have cmov, X87 compares write to FPSW and we need to move the bits to EFLAGS to use as JCC/SETCC/CMOV conditions. Previously this was done by calling ConvertCmpIfNecessary in multiple places which would emit the extra code for the FNSTSW, a shift, a truncate, and a SAHF instructions. Isel would then select trunc+X86ISD::CMP to a FUCOM instruction that produces FPSW. This patch centralizes all of the handling into a single custom isel handler. This allows us to remove ConvertCmpIfNecessary and a couple target specific ISD opcodes. Differential Revision: https://reviews.llvm.org/D73863	2020-02-06 10:43:06 -08:00
Craig Topper	600f2e1c4d	[X86] Remove SETB_C8r/SETB_C16r pseudo instructions. Use SETB_C32r and EXTRACT_SUBREG instead. Only 32 and 64 bit SBB are dependency breaking instructons on some CPUs. The 8 and 16 bit forms have to preserve upper bits of the GPR. This patch removes the smaller forms and selects the wider form instead. I had to do this with custom code as the tblgen generated code glued the eflags copytoreg to the extract_subreg instead of to the SETB pseudo. Longer term I think we can remove X86ISD::SETCC_CARRY and use (X86ISD::SBB zero, zero). We'll want to keep the pseudo and select (X86ISD::SBB zero, zero) to either a MOV32r0+SBB for targets where there is no dependency break and SETB_C32/SETB_C64 for targets that have a dependency break. May want some way to avoid the MOV32r0 if the instruction that produced the carry flag happened to def a register that we can use for the dependency. I think the flag copy lowering should be using NEG instead of SUB to handle SETB. That would avoid the MOV32r0 there. Or maybe it should use a ADC with -1 to recreate the carry flag and keep the SETB? That would avoid a MOVZX on the input of the SUB. Differential Revision: https://reviews.llvm.org/D74024	2020-02-06 10:22:24 -08:00
Craig Topper	d975910c50	[X86] Don't exit from foldOffsetIntoAddress if the Offset is 0, but AM.Disp is non-zero. This is an alternate fix for the issue D73606 was trying to solve. The main issue here is that we bailed out of foldOffsetIntoAddress if Offset is 0. But if we just found a symbolic displacement and AM.Disp became non-zero earlier, we still need to validate that AM.Disp with the symbolic displacement. This is my second attempt at committing this after failing build bots previously. One thing I realized about the previous attempt is that its possible that AM.Disp is already non-zero and the new Offset changes it back to zero. In that case my previous attempt failed to update AM.Disp to zero. So this patch removes the early out for 0 and appropriately handle the 0 case in each check so we still update AM.Disp at the end.	2020-02-01 11:26:17 -08:00
Craig Topper	007a6a155c	Revert "[X86] Don't exit from foldOffsetIntoAddress if the Offset is 0, but AM.Disp is non-zero." Possibly causing build bot failures.	2020-01-29 22:59:05 -08:00
Craig Topper	1ef8e8b414	[X86] Don't exit from foldOffsetIntoAddress if the Offset is 0, but AM.Disp is non-zero. This is an alternate fix for the issue D73606 was trying to solve. The main issue here is that we bailed out of foldOffsetIntoAddress if Offset is 0. But if we just found a symbolic displacement and AM.Disp became non-zero earlier, we still need to validate that AM.Disp with the symbolic displacement. This passes fold-add-pcrel.ll. Differential Revision: https://reviews.llvm.org/D73608	2020-01-29 21:32:16 -08:00
Fangrui Song	bc15bf66dc	[X86] matchAdd: don't fold a large offset into a %rip relative address For `ret i64 add (i64 ptrtoint (i32* @foo to i64), i64 1701208431)`, ``` X86DAGToDAGISel::matchAdd ... // AM.setBaseReg(CurDAG->getRegister(X86::RIP, MVT::i64)); if (!matchAddressRecursively(N.getOperand(0), AM, Depth+1) && // Try folding offset but fail; there is a symbolic displacement, so offset cannot be too large !matchAddressRecursively(Handle.getValue().getOperand(1), AM, Depth+1)) return false; ... // Try again after commuting the operands. // AM.Disp = Val; foldOffsetIntoAddress() does not know there will be a symbolic displacement if (!matchAddressRecursively(Handle.getValue().getOperand(1), AM, Depth+1) && // AM.setBaseReg(CurDAG->getRegister(X86::RIP, MVT::i64)); !matchAddressRecursively(Handle.getValue().getOperand(0), AM, Depth+1)) // Succeeded! Produced leaq sym+disp(%rip),... return false; ``` `foldOffsetIntoAddress()` currently does not know there is a symbolic displacement and can fold a large offset. The produced `leaq sym+disp(%rip), %rax` instruction is relocated by an R_X86_64_PC32. If disp is large and sym+disp-rip>=2**31, there will be a relocation overflow. This approach is still not elegant. Unfortunately the isRIPRelative interface is a bit clumsy. I tried several solutions and eventually picked this one. Differential Revision: https://reviews.llvm.org/D73606	2020-01-28 22:30:52 -08:00
Craig Topper	9cc9120969	[X86] Turn FP_ROUND/STRICT_FP_ROUND into X86ISD::VFPROUND/STRICT_VFPROUND during PreprocessISelDAG to remove some duplicate isel patterns.	2020-01-11 11:06:52 -08:00
Craig Topper	81a3d987ce	[X86] Remove dead code from X86DAGToDAGISel::Select that is no longer needed now that we don't mutate strict fp nodes. NFC	2020-01-11 00:27:14 -08:00
Craig Topper	c2ddfa876f	[X86] Simplify code by removing an unreachable condition. NFCI For X87<->SSE conversions, the SSE type is always smaller than the X87 type. So we can always use the smallest type for the memory type.	2020-01-10 23:41:06 -08:00
Craig Topper	5fe5c0a60f	[X86] Preserve fpexcept property when turning strict_fp_extend and strict_fp_round into stack operations. We use the stack for X87 fp_round and for moving from SSE f32/f64 to X87 f64/f80. Or from X87 f64/f80 to SSE f32/f64. Note for the SSE<->X87 conversions the conversion always happens in the X87 domain. The load/store ops in the X87 instructions are able to signal exceptions.	2020-01-10 23:41:06 -08:00
Craig Topper	69806808b9	[X86] Use ReplaceAllUsesWith instead of ReplaceAllUsesOfValueWith to simplify some code. NFCI	2020-01-10 20:31:21 -08:00
Amara Emerson	df3f4e0d77	[X86] Fix an 8 bit testb being selected when folding a volatile i32 load pattern. Differential Revision: https://reviews.llvm.org/D71581	2020-01-06 11:46:42 -08:00
Liu, Chen3	8af492ade1	add strict float for round operation Differential Revision: https://reviews.llvm.org/D72026	2020-01-01 20:42:12 +08:00
Fangrui Song	5edb40c022	[SelectionDAG] Disallow indirect "i" constraint This allows us to delete InlineAsm::Constraint_i workarounds in SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and TargetLowering::getInlineAsmMemConstraint overrides. They were introduced to X86 in r237517 to prevent crashes for constraints like "=*imr". They were later copied to other targets.	2019-12-29 16:50:42 -08:00
Craig Topper	a21beccea2	[X86] Add STRICT versions of CVTTP2SI, CVTTP2UI, CMPM, and CMPP. Differential Revision: https://reviews.llvm.org/D71850	2019-12-24 10:07:04 -08:00
Ulrich Weigand	0d3f782e41	[FPEnv][X86] More strict int <-> FP conversion fixes Fix several several additional problems with the int <-> FP conversion logic both in common code and in the X86 target. In particular: - The STRICT_FP_TO_UINT expansion emits a floating-point compare. This compare can raise exceptions and therefore needs to be a strict compare. I've made it signaling (even though quiet would also be correct) as signaling is the more usual default for an LT. This code exists both in common code and in the X86 target. - The STRICT_UINT_TO_FP expansion algorithm was incorrect for strict mode: it emitted two STRICT_SINT_TO_FP nodes and then used a select to choose one of the results. This can cause spurious exceptions by the STRICT_SINT_TO_FP that ends up not chosen. I've fixed the algorithm to use only a single STRICT_SINT_TO_FP instead. - The !isStrictFPEnabled logic in DoInstructionSelection would sometimes do the wrong thing because it calls getOperationAction using the result VT. But for some opcodes, incuding [SU]INT_TO_FP, getOperationAction needs to be called using the operand VT. - Remove some (obsolete) code in X86DAGToDAGISel::Select that would mutate STRICT_FP_TO_[SU]INT to non-strict versions unnecessarily. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D71840	2019-12-23 21:11:45 +01:00
Liu, Chen3	2f932b5729	Enable STRICT_FP_TO_SINT/UINT on X86 backend This patch is mainly for custom lowering the vector operation. Differential Revision: https://reviews.llvm.org/D71592	2019-12-19 14:49:13 +08:00
Craig Topper	f0df4218b6	[X86] Add a simple hack to IsProfitableToFold to prevent vselect+strict fp operations from being folded into masked instructions. We really need to update the isel patterns to prevent this, but that requires some tablegen de-tangling. So this hack will work for correctness in the short term.	2019-12-18 14:42:56 -08:00
Reid Kleckner	5d986953c8	[IR] Split out target specific intrinsic enums into separate headers This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320	2019-12-11 18:02:14 -08:00
Wang, Pengfei	21bc8631fe	[FPEnv][X86] Constrained FCmp intrinsics enabling on X86 Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582	2019-12-11 08:23:09 +08:00
Liu, Chen3	bbf7860b93	add support for strict operation fpextend/fpround/fsqrt on X86 backend Differential Revision: https://reviews.llvm.org/D71184	2019-12-10 09:04:28 +08:00
Liu, Chen3	3041434450	Add strict fp support for instructions fadd/fsub/fmul/fdiv Differential Revision: https://reviews.llvm.org/D68757	2019-12-06 09:44:33 +08:00
Amy Huang	9e978bb01c	Add support for lowering 32-bit/64-bit pointers Summary: This follows a previous patch that changes the X86 datalayout to represent mixed size pointers (32-bit sext, 32-bit zext, and 64-bit) with address spaces (https://reviews.llvm.org/D64931) This patch implements the address space cast lowering to the corresponding sign extension, zero extension, or truncate instructions. Related to https://bugs.llvm.org/show_bug.cgi?id=42359 Reviewers: rnk, craig.topper, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69639	2019-12-04 11:39:03 -08:00
Craig Topper	cfce8f2cfb	[X86] Add strict fp support for operations of X87 instructions This is the following patch of D68854. This patch adds basic operations of X87 instructions, including +, -, *, / , fp extensions and fp truncations. Patch by Chen Liu(LiuChen3) Differential Revision: https://reviews.llvm.org/D68857	2019-11-26 10:59:41 -08:00
Craig Topper	95f44cf44a	[X86] Mark vector STRICT_FADD/STRICT_FSUB as Legal and add mutation to X86ISelDAGToDAG The prevents LegalizeVectorOps from scalarizing them. We'll need to remove the X86 mutation code when we add isel patterns.	2019-11-21 16:19:18 -08:00
Hiroshi Yamauchi	52e377497d	[PGO][PGSO] DAG.shouldOptForSize part. Summary: (Split of off D67120) SelectionDAG::shouldOptForSize changes for profile guided size optimization. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70095	2019-11-21 14:16:00 -08:00
Craig Topper	c9e8e808cf	[SelectionDAG][X86] Mutate strictFP nodes to non-strict in DoInstructionSelection when the node is marked Expand rather than when it is not Legal. This allows operations that are marked Custom, but have some type combinations that are legal to get past this code. Add custom mutation code to X86's Select function for the nodes that don't have isel patterns yet.	2019-11-20 10:36:02 -08:00

1 2 3 4 5 ...

950 Commits