llvm-project

Commit Graph

Author	SHA1	Message	Date
Wang, Pengfei	6f7f5b54c8	[X86] AVX512FP16 instructions enabling 1/6 1. Enable FP16 type support and basic declarations used by following patches. 2. Enable new instructions VMOVW and VMOVSH. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105263	2021-08-10 12:46:01 +08:00
Craig Topper	c997867dc0	[X86] Add ISD::FREEZE and ISD::AssertAlign to the list of opcodes that don't guarantee upper 32 bits are zero. The freeze issue was reported here https://llvm.discourse.group/t/bug-or-feature-freeze-instruction/3639 I don't have a test for AssertAlign. I just noticed it was missing and assume it should be similar to the other two Asserts. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D104178	2021-06-12 09:52:29 -07:00
Florian Hahn	c2d44bd230	[X86] Lower calls with clang.arc.attachedcall bundle This patch adds support for lowering function calls with the `clang.arc.attachedcall` bundle. The goal is to expand such calls to the following sequence of instructions: callq @fn movq %rax, %rdi callq _objc_retainAutoreleasedReturnValue / _objc_unsafeClaimAutoreleasedReturnValue This sequence of instructions triggers Objective-C runtime optimizations, hence we want to ensure no instructions get moved in between them. This patch achieves that by adding a new CALL_RVMARKER ISD node, which gets turned into the CALL64_RVMARKER pseudo, which eventually gets expanded into the sequence mentioned above. The ObjC runtime function to call is determined by the argument in the bundle, which is passed through as a target constant to the pseudo. @ahatanak is working on using this attribute in the front- & middle-end. Together with the front- & middle-end changes, this should address PR31925 for X86. This is the X86 version of `46bc40e502`, which added similar support for AArch64. Reviewed By: ab Differential Revision: https://reviews.llvm.org/D94597	2021-05-21 16:33:58 +01:00
Simon Pilgrim	029e41ec98	[X86] Ensure multiclass ATOMIC_RMW_BINOP is tagged as MayLoad and MayStore These are RMW ops and should be tagged as both loads and stores.	2021-04-27 14:11:22 +01:00
Simon Pilgrim	a0677ff5eb	[X86] Rename multiclass ATOMIC_LOAD_BINOP -> ATOMIC_RMW_BINOP. NFCI. Noticed while triaging the rG2149aa73f640c96 regressions - the LXADD ops are load+store RMW instructions, not just loads.	2021-04-26 15:17:30 +01:00
Alexey Lapshin	cf7cdaff64	[X86][VARARG] Avoid spilling xmm registers for va_start. That review is extracted from D69372. It fixes https://bugs.llvm.org/show_bug.cgi?id=42219 bug. For the noimplicitfloat mode, the compiler mustn't generate floating-point code if it was not asked directly to do so. This rule does not work with variable function arguments currently. Though compiler correctly guards block of code, which copies xmm vararg parameters with a check for %al, it does not protect spills for xmm registers. Thus, such spills are generated in non-protected areas and could break code, which does not expect floating-point data. The problem happens in -O0 optimization mode. With this optimization level there is used FastRegisterAllocator, which spills virtual registers at basic block boundaries. Register Allocator does not protect spills with additional control-flow modifications. Thus to resolve that problem, it is suggested to not copy incoming physical registers into virtual registers. Instead, store incoming physical xmm registers into the memory from scratch. Differential Revision: https://reviews.llvm.org/D80163	2021-03-06 15:25:47 +03:00
Harald van Dijk	9eac818370	[X86] Fix variadic argument handling for x32 The X86-64 ABI defines va_list as typedef struct { unsigned int gp_offset; unsigned int fp_offset; void overflow_arg_area; void reg_save_area; } va_list[1]; This means the size, alignment, and reg_save_area offset will depend on whether we are in LP64 or in ILP32 mode, so this commit adds the checks. Additionally, the VAARG_64 pseudo-instruction assumed 64-bit pointers, so this commit adds a VAARG_X32 pseudo-instruction that behaves just like VAARG_64, except for assuming 32-bit pointers. Some of these changes were originally done by Michael Liao <michael.hliao@gmail.com>. Fixes https://bugs.llvm.org/show_bug.cgi?id=48428. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D93160	2020-12-14 23:47:27 +00:00
Harald van Dijk	c9be4ef184	[X86] Add TLS_(base_)addrX32 for X32 mode LLVM has TLS_(base_)addr32 for 32-bit TLS addresses in 32-bit mode, and TLS_(base_)addr64 for 64-bit TLS addresses in 64-bit mode. x32 mode wants 32-bit TLS addresses in 64-bit mode, which were not yet handled. This adds TLS_(base_)addrX32 as copies of TLS_(base_)addr64, except that they use tls32(base)addr rather than tls64(base)addr, and then restricts TLS_(base_)addr64 to 64-bit LP64 mode, TLS_(base_)addrX32 to 64-bit ILP32 mode. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92346	2020-12-02 22:20:36 +00:00
Craig Topper	63ba82ed00	[X86] Use TargetConstant for immediates for VASTART_SAVE_XMM_REGS.	2020-10-25 12:52:56 -07:00
Craig Topper	2ed16aa66f	[X86] Use TargetConstant instead of Constant for operands to X86vaarg64.	2020-10-25 12:24:59 -07:00
Craig Topper	a222d832d5	[X86] Use TargetConstant for FPDiff with X86::TC_RETURN. It's required to be a constant and can never be in a register so make it explicit.	2020-10-25 00:29:11 -07:00
Craig Topper	68e1a8d207	[X86] Defer the creation of LCMPXCHG16B_SAVE_RBX until finalize-isel We need to use LCMPXCHG16B_SAVE_RBX if RBX/EBX is being used as the frame pointer. We previously checked for this during type legalization, but that's too early to know for sure if the base pointer is needed. This patch adds a new pseudo instruction to emit from isel that uses a virtual register for the RBX input. Then we use the custom inserter hook to emit LCMPXCHG16B if RBX isn't needed as a base pointer or LCMPXCHG16B_SAVE_RBX if it is. Fixes PR42064. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D88808	2020-10-07 17:00:43 -07:00
Craig Topper	4da4e7cb20	[X86] Remove X86ISD::LCMPXCHG8_SAVE_EBX_DAG and LCMPXCHG8B_SAVE_EBX pseudo instruction This and its friend X86ISD::LCMPXCHG8_SAVE_RBX_DAG are used if we need to avoid clobbering the frame pointer in EBX/RBX. EBX/RBX are only used a frame pointer in 64-bit mode. In 64-bit mode we don't use CMPXCHG8B since we have a GR64 cmpxchg available. So we don't need special handling for LCMPXCHG8B. Split from D88808 Differential Revision: https://reviews.llvm.org/D88853	2020-10-05 15:03:07 -07:00
Craig Topper	b18026114a	[X86] MWAITX_SAVE_RBX should not have EBX as an implicit use. RBX was copied to a virtual register before this instruction was created. And the EBX input for the final MWAITX is still in a virtual register. So EBX isn't read by this pseudo.	2020-10-04 20:34:31 -07:00
Craig Topper	4b38ceb0eb	[X86] Remove MWAITX_SAVE_EBX pseudo instruction. Always save/restore the full %rbx register even in gnux32. ebx/rbx only needs to be saved when 64-bit registers are supported anyway. It should be fine to save/restore the whole rbx register even in gnux32 where the base is technically just ebx. This matches what we do for cmpxchg16b where rbx is saved/restored regardless of gnux32.	2020-10-04 16:28:15 -07:00
Craig Topper	952dfd76c6	[X86] Correct the implicit defs/uses for the MWAITX pseudo instructions. MWAITX doesn't touch EFLAGS so no pseudos should def EFLAGS. The SAVE_EBX/RBX pseudos only needs to def the EBX register that the expansion overwrites. The EAX and ECX registers are only read. The pseudo emitted during isel that is used by the custom inserter shouldn't have any implicit defs or uses since everything is in vregs.	2020-10-04 15:28:38 -07:00
Craig Topper	0db97234cf	[X86] Remove usesCustomInserter from MWAITX_SAVE_EBX and MWAITX_SAVE_RBX. NFC These are now emitted by a CustomInserter rather than using a custom inserter themselves.	2020-10-04 15:28:38 -07:00
Craig Topper	7f3da48885	[X86] Remove X86ISD::MWAITX_DAG. Just match the intrinsic to the custom inserter pseudo instruction during isel.	2020-10-03 18:44:53 -07:00
Pierre Gousseau	cda6b09242	[X86] Make sure we do not clobber RBX with mwaitx when used as a base pointer. mwaitx uses EBX as one of its argument. Using this instruction clobbers RBX as it is defined to hold one of the input. When the backend uses dynamically allocated stack, RBX is used as a reserved register for the base pointer. This patch is adapted from @qcolombet patch for cmpxchg at r263325. This fixes PR43528. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D73475	2020-08-26 11:20:31 +01:00
Craig Topper	ac82b918c7	[X86] Use h-register for final XOR of __builtin_parity on 64-bit targets. This adds an isel pattern and special XOR8rr_NOREX instruction to enable the use of h-registers for __builtin_parity. This avoids a copy and a shift instruction. The NOREX instruction is in case register allocation doesn't use the matching l-register for some reason. If a R8-R15 register gets picked instead, we won't be able to encode the instruction since an h-register can't be used with a REX prefix. Fixes PR46954	2020-08-03 10:10:17 -07:00
Craig Topper	d72cb4ce21	Recommit "[X86] Separate imm from relocImm handling." Fix the copy/paste mistake that caused it to fail previously	2020-06-15 10:59:43 -07:00
Hans Wennborg	f47a776628	Revert "[X86] Separate imm from relocImm handling." > relocImm was a complexPattern that handled both ConstantSDNode > and X86Wrapper. But it was only applied selectively because using > it would cause patterns to be not importable into FastISel or > GlobalISel. So it only got applied to flag setting instructions, > stores, RMW arithmetic instructions, and rotates. > > Most of the test changes are a result of making patterns available > to GlobalISel or FastISel. The absolute-cmp.ll change is due to > this fixing a pattern ordering issue to make an absolute symbol > match to an 8-bit immediate before trying a 32-bit immediate. > > I tried to use PatFrags to reduce the repetition, but I was getting > errors from TableGen. This caused "Invalid EmitNode" assertions, see the llvm-commits thread for discussion.	2020-06-15 16:14:59 +02:00
Craig Topper	8885a7640b	[X86] Separate imm from relocImm handling. relocImm was a complexPattern that handled both ConstantSDNode and X86Wrapper. But it was only applied selectively because using it would cause patterns to be not importable into FastISel or GlobalISel. So it only got applied to flag setting instructions, stores, RMW arithmetic instructions, and rotates. Most of the test changes are a result of making patterns available to GlobalISel or FastISel. The absolute-cmp.ll change is due to this fixing a pattern ordering issue to make an absolute symbol match to an 8-bit immediate before trying a 32-bit immediate. I tried to use PatFrags to reduce the repetition, but I was getting errors from TableGen.	2020-06-13 11:29:28 -07:00
Craig Topper	324e13668e	[X86] Split imm handling out of selectMOV64Imm32 and add a separate isel pattern. This makes the pattern available to global isel.	2020-06-10 11:12:36 -07:00
serge-sans-paille	6c733f5a13	Use Pseudo Instruction to carry stack probing information Instead of using a fake call and metadata to temporarily represent a probed static alloca, use a pseudo instruction. This is inspired by the SystemZ approach proposed in https://reviews.llvm.org/D78717. Differential Revision: https://reviews.llvm.org/D80641	2020-06-02 16:14:06 +02:00
Kazuaki Ishizaki	0312b9f550	[llvm] NFC: Fix trivial typo in rst and td files Differential Revision: https://reviews.llvm.org/D77469	2020-04-23 14:26:32 +09:00
Scott Constable	71e8021d82	[X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to "Indirect Thunks" There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g., https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Therefore it makes sense to refactor X86RetpolineThunks as a more general capability. Differential Revision: https://reviews.llvm.org/D76810	2020-04-02 21:55:13 -07:00
Simon Pilgrim	b3b4727a3e	[X86] Replace (most) X86ISD::SHLD/SHRD usage with ISD::FSHL/FSHR generic opcodes (PR39467) For i32 and i64 cases, X86ISD::SHLD/SHRD are close enough to ISD::FSHL/FSHR that we can use them directly, we just need to account for the operand commutation for SHRD. The i16 SHLD/SHRD case is annoying as the shift amount is modulo-32 (vs funnel shift modulo-16), so I've added X86ISD::FSHL/FSHR equivalents, which matches the generic implementation in all other terms. Something I'm slightly concerned with is that ISD::FSHL/FSHR legality is controlled by the Subtarget.isSHLDSlow() feature flag - we don't normally use non-ISA features for this but it allows the DAG combines to continue to operate after legalization in a lot more cases. The X86 *bits.ll changes are all affected by the same issue - we now have a "FSHR(-1,-1,amt) -> ROTR(-1,amt) -> (-1)" simplification that reduces the dependencies enough for the branch fall through code to mess up. Differential Revision: https://reviews.llvm.org/D75748	2020-03-11 11:17:49 +00:00
Craig Topper	78be618717	[X86] Add CMOV_VR64 pseudo instruction for MMX. Remove mmx handling from combineSelect. The combineSelect code was casting to i64 without any check that i64 was legal. This can break after type legalization. It also required splitting the mmx register on 32-bit targets. It's not clear that this makes sense. Instead switch to using a cmov pseudo like we do for XMM/YMM/ZMM.	2020-02-20 20:30:56 -08:00
Craig Topper	e5782377f3	[X86] Add CMOV_VK1 pseudo so we don't crash on v1i1 ISD::SELECT	2020-02-20 15:13:48 -08:00
Craig Topper	656d66f5fc	[X86] Use custom isel for (X86sbb_flag 0, 0) so we can use 32-bit SBB for i8/i16. We were using MOV32r0 and an extract_subreg as an input. By using custom isel we can move the extract_subreg to after the SBB instead of on the input.	2020-02-09 13:19:35 -08:00
serge_sans_paille	e67cbac812	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html This a recommit of `39f50da2a3` with proper LiveIn declaration, better option handling and more portable testing. Differential Revision: https://reviews.llvm.org/D68720	2020-02-09 10:42:45 +01:00
serge-sans-paille	4546211600	Revert "Support -fstack-clash-protection for x86" This reverts commit `0fd51a4554`. Failures: http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/4354	2020-02-09 10:06:31 +01:00
serge_sans_paille	0fd51a4554	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html This a recommit of `39f50da2a3` with proper LiveIn declaration, better option handling and more portable testing. Differential Revision: https://reviews.llvm.org/D68720	2020-02-09 09:35:42 +01:00
serge-sans-paille	658495e6ec	Revert "Support -fstack-clash-protection for x86" This reverts commit `e229017732`. Failures: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604 http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308	2020-02-08 14:26:22 +01:00
serge_sans_paille	e229017732	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html This a recommit of `39f50da2a3` with better option handling and more portable testing Differential Revision: https://reviews.llvm.org/D68720	2020-02-08 13:31:52 +01:00
Nico Weber	b03c3d8c62	Revert "Support -fstack-clash-protection for x86" This reverts commit `4a1a0690ad`. Breaks tests on mac and win, see https://reviews.llvm.org/D68720	2020-02-07 14:49:38 -05:00
serge_sans_paille	4a1a0690ad	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html This a recommit of `39f50da2a3` with correct option flags set. Differential Revision: https://reviews.llvm.org/D68720	2020-02-07 19:54:39 +01:00
serge-sans-paille	f6d98429fc	Revert "Support -fstack-clash-protection for x86" This reverts commit `39f50da2a3`. The -fstack-clash-protection is being passed to the linker too, which is not intended. Reverting and fixing that in a later commit.	2020-02-07 11:36:53 +01:00
serge_sans_paille	39f50da2a3	Support -fstack-clash-protection for x86 Implement protection against the stack clash attack [0] through inline stack probing. Probe stack allocation every PAGE_SIZE during frame lowering or dynamic allocation to make sure the page guard, if any, is touched when touching the stack, in a similar manner to GCC[1]. This extends the existing `probe-stack' mechanism with a special value `inline-asm'. Technically the former uses function call before stack allocation while this patch provides inlined stack probes and chunk allocation. Only implemented for x86. [0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt [1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html Differential Revision: https://reviews.llvm.org/D68720	2020-02-07 10:56:15 +01:00
Craig Topper	600f2e1c4d	[X86] Remove SETB_C8r/SETB_C16r pseudo instructions. Use SETB_C32r and EXTRACT_SUBREG instead. Only 32 and 64 bit SBB are dependency breaking instructons on some CPUs. The 8 and 16 bit forms have to preserve upper bits of the GPR. This patch removes the smaller forms and selects the wider form instead. I had to do this with custom code as the tblgen generated code glued the eflags copytoreg to the extract_subreg instead of to the SETB pseudo. Longer term I think we can remove X86ISD::SETCC_CARRY and use (X86ISD::SBB zero, zero). We'll want to keep the pseudo and select (X86ISD::SBB zero, zero) to either a MOV32r0+SBB for targets where there is no dependency break and SETB_C32/SETB_C64 for targets that have a dependency break. May want some way to avoid the MOV32r0 if the instruction that produced the carry flag happened to def a register that we can use for the dependency. I think the flag copy lowering should be using NEG instead of SUB to handle SETB. That would avoid the MOV32r0 there. Or maybe it should use a ADC with -1 to recreate the carry flag and keep the SETB? That would avoid a MOVZX on the input of the SUB. Differential Revision: https://reviews.llvm.org/D74024	2020-02-06 10:22:24 -08:00
Craig Topper	a3d489e87e	[X86] Add a DAG combine for (i32 (sext (i8 (x86isd::setcc_carry)))) -> (i32 (x86isd::setcc_carry)) and remove isel patterns. Same for any_extend though we don't have coverage for that. The test changes are because isel didn't check one use of the setcc_carry. So in isel we would end up with two different sized setcc_carry instructions. And since it clobbers the flags we would need to recreate the flags for the second instruction. This code handles additional uses by truncating the new wide setcc_carry back to the original size for those uses.	2020-02-04 22:40:36 -08:00
Reid Kleckner	2d89e0a098	[SEH] Remove CATCHPAD SDNode and X86::EH_RESTORE MachineInstr The CATCHPAD node mostly existed to be selected into the EH_RESTORE instruction, which sets the frame back up when 32-bit Windows exceptions return to the parent function. However, creating this MachineInstr early increases the risk that other passes will come along and insert instructions that use the stack before ESP and EBP are restored. That happened in PR44697. Instead of representing these in the instruction stream early, delay it until PEI. Mark the blocks where this needs to happen as EHPads, but not funclet entry blocks. Passes after PEI have to be careful not to hoist instructions that can use stack across frame setup instructions, so this should be relatively reliable. Fixes PR44697 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D73752	2020-02-04 15:13:12 -08:00
Craig Topper	cd14b4a62b	[X86] Remove unneeded code that looks for (and (i8 (X86setcc_c)) I don't believe we use this construct anymore so I don't think we need to look for it.	2020-02-03 23:18:11 -08:00
Craig Topper	e3c2163ffe	[X86] Use TargetConstant for condition code on X86ISD::SETCC/CMOV/BRCOND nodes. This removes the need for ConvertToTarget opcodes in the isel table. It's also consistent with the recent changes to use TargetConstant for intrinsic nodes that always take immediates. Differential Revision: https://reviews.llvm.org/D67902 llvm-svn: 372645	2019-09-23 19:48:20 +00:00
Philip Reames	0b4d67ca35	Rename nonvolatile_load/store to simple_load/store [NFC] Implement the TODO from D66318. llvm-svn: 371789	2019-09-12 23:03:39 +00:00
Reid Kleckner	3fa07dee94	Revert [Windows] Disable TrapUnreachable for Win64, add SEH_NoReturn This reverts r370525 (git commit `0bb1630685`) Also reverts r370543 (git commit `185ddc08ee`) The approach I took only works for functions marked `noreturn`. In general, a call that is not known to be noreturn may be followed by unreachable for other reasons. For example, there could be multiple call sites to a function that throws sometimes, and at some call sites, it is known to always throw, so it is followed by unreachable. We need to insert an `int3` in these cases to pacify the Windows unwinder. I think this probably deserves its own standalone, Win64-only fixup pass that runs after block placement. Implementing that will take some time, so let's revert to TrapUnreachable in the mean time. llvm-svn: 370829	2019-09-03 22:27:27 +00:00
Reid Kleckner	0bb1630685	[Windows] Disable TrapUnreachable for Win64, add SEH_NoReturn Users have complained llvm.trap produce two ud2 instructions on Win64, one for the trap, and one for unreachable. This change fixes that. TrapUnreachable was added and enabled for Win64 in r206684 (April 2014) to avoid poorly understood issues with the Windows unwinder. There seem to be two major things in play: - the unwinder - C++ EH, _CxxFrameHandler3 & co The unwinder disassembles forward from the return address to scan for epilogues. Inserting a ud2 had the effect of stopping the unwinder, and ensuring that it ran the EH personality function for the current frame. However, it's not clear what the unwinder does when the return address happens to be the last address of one function and the first address of the next function. The Visual C++ EH personality, _CxxFrameHandler3, needs to figure out what the current EH state number is. It does this by consulting the ip2state table, which maps from PC to state number. This seems to go wrong when the return address is the last PC of the function or catch funclet. I'm not sure precisely which system is involved here, but in order to address these real or hypothetical problems, I believe it is enough to insert int3 after a call site if it would otherwise be the last instruction in a function or funclet. I was able to reproduce some similar problems locally by arranging for a noreturn call to appear at the end of a catch block immediately before an unrelated function, and I confirmed that the problems go away when an extra trailing int3 instruction is added. MSVC inserts int3 after every noreturn function call, but I believe it's only necessary to do it if the call would be the last instruction. This change inserts a pseudo instruction that expands to int3 if it is in the last basic block of a function or funclet. I did what I could to run the Microsoft compiler EH tests, and the ones I was able to run showed no behavior difference before or after this change. Differential Revision: https://reviews.llvm.org/D66980 llvm-svn: 370525	2019-08-30 20:46:39 +00:00
Craig Topper	8582ecd8d9	[X86] Introduce new MOVSSrm/MOVSDrm opcodes that use VR128 register class. Rename the old versions that use FR32/FR64 to MOVSSrm_alt/MOVSDrm_alt. Use the new versions in patterns that previously used a COPY_TO_REGCLASS to VR128. These patterns expect the upper bits to be zero. The current set up appears to work, but I'm not sure we should be enforcing upper bits being zero through a COPY_TO_REGCLASS. I wanted to flip the arrangement and use a COPY_TO_REGCLASS to FR32/FR64 for the patterns that need an f32/f64 result, but that complicated fastisel and globalisel. I've been doing some experiments with reducing some isel patterns and ended up in a situation where I had a (SUBREG_TO_REG (COPY_TO_RECLASS (VMOVSSrm), VR128)) and our post-isel peephole was unable to avoid using an instruction for the SUBREG_TO_REG due to the COPY_TO_REGCLASS. Having a VR128 instruction removes the COPY_TO_REGCLASS that was breaking this. llvm-svn: 363643	2019-06-18 03:23:11 +00:00
Craig Topper	c9d7484aa3	[X86] Add CMOV_FR32X/CMOV_FR64X pseudo instructions. Use them in fast isel to fix a machine verifier error after adding test cases. Fast isel picks the FR32X/FR64X register classes when lowering pseudo select, but it didn't have the right opcode to go with it. llvm-svn: 360524	2019-05-11 16:00:28 +00:00

1 2 3 4 5 ...

344 Commits