llvm-project

Commit Graph

Author	SHA1	Message	Date
Kyungwoo Lee	4f58b1bd29	[AArch64] Homogeneous Prolog and Epilog Size Optimization Second land attempt. MachineVerifier DefRegState expensive check errors fixed. Prologs and epilogs handle callee-save registers and tend to be irregular with different immediate offsets that are not often handled by the MachineOutliner. Commit D18619/a5335647d5e8 (combining stack operations) stretched irregularity further. This patch tries to emit homogeneous stores and loads with the same offset for prologs and epilogs respectively. We have observed that this canonicalizes (homogenizes) prologs and epilogs significantly and results in a greatly increased chance of outlining, resulting in a code size reduction. Despite the above results, there are still size wins to be had that the MachineOutliner does not provide due to the special handling X30/LR. To handle the LR case, his patch custom-outlines prologs and epilogs in place. It does this by doing the following: * Injects HOM_Prolog and HOM_Epilog pseudo instructions during a Prolog and Epilog Injection Pass. * Lowers and optimizes said pseudos in a AArchLowerHomogneousPrologEpilog Pass. * Outlined helpers are created on demand. Identical helpers are merged by the linker. * An opt-in flag is introduced to enable this feature. Another threshold flag is also introduced to control the aggressiveness of outlining for application's need. This reduced an average of 4% of code size on LLVM-TestSuite/CTMark targeting arm64/-Oz. Differential Revision: https://reviews.llvm.org/D76570	2021-02-02 14:57:26 -08:00
Puyan Lotfi	8f7f2c4211	Revert "[AArch64] Homogeneous Prolog and Epilog Size Optimization" This reverts commit `0426be3df6`. Reverting due to some expensive-checks failures in tests.	2021-02-02 02:33:44 -05:00
Kyungwoo Lee	0426be3df6	[AArch64] Homogeneous Prolog and Epilog Size Optimization Prologs and epilogs handle callee-save registers and tend to be irregular with different immediate offsets that are not often handled by the MachineOutliner. Commit D18619/a5335647d5e8 (combining stack operations) stretched irregularity further. This patch tries to emit homogeneous stores and loads with the same offset for prologs and epilogs respectively. We have observed that this canonicalizes (homogenizes) prologs and epilogs significantly and results in a greatly increased chance of outlining, resulting in a code size reduction. Despite the above results, there are still size wins to be had that the MachineOutliner does not provide due to the special handling X30/LR. To handle the LR case, his patch custom-outlines prologs and epilogs in place. It does this by doing the following: * Injects HOM_Prolog and HOM_Epilog pseudo instructions during a Prolog and Epilog Injection Pass. * Lowers and optimizes said pseudos in a AArchLowerHomogneousPrologEpilog Pass. * Outlined helpers are created on demand. Identical helpers are merged by the linker. * An opt-in flag is introduced to enable this feature. Another threshold flag is also introduced to control the aggressiveness of outlining for application's need. This reduced an average of 4% of code size on LLVM-TestSuite/CTMark targeting arm64/-Oz. Differential Revision: https://reviews.llvm.org/D76570	2021-02-02 00:26:51 -05:00
Bradley Smith	42635856ed	[AArch64][SVE] Allow accesses to SVE stack objects to use frame pointer The layout of the stack frame for SVE means that using the frame pointer rather than the stack pointer for an access to an SVE stack object removes the need for an additional add to jump over the non-SVE objects. Likewise the opposite is true for non-SVE stack objects. This patch allows for the former to be done by having HasFP return true in the presence of both SVE and non-SVE stack objects, and also fixes a minor issue whereby the later would not be done for certain offsets.	2021-01-28 12:39:57 +00:00
Hsiangkai Wang	914e2f5a02	[NFC] Use generic name for scalable vector stack ID. Differential Revision: https://reviews.llvm.org/D94471	2021-01-13 10:57:43 +08:00
Mark Murray	af7cce2fa4	[AArch64] Add +pauth archictecture option, allowing the v8.3a pointer authentication extension. Differential Revision: https://reviews.llvm.org/D94083	2021-01-08 13:21:11 +00:00
Jay Foad	000400ca0a	Fix speling in comments. NFC.	2020-11-23 14:43:24 +00:00
Sander de Smalen	d57bba7cf8	[SVE] Return StackOffset for TargetFrameLowering::getFrameIndexReference. To accommodate frame layouts that have both fixed and scalable objects on the stack, describing a stack location or offset using a pointer + uint64_t is not sufficient. For this reason, we've introduced the StackOffset class, which models both the fixed- and scalable sized offsets. The TargetFrameLowering::getFrameIndexReference is made to return a StackOffset, so that this can be used in other interfaces, such as to eliminate frame indices in PEI or to emit Debug locations for variables on the stack. This patch is purely mechanical and doesn't change the behaviour of how the result of this function is used for fixed-sized offsets. The patch adds various checks to assert that the offset has no scalable component, as frame offsets with a scalable component are not yet supported in various places. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D90018	2020-11-05 11:02:18 +00:00
Sander de Smalen	73b6cb67dc	[NFCI] Replace AArch64StackOffset by StackOffset. This patch replaces the AArch64StackOffset class by the generic one defined in TypeSize.h. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D88983	2020-11-04 08:49:00 +00:00
Evgenii Stepanov	2e794a46b5	[AArch64] Stack frame reordering. Implement stack frame reordering in the AArch64 backend. Unlike the X86 implementation, AArch64 does not seem to benefit from "access density" based frame reordering, mainly because it has a much smaller variety of addressing modes, and the fact that all instructions are 4 bytes so each frame object is either in range of an instruction (and then the access is "free") or not (and that has a code size cost of 4 bytes). This change improves Memory Tagging codegen by * Placing an object that has been chosen as the base tagged pointer of the function at SP + 0. This saves one instruction to setup the pointer (IRG does not have an offset immediate), and more because that object can now be referenced without materializing its tagged address in a scratch register. * Placing objects that go out of scope simultaneously together. This exposes opportunities for instruction merging in tryMergeAdjacentSTG. Differential Revision: https://reviews.llvm.org/D72366	2020-10-15 12:50:16 -07:00
Evgenii Stepanov	2f63e57fa5	[MTE] Pin the tagged base pointer to one of the stack slots. Summary: Pin the tagged base pointer to one of the stack slots, and (if necessary) rewrite tag offsets so that an object that occupies that slot has both address and tag offsets of 0. This allows ADDG instructions for that object to be eliminated and their uses replaced with the tagged base pointer itself. This optimization must be done in machine instructions and not in the IR instrumentation pass, because referring to a stack slot through an IRG pointer would confuse the stack coloring pass. The optimization makes a (pretty naive) attempt to find the slot that would benefit the most by counting the uses of stack slots in the function. Reviewers: ostannard, pcc Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72365	2020-10-15 12:50:16 -07:00
Martin Storsjö	7d07405761	[AArch64] Prefer prologues with sp adjustments merged into stp/ldp for WinCFI, if optimizing for size This makes the prologue match the windows canonical layout, for cases without a frame pointer. This can potentially be a slower (a longer dependency chain of the sp register, and potentially one arithmetic operation more on some cores), but gives notable size improvements. The previous two commits shrinks a 166 KB xdata section by 49 KB, and if the change from this commit is enabled, it shrinks the xdata section by another 25 KB. In total, since the start of the recent arm64 unwind info cleanups and optimizations (since before commit `37ef743cbf`), the xdata+pdata sections of the same test DLL has shrunk from 407 KB in total originally, to 163 KB now. Differential Revision: https://reviews.llvm.org/D88701	2020-10-03 21:37:22 +03:00
Martin Storsjö	890af2f003	[AArch64] Allow pairing lr with other GPRs for WinCFI This saves one instruction per prologue/epilogue for any function with an odd number of callee-saved GPRs, but more importantly, allows such functions to match the packed unwind format. Differential Revision: https://reviews.llvm.org/D88699	2020-10-03 21:37:22 +03:00
Martin Storsjö	3780a4e568	[AArch64] Match the windows canonical callee saved register order On windows, the callee saved registers in a canonical prologue are ordered starting from a lower register number at a lower stack address (with the possible gap for aligning the stack at the top); this is the opposite order that llvm normally produces. To achieve this, reverse the order of the registers in the assignCalleeSavedSpillSlots callback, to get the stack objects laid out by PrologEpilogInserter in the right order, and adjust computeCalleeSaveRegisterPairs to lay them out from the bottom up. This allows generated prologs more often to match the format that allows the unwind info to be written as packed info. Differential Revision: https://reviews.llvm.org/D88677	2020-10-03 21:37:22 +03:00
Martin Storsjö	afb4e0f289	[AArch64] Omit SEH directives for the epilogue if none are needed For these cases, we already omit the prologue directives, if (!AFI->hasStackFrame() && !windowsRequiresStackProbe && !NumBytes). When writing the epilogue (after the prolog has been written), if the function doesn't have the WinCFI flag set (i.e. if no prologue was generated), assume that no epilogue will be needed either, and don't emit any epilog start pseudo instruction. After completing the epilogue, make sure that it actually matched the prologue. Previously, when epilogue start/end was generated, but no prologue, the unwind info for such functions actually was huge; 12 bytes xdata (4 bytes header, 4 bytes for one non-folded epilogue header, 4 bytes for padded opcodes) and 8 bytes pdata. Because the epilog consisted of one opcode (end) but the prolog was empty (no .seh_endprologue), the epilogue couldn't be folded into the prologue, and thus couldn't be considered for packed form either. On a 6.5 MB DLL with 110 KB pdata and 166 KB xdata, this gets rid of 38 KB pdata and 62 KB xdata. Differential Revision: https://reviews.llvm.org/D88641	2020-10-02 09:12:56 +03:00
Martin Storsjö	51e74e21aa	[AArch64] Remove a duplicate call to setHasWinCFI. NFCI. The function already has a cleanup scope that calls the same whenever the function is exited. When reading the code, seeing that this return codepath has an explicit call while other return paths lack it is confusing. In the hypothetical case of a function having a prologue that set the HasWinCFI flag in the MF, but the epilogue containing no WinCFI instructions, the HasWinCFI flag in the MF would end up reset back to false. Differential Revision: https://reviews.llvm.org/D88636	2020-10-01 19:03:27 +03:00
Momchil Velikov	a88c722e68	[AArch64] PAC/BTI code generation for LLVM generated functions PAC/BTI-related codegen in the AArch64 backend is controlled by a set of LLVM IR function attributes, added to the function by Clang, based on command-line options and GCC-style function attributes. However, functions, generated in the LLVM middle end (for example, asan.module.ctor or __llvm_gcov_write_out) do not get any attributes and the backend incorrectly does not do any PAC/BTI code generation. This patch record the default state of PAC/BTI codegen in a set of LLVM IR module-level attributes, based on command-line options: * "sign-return-address", with non-zero value means generate code to sign return addresses (PAC-RET), zero value means disable PAC-RET. * "sign-return-address-all", with non-zero value means enable PAC-RET for all functions, zero value means enable PAC-RET only for functions, which spill LR. * "sign-return-address-with-bkey", with non-zero value means use B-key for signing, zero value mean use A-key. This set of attributes are always added for AArch64 targets (as opposed, for example, to interpreting a missing attribute as having a value 0) in order to be able to check for conflicts when combining module attributed during LTO. Module-level attributes are overridden by function level attributes. All the decision making about whether to not to generate PAC and/or BTI code is factored out into AArch64FunctionInfo, there shouldn't be any places left, other than AArch64FunctionInfo, which directly examine PAC/BTI attributes, except AArch64AsmPrinter.cpp, which is/will-be handled by a separate patch. Differential Revision: https://reviews.llvm.org/D85649	2020-09-25 11:47:14 +01:00
Eli Friedman	b92d084910	[AArch64][SVE] Fix frame offset calculation when d8 is saved. If d8 is saved, the fp is not actually adjacent to the SVE spills/allocations. Fix the offset calculation to account for this. Differential Revision: https://reviews.llvm.org/D88117	2020-09-23 11:33:53 -07:00
Owen Anderson	5987da8764	Revert "Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain"" This reverts commit `bc9a29b9ee`. The reasoning that this patch was wrong was itself incorrect (see discussion on llvm-commits). This patch does seem to be exposing a latent SVE code generation bug on non-public tests, which should not block a correctness fix for public, non-SVE use cases.	2020-09-01 19:29:03 +00:00
Paul Walker	bc9a29b9ee	Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain" This reverts commit `e9d9a61208`. This patch was previously revert by `04879086b4` with the reapplication being done after breaking the assert used to ensure SP is always 16-byte aligned, which is a requirement of the AAPCS. For extra context the latest patch caused runtime failures when building with "-march=armv8-a+sve -mllvm -aarch64-sve-vector-bits-min=256".	2020-09-01 16:09:37 +01:00
Owen Anderson	e9d9a61208	Reapply D70800: Fix AArch64 AAPCS frame record chain Original Commit Message: After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register) may be placed in the middle of a stack frame if a function has both callee-saved general-purpose registers and floating point registers. This will break the stack unwinders that simply walk through the frame records (based on the guarantee from AAPCS64 "The Frame Pointer" section). This commit fixes the problem by adding the frame record offset. Patch By: logan Differential Revision: D70800	2020-08-27 17:29:41 +00:00
Martin Storsjö	04879086b4	Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain" This reverts commit `9936455204`. That commit caused failed assertions e.g. like this: $ cat alloca.c a; b() { float c; d(); a = __builtin_alloca(d); c = e(); f(a); return c; } $ clang -target aarch64-linux-gnu -c alloca.c -O2 clang: ../lib/Target/AArch64/AArch64InstrInfo.cpp:3446: void llvm::emitFrameOffset(llvm::MachineBasicBlock&, llvm::MachineBasicBlock::iterator, const llvm::DebugLoc&, unsigned int, unsigned int, llvm::StackOffset, const llvm::TargetInstrInfo, llvm::MachineInstr::MIFlag, bool, bool, bool): Assertion `(DestReg != AArch64::SP \|\| Bytes % 16 == 0) && "SP increment/decrement not 16-byte aligned"' failed.	2020-08-27 09:39:56 +03:00
Owen Anderson	9936455204	Reapply D70800: Fix AArch64 AAPCS frame record chain Original Commit Message: After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register) may be placed in the middle of a stack frame if a function has both callee-saved general-purpose registers and floating point registers. This will break the stack unwinders that simply walk through the frame records (based on the guarantee from AAPCS64 "The Frame Pointer" section). This commit fixes the problem by adding the frame record offset. Patch By: logan	2020-08-26 19:38:38 +00:00
Owen Anderson	9061eb8245	Revert "Fix frame pointer layout on AArch64 Linux." This broke stage2 of clang-cmake-aarch64-full. This reverts commit `a0aed80b22`.	2020-08-26 17:17:14 +00:00
Owen Anderson	a0aed80b22	Fix frame pointer layout on AArch64 Linux. When floating point callee-saved registers were used, the frame pointer would incorrectly point to the bottom of the CSR space (containing saved floating-point registers), rather than to the frame record. While all frame offsets were calculated consistently, resulting in working code, this prevented stack walkers from being about to traverse the frame list.	2020-08-26 16:09:49 +00:00
Sander de Smalen	5f47d4456d	[AArch64][SVE] Fix calculation restore point for SVE callee saves. This fixes an issue where the restore point of callee-saves in the function epilogues was incorrectly calculated when the basic block consisted of only a RET instruction. This caused dealloc instructions to be inserted in between the block of callee-save restore instructions, rather than before it. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D86099	2020-08-26 10:02:31 +01:00
David Blaikie	e31cfc4cd3	Fix -Wconstant-conversion warning with explicit cast Introduced by `fd6584a220` Following similar use of casts in AsmParser.cpp, for instance - ideally this type would use unsigned chars as they're more representative of raw data and don't get confused around implementation defined choices of char's signedness, but this is what it is & the signed/unsigned conversions are (so far as I understand) safe/bit preserving in this usage and what's intended, given the API design here.	2020-08-04 10:41:27 -07:00
Sander de Smalen	bb3344c7d8	[AArch64][SVE] Add missing unwind info for SVE registers. This patch adds a CFI entry for each SVE callee saved register that needs unwind info at an offset from the CFA. The offset is a DWARF expression because the offset is partly scalable. The CFI entries only cover a subset of the SVE callee-saves and only encodes the lower 64-bits, thus implementing the lowest common denominator ABI. Existing unwinders may support VG but only restore the lower 64-bits. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84044	2020-08-04 11:47:06 +01:00
Sander de Smalen	fd6584a220	[AArch64][SVE] Fix CFA calculation in presence of SVE objects. The CFA is calculated as (SP/FP + offset), but when there are SVE objects on the stack the SP offset is partly scalable and should instead be expressed as the DWARF expression: SP + offset + scalable_offset * VG where VG is the Vector Granule register, containing the number of 64bits 'granules' in a scalable vector. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84043	2020-08-04 11:47:06 +01:00
Sander de Smalen	cda2eb3ad2	[AArch64][SVE] Fix epilogue for SVE when the stack is realigned. While deallocating the stackframe, the offset used to reload the callee-saved registers was not pointing to the SVE callee-saves, but rather to the whole SVE area. +--------------+ \| GRP callee \| \| saves \| +--------------+ <- FP \| SVE callee \| \| saves \| +--------------+ <- Should restore SVE callee saves from here \| SVE Spills \| \| and Locals \| +--------------+ <- instead of from here. \| \| : : \| \| +--------------+ <- SP Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D84539	2020-07-28 15:45:53 +01:00
Sander de Smalen	26b4ef3694	[AArch64][SVE] Don't align the last SVE callee save. Instead of aligning the last callee-saved-register slot to the stack alignment (16 bytes), just align the SVE callee-saved block. This also simplifies the code that allocates space for the callee-saves. This change is needed to make sure the offset to which the callee-saved register is spilled, corresponds to the offset used for e.g. unwind call frame instructions. Reviewers: efriedma, paulwalker-arm, david-arm, rengolin Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84042	2020-07-28 15:45:53 +01:00
Sander de Smalen	54492a5843	[AArch64][SVE] Don't support fixedStack for SVE objects. Fixed stack objects are preallocated and defined to be allocated before any of the regular stack objects. These are normally used to model stack arguments. The AAPCS does not support passing SVE registers on the stack by value (only by reference). The current layout also doesn't place them before all stack objects, but rather before all SVE objects. Removing this simplifies the code that emits the allocation/deallocation around callee-saved registers (D84042). This patch also removes all uses of fixedStack from from framelayout-sve.mir, where this was used purely for testing purposes. Reviewers: paulwalker-arm, efriedma, rengolin Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D84538	2020-07-28 15:45:53 +01:00
Sander de Smalen	a8f4f85d84	[AArch64][SVE] Remove erroneous assert in resolveFrameOffsetReference The code already supports addressing a fixed-size stack object from the frame-pointer, by first subtracting sizeof(SVE area) from FP. Reviewers: efriedma, cameron.mcinally, david-arm, rengolin Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D83125	2020-07-14 09:22:45 +01:00
Kyungwoo Lee	7af27b65b3	[NFC][AArch64] Refactor getArgumentPopSize Differential Revision: https://reviews.llvm.org/D83456	2020-07-09 11:58:15 -07:00
Guillaume Chatelet	4f5133a4dc	[Alignment][NFC] Migrate AArch64, ARM, Hexagon, MSP and NVPTX backends to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82749	2020-06-30 07:56:17 +00:00
Kristof Beyls	c35ed40f4f	[AArch64] Extend AArch64SLSHardeningPass to harden BLR instructions. To make sure that no barrier gets placed on the architectural execution path, each BLR x<N> instruction gets transformed to a BL __llvm_slsblr_thunk_x<N> instruction, with __llvm_slsblr_thunk_x<N> a thunk that contains __llvm_slsblr_thunk_x<N>: BR x<N> <speculation barrier> Therefore, the BLR instruction gets split into 2; one BL and one BR. This transformation results in not inserting a speculation barrier on the architectural execution path. The mitigation is off by default and can be enabled by the harden-sls-blr subtarget feature. As a linker is allowed to clobber X16 and X17 on function calls, the above code transformation would not be correct in case a linker does so when N=16 or N=17. Therefore, when the mitigation is enabled, generation of BLR x16 or BLR x17 is avoided. As BLRA* indirect calls are not produced by LLVM currently, this does not aim to implement support for those. Differential Revision: https://reviews.llvm.org/D81402	2020-06-12 07:34:33 +01:00
Martin Storsjö	cf97e0ec42	[AArch64] Treat x18 as callee-saved in functions with windows calling convention on non-windows OSes Treat it as callee-saved, and always back it up. When windows code calls entry points in unix code, marked with the windows calling convention, that unix code can call other functions that isn't compiled with -ffixed-x18 which may clobber x18 freely. By backing it up and restoring it on return, we preserve the register across the function call, fulfilling this part of the windows calling convention on another OS. This isn't enough for making sure that x18 is preseved when non-windows code does a callback to windows code, but is a clear improvement over the current status quo. Additionally, wine is nowadays building many modules as PE DLLs, which avoids the callback issue altogether for those DLLs. Differential Revision: https://reviews.llvm.org/D61892	2020-05-30 09:22:09 +03:00
Fangrui Song	0840d725c4	[MC] Change MCCFIInstruction::createDefCfaOffset to cfiDefCfaOffset which does not negate Offset The negative Offset has caused a bunch of problems and confused quite a few call sites. Delete the unneeded negation and fix all call sites.	2020-05-22 17:07:11 -07:00
Fangrui Song	7e49dc6184	[MC] Change MCCFIInstruction::createDefCfa to cfiDefCfa which does not negate Offset The negative Offset has caused a bunch of problems and confused quite a few call sites. Delete the unneeded negation and fix all call sites.	2020-05-22 15:47:26 -07:00
Matt Arsenault	2481f26ac3	CodeGen: Use Register in TargetFrameLowering	2020-04-07 17:07:44 -04:00
Guillaume Chatelet	fc63c4d8ce	[Alignment][NFC] Remove remaining uses of MachineFrameInfo::setObjectAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77217	2020-04-01 14:38:05 +00:00
Guillaume Chatelet	1dffa2550b	[Alignment][NFC] Transition to MachineFrameInfo::getObjectAlign() Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77215	2020-04-01 14:08:28 +00:00
Daniel Frampton	494abe139a	[AArch64] Change AArch64 Windows EH UnwindHelp object to be a fixed object The UnwindHelp object is used during exception handling by runtime code. It must be findable from a fixed offset from FP. This change allocates the UnwindHelp object as a fixed object (as is done for x86_64) to ensure that both the generated code and runtime agree on the location of the object. Fixes https://bugs.llvm.org/show_bug.cgi?id=45346 Differential Revision: https://reviews.llvm.org/D77016	2020-03-31 14:21:21 -07:00
Daniel Frampton	522b4c4b88	[AArch64] Fix mismatch in prologue and epilogue for funclets on Windows The generated code for a funclet can have an add to sp in the epilogue for which there is no corresponding sub in the prologue. This patch removes the early return from emitPrologue that was preventing the sub to sp, and instead conditionalizes the appropriate parts of the rest of the function. Fixes https://bugs.llvm.org/show_bug.cgi?id=45345 Differential Revision: https://reviews.llvm.org/D77015	2020-03-31 14:21:18 -07:00
Guillaume Chatelet	998118c3d3	[Alignment][NFC] Deprecate MachineMemOperand::getMachineMemOperand version that takes an untyped alignement. Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77138	2020-03-31 16:05:31 +00:00
Guillaume Chatelet	b727aabcb8	[Alignment][NFC] Use llvmTargetFrameLowering::getStackAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Reviewed By: courbet Subscribers: wuzish, arsenm, jyknight, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, fedor.sergeev, jrtc27, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76613	2020-03-26 18:15:53 +00:00
Guillaume Chatelet	d000655a8c	[Alignment][NFC] Deprecate getMaxAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76348	2020-03-18 14:48:45 +01:00
Benjamin Kramer	186dd63182	ArrayRef'ize restoreCalleeSavedRegisters. NFCI. restoreCalleeSavedRegisters can mutate the contents of the CalleeSavedInfos, so use a MutableArrayRef.	2020-02-29 09:50:23 +01:00
Benjamin Kramer	e4230a9f6c	ArrayRef'ize spillCalleeSavedRegisters. NFCI.	2020-02-08 12:19:23 +01:00
Evgenii Stepanov	d081962dea	Merge memtag instructions with adjacent stack slots. Summary: Detect a run of memory tagging instructions for adjacent stack frame slots, and replace them with a shorter instruction sequence * replace STG + STG with ST2G * replace STGloop + STGloop with STGloop This code needs to run when stack slot offsets are already known, but before FrameIndex operands in STG instructions are eliminated; that's the reason for the new hook in PrologueEpilogue. This change modifies STGloop and STZGloop pseudos to take the size as an immediate integer operand, and adds _untied variants of those pseudos that are allowed to take the base address as a FI operand. This is needed to simplify recognizing an STGloop instruction as operating on a stack slot post-regalloc. This improves memtag code size by ~0.25%, and it looks like an additional ~0.1% is possible by rearranging the stack frame such that consecutive STG instructions reference adjacent slots (patch pending). Reviewers: pcc, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70286	2020-01-17 15:19:29 -08:00

1 2 3 4 5

226 Commits