llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanne Wouda	e503fee904	[AArch64] Fix MUL/SUB fusing Summary: When MUL is the first operand to SUB, we can't use MLS because the accumulator should be negated. Emit a NEG of the accumulator and an MLA instead, similar to what we do for FMUL / FSUB fusing. Reviewers: dmgreen, SjoerdMeijer, fhahn, Gerolf, mstorsjo, asbirlea Reviewed By: asbirlea Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71067	2019-12-05 18:10:06 +00:00
David Tellenbach	cec2d5c174	Reland [AArch64][MachineOutliner] Return address signing for outlined functions Summary: Reland after fixing an ASan failure by stopping outlining early if the constraints for return address signing removed too many outlining candidates. During AArch64 frame lowering instructions to enable return address signing are inserted into functions if needed. Functions generated during machine outlining don't run through target frame lowering and hence are missing such instructions. This patch introduces the following changes: 1. If not all functions that potentially participate in function outlining agree on their return address signing scope and their return address signing key, outlining is disabled for these functions. 2. If not all functions that potentially participate in function outlining agree on their support for v8.3A features, outlining is disabled for these functions. 3. If an outlining candidate would outline instructions that modify sp in a way that invalidates return address signing, outlining is disabled for that particular candidate. 4. If all candidate functions agree on the signing scope, signing key and their support for v8.3 features, the outlined function behaves as if it had the same scope and key attributes and as if it would provide the same v8.3A support as the original functions. Reviewers: ostannard, paquette Reviewed By: ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70635	2019-12-05 02:20:59 +01:00
Sterling Augustine	f65267ee16	Revert "Reland [AArch64][MachineOutliner] Return address signing for outlined functions" This reverts commit `02760b750b`. The original commit is not asan clean. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/37147/steps/check-llvm%20asan/logs/stdio	2019-12-04 16:31:10 -08:00
David Tellenbach	02760b750b	Reland [AArch64][MachineOutliner] Return address signing for outlined functions Summary: Reland after fixing a bug that allowed outlining of SP modifying instructions that invalidated return address signing. During AArch64 frame lowering instructions to enable return address signing are inserted into functions if needed. Functions generated during machine outlining don't run through target frame lowering and hence are missing such instructions. This patch introduces the following changes: 1. If not all functions that potentially participate in function outlining agree on their return address signing scope and their return address signing key, outlining is disabled for these functions. 2. If not all functions that potentially participate in function outlining agree on their support for v8.3A features, outlining is disabled for these functions. 3. If an outlining candidate would outline instructions that modify sp in a way that invalidates return address signing, outlining is disabled for that particular candidate. 4. If all candidate functions agree on the signing scope, signing key and their support for v8.3 features, the outlined function behaves as if it had the same scope and key attributes and as if it would provide the same v8.3A support as the original functions. Reviewers: ostannard, paquette Reviewed By: ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70635	2019-12-04 19:39:52 +01:00
Sanne Wouda	f2e7de81c6	[AArch64] Fix over-eager fusing of NEON SIMD MUL/ADD Summary: The ISel pattern for SIMD MLA is a bit too eager: it replaces the ADD with an MLA even when the MUL cannot be eliminated, e.g. when it has another use. An MLA is usually has a higher latency than an ADD (and there are fewer pipes available that can execute it), so trading an MLA for an ADD is not great. ISel is not taking the number of uses of the MUL result into account, nor any other factors such as the length of the critical path or other resource pressure. The MachineCombiner is able to make these judgments so this patch ports the ISel pattern for MUL/ADD fusing to the MachineCombiner. Similarly for MUL/SUB -> MLS, as well as the indexed variants. The change has no impact on SPEC CPU© intrate nor fprate. Reviewers: dmgreen, SjoerdMeijer, fhahn, Gerolf Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70673	2019-12-03 15:48:37 +00:00
Djordje Todorovic	2eb0862ed8	[AArch64][DebugInfo] Fix incorrect call site param value produced by MOVZXi This resolves the problem with the truncation of the immediate operand. Differential Revision: https://reviews.llvm.org/D70168	2019-11-14 11:02:35 +01:00
Sander de Smalen	3367686b4d	[AArch64] Extend storeRegToStackSlot to spill SVE registers. This patch allows the register allocator to spill SVE registers to the stack. Reviewers: ostannard, efriedma, rengolin, cameron.mcinally Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D70082	2019-11-13 10:09:32 +00:00
Matt Arsenault	e6c9a9af39	Use MCRegister in copyPhysReg	2019-11-11 14:42:33 +05:30
Djordje Todorovic	8d2ccd1ac3	Reland: [TII] Use optional destination and source pair as a return value; NFC Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622	2019-11-08 13:00:39 +01:00
Oliver Stannard	a3f4745428	Revert "[AArch64][MachineOutliner] Return address signing for outlined functions" This is causing faults when an instruction which modifies SP is outlined, causing the PAC and AUT instructions to not match. This reverts commits `70caa1fc30` and `55314d3237`.	2019-11-01 16:06:09 +00:00
Simon Pilgrim	3842b94c4e	Revert rG57ee0435bd47f23f3939f402914c231b4f65ca5e - [TII] Use optional destination and source pair as a return value; NFC This is breaking MSVC builds: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20375	2019-10-31 18:00:29 +00:00
Djordje Todorovic	57ee0435bd	[TII] Use optional destination and source pair as a return value; NFC Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622	2019-10-31 15:34:49 +01:00
David Tellenbach	70caa1fc30	[AArch64][MachineOutliner] Return address signing for outlined functions Summary: During AArch64 frame lowering instructions to enable return address signing are inserted into function if needed. Functions generated during machine outlining don't run through target frame lowering and hence are missing such instructions. This patch introduces the following changes: 1. If not all functions that potentially participate in function outlining agree on their return address signing scope and their return address signing key, outlining is disabled for these functions. 2. If not all functions that potentially participate in function outlining agree on their support for v8.3A features, outlining is disabled for these functions. 2. If all candidate functions agree on the signing scope, signing key and and their support for v8.3 features, the outlined function behaves as if it had the same scope and key attributes and as if it would provide the same v8.3A support as the original functions. Reviewers: olista01, paquette, t.p.northover, ostannard Reviewed By: ostannard Subscribers: ostannard, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69097	2019-10-30 15:20:16 +00:00
Djordje Todorovic	532815dd5c	[ARM][AArch64][DebugInfo] Improve call site instruction interpretation Extend the describeLoadedValue() with support for target specific ARM and AArch64 instructions interpretation. The patch provides specialization for ADD and SUB operations that include a register and an immediate/offset operand. Some of the instructions can operate with global string addresses or constant pool indexes but such cases are omitted since we currently lack flexible support for processing such operands at DWARF production stage. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D67556	2019-10-30 13:58:14 +01:00
Shoaib Meenai	6d1891c508	[AArch64] Fix offset calculation r374772 changed Offset to be an int64_t but left NewOffset as an int. Scale is unsigned, so in the calculation `Offset - NewOffset * Scale`, `NewOffset * Scale` was promoted to unsigned and was then zero-extended to 64 bits, leading to an incorrect computation which manifested as an out-of-memory when building the Swift standard library for Android aarch64. Promote NewOffset to int64_t to fix this, and promote EmittableOffset as well, since its one user passes it to a function which takes an int64_t anyway. Test case based on a suggestion by Sander de Smalen! Differential Revision: https://reviews.llvm.org/D69018 llvm-svn: 375043	2019-10-16 21:41:05 +00:00
Sander de Smalen	7774812965	[AArch64] Stackframe accesses to SVE objects. Materialize accesses to SVE frame objects from SP or FP, whichever is available and beneficial. This patch still assumes the objects are pre-allocated. The automatic layout of SVE objects within the stackframe will be added in a separate patch. Reviewers: greened, cameron.mcinally, efriedma, rengolin, thegameg, rovka Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D67749 llvm-svn: 374772	2019-10-14 13:11:34 +00:00
Sebastian Pop	d0d52edae9	fix fmls fp16 Tim Northover remarked that the added patterns for fmls fp16 produce wrong code in case the fsub instruction has a multiplication as its first operand, i.e., all the patterns FMLSv_OP1: > define <8 x half> @test_FMLSv8f16_OP1(<8 x half> %a, <8 x half> %b, <8 x half> %c) { > ; CHECK-LABEL: test_FMLSv8f16_OP1: > ; CHECK: fmls {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h > entry: > > %mul = fmul fast <8 x half> %c, %b > %sub = fsub fast <8 x half> %mul, %a > ret <8 x half> %sub > } > > This doesn't look right to me. The exact instruction produced is "fmls > v0.8h, v2.8h, v1.8h", which I think calculates "v0 - v2v1", but the > IR is calculating "v2v1-v0". The equivalent <4 x float> code also > doesn't emit an fmls. This patch generates an fmla and negates the value of the operand2 of the fsub. Inspecting the pattern match, I found that there was another mistake in the opcode to be selected: matching FMULv416 should generate FMLSv416 and not FMLSv232. Tested on aarch64-linux with make check-all. Differential Revision: https://reviews.llvm.org/D67990 llvm-svn: 374044	2019-10-08 13:23:57 +00:00
Sander de Smalen	4f99b6f0fe	[AArch64] Static (de)allocation of SVE stack objects. Adds support to AArch64FrameLowering to allocate fixed-stack SVE objects. The focus of this patch is purely to allow the stack frame to allocate/deallocate space for scalable SVE objects. More dynamic allocation (at compile-time, i.e. determining placement of SVE objects on the stack), or resolving frame-index references that include scalable-sized offsets, are left for subsequent patches. SVE objects are allocated in the stack frame as a separate region below the callee-save area, and above the alignment gap. This is done so that the SVE objects can be accessed directly from the FP at (runtime) VL-based offsets to benefit from using the VL-scaled addressing modes. The layout looks as follows: +-------------+ \| stack arg \| +-------------+ \| Callee Saves\| \| X29, X30 \| (if available) \|-------------\| <- FP (if available) \| : \| \| SVE area \| \| : \| +-------------+ \|/////////////\| alignment gap. \| : \| \| Stack objs \| \| : \| +-------------+ <- SP after call and frame-setup SVE and non-SVE stack objects are distinguished using different StackIDs. The offsets for objects with TargetStackID::SVEVector should be interpreted as purely scalable offsets within their respective SVE region. Reviewers: thegameg, rovka, t.p.northover, efriedma, rengolin, greened Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D61437 llvm-svn: 373585	2019-10-03 11:33:50 +00:00
Changpeng Fang	f5524f0451	Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D58360 llvm-svn: 373024	2019-09-26 22:53:44 +00:00
Kerry McLaughlin	e55b3bf40e	[SVE][Inline-Asm] Add constraints for SVE predicate registers Summary: Adds the following inline asm constraints for SVE: - Upl: One of the low eight SVE predicate registers, P0 to P7 inclusive - Upa: SVE predicate register with full range, P0 to P15 Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, cameron.mcinally, greened, rengolin Reviewed By: rovka Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66524 llvm-svn: 371967	2019-09-16 09:45:27 +00:00
Sjoerd Meijer	395a86731d	[AArch64] MachineCombiner FMA matching. NFC. Follow-up of rL371321 that added some more FP16 FMA patterns, and an attempt to reduce the copy-pasting and make this more readable. Differential Revision: https://reviews.llvm.org/D67403 llvm-svn: 371818	2019-09-13 07:38:54 +00:00
Tim Northover	f1c2892912	AArch64: support arm64_32, an ILP32 slice for watchOS. This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM. FastISel is mostly disabled for now since it would generate incorrect code for ILP32. llvm-svn: 371722	2019-09-12 10:22:23 +00:00
Sebastian Pop	eacb2c2c97	[aarch64] Add combine patterns for fp16 fmla This patch enables generation of fused multiply add/sub for instructions operating on fp16. Tested on aarch64-linux. Differential Revision: https://reviews.llvm.org/D67297 llvm-svn: 371321	2019-09-07 20:24:51 +00:00
Kerry McLaughlin	da4ef9b4c8	[SVE][Inline-Asm] Support for SVE asm operands Summary: Adds the following inline asm constraints for SVE: - w: SVE vector register with full range, Z0 to Z31 - x: Restricted to registers Z0 to Z15 inclusive. - y: Restricted to registers Z0 to Z7 inclusive. This change also adds the "z" modifier to interpret a register as an SVE register. Not all of the bitconvert patterns added by this patch are used, but they have been included here for completeness. Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, rengolin, cameron.mcinally, greened Reviewed By: sdesmalen Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66302 llvm-svn: 370673	2019-09-02 16:12:31 +00:00
Paul Walker	26295676a4	Revert Revert [AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. This reverts r369132 (git commit `19301d75f0`) llvm-svn: 369186	2019-08-17 09:22:36 +00:00
Paul Walker	93c7a4a47c	Revert [AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. This reverts r369133 (git commit `2632c677f8`) llvm-svn: 369185	2019-08-17 09:22:28 +00:00
Paul Walker	2632c677f8	[AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. Recommit with fixes for mac builders. Summary: AArch64InstrInfo::getInstSizeInBytes is incorrectly treating meta instructions (e.g. CFI_INSTRUCTION) as normal instructions and giving them a size of 4. This results in branch relaxation calculating block sizes wrong. Branch relaxation also considers alignment and thus a single mistake can result in later blocks being incorrectly sized even when they themselves do not contain meta instructions. The net result is we might not relax a branch whose destination is not within range. Reviewers: nickdesaulniers, peter.smith Reviewed By: peter.smith Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66337 > llvm-svn: 369111 llvm-svn: 369133	2019-08-16 17:29:53 +00:00
Paul Walker	19301d75f0	Revert [AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. This reverts r369111 (git commit `3ccee5f7c4`) llvm-svn: 369132	2019-08-16 17:29:42 +00:00
Paul Walker	3ccee5f7c4	[AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. Summary: AArch64InstrInfo::getInstSizeInBytes is incorrectly treating meta instructions (e.g. CFI_INSTRUCTION) as normal instructions and giving them a size of 4. This results in branch relaxation calculating block sizes wrong. Branch relaxation also considers alignment and thus a single mistake can result in later blocks being incorrectly sized even when they themselves do not contain meta instructions. The net result is we might not relax a branch whose destination is not within range. Reviewers: nickdesaulniers, peter.smith Reviewed By: peter.smith Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66337 llvm-svn: 369111	2019-08-16 14:17:52 +00:00
Evgeniy Stepanov	10ce5f88d1	Add missing MIR serialization text for AArch64II::MO_TAGGED. Reviewers: pcc Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66312 llvm-svn: 369053	2019-08-15 22:03:55 +00:00
Daniel Sanders	5ae66e56cf	[aarch64] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Manual fixups in: AArch64InstrInfo.cpp - genFusedMultiply() now takes a Register* instead of unsigned* AArch64LoadStoreOptimizer.cpp - Ternary operator was ambiguous between Register/MCRegister. Settled on Register Depends on D65919 Reviewers: aemerson Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for full review was: https://reviews.llvm.org/D65962 llvm-svn: 368628	2019-08-12 22:40:53 +00:00
Sander de Smalen	1d2bfa4a86	[AArch64][WinCFI] Do not pair callee-save instructions in LoadStoreOptimizer Prevent the LoadStoreOptimizer from pairing any load/store instructions with instructions from the prologue/epilogue if the CFI information has encoded the operations as separate instructions. This would otherwise lead to a mismatch of the actual prologue size from the size as recorded in the Windows CFI. Reviewers: efriedma, mstorsjo, ssijaric Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D65817 llvm-svn: 368164	2019-08-07 12:41:38 +00:00
Sander de Smalen	ad7e95df5a	[AArch64] NFC: Generalize emitFrameOffset to support more than byte offsets. Refactor emitFrameOffset to accept a StackOffset struct as its offset argument. This method currently only supports byte offsets (MVT::i8) but will be extended in a later patch to support scalable offsets (MVT::nxv1i8) as well. Reviewers: thegameg, rovka, t.p.northover, efriedma, greened Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D61436 llvm-svn: 368049	2019-08-06 15:06:31 +00:00
Sander de Smalen	612b038966	[AArch64] NFC: Add generic StackOffset to describe scalable offsets. To support spilling/filling of scalable vectors we need a more generic representation of a stack offset than simply 'int'. For this we introduce the StackOffset struct, which comprises multiple offsets sized by their respective MVTs. Byte-offsets will thus be a simple tuple such as { offset, MVT::i8 }. Adding two byte-offsets will result in a byte offset { offsetA + offsetB, MVT::i8 }. When two offsets have different types, we can canonicalise them to use the same MVT, as long as their runtime sizes are guaranteed to have the same size-ratio as they would have at compile-time. When we have both scalable- and fixed-size objects on the stack, we can create an offset that is: ({ offset_fixed, MVT::i8 } + { offset_scalable, MVT::nxv1i8 }) The struct also contains a getForFrameOffset() method that is specific to AArch64 and decomposes the frame-offset to be used directly in instructions that operate on the stack or index into the stack. Note: This patch adds StackOffset as an AArch64-only concept, but we would like to make this a generic concept/struct that is supported by all interfaces that take or return stack offsets (currently as 'int'). Since that would be a bigger change that is currently pending on D32530 landing, we thought it makes sense to first show/prove the concept in the AArch64 target before proposing to roll this out further. Reviewers: thegameg, rovka, t.p.northover, efriedma, greened Reviewed By: rovka, greened Differential Revision: https://reviews.llvm.org/D61435 llvm-svn: 368024	2019-08-06 13:06:40 +00:00
Daniel Sanders	2bea69bf65	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633	2019-08-01 23:27:28 +00:00
Peter Collingbourne	09f39967a2	AArch64: Add a tagged-globals backend feature. This feature instructs the backend to allow locally defined global variable addresses to contain a pointer tag in bits 56-63 that will be ignored by the hardware (i.e. TBI), but may be used by an instrumentation pass such as HWASAN. It works by adding a MOVK instruction to the regular ADRP/ADD sequence that sets bits 48-63 to the corresponding bits of the global, with the linker bounds check disabled on the ADRP instruction to prevent the tag from causing a link failure. This implementation of the feature omits the MOVK when loading from or storing to a global, which is sufficient for TBI. If the same approach is extended to MTE, assuming that 0 is not configured as a catch-all tag, we will most likely also need the MOVK in this case in order to avoid a tag mismatch. Differential Revision: https://reviews.llvm.org/D65364 llvm-svn: 367475	2019-07-31 20:14:19 +00:00
Peter Collingbourne	33773d5cfc	SelectionDAG, MI, AArch64: Widen target flags fields/arguments from unsigned char to unsigned. This makes the field wider than MachineOperand::SubReg_TargetFlags so that we don't end up silently truncating any higher bits. We should still catch any bits truncated from the MachineOperand field as a consequence of the assertion in MachineOperand::setTargetFlags(). Differential Revision: https://reviews.llvm.org/D65465 llvm-svn: 367474	2019-07-31 20:14:09 +00:00
Evgeniy Stepanov	d752f5e953	Basic codegen for MTE stack tagging. Implement IR intrinsics for stack tagging. Generated code is very unoptimized for now. Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are used to implement a tagged stack frame pointer in a virtual register. Differential Revision: https://reviews.llvm.org/D64172 llvm-svn: 366360	2019-07-17 19:24:02 +00:00
Jonas Paulsson	fdc4ea34e3	[SystemZ, RegAlloc] Favor 3-address instructions during instruction selection. This patch aims to reduce spilling and register moves by using the 3-address versions of instructions per default instead of the 2-address equivalent ones. It seems that both spilling and register moves are improved noticeably generally. Regalloc hints are passed to increase conversions to 2-address instructions which are done in SystemZShortenInst.cpp (after regalloc). Since the SystemZ reg/mem instructions are 2-address (dst and lhs regs are the same), foldMemoryOperandImpl() can no longer trivially fold a spilled source register since the reg/reg instruction is now 3-address. In order to remedy this, new 3-address pseudo memory instructions are used to perform the folding only when the dst and lhs virtual registers are known to be allocated to the same physreg. In order to not let MachineCopyPropagation run and change registers on these transformed instructions (making it 3-address), a new target pass called SystemZPostRewrite.cpp is run just after VirtRegRewriter, that immediately lowers the pseudo to a target instruction. If it would have been possibe to insert a COPY instruction and change a register operand (convert to 2-address) in foldMemoryOperandImpl() while trusting that the caller (e.g. InlineSpiller) would update/repair the involved LiveIntervals, the solution involving pseudo instructions would not have been needed. This is perhaps a potential improvement (see Phabricator post). Common code changes: * A new hook TargetPassConfig::addPostRewrite() is utilized to be able to run a target pass immediately before MachineCopyPropagation. * VirtRegMap is passed as an argument to foldMemoryOperand(). Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D60888 llvm-svn: 362868	2019-06-08 06:19:15 +00:00
Nick Desaulniers	33bc64202b	[AArch64] check for INLINEASM_BR along w/ INLINEASM Summary: It looks like since INLINEASM_BR was created off of INLINEASM, a few checks for INLINEASM needed to be updated to check for either case. pr/41999 Reviewers: t.p.northover, peter.smith Reviewed By: peter.smith Subscribers: craig.topper, javed.absar, kristof.beyls, hiraditya, llvm-commits, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62402 llvm-svn: 361661	2019-05-24 19:00:13 +00:00
Alexey Lapshin	53726588f6	[DebugInfo][AArch64] Recognise target specific instruction as mov instr This fix is for the problem from https://bugs.llvm.org/show_bug.cgi?id=38714. Specifically, Simple Register Coalescing creates following conversion : undef %0.sub_32:gpr64 = ORRWrs $wzr, %3:gpr32common, 0, debug-location !24; It copies 32-bit value from gpr32 into gpr64. But Live DEBUG_VALUE analysis is not able to create debug location record for that instruction. So the problem is in that debug info for argc variable is incorrect. The fix is to write custom isCopyInstrImpl() which would recognize the ORRWrs instr. llvm-svn: 361417	2019-05-22 18:48:58 +00:00
Mandeep Singh Grang	814435fe87	[AArch64] only indicate CFI on Windows if we emitted CFI Summary: Otherwise, we emit directives for CFI without any actual CFI opcodes to go with them, which causes tools to malfunction. The technique is similar to what the x86 backend already does. Fixes https://bugs.llvm.org/show_bug.cgi?id=40876 Patch by: froydnj (Nathan Froyd) Reviewers: mstorsjo, eli.friedman, rnk, mgrang, ssijaric Reviewed By: rnk Subscribers: javed.absar, kristof.beyls, llvm-commits, dmajor Tags: #llvm Differential Revision: https://reviews.llvm.org/D61960 llvm-svn: 360816	2019-05-15 21:23:41 +00:00
Javed Absar	1cdc3dbc58	[AArch64] Add support for MTE intrinsics This patch provides intrinsics support for Memory Tagging Extension (MTE), which was introduced with the Armv8.5-a architecture. The intrinsics are described in detail in the latest ACLE Q1 2019 documentation: https://developer.arm.com/docs/101028/latest Reviewed by: David Spickett Differential Revision: https://reviews.llvm.org/D60486 llvm-svn: 358963	2019-04-23 09:39:58 +00:00
Bjorn Pettersson	238c9d6308	[CodeGen] Add "const" to MachineInstr::mayAlias Summary: The basic idea here is to make it possible to use MachineInstr::mayAlias also when the MachineInstr is const (or the "Other" MachineInstr is const). The addition of const in MachineInstr::mayAlias then rippled down to the need for adding const in several other places, such as TargetTransformInfo::getMemOperandWithOffset. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, MatzeB, arsenm, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60856 llvm-svn: 358744	2019-04-19 09:08:38 +00:00
Evandro Menezes	85bd3978ae	[IR] Refactor attribute methods in Function class (NFC) Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731	2019-04-04 22:40:06 +00:00
Sander de Smalen	90d1b551e1	[AArch64] NFC: Cleanup isAArch64FrameOffsetLegal Cleanup isAArch64FrameOffsetLegal by: - Merging the large switch statement to reuse AArch64InstrInfo::getMemOpInfo(). - Using AArch64InstrInfo::getUnscaledLdSt() to determine whether an instruction has an unscaled variant. - Simplifying the logic that calculates the offset to fit the immediate. Reviewers: paquette, evandro, eli.friedman, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59636 llvm-svn: 357064	2019-03-27 13:16:19 +00:00
Sander de Smalen	46edefe3c4	[AArch64] Adds cases for LDRSHWui and LDRSHXui to getMemOpInfo This patch also adds cases PRFUMi and PRFMui. This change was discussed in https://reviews.llvm.org/D59635. llvm-svn: 357059	2019-03-27 10:39:03 +00:00
Tim Northover	638110a208	AArch64: implement copy for paired GPR registers. When doing 128-bit atomics using CASP we might need to copy a GPRPair to a different register, but that was unimplemented up to now. llvm-svn: 353383	2019-02-07 10:35:34 +00:00
Oliver Stannard	78dc38ec94	[AArch64][Outliner] Don't outline BTI instructions We can't outline BTI instructions, because they need to be the very first instruction executed after an indirect call or branch. If we outline them, then an indirect call might go to the branch to the outlined function, which will fault. Differential revision: https://reviews.llvm.org/D57753 llvm-svn: 353190	2019-02-05 17:21:57 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Mandeep Singh Grang	859cb2e35d	[AArch64] Emit the correct MCExpr relocations specifiers like VK_ABS_G0, etc Summary: D55896 and D56029 add support to emit fixups for :abs_g0: , :abs_g1_s: , etc. This patch adds the necessary enums and MCExpr needed for lowering these. Reviewers: rnk, mstorsjo, efriedma Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D56037 llvm-svn: 350798	2019-01-10 04:59:44 +00:00
Kristof Beyls	c650ff77eb	Initial AArch64 SLH implementation. This is an initial implementation for Speculative Load Hardening for AArch64. It builds on top of the recently introduced AArch64SpeculationHardening pass. This doesn't implement (yet) some of the optimizations implemented for the X86SpeculativeLoadHardening pass. I thought introducing the optimizations incrementally in follow-up patches should make this easier to review. Differential Revision: https://reviews.llvm.org/D55929 llvm-svn: 350729	2019-01-09 15:13:34 +00:00
Kristof Beyls	e66bc1f756	Introduce control flow speculation tracking pass for AArch64 The pass implements tracking of control flow miss-speculation into a "taint" register. That taint register can then be used to mask off registers with sensitive data when executing under miss-speculation, a.k.a. "transient execution". This pass is aimed at mitigating against SpectreV1-style vulnarabilities. At the moment, it implements the tracking of miss-speculation of control flow into a taint register, but doesn't implement a mechanism yet to then use that taint register to mask off vulnerable data in registers (something for a follow-on improvement). Possible strategies to mask out vulnerable data that can be implemented on top of this are: - speculative load hardening to automatically mask of data loaded in registers. - using intrinsics to mask of data in registers as indicated by the programmer (see https://lwn.net/Articles/759423/). For AArch64, the following implementation choices are made. Some of these are different than the implementation choices made in the similar pass implemented in X86SpeculativeLoadHardening.cpp, as the instruction set characteristics result in different trade-offs. - The speculation hardening is done after register allocation. With a relative abundance of registers, one register is reserved (X16) to be the taint register. X16 is expected to not clash with other register reservation mechanisms with very high probability because: . The AArch64 ABI doesn't guarantee X16 to be retained across any call. . The only way to request X16 to be used as a programmer is through inline assembly. In the rare case a function explicitly demands to use X16/W16, this pass falls back to hardening against speculation by inserting a DSB SYS/ISB barrier pair which will prevent control flow speculation. - It is easy to insert mask operations at this late stage as we have mask operations available that don't set flags. - The taint variable contains all-ones when no miss-speculation is detected, and contains all-zeros when miss-speculation is detected. Therefore, when masking, an AND instruction (which only changes the register to be masked, no other side effects) can easily be inserted anywhere that's needed. - The tracking of miss-speculation is done by using a data-flow conditional select instruction (CSEL) to evaluate the flags that were also used to make conditional branch direction decisions. Speculation of the CSEL instruction can be limited with a CSDB instruction - so the combination of CSEL + a later CSDB gives the guarantee that the flags as used in the CSEL aren't speculated. When conditional branch direction gets miss-speculated, the semantics of the inserted CSEL instruction is such that the taint register will contain all zero bits. One key requirement for this to work is that the conditional branch is followed by an execution of the CSEL instruction, where the CSEL instruction needs to use the same flags status as the conditional branch. This means that the conditional branches must not be implemented as one of the AArch64 conditional branches that do not use the flags as input (CB(N)Z and TB(N)Z). This is implemented by ensuring in the instruction selectors to not produce these instructions when speculation hardening is enabled. This pass will assert if it does encounter such an instruction. - On function call boundaries, the miss-speculation state is transferred from the taint register X16 to be encoded in the SP register as value 0. Future extensions/improvements could be: - Implement this functionality using full speculation barriers, akin to the x86-slh-lfence option. This may be more useful for the intrinsics-based approach than for the SLH approach to masking. Note that this pass already inserts the full speculation barriers if the function for some niche reason makes use of X16/W16. - no indirect branch misprediction gets protected/instrumented; but this could be done for some indirect branches, such as switch jump tables. Differential Revision: https://reviews.llvm.org/D54896 llvm-svn: 349456	2018-12-18 08:50:02 +00:00
Evandro Menezes	53f0d41dc4	[AArch64] Refactor the Exynos scheduling predicates Refactor the scheduling predicates based on `MCInstPredicate`. In this case, for the Exynos processors. Differential revision: https://reviews.llvm.org/D55345 llvm-svn: 348774	2018-12-10 17:17:26 +00:00
Evandro Menezes	799b76eae2	[AArch64] Fix Exynos predicate Fix predicate for arithmetic instructions with shift and/or extend. llvm-svn: 348510	2018-12-06 18:25:37 +00:00
Simon Pilgrim	58d44235e5	Fix -Wparentheses warning. NFCI. llvm-svn: 348254	2018-12-04 12:24:10 +00:00
Jessica Paquette	bce2086ad1	[MachineOutliner] Move stack instr check logic to getOutliningCandidateInfo This moves the stack check logic into a lambda within getOutliningCandidateInfo. This allows us to be less conservative with stack checks. Whether or not a stack instruction is safe to outline is dependent on the frame variant and call variant of the outlined function; only in cases where we modify the stack can these be unsafe. So, if we move that logic later, when we're looking at an individual candidate, we can make better decisions here. This gives some code size savings as a result. llvm-svn: 348220	2018-12-04 00:31:55 +00:00
Jessica Paquette	2f5833ecd9	[MachineOutliner][AArch64][NFC] Add early exit to candidate discarding logic If we dropped too many candidates to be beneficial when dropping candidates that modify the stack, there's no reason to check for other cost model qualities. llvm-svn: 348219	2018-12-04 00:31:47 +00:00
Jessica Paquette	2accb31690	[MachineOutliner] Drop candidates that require fixups if it's beneficial If it's a bigger code size win to drop candidates that require stack fixups than to demote every candidate to that variant, the outliner should do that. This happens if the number of bytes taken by calls to functions that don't require fixups, plus the number of bytes that'd be left is less than the number of bytes that it'd take to emit a save + restore for all candidates. Also add tests for each possible new behaviour. - machine-outliner-compatible-candidates shows that when we have candidates that don't use the stack, we can use the default call variant along with the no save/regsave variant. - machine-outliner-all-stack shows that when it's better to fix up the stack, we still will demote all candidates to that case - machine-outliner-drop-stack shows that we can discard candidates that require stack fixups when it would be beneficial to do so. llvm-svn: 348168	2018-12-03 19:11:27 +00:00
Jessica Paquette	9a7103b0f8	[MachineOutliner][AArch64] Improve checks for stack instructions If we know that we'll definitely save LR to a register, there's no reason to pre-check whether or not a stack instruction is unsafe to fix up. This makes it so that we check for that condition before mapping instructions. This allows us to outline more, since we don't pessimise as many instructions. Also update some tests, since we outline more. llvm-svn: 348081	2018-12-01 21:24:06 +00:00
Jessica Paquette	1cb18ec4ec	[MachineOutliner] Outline both register save calls + no LR save calls together Instead of treating the outlined functions for these as distinct frames, they should be combined into one case. Neither allows for stack fixups, and both generate the same frame. Thus, they ought to be considered one case. This makes the code far easier to understand, for one thing. It also offers some small code size improvements. It's fairly rare to see a class of outlined functions that doesn't fall entirely into one variant (on CTMark anyway). It does happen from time to time though. This mostly offers some serious simplification. Also update the test to show the added functionality. llvm-svn: 348036	2018-11-30 21:14:58 +00:00
Francis Visoiu Mistrih	0b8dd4488e	[MachineScheduler] Order FI-based memops based on stack direction It makes more sense to order FI-based memops in descending order when the stack goes down. This allows offsets to stay "consecutive" and allow easier pattern matching. llvm-svn: 347906	2018-11-29 20:03:19 +00:00
Francis Visoiu Mistrih	879087ce5b	[MachineScheduler] Add support for clustering mem ops with FI base operands Before this patch, the following stores in `merge_fail` would fail to be merged, while they would get merged in `merge_ok`: ``` void use(unsigned long long ); void merge_fail(unsigned key, unsigned index) { unsigned long long args[8]; args[0] = key; args[1] = index; use(args); } void merge_ok(unsigned long long dst, unsigned a, unsigned b) { dst[0] = a; dst[1] = b; } ``` The reason is that `getMemOpBaseImmOfs` would return false for FI base operands. This adds support for this. Differential Revision: https://reviews.llvm.org/D54847 llvm-svn: 347747	2018-11-28 12:00:28 +00:00
Francis Visoiu Mistrih	d7eebd6d83	[CodeGen][NFC] Make `TII::getMemOpBaseImmOfs` return a base operand Currently, instructions doing memory accesses through a base operand that is not a register can not be analyzed using `TII::getMemOpBaseRegImmOfs`. This means that functions such as `TII::shouldClusterMemOps` will bail out on instructions using an FI as a base instead of a register. The goal of this patch is to refactor all this to return a base operand instead of a base register. Then in a separate patch, I will add FI support to the mem op clustering in the MachineScheduler. Differential Revision: https://reviews.llvm.org/D54846 llvm-svn: 347746	2018-11-28 12:00:20 +00:00
Evandro Menezes	9ef79c884a	[TableGen] Refactor macro names (NFC) Make the names for the macros for `TargetInstrInfo` uniform. llvm-svn: 347706	2018-11-27 20:58:27 +00:00
Evandro Menezes	6a38a5effe	[AArch64] Refactor the scheduling predicates (3/3) (NFC) Refactor the scheduling predicates based on `MCInstPredicate`. In this case, `AArch64InstrInfo::hasExtendedReg()`. Differential revision: https://reviews.llvm.org/D54822 llvm-svn: 347599	2018-11-26 21:47:46 +00:00
Evandro Menezes	56368c6fa5	[AArch64] Refactor the scheduling predicates (2/3) (NFC) Refactor the scheduling predicates based on `MCInstPredicate`. In this case, `AArch64InstrInfo::hasShiftedReg()`. Differential revision: https://reviews.llvm.org/D54820 llvm-svn: 347598	2018-11-26 21:47:41 +00:00
Evandro Menezes	b02ac8bd21	[AArch64] Refactor the scheduling predicates (1/3) (NFC) Refactor the scheduling predicates based on `MCInstPredicate`. In this case, `AArch64InstrInfo::isScaledAddr()` Differential revision: https://reviews.llvm.org/D54777 llvm-svn: 347597	2018-11-26 21:47:28 +00:00
Jessica Paquette	27e1754fc9	[MachineOutliner][NFC] Don't compute liveness if X16/X17/NZCV are unused Using the MBB flags, we can tell if X16/X17/NZCV are unused in a block, and also not live out. If this holds for all MBBs, then we can avoid checking for liveness on that candidate. Furthermore, if it holds for an individual candidate's MBB, then we can avoid checking for liveness on that candidate. llvm-svn: 346901	2018-11-14 22:23:38 +00:00
Jessica Paquette	4e97ec94d9	[MachineOutliner][NFC] Use flags set in all candidates to check for calls If we keep track of if the ContainsCalls bit is set in the MBB flags for each candidate, then we have a better chance of not checking the candidate for calls at all. This saves quite a few checks in some CTMark tests (~200 in Bullet, for example.) llvm-svn: 346816	2018-11-13 23:41:31 +00:00
Jessica Paquette	cad864d49e	[MachineOutliner][NFC] Use MBB flags to avoid call checks in getOutliningInfo We already determine a bunch of information about an MBB in getMachineOutlinerMBBFlags. We can reuse that information to avoid calculating things that must be false/true. The first thing we can easily check is if an outlined sequence could ever contain calls. There's no reason to walk over the outlined range, checking for calls, if we already know that there are no calls in the block containing the sequence. llvm-svn: 346809	2018-11-13 23:01:34 +00:00
Jessica Paquette	b2d53c5d7d	[MachineOutliner][NFC] Exit getOutliningType if there are < 2 candidates Since we never outline anything with fewer than 2 occurrences, there's no reason to compute cost model information if there's less than that. llvm-svn: 346803	2018-11-13 22:16:27 +00:00
Jessica Paquette	106946329d	[MachineOutliner][NFC] Simplify isMBBSafeToOutlineFrom check in AArch64 outliner Turns out it's way simpler to do this check with one LRU. Instead of maintaining two, just keep one. Check if each of the registers is available, and then check if it's a live out from the block. If it's a live out, but available in the block, we know we're in an unsafe case. llvm-svn: 346721	2018-11-13 00:32:09 +00:00
Jessica Paquette	82d9c0a3fa	[MachineOutliner][NFC] Change getMachineOutlinerMBBFlags to isMBBSafeToOutlineFrom Instead of returning Flags, return true if the MBB is safe to outline from. This lets us check for unsafe situations, like say, in AArch64, X17 is live across a MBB without being defined in that MBB. In that case, there's no point in performing an instruction mapping. llvm-svn: 346718	2018-11-12 23:51:32 +00:00
Eli Friedman	ad1151cf6a	[ARM64] [Windows] Handle funclets This patch adds support for funclets in frame lowering and ISel lowering. Together with D50288 and D50166, it enables C++ exception handling. Patch by Sanjin Sijaric, with some fixes by me. Differential Revision: https://reviews.llvm.org/D51524 llvm-svn: 346568	2018-11-09 23:33:30 +00:00
Evandro Menezes	f1a0d93b1d	[PATCH] [AArch64] Refactor helper functions (NFC) Refactor helper functions in AArch64InstrInfo to be static methods. llvm-svn: 346273	2018-11-06 22:17:14 +00:00
Sanjin Sijaric	fadebc8aae	[ARM64] [Windows] Exception handling support in frame lowering Emit pseudo instructions indicating unwind codes corresponding to each instruction inside the prologue/epilogue. These are used by the MCLayer to populate the .xdata section. Differential Revision: https://reviews.llvm.org/D50288 llvm-svn: 345701	2018-10-31 09:27:01 +00:00
Eli Friedman	93d0129b78	[AArch64] [Windows] SEH opcodes should be scheduling boundaries. Prevents the post-RA scheduler from modifying the prologue sequences emitting by frame lowering. This is roughly similar to what we do for other targets: TargetInstrInfo::isSchedulingBoundary checks isPosition(), which checks for CFI_INSTRUCTION. isSEHInstruction is taken from D50288; it'll land with whatever patch lands first. Differential Revision: https://reviews.llvm.org/D53851 llvm-svn: 345634	2018-10-30 19:24:51 +00:00
Evandro Menezes	096e2497b5	[AArch64] Refactor Exynos machine model Effectively, NFC. llvm-svn: 345201	2018-10-24 21:40:43 +00:00
Tim Northover	1c353419ab	AArch64: add a pass to compress jump-table entries when possible. llvm-svn: 345188	2018-10-24 20:19:09 +00:00
Evandro Menezes	769d4cebad	[AArch64] Refactor Exynos machine model (NFC) llvm-svn: 345187	2018-10-24 20:03:24 +00:00
Fangrui Song	2e83b2e9ee	Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFC llvm-svn: 344774	2018-10-19 06:12:02 +00:00
Oliver Stannard	367b4741f4	[AArch64][v8.5A] Don't create BR instructions in outliner when BTI enabled When branch target identification is enabled, we can only do indirect tail-calls through x16 or x17. This means that the outliner can't transform a BLR instruction at the end of an outlined region into a BR. Differential revision: https://reviews.llvm.org/D52869 llvm-svn: 343969	2018-10-08 14:12:08 +00:00
Oliver Stannard	9ecdac8ee0	[AArch64] Fix verifier error when outlining indirect calls The MachineOutliner for AArch64 transforms indirect calls into indirect tail calls, replacing the call with the TCRETURNri pseudo-instruction. This pseudo lowers to a BR, but has the isCall and isReturn flags set. The problem is that TCRETURNri takes a tcGPR64 as the register argument, to prevent indiret tail-calls from using caller-saved registers. The indirect calls transformed by the outliner could use caller-saved registers. This is fine, because the outliner ensures that the register is available at all call sites. However, this causes a verifier failure when the register is not in tcGPR64. The fix is to add a new pseudo-instruction like TCRETURNri, but which accepts any GPR. Differential revision: https://reviews.llvm.org/D52829 llvm-svn: 343959	2018-10-08 09:18:48 +00:00
Matthias Braun	81578e9f77	X86, AArch64, ARM: Do not attach debug location to spill/reload instructions This rebases and recommits r343520. hwasan should be fixed now and this shouldn't break the tests anymore. Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343895	2018-10-05 22:00:13 +00:00
Matthias Braun	0c67a4e958	AArch64: Fix XSeqPairs/WSeqPairs problems - Fix spill/reloads of XSeqPairs failing with vregs (only physregs worked correctly) - Add missing spill/reload code for WSeqPairs class Differential Revision: https://reviews.llvm.org/D52761 llvm-svn: 343799	2018-10-04 17:02:53 +00:00
Matt Morehouse	4b1ec17fb0	Revert "X86, AArch64, ARM: Do not attach debug location to spill/reload instructions" This reverts r343520 due to breakage of HWASan tests on Android. llvm-svn: 343616	2018-10-02 18:35:44 +00:00
Matthias Braun	3e081703c3	X86, AArch64, ARM: Do not attach debug location to spill/reload instructions Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343520	2018-10-01 18:56:39 +00:00
Evandro Menezes	55b9a5395b	[AArch64] Refactor cheap cost model Refactor the order in `TII::isAsCheapAsAMove()` to ease future development and maintenance. Practically NFC. llvm-svn: 343489	2018-10-01 16:11:19 +00:00
Evandro Menezes	fc1852ff1c	[AArch64] Split zero cycle feature more granularly Split the `zcz` feature into specific ones got GP and FP registers, `zcz-gp` and `zcz-fp`, respectively, while retaining the original feature option to mean both. Differential revision: https://reviews.llvm.org/D52621 llvm-svn: 343354	2018-09-28 19:05:09 +00:00
Martin Storsjo	fed420d6b6	[MinGW] [AArch64] Add stubs for potential automatic dllimported variables The runtime pseudo relocations can't handle the AArch64 format PC relative addressing in adrp+add/ldr pairs. By using stubs, the potentially dllimported addresses can be touched up by the runtime pseudo relocation framework. Differential Revision: https://reviews.llvm.org/D51452 llvm-svn: 341401	2018-09-04 20:56:21 +00:00
Martin Storsjo	9e4d5f9b7b	[AArch64] Hook up the missed machine operand flag name for MO_DLLIMPORT llvm-svn: 341178	2018-08-31 08:00:34 +00:00
David Green	9dd1d451d9	[AArch64] Add Tiny Code Model for AArch64 This adds the plumbing for the Tiny code model for the AArch64 backend. This, instead of loading addresses through the normal ADRP;ADD pair used in the Small model, uses a single ADR. The 21 bit range of an ADR means that the code and its statically defined symbols need to be within 1MB of each other. This makes it mostly interesting for embedded applications where we want to fit as much as we can in as small a space as possible. Differential Revision: https://reviews.llvm.org/D49673 llvm-svn: 340397	2018-08-22 11:31:39 +00:00
Fangrui Song	f78650a8de	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293	2018-07-30 19:41:25 +00:00
Jessica Paquette	fa3bee4756	[MachineOutliner][AArch64] Add support for saving LR to a register This teaches the outliner to save LR to a register rather than the stack when possible. This allows us to avoid bumping the stack in outlined functions in some cases. By doing this, in a later patch, we can teach the outliner to do something like this: f1: ... bl OUTLINED_FUNCTION ... f2: ... move LR's contents to a register bl OUTLINED_FUNCTION move the register's contents back instead of falling back to saving LR in both cases. llvm-svn: 338278	2018-07-30 17:45:28 +00:00
Jessica Paquette	f90edbe3d6	Recommit "Enable MachineOutliner by default under -Oz for AArch64" Fixed the ASAN failure from before in r338148, so recommiting. This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338160	2018-07-27 20:18:27 +00:00
Jessica Paquette	9d93c6026a	[MachineOutliner] Exit getOutliningCandidateInfo when we erase all candidates There was a missing check for if a candidate list was entirely deleted. This adds that check. This fixes an asan failure caused by running test/CodeGen/AArch64/addsub_ext.ll with the MachineOutliner enabled. llvm-svn: 338148	2018-07-27 18:21:57 +00:00
Jessica Paquette	faea2d3130	Revert "Enable MachineOutliner by default under -Oz for AArch64" It failed an Asan test on a bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/21543/steps/check-llvm%20asan/logs/stdio Fixing that before recommitting. llvm-svn: 338136	2018-07-27 17:25:38 +00:00
Jessica Paquette	d4229b985c	Enable MachineOutliner by default under -Oz for AArch64 This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338133	2018-07-27 16:44:42 +00:00
Jessica Paquette	69f517df27	[MachineOutliner][NFC] Move target frame info into OutlinedFunction Just some gardening here. Similar to how we moved call information into Candidates, this moves outlined frame information into OutlinedFunction. This allows us to remove TargetCostInfo entirely. Anywhere where we returned a TargetCostInfo struct, we now return an OutlinedFunction. This establishes OutlinedFunctions as more of a general repeated sequence, and Candidates as occurrences of those repeated sequences. llvm-svn: 337848	2018-07-24 20:13:10 +00:00

1 2 3 4 5 ...

394 Commits