llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	cc5a1b3dd9	llvm-reduce: Add cloning of target MachineFunctionInfo MIR support is totally unusable for AMDGPU without this, since the set of reserved registers is set from fields here. Add a clone method to MachineFunctionInfo. This is a subtle variant of the copy constructor that is required if there are any MIR constructs that use pointers. Specifically, at minimum fields that reference MachineBasicBlocks or the MachineFunction need to be adjusted to the values in the new function.	2022-06-07 10:14:48 -04:00
Guillaume Chatelet	0788186182	[Alignment][NFC] Remove usage of MemSDNode::getAlignment I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with TypeSize since it came up a few times. Differential Revision: https://reviews.llvm.org/D126910	2022-06-07 13:52:20 +00:00
David Green	53be6ab25c	[ARM] Fix MVE getShuffleCost legalized type check The MVE shuffle costing for VREV instructions was making incorrect assumptions as to legalized vector types remaining as vectors. Add a quick check to ensure they are indeed vectors before attempting to get the number of elements.	2022-06-07 14:36:04 +01:00
Fangrui Song	15d82c62dc	[MC] De-capitalize MCStreamer functions Follow-up to `c031378ce0` . The class is mostly consistent now.	2022-06-07 00:31:02 -07:00
ksyx	3204272f0f	[ARM] Use llvm::dbgs() to print debug info (NFC) For consistency with other parts of code. Approved by efriedma in differential revision https://reviews.llvm.org/D127055	2022-06-06 16:43:16 -04:00
Martin Storsjö	98dc3e86fd	[ARM] [MinGW] Default to WinEH exception handling instead of Dwarf Switching this target to WinEH also seems to affect the `-windows-itanium` target. Differential Revision: https://reviews.llvm.org/D126870	2022-06-06 23:27:19 +03:00
Fangrui Song	77e300ffdf	[MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline"	2022-06-05 15:11:01 -07:00
Fangrui Song	8c911f8e9a	[ARM][MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline" The directive name is not useful because the next line replicates the error line which includes the directive. The prevailing style uses "expected newline".	2022-06-05 14:53:59 -07:00
Kazu Hirata	3b9707dbc0	[llvm] Convert for_each to range-based for loops (NFC)	2022-06-05 12:07:14 -07:00
Fangrui Song	d86a206f06	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 00:31:44 -07:00
Martin Storsjö	485432f3c8	[ARM] Make a narrow tMOVi8 where possible in SEH prologues We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions in a potential future pass in frame lowering. But for this specific case, where we know we can express the intent with a narrow instruction, change to that instruction form directly in frame lowering. Differential Revision: https://reviews.llvm.org/D126949	2022-06-03 22:33:55 +03:00
Martin Storsjö	bd52506d24	[ARM] Make narrow push/pop in SEH prologues/epilogues where applicable We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions in a potential future pass in frame lowering. But for this specific case, where we know we can express the intent with a narrow instruction, change to that instruction form directly in frame lowering. Differential Revision: https://reviews.llvm.org/D126948	2022-06-03 22:33:55 +03:00
Martin Storsjö	40c937cba2	[ARM] Fix restoring stack for varargs with SEH split frame pointer push Previously, the "add sp, #12" ended up inserted after "bx lr". Differential Revision: https://reviews.llvm.org/D126872	2022-06-03 09:32:00 +03:00
Martin Storsjö	668bb96379	[ARM] Implement lowering of the sponentry intrinsic This is needed for SEH based setjmp on Windows. Differential Revision: https://reviews.llvm.org/D126763	2022-06-02 12:29:59 +03:00
Martin Storsjö	2ab19bfa41	[ARM] Adjust the frame pointer when it's needed for SEH unwinding For functions that require restoring SP from FP (e.g. that need to align the stack, or that have variable sized allocations), the prologue and epilogue previously used to look like this: push {r4-r5, r11, lr} add r11, sp, #8 ... sub r4, r11, #8 mov sp, r4 pop {r4-r5, r11, pc} This is problematic, because this unwinding operation (restoring sp from r11 - offset) can't be expressed with the SEH unwind opcodes (probably because this unwind procedure doesn't map exactly to individual instructions; note the detour via r4 in the epilogue too). To make unwinding work, the GPR push is split into two; the first one pushing all other registers, and the second one pushing r11+lr, so that r11 can be set pointing at this spot on the stack: push {r4-r5} push {r11, lr} mov r11, sp ... mov sp, r11 pop {r11, lr} pop {r4-r5} bx lr For the same setup, MSVC generates code that uses two registers; r11 still pointing at the {r11,lr} pair, but a separate register used for restoring the stack at the end: push {r4-r5, r7, r11, lr} add r11, sp, #12 mov r7, sp ... mov sp, r7 pop {r4-r5, r7, r11, pc} For cases with clobbered float/vector registers, they are pushed after the GPRs, before the {r11,lr} pair. Differential Revision: https://reviews.llvm.org/D125649	2022-06-02 12:28:46 +03:00
Martin Storsjö	d8e67c1ccc	[ARM] Add SEH opcodes in frame lowering Skip inserting regular CFI instructions if using WinCFI. This is based a fair amount on the corresponding ARM64 implementation, but instead of trying to insert the SEH opcodes one by one where we generate other prolog/epilog instructions, we try to walk over the whole prolog/epilog range and insert them. This is done because in many cases, the exact number of instructions inserted is abstracted away deeper. For some cases, we manually insert specific SEH opcodes directly where instructions are generated, where the automatic mapping of instructions to SEH opcodes doesn't hold up (e.g. for __chkstk stack probes). Skip Thumb2SizeReduction for SEH prologs/epilogs, and force tail calls to wide instructions (just like on MachO), to make sure that the unwind info actually matches the width of the final instructions, without heuristics about what later passes will do. Mark SEH instructions as scheduling boundaries, to make sure that they aren't reordered away from the instruction they describe by PostRAScheduler. Mark the SEH instructions with the NoMerge flag, to avoid doing tail merging of functions that have multiple epilogs that all end with the same sequence of "b <other>; .seh_nop_w, .seh_endepilogue". Differential Revision: https://reviews.llvm.org/D125648	2022-06-02 12:28:46 +03:00
Martin Storsjö	298e9cac92	[MC] [Win64EH] Check that the SEH unwind opcodes match the actual instructions It's a fairly common issue that the generating code incorrectly marks instructions as narrow or wide; check that the instruction lengths add up to the expected value, and error out if it doesn't. This allows catching code generation bugs. Also check that prologs and epilogs are properly terminated, to catch other code generation issues. Differential Revision: https://reviews.llvm.org/D125647	2022-06-01 11:25:49 +03:00
Martin Storsjö	6b75a3523f	[ARM] [MC] Add support for writing ARM WinEH unwind info This includes .seh_* directives for generating it from assembly. It is designed fairly similarly to the ARM64 handling. For .seh_handler directives, such as ".seh_handler __C_specific_handler, @except" (which is supported on x86_64 and aarch64 so far), the "@except" bit doesn't work in ARM assembly, as '@' is used as a comment character (on all current platforms). Allow using '%' instead of '@' for this purpose. This convention is used by GAS in similar contexts already, e.g. [1]: Note on targets where the @ character is the start of a comment (eg ARM) then another character is used instead. For example the ARM port uses the % character. In practice, this unfortunately means that all such .seh_handler directives will need ifdefs for ARM. Contrary to ARM64, on ARM, it's quite common that we can't evaluate e.g. the function length at this point, due to instructions whose length is finalized later. (Also, inline jump tables end with a ".p2align 1".) If unable to to evaluate the function length immediately, emit it as an MCExpr instead. If we'd implement splitting the unwind info for a function (which isn't implemented for ARM64 yet either), we wouldn't know whether we need to split it though. Avoid calling getFrameIndexOffset() on an unset FuncInfo.UnwindHelpFrameIdx, to avoid triggering asserts in the preexisting testcase CodeGen/ARM/Windows/wineh-basic.ll. (Once MSVC exception handling is fully implemented, those changes can be reverted.) [1] https://sourceware.org/binutils/docs/as/Section.html#Section Differential Revision: https://reviews.llvm.org/D125645	2022-06-01 11:25:48 +03:00
Zongwei Lan	ad73ce318e	[Target] use getSubtarget<> instead of static_cast<>(getSubtarget()) Differential Revision: https://reviews.llvm.org/D125391	2022-05-26 11:22:41 -07:00
David Penry	917dc0749b	[ARM] Recognize t2LoopEnd for software pipelining - Add t2LoopEnd to TargetInstrInfo::analyzeBranch and related functions. As there are many side effects of analyzing a branch, only do so if software pipelining is enabled to maintain previous behavior when pipelining is not desired. - Make sure that t2LoopEndDec is immediately followed by a t2B when it is synthesized from a t2LoopEnd. This is done because the t2LoopEnd might have acquired a fall-through path, but IfConversion assumes that fall-through are only possible on analyzable branches. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D126322	2022-05-26 09:55:42 -07:00
Fangrui Song	9ee15bba47	[MC] Lower case the first letter of EmitCOFF* EmitWin* EmitCV*. NFC	2022-05-26 00:14:08 -07:00
Maksim Panchenko	bed9efed71	[MCDisassembler] Disambiguate Size parameter in tryAddingSymbolicOperand() MCSymbolizer::tryAddingSymbolicOperand() overloaded the Size parameter to specify either the instruction size or the operand size depending on the architecture. However, for proper symbolic disassembly on X86, we need to know both sizes, as an instruction can have two operands, and the instruction size cannot be reliably calculated based on the operand offset and its size. Hence, split Size into OpSize and InstSize. For X86, the new interface allows to fix a couple of issues: * Correctly adjust the value of PC-relative operands. * Set operand size to zero when the operand is specified implicitly. Differential Revision: https://reviews.llvm.org/D126101	2022-05-25 13:44:32 -07:00
David Green	18cb3b3506	[ARM] Fix vcvtb/t.f16 input liveness The `vcvtb.f16.f32 Sd, Sn` (and vcvtt.f16.f32) instruction convert a f32 into a f16, writing either the top or bottom halves of the register. That means that half of the input register Sd is used in the output. This wasn't being modelled in the instructions, leading later analyses to believe that the registers were dead where they were not, generating invalid scheduling Fix that be specifying the input Sda register for the instructions too, allowing them to be set for cases like vector inserts. Most of the changes are plumbing through the constraint string, cstr. Differential Revision: https://reviews.llvm.org/D126118	2022-05-25 12:16:26 +01:00
David Green	a86cfaea54	[ARM] Add register-mask for tail returns The TC_RETURN/TCRETURNdi under Arm does not currently add the register-mask operand when tail folding, which leads to the register (like LR) not being 'used' by the return. This changes the code to unconditionally set the register mask on the call, as opposed to skipping it for tail calls. I don't believe this will currently alter any codegen, but should glue things together better post-frame lowering. It matches the AArch64 code better. Differential Revision: https://reviews.llvm.org/D125906	2022-05-21 15:28:24 +01:00
David Green	b4dd9fc370	[ARM] Cost modelling for MVE vector fptoi_sat Building on top of D125665, this adds MVE costs for fptosi.sat and fptoui.sat, providing MVE is available and the types are legal. Differential Revision: https://reviews.llvm.org/D125666	2022-05-20 11:00:34 +01:00
David Green	80aab0312a	[ARM] Cost modelling for scalar fptoi_sat Similar to D124357, this adds some cost modelling for fptoi_sat for Arm targets. Where VFP2 is available (and FP64/FP16 for the relevant types), the operations are legal as the Arm instructions naturally saturate. Otherwise they will need an extra smin/smax clamp, similar to AArch64. Differential Revision: https://reviews.llvm.org/D125665	2022-05-19 19:53:21 +01:00
Archibald Elliott	2321c36fbf	[ARM] Don't Enable AES Pass for Generic Cores This brings clang/llvm into line with GCC. The Pass is still enabled for the affected cores, but is now opt-in when using `-march=`. I also took the opportunity to add release notes for this change. Reviewed By: john.brawn Differential Revision: https://reviews.llvm.org/D125775	2022-05-18 13:10:31 +01:00
Martin Storsjö	68f37e7991	[ARM] Rename the isARMAreaXRegister parameter isIOS to SplitFramePushPop. NFC. In `f8b0a7af52` in 2016, this parameter was generalized on the caller side (previously passing STI.isTargetMachO(), now passing STI.splitFramePushPop()). Rename the parameter on the receiver side to match the generalization. Differential Revision: https://reviews.llvm.org/D125681	2022-05-17 00:41:38 +03:00
NAKAMURA Takumi	bdab5c4b3d	ARMFixCortexA57AES1742098Pass.cpp: Suppress a warning. [-Wunused-but-set-variable]	2022-05-15 18:01:42 +09:00
Sheng	c644488a8b	Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h` The name `MCFixedLenDisassembler.h` is out of date after D120958. Rename it as `MCDecoderOps.h` to reflect the change. Reviewed By: myhsu Differential Revision: https://reviews.llvm.org/D124987	2022-05-15 08:44:58 +08:00
Archibald Elliott	3a24df992c	[ARM] Pass for Cortex-A57 and Cortex-A72 Fused AES Erratum This adds a late Machine Pass to work around a Cortex CPU Erratum affecting Cortex-A57 and Cortex-A72: - Cortex-A57 Erratum 1742098 - Cortex-A72 Erratum 1655431 The pass inserts instructions to make the inputs to the fused AES instruction pairs no longer trigger the erratum. Here the pass errs on the side of caution, inserting the instructions wherever we cannot prove that the inputs came from a safe instruction. The pass is used: - for Cortex-A57 and Cortex-A72, - for "generic" cores (which are used when using `-march=`), - when the user specifies `-mfix-cortex-a57-aes-1742098` or `mfix-cortex-a72-aes-1655431` in the command-line arguments to clang. Reviewed By: dmgreen, simon_tatham Differential Revision: https://reviews.llvm.org/D119720	2022-05-13 10:47:33 +01:00
David Green	f848798b7d	[ARM] Delay creation of MVE Imm shifts to legalization The reasoning for creating VSHLIMM/VSHRsIMM/VSHRuIMM nodes in a combine - because matching i64 constants is difficult - does not apply for MVE, as there are not v2i64 shifts. Delaying the creation of the nodes can allow extra transforms on target independant shl/shr.	2022-05-04 22:12:09 +01:00
serge-sans-paille	7030654296	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `fa5a4e1b95` detected a few regressions, fixing them. Differential Revision: https://reviews.llvm.org/D124847	2022-05-04 08:32:38 +02:00
Zhiyao Ma	bd606afe26	[ARM] Only update the successor edges for immediate predecessors of PrologueMBB When adjusting the function prologue for segmented stacks, only update the successor edges of the immediate predecessors of the original prologue. Differential Revision: https://reviews.llvm.org/D122959	2022-05-03 12:36:35 +01:00
David Penry	dcb77643e3	Reapply [CodeGen][ARM] Enable Swing Module Scheduling for ARM Fixed "private field is not used" warning when compiled with clang. original commit: `28d09bbbc3` reverted in: `fa49021c68` ------ This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch. MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body. The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues. It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D122672	2022-04-29 10:54:39 -07:00
David Penry	fa49021c68	Revert "[CodeGen][ARM] Enable Swing Module Scheduling for ARM" This reverts commit `28d09bbbc3` while I investigate a buildbot failure.	2022-04-28 13:29:27 -07:00
David Penry	28d09bbbc3	[CodeGen][ARM] Enable Swing Module Scheduling for ARM This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch. MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body. The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues. It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D122672	2022-04-28 13:01:18 -07:00
Ties Stuij	051deb2d9d	[ARM] add Armv9 build attribute The build attribute number can be found in the Arm ABI addenda32 document: https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#335target-related-attributes Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D124090	2022-04-28 10:48:26 +01:00
Vasileios Porpodas	fa8a9fea47	Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`" This reverts commit `6a9bbd9f20`. Code review: https://reviews.llvm.org/D124202	2022-04-26 14:02:40 -07:00
Vasileios Porpodas	6a9bbd9f20	Revert "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`" This reverts commit `55ce296d6f`.	2022-04-26 11:25:26 -07:00
Vasileios Porpodas	55ce296d6f	[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost` Before this patch `Args` was used to pass a broadcat's arguments by SLP. This patch changes this. `Args` is now used for passing the operands of the shuffle. Differential Revision: https://reviews.llvm.org/D124202	2022-04-26 11:11:29 -07:00
Matt Arsenault	d7938b1a81	MachineModuleInfo: Move HasSplitStack handling to AsmPrinter This is used to emit one field in doFinalization for the module. We can accumulate this when emitting all individual functions directly in the AsmPrinter, rather than accumulating additional state in MachineModuleInfo. Move the special case behavior predicate into MachineFrameInfo to share it. This now promotes it to generic behavior. I'm assuming this is fine because no other target implements adjustForSegmentedStacks, or has tests using the split-stack attribute.	2022-04-20 10:54:29 -04:00
Ilia Diachkov	6c69427e88	[SPIR-V](3/6) Add MC layer, object file support, and InstPrinter The patch adds SPIRV-specific MC layer implementation, SPIRV object file support and SPIRVInstPrinter. Differential Revision: https://reviews.llvm.org/D116462 Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov, Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com> Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com> Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com> Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com> Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>	2022-04-20 01:10:25 +02:00
Craig Topper	65942554e2	[ARM] Add missing return to ARMTTIImpl::isLoweredToCall. I assume we meant to return the result of the call to BaseT::isLoweredToCall(F). This might not be a functional change in practice because it would still hit the default case in the switch and call BaseT::isLoweredToCall(F) at the end. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D123333	2022-04-07 12:52:54 -07:00
Ties Stuij	edeceb8647	remove dead code in parseRegisterList checking for ARM::RA_AUTH_CODE Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122577	2022-04-07 14:53:46 +01:00
Matt Arsenault	c4ea925f50	AtomicExpand: Change return type for shouldExpandAtomicStoreInIR Use the same enum as the other atomic instructions for consistency, in preparation for addition of another strategy. Introduce a new "Expand" option, since the store expansion does not use cmpxchg. Alternatively, the existing CmpXChg strategy could be renamed to Expand.	2022-04-06 22:34:04 -04:00
Matt Arsenault	0fb6856aff	ARM/GlobalISel: Get pointer type from value instead of getPointerSize Avoid using getPointerSize and pass through the original value type.	2022-03-31 16:46:23 -04:00
Chris Bieneman	9130e471fe	Add DXContainer DXIL is wrapped in a container format defined by the DirectX 11 specification. Codebases differ in calling this format either DXBC or DXILContainer. Since eventually we want to add support for DXBC as a target architecture and the format is used by DXBC and DXIL, I've termed it DXContainer here. Most of the changes in this patch are just adding cases to switch statements to address warnings. Reviewed By: pete Differential Revision: https://reviews.llvm.org/D122062	2022-03-29 14:34:23 -05:00
Shao-Ce SUN	662b9fa02c	[NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122557	2022-03-29 09:53:24 +08:00
Kazu Hirata	6212871968	[Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC)	2022-03-27 22:22:37 -07:00
Maksim Panchenko	4ae9745af1	[Disassember][NFCI] Use strong type for instruction decoder All LLVM backends use MCDisassembler as a base class for their instruction decoders. Use "const MCDisassembler " for the decoder instead of "const void ". Remove unnecessary static casts. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D122245	2022-03-25 18:53:59 -07:00
Vasileios Porpodas	39aa202aff	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `e6ead19b77`.	2022-03-23 18:32:17 -07:00
Arthur Eubanks	e6ead19b77	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash." This reverts commit `27bd8f9492`. Causes crashes, see comments in D121973	2022-03-23 10:57:45 -07:00
Vasileios Porpodas	27bd8f9492	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `f7d7d2a08d`.	2022-03-22 16:41:55 -07:00
Arthur Eubanks	f7d7d2a08d	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads."" This reverts commit `79613185d3`. Causes crashes, see comments in https://reviews.llvm.org/D121973.	2022-03-22 13:33:49 -07:00
Craig Topper	9b0f227d7b	[TableGen][RISCV] Add InstAliases with zero_reg to cover unmasked vnot.v, vncvt.x.x.w, vneg.v, etc. The mask being NoRegister prevented the existing aliases from matching since NoRegister isn't in the VMV0 register class. To workaround this I've added new aliases that look for zero_reg. I had to motify tablegen to generate matching code for zero_reg. And as a consequence, I had to change the EmitPriority for an ARM alias that used zero_reg that started printing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121496	2022-03-22 10:14:43 -07:00
Vasileios Porpodas	79613185d3	Recommit "[SLP] Fix lookahead operand reordering for splat loads." Original review: https://reviews.llvm.org/D121354 The original commit `9136145eb0` broke the build on several targets. Differential Revision: https://reviews.llvm.org/D121973	2022-03-21 15:57:32 -07:00
Eli Friedman	ddca66622c	[ARM] Fix shouldExpandAtomicLoadInIR for subtargets without ldrexd. Regression from 2f497ec3; we should not try to generate ldrexd on targets that don't have it. Also, while I'm here, fix shouldExpandAtomicStoreInIR, for consistency. That doesn't really have any practical effect, though. On Thumb targets where we need to use __sync_* libcalls, there is no libcall for stores, so SelectionDAG calls __sync_lock_test_and_set_8 anyway.	2022-03-18 15:54:38 -07:00
Eli Friedman	2f497ec3a0	[ARM] Fix ARM backend to correctly use atomic expansion routines. Without this patch, clang would generate calls to __sync_* routines on targets where it does not make sense; we can't assume the routines exist on unknown targets. Linux has special implementations of the routines that work on old ARM targets; other targets have no such routines. In general, atomics operations which aren't natively supported should go through libatomic (__atomic_) APIs, which can support arbitrary atomics through locks. ARM targets older than v6, where this patch makes a difference, are rare in practice, but not completely extinct. See, for example, discussion on D116088. This also affects Cortex-M0, but I don't think __sync_ routines actually exist in any Cortex-M0 libraries. So in practice this just leads to a slightly different linker error for those cases, I think. Mechanically, this patch does the following: - Ensures we run atomic expansion unconditionally; it never makes sense to completely skip it. - Fixes getMaxAtomicSizeInBitsSupported() so it returns an appropriate number on all ARM subtargets. - Fixes shouldExpandAtomicRMWInIR() and shouldExpandAtomicCmpXchgInIR() to correctly handle subtargets that don't have atomic instructions. Differential Revision: https://reviews.llvm.org/D120026	2022-03-18 12:43:57 -07:00
Tomas Matheson	831ab35b2f	[ARM][AArch64] generate subtarget feature flags Reland of D120906 after sanitizer failures. This patch aims to reduce a lot of the boilerplate around adding new subtarget features. From the SubtargetFeatures tablegen definitions, a series of calls to the macro GET_SUBTARGETINFO_MACRO are generated in ARM/AArch64GenSubtargetInfo.inc. ARMSubtarget/AArch64Subtarget can then use this macro to define bool members and the corresponding getter methods. Some naming inconsistencies have been fixed to allow this, and one unused member removed. This implementation only applies to boolean members; in future both BitVector and enum members could also be generated. Differential Revision: https://reviews.llvm.org/D120906	2022-03-18 16:07:00 +00:00
Tomas Matheson	62c481542e	Revert "[ARM][AArch64] generate subtarget feature flags" This reverts commit `dd8b0fecb9`.	2022-03-18 11:58:20 +00:00
Tomas Matheson	dd8b0fecb9	[ARM][AArch64] generate subtarget feature flags This patch aims to reduce a lot of the boilerplate around adding new subtarget features. From the SubtargetFeatures tablegen definitions, a series of calls to the macro GET_SUBTARGETINFO_MACRO are generated in ARM/AArch64GenSubtargetInfo.inc. ARMSubtarget/AArch64Subtarget can then use this macro to define bool members and the corresponding getter methods. Some naming inconsistencies have been fixed to allow this, and one unused member removed. This implementation only applies to boolean members; in future both BitVector and enum members could also be generated. Differential Revision: https://reviews.llvm.org/D120906	2022-03-18 11:48:20 +00:00
Sterling Augustine	bd38234d76	Reland "Use a stable-sort when combining bases" Differential Revision: https://reviews.llvm.org/D121922	2022-03-17 11:32:16 -07:00
Archibald Elliott	f496330f97	[ARM] Fix Decode of tsb csync There is a crash in the ARM backend when attempting to decode a "tsb csync" instruction using `llvm-objdump --triple=armv8.4a -d`. The crash was in `ARMMCInstrAnalysis::evaluateBranch` where the number of operands in the decoded instruction (0) did not match the number of operands in the instruction description (1). This is becuase `tsb csync` looks like it has an operand during assembly, but there is only one valid operand (csync), so there is no encoding space in the instruction for the operand, so the decoder never has a field to decode that represents `csync`. The fix is to add a custom decode method, which ensures that this instruction does have the right number of operands after decoding. This method merely adds the only available operand value, `ARM_TSB::CSYNC`. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D121479	2022-03-17 17:29:31 +00:00
Sterling Augustine	84810e1f74	Revert "Use a stable-sort when combining bases" This reverts commit `81417261a1`.	2022-03-17 09:54:13 -07:00
Sterling Augustine	81417261a1	Use a stable-sort when combining bases While experimenting with different algorithms for std::sort I discovered that combine-vmovdrr.ll fails if this sort is not stable. I suspect that the test is too stringent in its check--the resultant code looks functionally identical to me under both stable and unstable sorting, but a generic fix is quite a bit more difficult to implement. Thanks to scw@google.com for finding the proper fix. Differential Revision: https://reviews.llvm.org/D121870	2022-03-17 09:02:13 -07:00
Arthur Eubanks	2371c5a0e0	[OpaquePtr][ARM] Use elementtype on ldrex/ldaex/stlex/strex Includes verifier changes checking the elementtype, clang codegen changes to emit the elementtype, and ISel changes using the elementtype. Basically the same as D120527. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D121847	2022-03-16 14:11:53 -07:00
Shengchen Kan	37b378386e	[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments	2022-03-16 20:25:42 +08:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Fangrui Song	689c3a2552	[MC] Fix letter case of some MCSection member functions	2022-03-11 20:07:00 -08:00
Zhiyao Ma	adc26b4eae	[ARM] Fix 8-bit immediate overflow in the instruction of segmented stack prologue. It fixes the overflow of 8-bit immediate field in the emitted instruction that allocates large stacklet. For thumb2 targets, load large immediate by a pair of movw and movt instruction. For thumb1 and ARM targets, load large immediate by reading from literal pool. Differential Revision: https://reviews.llvm.org/D118545	2022-03-10 15:15:24 -08:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit `7f230feeea`. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Mircea Trofin	cb2160760e	[nfc][codegen] Move RegisterBank[Info].h under CodeGen This wraps up from D119053. The 2 headers are moved as described, fixed file headers and include guards, updated all files where the old paths were detected (simple grep through the repo), and `clang-format`-ed it all. Differential Revision: https://reviews.llvm.org/D119876	2022-03-01 21:53:25 -08:00
Jameson Nash	c4b1a63a1b	mark getTargetTransformInfo and getTargetIRAnalysis as const Seems like this can be const, since Passes shouldn't modify it. Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D120518	2022-02-25 14:30:44 -05:00
Craig Topper	c7d6448d03	[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable. Internally to DAGCombiner the SDValues were passed by non-const reference despite not being modified. They were then passed by const reference to TLI. This patch passes them by value which is consistent with the vast majority of code. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120420	2022-02-23 12:40:45 -08:00
David Green	a10789d6cd	[ARM] Recognize SSAT and USAT from SMIN/SMAX We have some recognition of SSAT and USAT from SELECT_CC at the moment. This extends the matching to SMIN/SMAX which can help catch more cases, either from min/max being the canonical form in instcombine or from some expanded nodes like fp_to_si_sat. Differential Revision: https://reviews.llvm.org/D119819	2022-02-23 08:55:54 +00:00
Egor Zhdan	3a1cb36237	Add DriverKit support This patch is the first in a series of patches to upstream the support for Apple's DriverKit. Once complete, it will allow targeting DriverKit platform with Clang similarly to AppleClang. This code was originally authored by JF Bastien. Differential Revision: https://reviews.llvm.org/D118046	2022-02-22 13:42:53 +00:00
Craig Topper	a6fb1bb306	[ARM] Remove unused lowerABS function. NFC This function was added in D49837, but no setOperationAction call was added with it. The code is equivalent to what is done by the default ExpandIntRes_ABS implementation when ADDCARRY is supported. Test case added to verify this. There was some existing coverage from Thumb2 MVE tests, but they started from vectors.	2022-02-20 22:43:23 -08:00
Craig Topper	440c4b705a	[SelectionDAG][RISCV][ARM][PowerPC][X86][WebAssembly] Change default abs expansion to use sra (X, size(X)-1); sub (xor (X, Y), Y). Previous we used sra (X, size(X)-1); xor (add (X, Y), Y). By placing sub at the end, we allow RISCV to combine sign_extend_inreg with it to form subw. Some X86 tests for Z - abs(X) seem to have improved as well. Other targets look to be a wash. I had to modify ARM's abs matching code to match from sub instead of xor. Maybe instead ISD::ABS should be made legal. I'll try that in parallel to this patch. This is an alternative to D119099 which was focused on RISCV only. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D119171	2022-02-20 21:11:23 -08:00
Simon Pilgrim	ae4bec20c4	[ARM] ARMAsmPrinter::emitAttributes - remove unnecessary nullptr test. The MMI pointer has already been dereferenced several times.	2022-02-18 10:36:40 +00:00
Jessica Paquette	6d58f4ab07	[MachineOutliner] NFC: Hide LRU-related stuff behind helper functions It's not particularly user-friendly to have to call `initLRU` everywhere. Also, it wasn't particularly great that the LRU for registers used in a sequence was also initialized by `initLRU`. This patch hides this stuff behind some helper functions: * `isAvailableAcrossAndOutOfSeq` * `isAnyUnavailableAcrossOrOutOfSeq` * `isAvailableInsideSeq` This allows the user to avoid calling `initLRU` explicitly. Also, it allows us to separate initializing the used-in-sequence LRU from the main LRU. Since both ARM and AArch64 check LR liveness in `insertOutlinedCall`, this refactor requires that we de-const the Candidate there. Some other quality-of-code improvements: * LRUs in outliner::Candidate now have more descriptive names * Use `Register` instead of `unsigned` in some places * Improve readability in some places by using ranges rather than `std::for_each` This is a preparatory commit for a larger compile time related change for the AArch64 outliner.	2022-02-16 11:39:07 -08:00
Shao-Ce SUN	2aed07e96c	[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter` Reviewed By: skan Differential Revision: https://reviews.llvm.org/D119846	2022-02-16 13:10:09 +08:00
David Green	ea6ebbcfb3	[ARM] MVE hadd and rhadd This uses the nodes from D106237 to add MVE HADD and RHADD lowering. Differential Revision: https://reviews.llvm.org/D106238	2022-02-14 11:55:40 +00:00
Yuanfang Chen	f927021410	Reland "[clang-cl] Support the /JMC flag" This relands commit `b380a31de0`. Restrict the tests to Windows only since the flag symbol hash depends on system-dependent path normalization.	2022-02-10 15:16:17 -08:00
Yuanfang Chen	b380a31de0	Revert "[clang-cl] Support the /JMC flag" This reverts commit `bd3a1de683`. Break bots: https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8822587673277278177/overview	2022-02-10 14:17:37 -08:00
Yuanfang Chen	bd3a1de683	[clang-cl] Support the /JMC flag The introduction and some examples are on this page: https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/ The `/JMC` flag enables these instrumentations: - Insert at the beginning of every function immediately after the prologue with a call to `void __fastcall __CheckForDebuggerJustMyCode(unsigned char *JMC_flag)`. The argument for `__CheckForDebuggerJustMyCode` is the address of a boolean global variable (the global variable is initialized to 1) with the name convention `__<hash>_<filename>`. All such global variables are placed in the `.msvcjmc` section. - The `<hash>` part of `__<hash>_<filename>` has a one-to-one mapping with a directory path. MSVC uses some unknown hashing function. Here I used DJB. - Add a dummy/empty COMDAT function `__JustMyCode_Default`. - Add `/alternatename:__CheckForDebuggerJustMyCode=__JustMyCode_Default` link option via ".drectve" section. This is to prevent failure in case `__CheckForDebuggerJustMyCode` is not provided during linking. Implementation: All the instrumentations are implemented in an IR codegen pass. The pass is placed immediately before CodeGenPrepare pass. This is to not interfere with mid-end optimizations and make the instrumentation target-independent (I'm still working on an ELF port in a separate patch). Reviewed By: hans Differential Revision: https://reviews.llvm.org/D118428	2022-02-10 10:26:30 -08:00
Oliver Stannard	a76620143c	[ARM] Patterns for vector conversion between half and float These patterns were omitted because clang only allows converting between these types using intrinsics, but other front-ends or optimisation passes may want to use them. Differential revision: https://reviews.llvm.org/D119354	2022-02-10 09:51:55 +00:00
serge-sans-paille	ef736a1c39	Cleanup LLVMMC headers There's a few relevant forward declarations in there that may require downstream adding explicit includes: llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h Counting preprocessed lines required to rebuild llvm-project on my setup: before: 1052436830 after: 1049293745 Which is significant and backs up the change in addition to the usual benefits of decreasing coupling between headers and compilation units. Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119244	2022-02-09 11:09:17 +01:00
Mark Murray	3d7662142d	[ARM] Undeprecate complex IT blocks AArch32/Armv8A introduced the performance deprecation of certain patterns of IT instructions. After some debate internal to ARM, this is now being reverted; i.e. no IT instruction patterns are performance deprecated anymore, as the perfomance degredation is not significant enough. This reverts the following: "ARMv8-A deprecates some uses of the T32 IT instruction. All uses of IT that apply to instructions other than a single subsequent 16-bit instruction from a restricted set are deprecated, as are explicit references to the PC within that single 16-bit instruction. This permits the non-deprecated forms of IT and subsequent instructions to be treated as a single 32-bit conditional instruction." The deprecation no longer applies, but the behaviour may be controlled by the -arm-restrict-it and -arm-no-restrict-it command-line options, with the latter being the default. No warnings about complex IT blocks will be generated. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D118044	2022-02-07 15:47:53 +00:00
Kazu Hirata	3a3cb929ab	[llvm] Use = default (NFC)	2022-02-06 22:18:35 -08:00
David Green	b7d3a2b62f	[ARM] Mark i64 and f64 shuffles as Custom for MVE This way they get lowered through the ARMISD::BUILD_VECTOR, which can produce more efficient D register moves. Also helps D115653 not get stuck in a loop.	2022-02-06 16:17:06 +00:00
Simon Pilgrim	5aa2acc86b	[DAG] SimplifyDemandedVectorElts - remove KnownZero/KnownUndef from DCI helper wrapper None of the external users actual touch these (they're purely used internally down the recursive call) - its trivial to add another wrapper if anything ever does want to track known elements.	2022-02-02 12:04:49 +00:00
tyb0807	762f0b5463	[ARM] Make getInstSizeInBytes() use instruction size from InstrInfo.td Currently, ARMBaseInstrInfo::getInstSizeInBytes() uses hard-coded instruction size for some pseudo-instructions, while this information should ideally be found in ARMInstrInfo.td, ARMInstrThumb(2).td files (which can be accessed via MCInstrDesc). Hence, the .td files should be updated and no hard-coded instruction sizes should be used by getInstSizeInBytes() anymore. Differential Revision: https://reviews.llvm.org/D118009	2022-02-01 10:39:14 +00:00
David Sherwood	daa80339df	[CodeGen] Support folds of not(cmp(cc, ...)) -> cmp(!cc, ...) for scalable vectors I have updated TargetLowering::isConstTrueVal to also consider SPLAT_VECTOR nodes with constant integer operands. This allows the optimisation to also work for targets that support scalable vectors. Differential Revision: https://reviews.llvm.org/D117210	2022-02-01 09:50:00 +00:00
Ties Stuij	6b1e844b69	[ARM] Add Cortex-X1C Support for Clang and LLVM This patch upstreams support for the Arm-v8 Cortex-X1C processor for AArch64 and ARM. For more information, see: - https://community.arm.com/arm-community-blogs/b/announcements/posts/arm-cortex-x1c - https://developer.arm.com/documentation/101968/0002/Functional-description/Technical-overview/Components The following people contributed to this patch: - Simon Tatham - Ties Stuij Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D117202	2022-01-31 14:23:35 +00:00
Nikita Popov	93122b2567	[ARM] Don't look through pointer types in canTailPredicateLoop() Inspecting the pointer element type here is incompatible with opaque pointers, and doesn't seem necessary to me. I think the intention might have been to check the type of load/store pointer arguments, but I believe those should get checked through their return type or value operand anyway. I don't get any test failures if I simply drop this. Differential Revision: https://reviews.llvm.org/D118353	2022-01-28 09:34:13 +01:00
David Green	82973edfb7	[ARM][AArch64] Introduce qrdmlah and qrdmlsh intrinsics Since it's introduction, the qrdmlah has been represented as a qrdmulh and a sadd_sat. This doesn't produce the same result for all input values though. This patch fixes that by introducing a qrdmlah (and qrdmlsh) intrinsic specifically for the vqrdmlah and sqrdmlah instructions. The old test cases will now produce a qrdmulh and sqadd, as expected. Fixes #53120 and #50905 and #51761. Differential Revision: https://reviews.llvm.org/D117592	2022-01-27 19:19:46 +00:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit `ef82063207`. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00
Jim Lin	da1cac7d19	[NFC] Remove duplicate include	2022-01-26 15:10:16 +08:00
Nikita Popov	aa97bc116d	[NFC] Remove uses of PointerType::getElementType() Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.	2022-01-25 09:44:52 +01:00
serge-sans-paille	75e164f61d	[llvm] Cleanup header dependencies in ADT and Support The cleanup was manual, but assisted by "include-what-you-use". It consists in 1. Removing unused forward declaration. No impact expected. 2. Removing unused headers in .cpp files. No impact expected. 3. Removing unused headers in .h files. This removes implicit dependencies and is generally considered a good thing, but this may break downstream builds. I've updated llvm, clang, lld, lldb and mlir deps, and included a list of the modification in the second part of the commit. 4. Replacing header inclusion by forward declaration. This has the same impact as 3. Notable changes: - llvm/Support/TargetParser.h no longer includes llvm/Support/AArch64TargetParser.h nor llvm/Support/ARMTargetParser.h - llvm/Support/TypeSize.h no longer includes llvm/Support/WithColor.h - llvm/Support/YAMLTraits.h no longer includes llvm/Support/Regex.h - llvm/ADT/SmallVector.h no longer includes llvm/Support/MemAlloc.h nor llvm/Support/ErrorHandling.h You may need to add some of these headers in your compilation units, if needs be. As an hint to the impact of the cleanup, running clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Support/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l before: 8000919 lines after: 7917500 lines Reduced dependencies also helps incremental rebuilds and is more ccache friendly, something not shown by the above metric :-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831	2022-01-21 13:54:49 +01:00
Jim Lin	d6b0734837	[NFC] Use Register instead of unsigned	2022-01-19 20:17:04 +08:00
Fangrui Song	2e589c9c42	[MC][ARM] Replace MCContext::reportFatalError call with reportError This call is slightly try. We need to postpone getFixupKindNumBytes.	2022-01-15 00:32:03 -08:00
Fangrui Song	e2b66928e5	[MC][ARM] Replace MCContext::reportFatalError call with reportError	2022-01-15 00:13:49 -08:00
Ties Stuij	7c70f96a91	[ARM] fix bug causing shrinkwrapping not always being off using PAC If you want to check for all uses of PAC, the SpillsLR argument to shouldSignReturnAddress should be true instead of false, as that value will be returned from the function if the other checks fall through. Reviewed By: miyuki Differential Revision: https://reviews.llvm.org/D116213	2022-01-13 10:37:00 +00:00
Rosie Sumpter	552eb372cb	[LoopVectorize] Pass a vector type to isLegalMaskedGather/Scatter This is required to query the legality more precisely in the LoopVectorizer. This adds another TTI function named 'forceScalarizeMaskedGather/Scatter' function to work around the hack introduced for MVE, where isLegalMaskedGather/Scatter would return an answer by second-guessing where the function was called from, based on the Type passed in (vector vs scalar). The new interface makes this explicit. It is also used by X86 to check for vector widths where gather/scatters aren't profitable (or don't exist) for certain subtargets. Differential Revision: https://reviews.llvm.org/D115329	2022-01-12 13:34:12 +00:00
David Green	351edf1c47	[ARM] Remove FeaturePerfMon from armv7-m FeaturePerfMon relates to the PMU extensions available in armv7-a, and should not be available in v7-m (it requires loading from a system register with a mrc). Sink it down a level in the dependency map so that it isn't present in ARMv7m or HasV8MMainlineOps. It is also removed from the Neoverse-N2, as it will already be transitively included. Differential Revision: https://reviews.llvm.org/D117022	2022-01-12 09:44:53 +00:00
Tim Northover	0b5b35fdbd	ARM: make FastISel & GISel pass -1 to ADJCALLSTACKUP to signal no callee pop. The interface for these instructions changed with support for mandatory tail calls, and now -1 indicates the CalleePopAmount argument is not valid. Unfortunately I didn't realise FastISel or GISel did calls at the time so didn't update them.	2022-01-11 11:31:13 +00:00
Kazu Hirata	4e2ec7e38d	[llvm] Remove unused forward declarations (NFC)	2022-01-07 20:00:34 -08:00
Kazu Hirata	2aed08131d	[llvm] Use true/false instead of 1/0 (NFC) Identified with modernize-use-bool-literals.	2022-01-07 00:39:14 -08:00
Kazu Hirata	f3a344d212	[Target] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-06 22:01:44 -08:00
Kazu Hirata	e5947760c2	Revert "[llvm] Remove redundant member initialization (NFC)" This reverts commit `fd4808887e`. This patch causes gcc to issue a lot of warnings like: warning: base class ‘class llvm::MCParsedAsmOperand’ should be explicitly initialized in the copy constructor [-Wextra]	2022-01-03 11:28:47 -08:00
Lucas Prates	cd7f621a0a	[ARM][AArch64] Introduce Armv9.3-A This patch introduces support for targetting the Armv9.3-A architecture, which should map to the existing Armv8.8-A extensions. Differential Revision: https://reviews.llvm.org/D116158	2022-01-03 12:40:43 +00:00
Kazu Hirata	41bfac6aed	[Target] Remove unused forward declarations (NFC)	2022-01-02 10:20:15 -08:00
Kazu Hirata	fd4808887e	[llvm] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-01 16:18:18 -08:00
David Green	319e77592f	[ARM] Verify addressing immediates This adds at extra check into ARMBaseInstrInfo::verifyInstruction to verify the offsets used in addressing mode immediates using isLegalAddressImm. Some tests needed fixing up as a result, adjusting the opcode created from CMSE stack adjustments. Differential Revision: https://reviews.llvm.org/D114939	2022-01-01 20:08:45 +00:00
Kazu Hirata	69ccc96162	[llvm] Use the default constructor for SDValue (NFC)	2022-01-01 10:36:59 -08:00
Simon Tatham	d50072f74e	[ARM] Introduce an empty "armv8.8-a" architecture. This is the first commit in a series that implements support for "armv8.8-a" architecture. This should contain all the necessary boilerplate to make the 8.8-A architecture exist from LLVM and Clang's point of view: it adds the new arch as a subtarget feature, a definition in TargetParser, a name on the command line, an appropriate set of predefined macros, and adds appropriate tests. The new architecture name is supported in both AArch32 and AArch64. However, in this commit, no actual _functionality_ is added as part of the new architecture. If you specify -march=armv8.8a, the compiler will accept it and set the right predefines, but generate no code any differently. Differential Revision: https://reviews.llvm.org/D115694	2021-12-31 16:43:53 +00:00
Kazu Hirata	5a667c0e74	[llvm] Use nullptr instead of 0 (NFC) Identified with modernize-use-nullptr.	2021-12-28 08:52:25 -08:00
Kazu Hirata	8445883327	[llvm] Drop unnecessary const from return types (NFC) Identified with readability-const-return-type.	2021-12-27 15:58:03 -08:00
David Green	2ec3ca7477	[ARM] Extend IsCMPZCSINC to handle CMOV A 'CMOV 1, 0, CC, %cpsr, Cmp' is the same as a 'CSINC 0, 0, CC, Cmp', and can be treated the same in IsCMPZCSINC added in D114013. This allows us to remove the unnecessary CMOV in the same way that we could remove a CSINC. Differential Revision: https://reviews.llvm.org/D115188	2021-12-27 14:15:03 +00:00
Kazu Hirata	0a5788ab57	[Target] Use range-based for loops (NFC)	2021-12-26 23:49:38 -08:00
Kazu Hirata	c5cf7d910e	[ARM] Use range-based for loops (NFC)	2021-12-20 23:06:47 -08:00
Kazu Hirata	de90490060	Revert "[ARM] Use range-based for loops (NFC)" This reverts commit `93d79cac2e`. This patch seems to break llvm/test/CodeGen/ARM/constant-islands-cfg.mir under asan.	2021-12-20 10:51:36 -08:00
Kazu Hirata	93d79cac2e	[ARM] Use range-based for loops (NFC)	2021-12-20 00:04:53 -08:00
David Green	4ece4cd77e	[ARM] Fold away CMP/CSINC from CMOV This makes use of the code in D114013 to fold away unnecessary CMPZ/CSINC starting from a CMOV, in a similar way to how we fold away CSINV/CSINC/etc Differential Revision: https://reviews.llvm.org/D115185	2021-12-19 21:53:50 +00:00
Kazu Hirata	26bd534a79	[llvm] Use none_of instead of \!any_of (NFC)	2021-12-17 13:48:57 -08:00
David Green	6bd8f114c8	[ARM] Handle splats of constants for MVE qr instruction Some MVE instructions have qr variants that take a Q and R register, splatting the R register for each lane. This is usually handled fine for standard splats as we sink the splat into the loop and combine the resulting dup into the qr instruction. It does not work for constant splats though, as we generate a vmovimm or constant pool load instead. This intercepts that, generating a vdup of the constant instead where we can turn the result into a qr instruction variant. Differential Revision: https://reviews.llvm.org/D115242	2021-12-17 09:16:28 +00:00
David Green	26f6fbe2be	[ARM] Add AddrModeT2_i8neg addressing mode support for frame lowering. As reported from a failing firefox build, we can sometimes get frame indices with negative offsets from a t2LDRi8. This adds support for them, to prevent the crash.	2021-12-14 12:49:27 +00:00
Kazu Hirata	483499670e	[Target] Use llvm::reverse (NFC)	2021-12-12 08:34:24 -08:00
Kazu Hirata	c2bb9637d9	Use llvm::any_of and llvm::all_of (NFC)	2021-12-11 11:54:37 -08:00
Kazu Hirata	36b8a4f9f3	[llvm] Use llvm::is_contained (NFC)	2021-12-11 11:42:09 -08:00
Mikael Holmen	d0f55a0d80	[ARM] Fix gcc warning about mix of enumeral and non-enumeral types gcc warned with ../lib/Target/ARM/ARMFrameLowering.cpp:797:31: warning: enumeral and non-enumeral type in conditional expression [-Wextra] 797 \| Reg == ARM::R12 ? ARM::RA_AUTH_CODE : Reg, true); \| ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~	2021-12-09 10:31:56 +01:00
Mircea Trofin	b012742405	[NFC] Rename MachineFunction::deleteMachineInstr (coding style)	2021-12-08 20:36:13 -08:00
David Green	d43c801d13	[ARM] Peek through And 1 in IsCMPZCSINC We can be in situations where And 1 zext nodes will not have been yet, preventing us from detecting removable cmpz/csinc patterns. This peeks through those nodes allowing us to simplify more code. Differential Revision: https://reviews.llvm.org/D115176	2021-12-08 15:40:23 +00:00
Ties Stuij	63eb7ff47d	[ARM] Implement PAC return address signing mechanism for PACBTI-M This patch implements PAC return address signing for armv8-m. This patch roughly accomplishes the following things: - PAC and AUT instructions are generated. - They're part of the stack frame setup, so that shrink-wrapping can move them inwards to cover only part of a function - The auth code generated by PAC is saved across subroutine calls so that AUT can find it again to check - PAC is emitted before stacking registers (so that the SP it signs is the one on function entry). - The new pseudo-register ra_auth_code is mentioned in the DWARF frame data - With CMSE also in use: PAC is emitted before stacking FPCXTNS, and AUT validates the corresponding value of SP - Emit correct unwind information when PAC is replaced by PACBTI - Handle tail calls correctly Some notes: We make the assembler accept the `.save {ra_auth_code}` directive that is emitted by the compiler when it saves a register that contains a return address authentication code. For EHABI we need to have the `FrameSetup` flag on the instruction and handle the `t2PACBTI` opcode (identically to `t2PAC`), so we can emit `.save {ra_auth_code}`, instead of `.save {r12}`. For PACBTI-M, the instruction which computes return address PAC should use SP value before adjustment for the argument registers save are (used for variadic functions and when a parameter is is split between stack and register), but at the same it should be after the instruction that saves FPCXT when compiling a CMSE entry function. This patch moves the varargs SP adjustment after the FPCXT save (they are never enabled at the same time), so in a following patch handling of the `PAC` instruction can be placed between them. Epilogue emission code adjusted in a similar manner. PACBTI-M code generation should not emit any instructions for architectures v6-m, v8-m.base, and for A- and R-class cores. Diagnostic message for such cases is handled separately by a future ticket. note on tail calls: If the called function has four arguments that occupy registers `r0`-`r3`, the only option for holding the function pointer itself is `r12`, but this register is used to keep the PAC during function/prologue epilogue and clobbers the function pointer. When we do the tail call we need the five registers (`r0`-`r3` and `r12`) to keep six values - the four function arguments, the function pointer and the PAC, which is obviously impossible. One option would be to authenticate the return address before all callee-saved registers are restored, so we have a scratch register to temporarily keep the value of `r12`. The issue with this approach is that it violates a fundamental invariant that PAC is computed using CFA as a modifier. It would also mean using separate instructions to pop `lr` and the rest of the callee-saved registers, which would offset the advantages of doing a tail call. Instead, this patch disables indirect tail calls when the called function take four or more arguments and the return address sign and authentication is enabled for the caller function, conservatively assuming the caller function would spill LR. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Momchil Velikov - Ties Stuij Reviewed By: danielkiss Differential Revision: https://reviews.llvm.org/D112429	2021-12-07 10:15:19 +00:00
Ties Stuij	0fbb17458a	[ARM] Implement setjmp BTI placement for PACBTI-M This patch intends to guard indirect branches performed by longjmp by inserting BTI instructions after calls to setjmp. Calls with 'returns-twice' are lowered to a new pseudo-instruction named t2CALL_BTI that is later expanded to a bundle of {tBL,t2BTI}. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Alexandros Lamprineas - Ties Stuij Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D112427	2021-12-06 11:07:10 +00:00
David Green	d8495e0352	[ARM] Add a vrinta.f16.f16 alias The v8.1-m ARMARM uses the vrinta.f16.f16 names, as opposed to vrinta.f16. This adds an alias for it in the same way that we have for f32 and f64. Differential Revision: https://reviews.llvm.org/D68127	2021-12-06 11:06:25 +00:00
David Green	ab0c5cea0b	[ARM] Use v2i1 for MVE and CDE intrinsics This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal type, to use a <2 x i1> as opposed to emulating the predicate with a <4 x i1>. The v4i1 workarounds have been removed leaving the natural v2i1 types, notably in vctp64 which now generates a v2i1 type. AutoUpgrade code has been added to upgrade old IR, which needs to convert the old v4i1 to a v2i1 be converting it back and forth to an integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be optimized away in the final assembly. Differential Revision: https://reviews.llvm.org/D114455	2021-12-03 15:27:58 +00:00
David Green	255ad73424	[ARM] Make MVE v2i1 predicates legal MVE can treat v16i1, v8i1, v4i1 and v2i1 as different views onto the same 16bit VPR.P0 register, with v2i1 holding two 8 bit values for the two halves. This was never treated as a legal type in llvm in the past as there are not many 64bit instructions and no 64bit compares. There are a few instructions that could use it though, notably a VSELECT (as it can handle any size using the underlying v16i8 VPSEL), AND/OR/XOR for similar reasons, some gathers/scatter and long multiplies and VCTP64 instructions. This patch goes through and makes v2i1 a legal type, handling all the cases that fall out of that. It also makes VSELECT legal for v2i64 as a side benefit. A lot of the codegen changes as a result - usually in way that is a little better or a little worse, but still expensive. Costs can change a little too in the process, again in a way that expensive things remain expensive. A lot of the tests that changed are mainly to ensure correctness - the code can hopefully be improved in the future where it comes up in practice. The intrinsics currently remain using the v4i1 they previously did to emulate a v2i1. This will be changed in a followup patch but this one was already large enough. Differential Revision: https://reviews.llvm.org/D114449	2021-12-03 14:05:41 +00:00
David Green	b8f1ccb0ac	[ARM] Introduce i8neg and i8pos addressing modes Some instructions with i8 immediate ranges can only hold negative values (like t2LDRHi8), only hold positive values (like t2STRT) or hold +/- depending on the U bit (like the pre/post inc instructions. e.g t2LDRH_POST). This patch splits the AddrModeT2_i8 into AddrModeT2_i8, AddrModeT2_i8pos and AddrModeT2_i8neg to make this clear. This allows us to get the offset ranges of t2LDRHi8 correct in the load/store optimizer, fixing issues where we could end up creating instructions with positive offsets (which may then be encoded as ldrht). Differential Revision: https://reviews.llvm.org/D114638	2021-12-02 17:10:26 +00:00
David Green	e629302558	[ARM] Correct range in isLegalAddressImm The ranges in isLegalAddressImm were off by one, not allowing the maximum values for unscaled offsets. Differential Revision: https://reviews.llvm.org/D114636	2021-12-02 11:33:40 +00:00
David Green	646c872f9d	[ARM] Teach getIntImmCostInst about the cost of saturating fp converts Given a min(max(fptosi, INT_MIN), INT_MAX) with the correct constants, we can now generate a fptosi.sat. But in the arm backend, the constant can be treated as high cost, pulling it out of the basic block in a way that the DAG combine can no longer see it. This teaches it again that it is a low cost constant, not worth hoisting out. Recommitted from `0e98659ea1` with a fix for APInt comparison. Differential Revision: https://reviews.llvm.org/D114380	2021-12-02 07:56:27 +00:00
David Green	13e66c070b	Revert "[ARM] Teach getIntImmCostInst about the cost of saturating fp converts" This reverts commit `6d41de380f` as the windows bots are not happy, in a way I do not understand. Revert whilst we figure out what is wrong.	2021-12-01 15:25:19 +00:00
Ties Stuij	f5f28d5b0c	[ARM] Implement BTI placement pass for PACBTI-M This patch implements a new MachineFunction in the ARM backend for placing BTI instructions. It is similar to the existing AArch64 aarch64-branch-targets pass. BTI instructions are inserted into basic blocks that: - Have their address taken - Are the entry block of a function, if the function has external linkage or has its address taken - Are mentioned in jump tables - Are exception/cleanup landing pads Each BTI instructions is placed in the beginning of a BB after the so-called meta instructions (e.g. exception handler labels). Each outlining candidate and the outlined function need to be in agreement about whether BTI placement is enabled or not. If branch target enforcement is disabled for a function, the outliner should not covertly enable it by emitting a call to an outlined function, which begins with BTI. The cost mode of the outliner is adjusted to account for the extra BTI instructions in the outlined function. The ARM Constant Islands pass will maintain the count of the jump tables, which reference a block. A `BTI` instruction is removed from a block only if the reference count reaches zero. PAC instructions in entry blocks are replaced with PACBTI instructions (tests for this case will be added in a later patch because the compiler currently does not generate PAC instructions). The ARM Constant Island pass is adjusted to handle BTI instructions correctly. Functions with static linkage that don't have their address taken can still be called indirectly by linker-generated veneers and thus their entry points need be marked with BTI or PACBTI. The changes are tested using "LLVM IR -> assembly" tests, jump tables also have a MIR test. Unfortunately it is not possible add MIR tests for exception handling and computed gotos because of MIR parser limitations. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Mikhail Maltsev - Momchil Velikov - Ties Stuij Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D112426	2021-12-01 12:54:05 +00:00
Ties Stuij	b430782be3	[ARM] emit PACBTI-M build attributes This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Victor Campos - Ties Stuij Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D112425	2021-12-01 11:05:29 +00:00
Ties Stuij	c12c7a84b0	[ARM] add common parts for PACBTI-M support in the backend This patch encapsulates decision logic about when and how to generate PAC/BTI related code. It's a part shared by PAC-RET, BTI placement, build attribute emission, etc, so it make sense committing it separately in order to unblock the aforementioned parts, which can proceed concurrently. This patch adds a few member functions to `ARMFunctionInfo`, which are currently unused, therefore there is no testing for them at the moment. This code is tested in follow-up PAC/BTI code gen patches. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Momchil Velikov - Ties Stuij Reviewed By: danielkiss Differential Revision: https://reviews.llvm.org/D112423	2021-12-01 10:48:30 +00:00

1 2 3 4 5 ...

11884 Commits