llvm-project

Commit Graph

Author	SHA1	Message	Date
David Green	1da6940275	[ARM] Add more opaque pointer gather/scatter tests. NFC Some of the newly added tests are incorrect, fixed in D127733.	2022-06-14 14:08:43 +01:00
Lucas Prates	7625e01d66	[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it. In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none\|aapcs\|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D125094	2022-06-14 13:37:51 +01:00
Guillaume Chatelet	6725d80640	[NFC][Alignment] Use Align in shouldAlignPointerArgs	2022-06-14 10:56:36 +00:00
Guillaume Chatelet	01a8b89edb	[NFC][Alignment] Use getAlign in ARMFastISel	2022-06-13 13:36:36 +00:00
Simon Pilgrim	f97e15ef45	[ARM] Fix "local variable is initialized but not referenced" MSVX warning. NFC	2022-06-13 11:48:06 +01:00
Lucas Prates	33b9ad647e	Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records" Reverting change due to test failure. This reverts commit `6119053dab`.	2022-06-13 11:00:49 +01:00
Lucas Prates	6119053dab	[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it. In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none\|aapcs\|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D125094	2022-06-13 10:21:06 +01:00
Lucas Prates	7775124b5c	[NFC][Thumb1] Use FrameDestroy flag to identify epilog instructions Simiarly to what's done on both ARM's and AArch64's frame lowering code, this updates Thumb1FrameLowering to use the FrameDestroy Machine Instruction flag to identify instructions inserted as part of the epilog instead of relying on assumptions about specific machine instructions. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D126285	2022-06-13 10:19:10 +01:00
Fangrui Song	adf4142f76	[MC] De-capitalize SwitchSection. NFC Add SwitchSection to return switchSection. The API will be removed soon.	2022-06-10 22:50:55 -07:00
Guillaume Chatelet	38637ee477	[clang] Add support for __builtin_memset_inline In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`. The idea is to get support from the compiler to easily write efficient memory function implementations. This patch could be split in two: - one for the LLVM part adding the `llvm.memset.inline.*` intrinsics. - and another one for the Clang part providing the instrinsic as a builtin. Differential Revision: https://reviews.llvm.org/D126903	2022-06-10 13:13:59 +00:00
David Sherwood	007917b95c	[MVE] Fold fadd(select(..., +0.0)) into a predicated fadd We already have patterns for matching fadd(select(..., -0.0)), but an upcoming patch will lead to patterns using +0.0 as the identity instead of -0.0. I'm adding support for these patterns now to avoid any regressions for MVE. Differential Revision: https://reviews.llvm.org/D127275	2022-06-10 11:09:55 +01:00
Sam Parker	447c411fef	[ARM][ParallelDSP] Fix self reference bug Ensure we don't generate a smlad intrinsic that takes itself as an argument. Differential Revision: https://reviews.llvm.org/D127213	2022-06-09 09:10:57 +00:00
Matt Arsenault	cc5a1b3dd9	llvm-reduce: Add cloning of target MachineFunctionInfo MIR support is totally unusable for AMDGPU without this, since the set of reserved registers is set from fields here. Add a clone method to MachineFunctionInfo. This is a subtle variant of the copy constructor that is required if there are any MIR constructs that use pointers. Specifically, at minimum fields that reference MachineBasicBlocks or the MachineFunction need to be adjusted to the values in the new function.	2022-06-07 10:14:48 -04:00
Guillaume Chatelet	0788186182	[Alignment][NFC] Remove usage of MemSDNode::getAlignment I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with TypeSize since it came up a few times. Differential Revision: https://reviews.llvm.org/D126910	2022-06-07 13:52:20 +00:00
David Green	53be6ab25c	[ARM] Fix MVE getShuffleCost legalized type check The MVE shuffle costing for VREV instructions was making incorrect assumptions as to legalized vector types remaining as vectors. Add a quick check to ensure they are indeed vectors before attempting to get the number of elements.	2022-06-07 14:36:04 +01:00
Fangrui Song	15d82c62dc	[MC] De-capitalize MCStreamer functions Follow-up to `c031378ce0` . The class is mostly consistent now.	2022-06-07 00:31:02 -07:00
ksyx	3204272f0f	[ARM] Use llvm::dbgs() to print debug info (NFC) For consistency with other parts of code. Approved by efriedma in differential revision https://reviews.llvm.org/D127055	2022-06-06 16:43:16 -04:00
Martin Storsjö	98dc3e86fd	[ARM] [MinGW] Default to WinEH exception handling instead of Dwarf Switching this target to WinEH also seems to affect the `-windows-itanium` target. Differential Revision: https://reviews.llvm.org/D126870	2022-06-06 23:27:19 +03:00
Fangrui Song	77e300ffdf	[MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline"	2022-06-05 15:11:01 -07:00
Fangrui Song	8c911f8e9a	[ARM][MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline" The directive name is not useful because the next line replicates the error line which includes the directive. The prevailing style uses "expected newline".	2022-06-05 14:53:59 -07:00
Kazu Hirata	3b9707dbc0	[llvm] Convert for_each to range-based for loops (NFC)	2022-06-05 12:07:14 -07:00
Fangrui Song	d86a206f06	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 00:31:44 -07:00
Martin Storsjö	485432f3c8	[ARM] Make a narrow tMOVi8 where possible in SEH prologues We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions in a potential future pass in frame lowering. But for this specific case, where we know we can express the intent with a narrow instruction, change to that instruction form directly in frame lowering. Differential Revision: https://reviews.llvm.org/D126949	2022-06-03 22:33:55 +03:00
Martin Storsjö	bd52506d24	[ARM] Make narrow push/pop in SEH prologues/epilogues where applicable We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions in a potential future pass in frame lowering. But for this specific case, where we know we can express the intent with a narrow instruction, change to that instruction form directly in frame lowering. Differential Revision: https://reviews.llvm.org/D126948	2022-06-03 22:33:55 +03:00
Martin Storsjö	40c937cba2	[ARM] Fix restoring stack for varargs with SEH split frame pointer push Previously, the "add sp, #12" ended up inserted after "bx lr". Differential Revision: https://reviews.llvm.org/D126872	2022-06-03 09:32:00 +03:00
Martin Storsjö	668bb96379	[ARM] Implement lowering of the sponentry intrinsic This is needed for SEH based setjmp on Windows. Differential Revision: https://reviews.llvm.org/D126763	2022-06-02 12:29:59 +03:00
Martin Storsjö	2ab19bfa41	[ARM] Adjust the frame pointer when it's needed for SEH unwinding For functions that require restoring SP from FP (e.g. that need to align the stack, or that have variable sized allocations), the prologue and epilogue previously used to look like this: push {r4-r5, r11, lr} add r11, sp, #8 ... sub r4, r11, #8 mov sp, r4 pop {r4-r5, r11, pc} This is problematic, because this unwinding operation (restoring sp from r11 - offset) can't be expressed with the SEH unwind opcodes (probably because this unwind procedure doesn't map exactly to individual instructions; note the detour via r4 in the epilogue too). To make unwinding work, the GPR push is split into two; the first one pushing all other registers, and the second one pushing r11+lr, so that r11 can be set pointing at this spot on the stack: push {r4-r5} push {r11, lr} mov r11, sp ... mov sp, r11 pop {r11, lr} pop {r4-r5} bx lr For the same setup, MSVC generates code that uses two registers; r11 still pointing at the {r11,lr} pair, but a separate register used for restoring the stack at the end: push {r4-r5, r7, r11, lr} add r11, sp, #12 mov r7, sp ... mov sp, r7 pop {r4-r5, r7, r11, pc} For cases with clobbered float/vector registers, they are pushed after the GPRs, before the {r11,lr} pair. Differential Revision: https://reviews.llvm.org/D125649	2022-06-02 12:28:46 +03:00
Martin Storsjö	d8e67c1ccc	[ARM] Add SEH opcodes in frame lowering Skip inserting regular CFI instructions if using WinCFI. This is based a fair amount on the corresponding ARM64 implementation, but instead of trying to insert the SEH opcodes one by one where we generate other prolog/epilog instructions, we try to walk over the whole prolog/epilog range and insert them. This is done because in many cases, the exact number of instructions inserted is abstracted away deeper. For some cases, we manually insert specific SEH opcodes directly where instructions are generated, where the automatic mapping of instructions to SEH opcodes doesn't hold up (e.g. for __chkstk stack probes). Skip Thumb2SizeReduction for SEH prologs/epilogs, and force tail calls to wide instructions (just like on MachO), to make sure that the unwind info actually matches the width of the final instructions, without heuristics about what later passes will do. Mark SEH instructions as scheduling boundaries, to make sure that they aren't reordered away from the instruction they describe by PostRAScheduler. Mark the SEH instructions with the NoMerge flag, to avoid doing tail merging of functions that have multiple epilogs that all end with the same sequence of "b <other>; .seh_nop_w, .seh_endepilogue". Differential Revision: https://reviews.llvm.org/D125648	2022-06-02 12:28:46 +03:00
Martin Storsjö	298e9cac92	[MC] [Win64EH] Check that the SEH unwind opcodes match the actual instructions It's a fairly common issue that the generating code incorrectly marks instructions as narrow or wide; check that the instruction lengths add up to the expected value, and error out if it doesn't. This allows catching code generation bugs. Also check that prologs and epilogs are properly terminated, to catch other code generation issues. Differential Revision: https://reviews.llvm.org/D125647	2022-06-01 11:25:49 +03:00
Martin Storsjö	6b75a3523f	[ARM] [MC] Add support for writing ARM WinEH unwind info This includes .seh_* directives for generating it from assembly. It is designed fairly similarly to the ARM64 handling. For .seh_handler directives, such as ".seh_handler __C_specific_handler, @except" (which is supported on x86_64 and aarch64 so far), the "@except" bit doesn't work in ARM assembly, as '@' is used as a comment character (on all current platforms). Allow using '%' instead of '@' for this purpose. This convention is used by GAS in similar contexts already, e.g. [1]: Note on targets where the @ character is the start of a comment (eg ARM) then another character is used instead. For example the ARM port uses the % character. In practice, this unfortunately means that all such .seh_handler directives will need ifdefs for ARM. Contrary to ARM64, on ARM, it's quite common that we can't evaluate e.g. the function length at this point, due to instructions whose length is finalized later. (Also, inline jump tables end with a ".p2align 1".) If unable to to evaluate the function length immediately, emit it as an MCExpr instead. If we'd implement splitting the unwind info for a function (which isn't implemented for ARM64 yet either), we wouldn't know whether we need to split it though. Avoid calling getFrameIndexOffset() on an unset FuncInfo.UnwindHelpFrameIdx, to avoid triggering asserts in the preexisting testcase CodeGen/ARM/Windows/wineh-basic.ll. (Once MSVC exception handling is fully implemented, those changes can be reverted.) [1] https://sourceware.org/binutils/docs/as/Section.html#Section Differential Revision: https://reviews.llvm.org/D125645	2022-06-01 11:25:48 +03:00
Zongwei Lan	ad73ce318e	[Target] use getSubtarget<> instead of static_cast<>(getSubtarget()) Differential Revision: https://reviews.llvm.org/D125391	2022-05-26 11:22:41 -07:00
David Penry	917dc0749b	[ARM] Recognize t2LoopEnd for software pipelining - Add t2LoopEnd to TargetInstrInfo::analyzeBranch and related functions. As there are many side effects of analyzing a branch, only do so if software pipelining is enabled to maintain previous behavior when pipelining is not desired. - Make sure that t2LoopEndDec is immediately followed by a t2B when it is synthesized from a t2LoopEnd. This is done because the t2LoopEnd might have acquired a fall-through path, but IfConversion assumes that fall-through are only possible on analyzable branches. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D126322	2022-05-26 09:55:42 -07:00
Fangrui Song	9ee15bba47	[MC] Lower case the first letter of EmitCOFF* EmitWin* EmitCV*. NFC	2022-05-26 00:14:08 -07:00
Maksim Panchenko	bed9efed71	[MCDisassembler] Disambiguate Size parameter in tryAddingSymbolicOperand() MCSymbolizer::tryAddingSymbolicOperand() overloaded the Size parameter to specify either the instruction size or the operand size depending on the architecture. However, for proper symbolic disassembly on X86, we need to know both sizes, as an instruction can have two operands, and the instruction size cannot be reliably calculated based on the operand offset and its size. Hence, split Size into OpSize and InstSize. For X86, the new interface allows to fix a couple of issues: * Correctly adjust the value of PC-relative operands. * Set operand size to zero when the operand is specified implicitly. Differential Revision: https://reviews.llvm.org/D126101	2022-05-25 13:44:32 -07:00
David Green	18cb3b3506	[ARM] Fix vcvtb/t.f16 input liveness The `vcvtb.f16.f32 Sd, Sn` (and vcvtt.f16.f32) instruction convert a f32 into a f16, writing either the top or bottom halves of the register. That means that half of the input register Sd is used in the output. This wasn't being modelled in the instructions, leading later analyses to believe that the registers were dead where they were not, generating invalid scheduling Fix that be specifying the input Sda register for the instructions too, allowing them to be set for cases like vector inserts. Most of the changes are plumbing through the constraint string, cstr. Differential Revision: https://reviews.llvm.org/D126118	2022-05-25 12:16:26 +01:00
David Green	a86cfaea54	[ARM] Add register-mask for tail returns The TC_RETURN/TCRETURNdi under Arm does not currently add the register-mask operand when tail folding, which leads to the register (like LR) not being 'used' by the return. This changes the code to unconditionally set the register mask on the call, as opposed to skipping it for tail calls. I don't believe this will currently alter any codegen, but should glue things together better post-frame lowering. It matches the AArch64 code better. Differential Revision: https://reviews.llvm.org/D125906	2022-05-21 15:28:24 +01:00
David Green	b4dd9fc370	[ARM] Cost modelling for MVE vector fptoi_sat Building on top of D125665, this adds MVE costs for fptosi.sat and fptoui.sat, providing MVE is available and the types are legal. Differential Revision: https://reviews.llvm.org/D125666	2022-05-20 11:00:34 +01:00
David Green	80aab0312a	[ARM] Cost modelling for scalar fptoi_sat Similar to D124357, this adds some cost modelling for fptoi_sat for Arm targets. Where VFP2 is available (and FP64/FP16 for the relevant types), the operations are legal as the Arm instructions naturally saturate. Otherwise they will need an extra smin/smax clamp, similar to AArch64. Differential Revision: https://reviews.llvm.org/D125665	2022-05-19 19:53:21 +01:00
Archibald Elliott	2321c36fbf	[ARM] Don't Enable AES Pass for Generic Cores This brings clang/llvm into line with GCC. The Pass is still enabled for the affected cores, but is now opt-in when using `-march=`. I also took the opportunity to add release notes for this change. Reviewed By: john.brawn Differential Revision: https://reviews.llvm.org/D125775	2022-05-18 13:10:31 +01:00
Martin Storsjö	68f37e7991	[ARM] Rename the isARMAreaXRegister parameter isIOS to SplitFramePushPop. NFC. In `f8b0a7af52` in 2016, this parameter was generalized on the caller side (previously passing STI.isTargetMachO(), now passing STI.splitFramePushPop()). Rename the parameter on the receiver side to match the generalization. Differential Revision: https://reviews.llvm.org/D125681	2022-05-17 00:41:38 +03:00
NAKAMURA Takumi	bdab5c4b3d	ARMFixCortexA57AES1742098Pass.cpp: Suppress a warning. [-Wunused-but-set-variable]	2022-05-15 18:01:42 +09:00
Sheng	c644488a8b	Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h` The name `MCFixedLenDisassembler.h` is out of date after D120958. Rename it as `MCDecoderOps.h` to reflect the change. Reviewed By: myhsu Differential Revision: https://reviews.llvm.org/D124987	2022-05-15 08:44:58 +08:00
Archibald Elliott	3a24df992c	[ARM] Pass for Cortex-A57 and Cortex-A72 Fused AES Erratum This adds a late Machine Pass to work around a Cortex CPU Erratum affecting Cortex-A57 and Cortex-A72: - Cortex-A57 Erratum 1742098 - Cortex-A72 Erratum 1655431 The pass inserts instructions to make the inputs to the fused AES instruction pairs no longer trigger the erratum. Here the pass errs on the side of caution, inserting the instructions wherever we cannot prove that the inputs came from a safe instruction. The pass is used: - for Cortex-A57 and Cortex-A72, - for "generic" cores (which are used when using `-march=`), - when the user specifies `-mfix-cortex-a57-aes-1742098` or `mfix-cortex-a72-aes-1655431` in the command-line arguments to clang. Reviewed By: dmgreen, simon_tatham Differential Revision: https://reviews.llvm.org/D119720	2022-05-13 10:47:33 +01:00
David Green	f848798b7d	[ARM] Delay creation of MVE Imm shifts to legalization The reasoning for creating VSHLIMM/VSHRsIMM/VSHRuIMM nodes in a combine - because matching i64 constants is difficult - does not apply for MVE, as there are not v2i64 shifts. Delaying the creation of the nodes can allow extra transforms on target independant shl/shr.	2022-05-04 22:12:09 +01:00
serge-sans-paille	7030654296	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `fa5a4e1b95` detected a few regressions, fixing them. Differential Revision: https://reviews.llvm.org/D124847	2022-05-04 08:32:38 +02:00
Zhiyao Ma	bd606afe26	[ARM] Only update the successor edges for immediate predecessors of PrologueMBB When adjusting the function prologue for segmented stacks, only update the successor edges of the immediate predecessors of the original prologue. Differential Revision: https://reviews.llvm.org/D122959	2022-05-03 12:36:35 +01:00
David Penry	dcb77643e3	Reapply [CodeGen][ARM] Enable Swing Module Scheduling for ARM Fixed "private field is not used" warning when compiled with clang. original commit: `28d09bbbc3` reverted in: `fa49021c68` ------ This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch. MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body. The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues. It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D122672	2022-04-29 10:54:39 -07:00
David Penry	fa49021c68	Revert "[CodeGen][ARM] Enable Swing Module Scheduling for ARM" This reverts commit `28d09bbbc3` while I investigate a buildbot failure.	2022-04-28 13:29:27 -07:00
David Penry	28d09bbbc3	[CodeGen][ARM] Enable Swing Module Scheduling for ARM This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch. MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body. The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues. It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D122672	2022-04-28 13:01:18 -07:00
Ties Stuij	051deb2d9d	[ARM] add Armv9 build attribute The build attribute number can be found in the Arm ABI addenda32 document: https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#335target-related-attributes Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D124090	2022-04-28 10:48:26 +01:00
Vasileios Porpodas	fa8a9fea47	Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`" This reverts commit `6a9bbd9f20`. Code review: https://reviews.llvm.org/D124202	2022-04-26 14:02:40 -07:00
Vasileios Porpodas	6a9bbd9f20	Revert "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`" This reverts commit `55ce296d6f`.	2022-04-26 11:25:26 -07:00
Vasileios Porpodas	55ce296d6f	[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost` Before this patch `Args` was used to pass a broadcat's arguments by SLP. This patch changes this. `Args` is now used for passing the operands of the shuffle. Differential Revision: https://reviews.llvm.org/D124202	2022-04-26 11:11:29 -07:00
Matt Arsenault	d7938b1a81	MachineModuleInfo: Move HasSplitStack handling to AsmPrinter This is used to emit one field in doFinalization for the module. We can accumulate this when emitting all individual functions directly in the AsmPrinter, rather than accumulating additional state in MachineModuleInfo. Move the special case behavior predicate into MachineFrameInfo to share it. This now promotes it to generic behavior. I'm assuming this is fine because no other target implements adjustForSegmentedStacks, or has tests using the split-stack attribute.	2022-04-20 10:54:29 -04:00
Ilia Diachkov	6c69427e88	[SPIR-V](3/6) Add MC layer, object file support, and InstPrinter The patch adds SPIRV-specific MC layer implementation, SPIRV object file support and SPIRVInstPrinter. Differential Revision: https://reviews.llvm.org/D116462 Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov, Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com> Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com> Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com> Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com> Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>	2022-04-20 01:10:25 +02:00
Craig Topper	65942554e2	[ARM] Add missing return to ARMTTIImpl::isLoweredToCall. I assume we meant to return the result of the call to BaseT::isLoweredToCall(F). This might not be a functional change in practice because it would still hit the default case in the switch and call BaseT::isLoweredToCall(F) at the end. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D123333	2022-04-07 12:52:54 -07:00
Ties Stuij	edeceb8647	remove dead code in parseRegisterList checking for ARM::RA_AUTH_CODE Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122577	2022-04-07 14:53:46 +01:00
Matt Arsenault	c4ea925f50	AtomicExpand: Change return type for shouldExpandAtomicStoreInIR Use the same enum as the other atomic instructions for consistency, in preparation for addition of another strategy. Introduce a new "Expand" option, since the store expansion does not use cmpxchg. Alternatively, the existing CmpXChg strategy could be renamed to Expand.	2022-04-06 22:34:04 -04:00
Matt Arsenault	0fb6856aff	ARM/GlobalISel: Get pointer type from value instead of getPointerSize Avoid using getPointerSize and pass through the original value type.	2022-03-31 16:46:23 -04:00
Chris Bieneman	9130e471fe	Add DXContainer DXIL is wrapped in a container format defined by the DirectX 11 specification. Codebases differ in calling this format either DXBC or DXILContainer. Since eventually we want to add support for DXBC as a target architecture and the format is used by DXBC and DXIL, I've termed it DXContainer here. Most of the changes in this patch are just adding cases to switch statements to address warnings. Reviewed By: pete Differential Revision: https://reviews.llvm.org/D122062	2022-03-29 14:34:23 -05:00
Shao-Ce SUN	662b9fa02c	[NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122557	2022-03-29 09:53:24 +08:00
Kazu Hirata	6212871968	[Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC)	2022-03-27 22:22:37 -07:00
Maksim Panchenko	4ae9745af1	[Disassember][NFCI] Use strong type for instruction decoder All LLVM backends use MCDisassembler as a base class for their instruction decoders. Use "const MCDisassembler " for the decoder instead of "const void ". Remove unnecessary static casts. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D122245	2022-03-25 18:53:59 -07:00
Vasileios Porpodas	39aa202aff	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `e6ead19b77`.	2022-03-23 18:32:17 -07:00
Arthur Eubanks	e6ead19b77	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash." This reverts commit `27bd8f9492`. Causes crashes, see comments in D121973	2022-03-23 10:57:45 -07:00
Vasileios Porpodas	27bd8f9492	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `f7d7d2a08d`.	2022-03-22 16:41:55 -07:00
Arthur Eubanks	f7d7d2a08d	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads."" This reverts commit `79613185d3`. Causes crashes, see comments in https://reviews.llvm.org/D121973.	2022-03-22 13:33:49 -07:00
Craig Topper	9b0f227d7b	[TableGen][RISCV] Add InstAliases with zero_reg to cover unmasked vnot.v, vncvt.x.x.w, vneg.v, etc. The mask being NoRegister prevented the existing aliases from matching since NoRegister isn't in the VMV0 register class. To workaround this I've added new aliases that look for zero_reg. I had to motify tablegen to generate matching code for zero_reg. And as a consequence, I had to change the EmitPriority for an ARM alias that used zero_reg that started printing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121496	2022-03-22 10:14:43 -07:00
Vasileios Porpodas	79613185d3	Recommit "[SLP] Fix lookahead operand reordering for splat loads." Original review: https://reviews.llvm.org/D121354 The original commit `9136145eb0` broke the build on several targets. Differential Revision: https://reviews.llvm.org/D121973	2022-03-21 15:57:32 -07:00
Eli Friedman	ddca66622c	[ARM] Fix shouldExpandAtomicLoadInIR for subtargets without ldrexd. Regression from 2f497ec3; we should not try to generate ldrexd on targets that don't have it. Also, while I'm here, fix shouldExpandAtomicStoreInIR, for consistency. That doesn't really have any practical effect, though. On Thumb targets where we need to use __sync_* libcalls, there is no libcall for stores, so SelectionDAG calls __sync_lock_test_and_set_8 anyway.	2022-03-18 15:54:38 -07:00
Eli Friedman	2f497ec3a0	[ARM] Fix ARM backend to correctly use atomic expansion routines. Without this patch, clang would generate calls to __sync_* routines on targets where it does not make sense; we can't assume the routines exist on unknown targets. Linux has special implementations of the routines that work on old ARM targets; other targets have no such routines. In general, atomics operations which aren't natively supported should go through libatomic (__atomic_) APIs, which can support arbitrary atomics through locks. ARM targets older than v6, where this patch makes a difference, are rare in practice, but not completely extinct. See, for example, discussion on D116088. This also affects Cortex-M0, but I don't think __sync_ routines actually exist in any Cortex-M0 libraries. So in practice this just leads to a slightly different linker error for those cases, I think. Mechanically, this patch does the following: - Ensures we run atomic expansion unconditionally; it never makes sense to completely skip it. - Fixes getMaxAtomicSizeInBitsSupported() so it returns an appropriate number on all ARM subtargets. - Fixes shouldExpandAtomicRMWInIR() and shouldExpandAtomicCmpXchgInIR() to correctly handle subtargets that don't have atomic instructions. Differential Revision: https://reviews.llvm.org/D120026	2022-03-18 12:43:57 -07:00
Tomas Matheson	831ab35b2f	[ARM][AArch64] generate subtarget feature flags Reland of D120906 after sanitizer failures. This patch aims to reduce a lot of the boilerplate around adding new subtarget features. From the SubtargetFeatures tablegen definitions, a series of calls to the macro GET_SUBTARGETINFO_MACRO are generated in ARM/AArch64GenSubtargetInfo.inc. ARMSubtarget/AArch64Subtarget can then use this macro to define bool members and the corresponding getter methods. Some naming inconsistencies have been fixed to allow this, and one unused member removed. This implementation only applies to boolean members; in future both BitVector and enum members could also be generated. Differential Revision: https://reviews.llvm.org/D120906	2022-03-18 16:07:00 +00:00
Tomas Matheson	62c481542e	Revert "[ARM][AArch64] generate subtarget feature flags" This reverts commit `dd8b0fecb9`.	2022-03-18 11:58:20 +00:00
Tomas Matheson	dd8b0fecb9	[ARM][AArch64] generate subtarget feature flags This patch aims to reduce a lot of the boilerplate around adding new subtarget features. From the SubtargetFeatures tablegen definitions, a series of calls to the macro GET_SUBTARGETINFO_MACRO are generated in ARM/AArch64GenSubtargetInfo.inc. ARMSubtarget/AArch64Subtarget can then use this macro to define bool members and the corresponding getter methods. Some naming inconsistencies have been fixed to allow this, and one unused member removed. This implementation only applies to boolean members; in future both BitVector and enum members could also be generated. Differential Revision: https://reviews.llvm.org/D120906	2022-03-18 11:48:20 +00:00
Sterling Augustine	bd38234d76	Reland "Use a stable-sort when combining bases" Differential Revision: https://reviews.llvm.org/D121922	2022-03-17 11:32:16 -07:00
Archibald Elliott	f496330f97	[ARM] Fix Decode of tsb csync There is a crash in the ARM backend when attempting to decode a "tsb csync" instruction using `llvm-objdump --triple=armv8.4a -d`. The crash was in `ARMMCInstrAnalysis::evaluateBranch` where the number of operands in the decoded instruction (0) did not match the number of operands in the instruction description (1). This is becuase `tsb csync` looks like it has an operand during assembly, but there is only one valid operand (csync), so there is no encoding space in the instruction for the operand, so the decoder never has a field to decode that represents `csync`. The fix is to add a custom decode method, which ensures that this instruction does have the right number of operands after decoding. This method merely adds the only available operand value, `ARM_TSB::CSYNC`. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D121479	2022-03-17 17:29:31 +00:00
Sterling Augustine	84810e1f74	Revert "Use a stable-sort when combining bases" This reverts commit `81417261a1`.	2022-03-17 09:54:13 -07:00
Sterling Augustine	81417261a1	Use a stable-sort when combining bases While experimenting with different algorithms for std::sort I discovered that combine-vmovdrr.ll fails if this sort is not stable. I suspect that the test is too stringent in its check--the resultant code looks functionally identical to me under both stable and unstable sorting, but a generic fix is quite a bit more difficult to implement. Thanks to scw@google.com for finding the proper fix. Differential Revision: https://reviews.llvm.org/D121870	2022-03-17 09:02:13 -07:00
Arthur Eubanks	2371c5a0e0	[OpaquePtr][ARM] Use elementtype on ldrex/ldaex/stlex/strex Includes verifier changes checking the elementtype, clang codegen changes to emit the elementtype, and ISel changes using the elementtype. Basically the same as D120527. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D121847	2022-03-16 14:11:53 -07:00
Shengchen Kan	37b378386e	[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments	2022-03-16 20:25:42 +08:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Fangrui Song	689c3a2552	[MC] Fix letter case of some MCSection member functions	2022-03-11 20:07:00 -08:00
Zhiyao Ma	adc26b4eae	[ARM] Fix 8-bit immediate overflow in the instruction of segmented stack prologue. It fixes the overflow of 8-bit immediate field in the emitted instruction that allocates large stacklet. For thumb2 targets, load large immediate by a pair of movw and movt instruction. For thumb1 and ARM targets, load large immediate by reading from literal pool. Differential Revision: https://reviews.llvm.org/D118545	2022-03-10 15:15:24 -08:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit `7f230feeea`. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Mircea Trofin	cb2160760e	[nfc][codegen] Move RegisterBank[Info].h under CodeGen This wraps up from D119053. The 2 headers are moved as described, fixed file headers and include guards, updated all files where the old paths were detected (simple grep through the repo), and `clang-format`-ed it all. Differential Revision: https://reviews.llvm.org/D119876	2022-03-01 21:53:25 -08:00
Jameson Nash	c4b1a63a1b	mark getTargetTransformInfo and getTargetIRAnalysis as const Seems like this can be const, since Passes shouldn't modify it. Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D120518	2022-02-25 14:30:44 -05:00
Craig Topper	c7d6448d03	[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable. Internally to DAGCombiner the SDValues were passed by non-const reference despite not being modified. They were then passed by const reference to TLI. This patch passes them by value which is consistent with the vast majority of code. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120420	2022-02-23 12:40:45 -08:00
David Green	a10789d6cd	[ARM] Recognize SSAT and USAT from SMIN/SMAX We have some recognition of SSAT and USAT from SELECT_CC at the moment. This extends the matching to SMIN/SMAX which can help catch more cases, either from min/max being the canonical form in instcombine or from some expanded nodes like fp_to_si_sat. Differential Revision: https://reviews.llvm.org/D119819	2022-02-23 08:55:54 +00:00
Egor Zhdan	3a1cb36237	Add DriverKit support This patch is the first in a series of patches to upstream the support for Apple's DriverKit. Once complete, it will allow targeting DriverKit platform with Clang similarly to AppleClang. This code was originally authored by JF Bastien. Differential Revision: https://reviews.llvm.org/D118046	2022-02-22 13:42:53 +00:00
Craig Topper	a6fb1bb306	[ARM] Remove unused lowerABS function. NFC This function was added in D49837, but no setOperationAction call was added with it. The code is equivalent to what is done by the default ExpandIntRes_ABS implementation when ADDCARRY is supported. Test case added to verify this. There was some existing coverage from Thumb2 MVE tests, but they started from vectors.	2022-02-20 22:43:23 -08:00
Craig Topper	440c4b705a	[SelectionDAG][RISCV][ARM][PowerPC][X86][WebAssembly] Change default abs expansion to use sra (X, size(X)-1); sub (xor (X, Y), Y). Previous we used sra (X, size(X)-1); xor (add (X, Y), Y). By placing sub at the end, we allow RISCV to combine sign_extend_inreg with it to form subw. Some X86 tests for Z - abs(X) seem to have improved as well. Other targets look to be a wash. I had to modify ARM's abs matching code to match from sub instead of xor. Maybe instead ISD::ABS should be made legal. I'll try that in parallel to this patch. This is an alternative to D119099 which was focused on RISCV only. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D119171	2022-02-20 21:11:23 -08:00
Simon Pilgrim	ae4bec20c4	[ARM] ARMAsmPrinter::emitAttributes - remove unnecessary nullptr test. The MMI pointer has already been dereferenced several times.	2022-02-18 10:36:40 +00:00
Jessica Paquette	6d58f4ab07	[MachineOutliner] NFC: Hide LRU-related stuff behind helper functions It's not particularly user-friendly to have to call `initLRU` everywhere. Also, it wasn't particularly great that the LRU for registers used in a sequence was also initialized by `initLRU`. This patch hides this stuff behind some helper functions: * `isAvailableAcrossAndOutOfSeq` * `isAnyUnavailableAcrossOrOutOfSeq` * `isAvailableInsideSeq` This allows the user to avoid calling `initLRU` explicitly. Also, it allows us to separate initializing the used-in-sequence LRU from the main LRU. Since both ARM and AArch64 check LR liveness in `insertOutlinedCall`, this refactor requires that we de-const the Candidate there. Some other quality-of-code improvements: * LRUs in outliner::Candidate now have more descriptive names * Use `Register` instead of `unsigned` in some places * Improve readability in some places by using ranges rather than `std::for_each` This is a preparatory commit for a larger compile time related change for the AArch64 outliner.	2022-02-16 11:39:07 -08:00
Shao-Ce SUN	2aed07e96c	[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter` Reviewed By: skan Differential Revision: https://reviews.llvm.org/D119846	2022-02-16 13:10:09 +08:00
David Green	ea6ebbcfb3	[ARM] MVE hadd and rhadd This uses the nodes from D106237 to add MVE HADD and RHADD lowering. Differential Revision: https://reviews.llvm.org/D106238	2022-02-14 11:55:40 +00:00
Yuanfang Chen	f927021410	Reland "[clang-cl] Support the /JMC flag" This relands commit `b380a31de0`. Restrict the tests to Windows only since the flag symbol hash depends on system-dependent path normalization.	2022-02-10 15:16:17 -08:00
Yuanfang Chen	b380a31de0	Revert "[clang-cl] Support the /JMC flag" This reverts commit `bd3a1de683`. Break bots: https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8822587673277278177/overview	2022-02-10 14:17:37 -08:00
Yuanfang Chen	bd3a1de683	[clang-cl] Support the /JMC flag The introduction and some examples are on this page: https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/ The `/JMC` flag enables these instrumentations: - Insert at the beginning of every function immediately after the prologue with a call to `void __fastcall __CheckForDebuggerJustMyCode(unsigned char *JMC_flag)`. The argument for `__CheckForDebuggerJustMyCode` is the address of a boolean global variable (the global variable is initialized to 1) with the name convention `__<hash>_<filename>`. All such global variables are placed in the `.msvcjmc` section. - The `<hash>` part of `__<hash>_<filename>` has a one-to-one mapping with a directory path. MSVC uses some unknown hashing function. Here I used DJB. - Add a dummy/empty COMDAT function `__JustMyCode_Default`. - Add `/alternatename:__CheckForDebuggerJustMyCode=__JustMyCode_Default` link option via ".drectve" section. This is to prevent failure in case `__CheckForDebuggerJustMyCode` is not provided during linking. Implementation: All the instrumentations are implemented in an IR codegen pass. The pass is placed immediately before CodeGenPrepare pass. This is to not interfere with mid-end optimizations and make the instrumentation target-independent (I'm still working on an ELF port in a separate patch). Reviewed By: hans Differential Revision: https://reviews.llvm.org/D118428	2022-02-10 10:26:30 -08:00

1 2 3 4 5 ...

11846 Commits