llvm-project

Commit Graph

Author	SHA1	Message	Date
David Green	3e0bf1c7a9	[CodeGen] Move instruction predicate verification to emitInstruction D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Target>MCCodeEmitter::encodeInstruction. This is a very useful idea, but the implementation inside MCCodeEmitter made it only fire for object files, not assembly which most of the llvm test suite uses. This patch moves the code into the <Target>_MC::verifyInstructionPredicates method, inside the InstrInfo. The allows it to be called from other places, such as in this patch where it is called from the <Target>AsmPrinter::emitInstruction methods which should trigger for both assembly and object files. It can also be called from other places such as verifyInstruction, but that is not done here (it tends to catch errors earlier, but in reality just shows all the mir tests that have incorrect feature predicates). The interface was also simplified slightly, moving computeAvailableFeatures into the function so that it does not need to be called externally. The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently show errors in the test-suite, so have been disabled with FIXME comments. Recommitted with some fixes for the leftover MCII variables in release builds. Differential Revision: https://reviews.llvm.org/D129506	2022-07-14 09:33:28 +01:00
Kazu Hirata	611ffcf4e4	[llvm] Use value instead of getValue (NFC)	2022-07-13 23:11:56 -07:00
David Green	95252133e1	Revert "Move instruction predicate verification to emitInstruction" This reverts commit `e2fb8c0f4b` as it does not build for Release builds, and some buildbots are giving more warning than I saw locally. Reverting to fix those issues.	2022-07-13 13:28:11 +01:00
David Green	e2fb8c0f4b	Move instruction predicate verification to emitInstruction D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Target>MCCodeEmitter::encodeInstruction. This is a very useful idea, but the implementation inside MCCodeEmitter made it only fire for object files, not assembly which most of the llvm test suite uses. This patch moves the code into the <Target>_MC::verifyInstructionPredicates method, inside the InstrInfo. The allows it to be called from other places, such as in this patch where it is called from the <Target>AsmPrinter::emitInstruction methods which should trigger for both assembly and object files. It can also be called from other places such as verifyInstruction, but that is not done here (it tends to catch errors earlier, but in reality just shows all the mir tests that have incorrect feature predicates). The interface was also simplified slightly, moving computeAvailableFeatures into the function so that it does not need to be called externally. The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently show errors in the test-suite, so have been disabled with FIXME comments. Differential Revision: https://reviews.llvm.org/D129506	2022-07-13 12:53:32 +01:00
John Brawn	ddd9485129	[MVE] Don't distribute add of vecreduce if it has more than one use If the add has more than one use then applying the transformation won't cause it to be removed, so we can end up applying it again causing an infinite loop. Differential Revision: https://reviews.llvm.org/D129361	2022-07-11 14:13:29 +01:00
David Sherwood	03fee6712a	[LoopVectorize] Add option to use active lane mask for loop control flow Currently, for vectorised loops that use the get.active.lane.mask intrinsic we only use the mask for predicated vector operations, such as masked loads and stores, etc. The loop itself is still controlled by comparing the canonical induction variable with the trip count. However, for some targets this is inefficient when it's cheap to use the mask itself to control the loop. This patch adds support for using the active lane mask for control flow by: 1. Generating the active lane mask for the next iteration of the vector loop, rather than the current one. If there are still any remaining iterations then at least the first bit of the mask will be set. 2. Extract the first bit of this mask and use this bit for the conditional branch. I did this by creating a new VPActiveLaneMaskPHIRecipe that sets up the initial PHI values in the vector loop pre-header. I've also made use of the new BranchOnCond VPInstruction for the final instruction in the loop region. Differential Revision: https://reviews.llvm.org/D125301	2022-07-11 13:46:55 +01:00
David Green	0a11ad2aa8	[ARM] Expand MVE i1 fptoint and inttofp if mve.fp is not present. If MVE.fp is not present then we cannot select the vector i1 fp operations to VCMP instructions, so need to expand.	2022-07-11 13:03:30 +01:00
David Green	438ffdb821	[ARM] Switch the costs of mve1beat and mve4beat These three subtarget features are meant to control where MVE instructions take 1 vs 2 vs 4 architectural beats. The mve1beat feature is described as "Model MVE instructions as a 1 beat per tick architecture", meaning MVE instruction will execute over 4 cycles. mve4beat is the opposite where the entire 4 beats of the MVE instruction execute in a single cycle. The costs for the two were backwards though, not matching the cycle counts like they should. This patch switches the costs on the two to bring them in-line with expectations. Differential Revision: https://reviews.llvm.org/D129141	2022-07-07 16:10:00 +01:00
Archibald Elliott	1666f09933	[ARM] Add Support for Cortex-M85 This patch adds support for Arm's Cortex-M85 CPU. The Cortex-M85 CPU is an Arm v8.1m Mainline CPU, with optional support for MVE and PACBTI, both of which are enabled by default. Parts have been coauthored by by Mark Murray, Alexandros Lamprineas and David Green. Differential Revision: https://reviews.llvm.org/D128415	2022-07-05 10:43:31 +01:00
Lucas Prates	70a5c52534	[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it. In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none\|aapcs\|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D125094	2022-06-27 14:08:48 +01:00
Tim Northover	69ae441e4c	ARM: don't try to load function pointer before long call. Deciding to load an arbitrary global based on whether the entire module is being built for long calls is pretty clearly spurious, and in fact the existing indirect logic is sufficient.	2022-06-27 13:59:35 +01:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit `aa8feeefd3`.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
serge-sans-paille	27fd01d3f8	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `fb67d683db` detected a few regressions, fixing them. The impact on preprocessed output is negligible: -4k lines.	2022-06-22 18:50:39 +02:00
David Green	979400be78	[ARM] Fix MVE gather/scatter merged gep offsets This fixes the combining of constant vector GEP operands in the optimization of MVE gather/scatter addresses, when opaque pointers are enabled. As opaque pointers reduce the number of bitcasts between geps, more can be folded than before. This can cause problems if the index types are now different between the two geps. This fixes that by making sure each constant is scaled appropriately, which has the effect of transforming the geps to have a scale of 1, changing [r0, q0, uxtw #1] gathers to [r0, q0] with a larger q0. This helps use a simpler instruction that doesn't need the extra uxtw. Differential Revision: https://reviews.llvm.org/D127733	2022-06-22 11:04:22 +01:00
Kazu Hirata	7a47ee51a1	[llvm] Don't use Optional::getValue (NFC)	2022-06-20 22:45:45 -07:00
Kazu Hirata	0916d96d12	Don't use Optional::hasValue (NFC)	2022-06-20 20:17:57 -07:00
Kazu Hirata	e0e687a615	[llvm] Don't use Optional::hasValue (NFC)	2022-06-20 10:38:12 -07:00
David Green	76f60931e2	[ARM] Allow distributing postinc with PHI uses Although this doesn't usually come up, we can have uses of the BaseAccess of a distributed postinc being a PHI. This doesn't need the usual dominance check as we will dominate along the phi edge, allowing us to still create a postinc load/store. Differential Revision: https://reviews.llvm.org/D127676	2022-06-20 10:08:21 +01:00
Kazu Hirata	437f960062	[llvm] Call *set::insert without checking membership first (NFC)	2022-06-18 10:22:05 -07:00
Kazu Hirata	b254d67160	[llvm] Call *set::insert without checking membership first (NFC)	2022-06-18 08:32:54 -07:00
Kazu Hirata	621f58e716	[Target, CodeGen] Use isImm(), isReg(), etc (NFC)	2022-06-18 07:41:04 -07:00
Corentin Jabot	b62e3a73e1	Replace to_hexString by touhexstr [NFC] LLVM had 2 methods to convert a number to an hexa string, this remove one of them. Differential Revision: https://reviews.llvm.org/D127958	2022-06-16 17:29:50 +02:00
Krasimir Georgiev	8f2ba36336	Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records AND [NFC][Thumb] Update frame-chain codegen test to use thumbv6m" This reverts commit `7625e01d66` and dependent `cbcce82ef6`. Commit `7625e01d66` causes some new codegen test failures under asan, e.g., CodeGen/ARM/execute-only.ll: https://lab.llvm.org/buildbot/#/builders/5/builds/24659/steps/15/logs/stdio.	2022-06-15 16:10:02 +02:00
David Green	1da6940275	[ARM] Add more opaque pointer gather/scatter tests. NFC Some of the newly added tests are incorrect, fixed in D127733.	2022-06-14 14:08:43 +01:00
Lucas Prates	7625e01d66	[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it. In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none\|aapcs\|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D125094	2022-06-14 13:37:51 +01:00
Guillaume Chatelet	6725d80640	[NFC][Alignment] Use Align in shouldAlignPointerArgs	2022-06-14 10:56:36 +00:00
Guillaume Chatelet	01a8b89edb	[NFC][Alignment] Use getAlign in ARMFastISel	2022-06-13 13:36:36 +00:00
Simon Pilgrim	f97e15ef45	[ARM] Fix "local variable is initialized but not referenced" MSVX warning. NFC	2022-06-13 11:48:06 +01:00
Lucas Prates	33b9ad647e	Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records" Reverting change due to test failure. This reverts commit `6119053dab`.	2022-06-13 11:00:49 +01:00
Lucas Prates	6119053dab	[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it. In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none\|aapcs\|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D125094	2022-06-13 10:21:06 +01:00
Lucas Prates	7775124b5c	[NFC][Thumb1] Use FrameDestroy flag to identify epilog instructions Simiarly to what's done on both ARM's and AArch64's frame lowering code, this updates Thumb1FrameLowering to use the FrameDestroy Machine Instruction flag to identify instructions inserted as part of the epilog instead of relying on assumptions about specific machine instructions. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D126285	2022-06-13 10:19:10 +01:00
Fangrui Song	adf4142f76	[MC] De-capitalize SwitchSection. NFC Add SwitchSection to return switchSection. The API will be removed soon.	2022-06-10 22:50:55 -07:00
Guillaume Chatelet	38637ee477	[clang] Add support for __builtin_memset_inline In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`. The idea is to get support from the compiler to easily write efficient memory function implementations. This patch could be split in two: - one for the LLVM part adding the `llvm.memset.inline.*` intrinsics. - and another one for the Clang part providing the instrinsic as a builtin. Differential Revision: https://reviews.llvm.org/D126903	2022-06-10 13:13:59 +00:00
David Sherwood	007917b95c	[MVE] Fold fadd(select(..., +0.0)) into a predicated fadd We already have patterns for matching fadd(select(..., -0.0)), but an upcoming patch will lead to patterns using +0.0 as the identity instead of -0.0. I'm adding support for these patterns now to avoid any regressions for MVE. Differential Revision: https://reviews.llvm.org/D127275	2022-06-10 11:09:55 +01:00
Sam Parker	447c411fef	[ARM][ParallelDSP] Fix self reference bug Ensure we don't generate a smlad intrinsic that takes itself as an argument. Differential Revision: https://reviews.llvm.org/D127213	2022-06-09 09:10:57 +00:00
Matt Arsenault	cc5a1b3dd9	llvm-reduce: Add cloning of target MachineFunctionInfo MIR support is totally unusable for AMDGPU without this, since the set of reserved registers is set from fields here. Add a clone method to MachineFunctionInfo. This is a subtle variant of the copy constructor that is required if there are any MIR constructs that use pointers. Specifically, at minimum fields that reference MachineBasicBlocks or the MachineFunction need to be adjusted to the values in the new function.	2022-06-07 10:14:48 -04:00
Guillaume Chatelet	0788186182	[Alignment][NFC] Remove usage of MemSDNode::getAlignment I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with TypeSize since it came up a few times. Differential Revision: https://reviews.llvm.org/D126910	2022-06-07 13:52:20 +00:00
David Green	53be6ab25c	[ARM] Fix MVE getShuffleCost legalized type check The MVE shuffle costing for VREV instructions was making incorrect assumptions as to legalized vector types remaining as vectors. Add a quick check to ensure they are indeed vectors before attempting to get the number of elements.	2022-06-07 14:36:04 +01:00
Fangrui Song	15d82c62dc	[MC] De-capitalize MCStreamer functions Follow-up to `c031378ce0` . The class is mostly consistent now.	2022-06-07 00:31:02 -07:00
ksyx	3204272f0f	[ARM] Use llvm::dbgs() to print debug info (NFC) For consistency with other parts of code. Approved by efriedma in differential revision https://reviews.llvm.org/D127055	2022-06-06 16:43:16 -04:00
Martin Storsjö	98dc3e86fd	[ARM] [MinGW] Default to WinEH exception handling instead of Dwarf Switching this target to WinEH also seems to affect the `-windows-itanium` target. Differential Revision: https://reviews.llvm.org/D126870	2022-06-06 23:27:19 +03:00
Fangrui Song	77e300ffdf	[MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline"	2022-06-05 15:11:01 -07:00
Fangrui Song	8c911f8e9a	[ARM][MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline" The directive name is not useful because the next line replicates the error line which includes the directive. The prevailing style uses "expected newline".	2022-06-05 14:53:59 -07:00
Kazu Hirata	3b9707dbc0	[llvm] Convert for_each to range-based for loops (NFC)	2022-06-05 12:07:14 -07:00
Fangrui Song	d86a206f06	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 00:31:44 -07:00
Martin Storsjö	485432f3c8	[ARM] Make a narrow tMOVi8 where possible in SEH prologues We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions in a potential future pass in frame lowering. But for this specific case, where we know we can express the intent with a narrow instruction, change to that instruction form directly in frame lowering. Differential Revision: https://reviews.llvm.org/D126949	2022-06-03 22:33:55 +03:00
Martin Storsjö	bd52506d24	[ARM] Make narrow push/pop in SEH prologues/epilogues where applicable We intentionally disable Thumb2SizeReduction for SEH prologues/epilogues, to avoid needing to guess what will happen with the instructions in a potential future pass in frame lowering. But for this specific case, where we know we can express the intent with a narrow instruction, change to that instruction form directly in frame lowering. Differential Revision: https://reviews.llvm.org/D126948	2022-06-03 22:33:55 +03:00
Martin Storsjö	40c937cba2	[ARM] Fix restoring stack for varargs with SEH split frame pointer push Previously, the "add sp, #12" ended up inserted after "bx lr". Differential Revision: https://reviews.llvm.org/D126872	2022-06-03 09:32:00 +03:00
Martin Storsjö	668bb96379	[ARM] Implement lowering of the sponentry intrinsic This is needed for SEH based setjmp on Windows. Differential Revision: https://reviews.llvm.org/D126763	2022-06-02 12:29:59 +03:00

1 2 3 4 5 ...

11820 Commits