llvm-project

Commit Graph

Author	SHA1	Message	Date
Igor Kudrin	2c14798ead	[ARM][llvm-objdump] Annotate PC-relative memory operands of VLDR instructions This extends D105979 and adds support for VLDR instructions. Differential Revision: https://reviews.llvm.org/D105980	2021-08-05 14:11:11 +07:00
Igor Kudrin	ddbe812bcc	[ARM][llvm-objdump] Annotate PC-relative memory operands This implements `MCInstrAnalysis::evaluateMemoryOperandAddress()` for Arm so that the disassembler can print the target address of memory operands that use PC+immediate addressing. Differential Revision: https://reviews.llvm.org/D105979	2021-08-05 14:11:11 +07:00
Daniel Egger	98c2e4115d	[ARM] Add lowering of uadd_sat to uq{add\|sub}8 and uq{add\|sub}16 This follow the lead of https://reviews.llvm.org/D68974 to add lowering of unsigned saturated addition/subtraction. Differential Revision: https://reviews.llvm.org/D105413	2021-07-11 15:58:11 +01:00
Eli Friedman	74909e4b6e	Rename MachineMemOperand::getOrdering -> getSuccessOrdering. Since this method can apply to cmpxchg operations, make sure it's clear what value we're actually retrieving. This will help ensure we don't accidentally ignore the failure ordering of cmpxchg in the future. We could potentially introduce a getOrdering() method on AtomicSDNode that asserts the operation isn't cmpxchg, but not sure that's worthwhile. Differential Revision: https://reviews.llvm.org/D103338	2021-06-21 16:49:27 -07:00
Tim Northover	d88f96dff3	ARM: support mandatory tail calls for tailcc & swifttailcc This adds support for callee-pop conventions to the ARM backend so that it can ensure a call marked "tail" is actually a tail call.	2021-05-28 11:10:51 +01:00
Tomas Matheson	9d86095ff8	Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" This reverts commit `753185031d`.	2021-05-03 21:48:20 +01:00
Tomas Matheson	753185031d	[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164 Originally submitted as `3338290c18`. Reverted in `c7df6b1223`.	2021-05-03 20:25:15 +01:00
Tomas Matheson	c7df6b1223	Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" This reverts commit `3338290c18`. Broke expensive checks on debian.	2021-04-30 16:53:14 +01:00
Tomas Matheson	3338290c18	[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164	2021-04-30 16:40:33 +01:00
David Green	8de7d8b2c2	[ARM] Recognize VIDUP from BUILDVECTORs of additions This adds a pattern to recognize VIDUP from BUILD_VECTOR of incrementing adds. This can come up from either geps or adds, and came up recently in D100550. We are just looking for a BUILD_VECTOR where each lane is an add of the first lane with N*i, where i is the lane and N is one of 1, 2, 4, or 8, supported by the VIDUP instruction. Differential Revision: https://reviews.llvm.org/D101263	2021-04-27 19:33:24 +01:00
Oliver Stannard	aac056c528	[objdump][ARM] Use correct offset when printing ARM/Thumb branch targets llvm-objdump only uses one MCInstrAnalysis object, so if ARM and Thumb code is mixed in one object, or if an object is disassembled without explicitly setting the triple to match the ISA used, then branch and call targets will be printed incorrectly. This could be fixed by creating two MCInstrAnalysis objects in llvm-objdump, like we currently do for SubtargetInfo. However, I don't think there's any reason we need two separate sub-classes of MCInstrAnalysis, so instead these can be merged into one, and the ISA determined by checking the opcode of the instruction. Differential revision: https://reviews.llvm.org/D97766	2021-03-04 11:15:57 +00:00
Kristof Beyls	df8ed39283	[ARM] harden-sls-blr: avoid r12 and lr in indirect calls. As a linker is allowed to clobber r12 on function calls, the code transformation that hardens indirect calls is not correct in case a linker does so. Similarly, the transformation is not correct when register lr is used. This patch makes sure that r12 or lr are not used for indirect calls when harden-sls-blr is enabled. Differential Revision: https://reviews.llvm.org/D92469	2020-12-19 12:39:59 +00:00
Kristof Beyls	195f44278c	[ARM] Implement harden-sls-retbr for ARM mode Some processors may speculatively execute the instructions immediately following indirect control flow, such as returns, indirect jumps and indirect function calls. To avoid a potential miss-speculatively executed gadget after these instructions leaking secrets through side channels, this pass places a speculation barrier immediately after every indirect control flow where control flow doesn't return to the next instruction, such as returns and indirect jumps, but not indirect function calls. Hardening of indirect function calls will be done in a later, independent patch. This patch is implementing the same functionality as the AArch64 counter part implemented in https://reviews.llvm.org/D81400. For AArch64, returns and indirect jumps only occur on RET and BR instructions and hence the function attribute to control the hardening is called "harden-sls-retbr" there. On AArch32, there is a much wider variety of instructions that can trigger an indirect unconditional control flow change. I've decided to stick with the name "harden-sls-retbr" as introduced for the corresponding AArch64 mitigation. This patch implements this for ARM mode. A future patch will extend this to also support Thumb mode. The inserted barriers are never on the correct, architectural execution path, and therefore performance overhead of this is expected to be low. To ensure these barriers are never on an architecturally executed path, when the harden-sls-retbr function attribute is present, indirect control flow is never conditionalized/predicated. On targets that implement that Armv8.0-SB Speculation Barrier extension, a single SB instruction is emitted that acts as a speculation barrier. On other targets, a DSB SYS followed by a ISB is emitted to act as a speculation barrier. These speculation barriers are implemented as pseudo instructions to avoid later passes to analyze them and potentially remove them. The mitigation is off by default and can be enabled by the harden-sls-retbr subtarget feature. Differential Revision: https://reviews.llvm.org/D92395	2020-12-19 11:42:39 +00:00
David Green	92205bf122	[ARM] Remove some dead code. NFC	2020-10-24 17:22:49 +01:00
Meera Nakrani	675431b987	[ARM] Added more patterns to generate SSAT/USAT with shift Added patterns to generate an SSAT or USAT with shift for SSAT/USAT instructions that are matched from IR patterns. Differential Revision: https://reviews.llvm.org/D88145	2020-09-28 14:50:19 +00:00
Meera Nakrani	20283ff491	[ARM] Generated SSAT and USAT instructions with shift Added patterns so that both SSAT and USAT instructions are generated with shifts. Added corresponding regression tests. Differential Review: https://reviews.llvm.org/D85120	2020-08-04 09:38:17 +00:00
Sjoerd Meijer	85342c27a3	[ARM] Optimize immediate selection Optimize some specific immediates selection by materializing them with sub/mvn instructions as opposed to loading them from the constant pool. Patch by Ben Shi, powerman1st@163.com. Differential Revision: https://reviews.llvm.org/D83745	2020-07-29 13:29:17 +01:00
David Green	f8abecf337	[ARM] Extra MVE select(binop) patterns This is very similar to 243970d03cace2, but handling a slightly different form of predicated operations. When starting with a pattern of the form select(p, BinOp(x, y), x), Instcombine will often transform this to BinOp(x, select(p, y, 0)), where 0 is the identity value of the binop (0 for adds/subs, 1 for muls, -1 for ands etc). This adds the patterns that transforms those back into predicated binary operations. There is also a very minor adjustment to tablegen null_frag in here, to allow it to also be recognized as a PatLeaf node, so that it can be used in MVE_TwoOpPattern to easily exclude the cases where we do not need the alternate transform. Differential Revision: https://reviews.llvm.org/D84091	2020-07-22 14:08:29 +01:00
David Green	76e0e1a55d	[ARM] VCVTT instruction selection We current extract and convert from a top lane of a f16 vector using a VMOVX;VCVTB pair. We can simplify that to use a single VCVTT. The pattern is mostly copied from a vector extract pattern, but produces a VCVTTHS f32 directly. This had to move some code around so that ARMInstrVFP had access to the required pattern frags that were previously part of ARMInstrNEON. Differential Revision: https://reviews.llvm.org/D81556	2020-06-26 08:58:55 +01:00
Stefan Agner	b7d41a11cd	[ARM] Make cp10 and cp11 usage a warning The ARM ARM considers p10/p11 valid arguments for MCR/MRC instructions. MRC instructions with p10 arguments are also used in kernel code which is shared for different architectures. Turn usage of p10/p11 to warnings for ARMv7/ARMv8-M. Reviewers: rengolin, olista01, t.p.northover, efriedma, psmith, simon_tatham Reviewed By: simon_tatham Subscribers: hiraditya, danielkiss, jcai19, tpimh, nickdesaulniers, peter.smith, javed.absar, kristof.beyls, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59733	2020-06-24 23:37:54 +02:00
Victor Campos	c010d4d195	[ARM] Improve codegen of volatile load/store of i64 Summary: Instead of generating two i32 instructions for each load or store of a volatile i64 value (two LDRs or STRs), now emit LDRD/STRD. These improvements cover architectures implementing ARMv5TE or Thumb-2. The code generation explicitly deviates from using the register-offset variant of LDRD/STRD. In this variant, the register allocated to the register-offset cannot be reused in any of the remaining operands. Such restriction seems to be non-trivial to implement in LLVM, thus it is left as a to-do. Differential Revision: https://reviews.llvm.org/D70072	2020-05-28 10:52:43 +01:00
Victor Campos	872ee78f65	Revert "[ARM] Improve codegen of volatile load/store of i64" This reverts commit `8a12553223`. A bug has been found when generating code for Thumb2. In some very specific cases, the prologue/epilogue emitter generates erroneous stack offsets for the new LDRD instructions that access the stack. This bug does not seem to be caused by the reverted patch though. Likely the latter has made an undiscovered issue emerge in the prologue/epilogue emission pass. Nevertheless, this reversion is necessary since it is blocking users of the ARM backend.	2020-05-22 11:01:57 +01:00
Momchil Velikov	bc2e572f51	Re-commit: [ARM] CMSE code generation This patch implements the final bits of CMSE code generation: * emit special linker symbols * restrict parameter passing to no use memory * emit BXNS and BLXNS instructions for returns from non-secure entry functions, and non-secure function calls, respectively * emit code to save/restore secure floating-point state around calls to non-secure functions * emit code to save/restore non-secure floating-pointy state upon entry to non-secure entry function, and return to non-secure state * emit code to clobber registers not used for arguments and returns * when switching to no-secure state Patch by Momchil Velikov, Bradley Smith, Javed Absar, David Green, possibly others. Differential Revision: https://reviews.llvm.org/D76518	2020-05-14 16:46:16 +01:00
David Green	6eee2d9b5b	[ARM] Convert VDUPLANE to VDUP under MVE Unlike Neon, MVE does not have a way of duplicating from a vector lane, so a VDUPLANE currently selects to a VDUP(move_from_lane(..)). This forces that to be done earlier as a dag combine to allow other folds to happen. It converts to a VDUP(EXTRACT). On FP16 this is then folded to a VGETLANEu to prevent it from creating a vmovx;vmovhr pair, using a single move_from_reg instead. Differential Revision: https://reviews.llvm.org/D79606	2020-05-09 18:58:13 +01:00
Momchil Velikov	fb18dffaeb	Revert "[ARM] CMSE code generation" This reverts commit `7cbbf89d23`. The regression tests fail with the expensive checks.	2020-05-05 19:05:40 +01:00
Momchil Velikov	7cbbf89d23	[ARM] CMSE code generation This patch implements the final bits of CMSE code generation: * emit special linker symbols * restrict parameter passing to not use memory * emit BXNS and BLXNS instructions for returns from non-secure entry functions, and non-secure function calls, respectively * emit code to save/restore secure floating-point state around calls to non-secure functions * emit code to save/restore non-secure floating-pointy state upon entry to non-secure entry function, and return to non-secure state * emit code to clobber registers not used for arguments and returns when switching to no-secure state Patch by Momchil Velikov, Bradley Smith, Javed Absar, David Green, possibly others. Differential Revision: https://reviews.llvm.org/D76518	2020-05-05 18:23:28 +01:00
Kazuaki Ishizaki	0312b9f550	[llvm] NFC: Fix trivial typo in rst and td files Differential Revision: https://reviews.llvm.org/D77469	2020-04-23 14:26:32 +09:00
David Green	fbd53ffc3a	[ARM] MVE VMULL patterns This adds MVE vmull patterns, which are conceptually the same as mul(vmovl, vmovl), and so the tablegen patterns follow the same structure. For i8 and i16 this is simple enough, but in the i32 version the multiply (in 64bits) is illegal, meaning we need to catch the pattern earlier in a dag fold. Because bitcasts are involved in the zext versions and the patterns are a little different in little and big endian. I have only added little endian support in this patch. Differential Revision: https://reviews.llvm.org/D76740	2020-04-02 10:57:40 +01:00
David Green	2c5f43f9dd	[ARM] Fix qdadd operand order qdadd is defined as sat(Rm + sat(2*Rn)). We had the Rm and Rn switched the wrong way around. Differential Revision: https://reviews.llvm.org/D77049	2020-03-31 10:11:36 +01:00
Stefan Agner	f87563661d	[MC][ARM] add implicit immediate form for ldrsbt/ldrht/ldrsht Add pseudo instructions for ldrsbt/ldrht/ldrsht with implicit immediate and add fall back C++ code to transform the instruction to the equivalent LDRSBTi/LDRHTi/LDRSHTi form. This is similar to how it has been done in commit `fb3950ec63` This fixes: https://bugs.llvm.org/show_bug.cgi?id=45070	2020-03-19 22:36:42 +01:00
Victor Campos	8a12553223	[ARM] Improve codegen of volatile load/store of i64 Summary: Instead of generating two i32 instructions for each load or store of a volatile i64 value (two LDRs or STRs), now emit LDRD/STRD. These improvements cover architectures implementing ARMv5TE or Thumb-2. The code generation explicitly deviates from using the register-offset variant of LDRD/STRD. In this variant, the register allocated to the register-offset cannot be reused in any of the remaining operands. Such restriction seems to be non-trivial to implement in LLVM, thus it is left as a to-do. Reviewers: dmgreen, efriedma, john.brawn, nickdesaulniers Reviewed By: efriedma, nickdesaulniers Subscribers: danielkiss, alanphipps, hans, nathanchance, nickdesaulniers, vvereschaka, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70072	2020-03-11 10:19:27 +00:00
Simon Tatham	9dcc1667ab	[ARM] Allow `ARMVectorRegCast` to match bitconverts too. (NFC) Summary: When we start putting instances of `ARMVectorRegCast` in complex isel patterns, it will be awkward that they're often turned into the more standard `bitconvert` in little-endian mode. We'd rather not have to write separate isel patterns for the two endiannesses, matching different but equivalent cast operations. This change aims to fix that awkwardness in advance, by turning the Tablegen record `ARMVectorRegCast` from a simple `SDNode` instance into a `PatFrags` that can match either kind of cast – with a predicate that prevents it matching a bitconvert in the big-endian case, where bitconvert isn't semantically identical. No existing code generation should be affected by this change, but it will enable the patterns introduced by D74336 to work in both endiannesses. Reviewers: dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74716	2020-02-18 09:34:50 +00:00
Mikhail Maltsev	dd4d093762	[ARM] Add initial support for Custom Datapath Extension (CDE) Summary: This patch adds assembly-level support for a new Arm M-profile architecture extension, Custom Datapath Extension (CDE). A brief description of the extension is available at https://developer.arm.com/architectures/instruction-sets/custom-instructions The latest specification for CDE is currently a beta release and is available at https://static.docs.arm.com/ddi0607/aa/DDI0607A_a_armv8m_arm_supplement_cde.pdf CDE allows chip vendors to add custom CPU instructions. The CDE instructions re-use the same encoding space as existing coprocessor instructions (such as MRC, MCR, CDP etc.). Each coprocessor in range cp0-cp7 can be configured as either general purpose (GCP) or custom datapath (CDEv1). This configuration is defined by the CPU vendor and is provided to LLVM using 8 subtarget features: cdecp0 ... cdecp7. The semantics of CDE instructions are implementation-defined, but the instructions are guaranteed to be pure (that is, they are stateless, they do not access memory or any registers except their explicit inputs/outputs). CDE requires the CPU to support at least Armv8.0-M mainline architecture. CDE includes 3 sets of instructions: * Instructions that operate on general purpose registers and NZCV flags * Instructions that operate on the S or D register file (require either FP or MVE extension) * Instructions that operate on the Q register file, require MVE The user-facing names that can be specified on the command line are the same as the 8 subtarget feature names. For example: $ clang -target arm-none-none-eabi -march=armv8m.main+cdecp0+cdecp3 tells the compiler that the coprocessors 0 and 3 are configured as CDEv1 and the remaining coprocessors are configured as GCP (which is the default). Reviewers: simon_tatham, ostannard, dmgreen, eli.friedman Reviewed By: simon_tatham Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D74044	2020-02-17 15:39:16 +00:00
David Green	9d4c597541	[ARM] Fix ReconstructShuffle for bigendian Simon pointed out that this function is doing a bitcast, which can be incorrect for big endian. That makes the lowering of VMOVN in MVE wrong, but the function is shared between Neon and MVE so both can be incorrect. This attempts to fix things by using the newly added VECTOR_REG_CAST instead of the BITCAST. As it may now be used on Neon, I've added the relevant patterns for it there too. I've also added a quick dag combine for it to remove them where possible. Differential Revision: https://reviews.llvm.org/D74485	2020-02-13 09:56:46 +00:00
Victor Campos	af2a384581	Revert "[ARM] Improve codegen of volatile load/store of i64" This reverts commit `60e0120c91`.	2020-02-08 13:18:45 +00:00
Simon Tatham	4321c6af28	[ARM,MVE] Support immediate vbicq,vorrq,vmvnq intrinsics. Summary: Immediate vmvnq is code-generated as a simple vector constant in IR, and left to the backend to recognize that it can be created with an MVE VMVN instruction. The predicated version is represented as a select between the input and the same constant, and I've added a Tablegen isel rule to turn that into a predicated VMVN. (That should be better than the previous VMVN + VPSEL: it's the same number of instructions but now it can fold into an adjacent VPT block.) The unpredicated forms of VBIC and VORR are done by enabling the same isel lowering as for NEON, recognizing appropriate immediates and rewriting them as ARMISD::VBICIMM / ARMISD::VORRIMM SDNodes, which I then instruction-select into the right MVE instructions (now that I've also reworked those instructions to use the same MC operand encoding). In order to do that, I had to promote the Tablegen SDNode instance `NEONvorrImm` to a general `ARMvorrImm` available in MVE as well, and similarly for `NEONvbicImm`. The predicated forms of VBIC and VORR are represented as a vector select between the original input vector and the output of the unpredicated operation. The main convenience of this is that it still lets me use the existing isel lowering for VBICIMM/VORRIMM, and not have to write another copy of the operand encoding translation code. This intrinsic family is the first to use the `imm_simd` system I put into the MveEmitter tablegen backend. So, naturally, it showed up a bug or two (emitting bogus range checks and the like). Fixed those, and added a full set of tests for the permissible immediates in the existing Sema test. Also adjusted the isel pattern for `vmovlb.u8`, which stopped matching because lowering started turning its input into a VBICIMM. Now it recognizes the VBICIMM instead. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72934	2020-01-23 11:53:52 +00:00
Victor Campos	60e0120c91	[ARM] Improve codegen of volatile load/store of i64 Summary: Instead of generating two i32 instructions for each load or store of a volatile i64 value (two LDRs or STRs), now emit LDRD/STRD. These improvements cover architectures implementing ARMv5TE or Thumb-2. Reviewers: dmgreen, efriedma, john.brawn, nickdesaulniers Reviewed By: efriedma, nickdesaulniers Subscribers: nickdesaulniers, vvereschaka, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70072	2020-01-07 13:16:18 +00:00
Matt Arsenault	0d9f919b73	DAG: Use TargetConstant for FENCE operands	2020-01-02 17:16:10 -05:00
Victor Campos	2ff5a596cb	Revert "[ARM] Improve codegen of volatile load/store of i64" This reverts commit `bbcf1c3496`.	2019-12-20 18:08:24 +00:00
Victor Campos	bbcf1c3496	[ARM] Improve codegen of volatile load/store of i64 Summary: Instead of generating two i32 instructions for each load or store of a volatile i64 value (two LDRs or STRs), now emit LDRD/STRD. These improvements cover architectures implementing ARMv5TE or Thumb-2. Reviewers: dmgreen, efriedma, john.brawn Reviewed By: efriedma Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70072	2019-12-19 11:23:01 +00:00
Simon Tatham	25305a9311	[ARM][MVE] Add intrinsics for more immediate shifts. Summary: This fills in the remaining shift operations that take a single vector input and an immediate shift count: the `vqshl`, `vqshlu`, `vrshr` and `vshll[bt]` families. `vshll[bt]` (which shifts each input lane left into a double-width output lane) is the most interesting one. There are separate MC instruction ids for shifting by exactly the input lane width and shifting by less than that, because the instruction encoding is so completely different for the lane-width special case. So I had to write two sets of patterns to match based on the immediate shift count, which involved adding a ComplexPattern matcher to avoid the general-case pattern accidentally matching the special case too. For that family I've made sure to add an llc codegen test for both versions of each instruction. I'm experimenting with a new strategy for parametrising the isel patterns for all these instructions: adding extra fields to the relevant `Instruction` subclass itself, which are ignored by the Tablegen backends that generate the MC data, but can be retrieved from each instance of that instruction subclass when it's passed as a template parameter to the multiclass that generates its isel patterns. A nice effect of that is that I can fill in those informational fields using `let` blocks, rather than having to type them out once per instruction at `defm` time. (As a result, quite a lot of existing instruction `def`s are reindented by this patch, so it's clearer to read with whitespace changes ignored.) Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71458	2019-12-13 13:07:39 +00:00
David Green	469ee617a0	[ARM] Add ARMVCCThen to tablegen and make use of it. NFC Similar to the parent, this adds some constants to tablegen to replace the existing magic values. Differential Revision: https://reviews.llvm.org/D70825	2019-12-02 19:57:12 +00:00
David Green	a223a4d66f	[ARM] Add ARMCC constants to tablegen. NFC I got tired of looking at magic constants in tablegen files. This adds condition codes like ARMCCeq and makes use of them. I also removed the extra patterns for reverse condition codes from D70296, they should now be covered by the parent commit. Differential Revision: https://reviews.llvm.org/D70824	2019-12-02 19:57:12 +00:00
David Green	0765a4c288	[ARM] Extra qdadd patterns This adds some new qdadd patterns to go along with the other recently added qadd's. Differential Revision: https://reviews.llvm.org/D68999 llvm-svn: 375414	2019-10-21 14:06:49 +00:00
David Green	d7b77f2203	[ARM] Add qadd lowering from a sadd_sat This lowers a sadd_sat to a qadd by treating it as legal. Also adds qsub at the same time. The qadd instruction sets the q flag, but we already have many cases where we do not model this in llvm. Differential Revision: https://reviews.llvm.org/D68976 llvm-svn: 375411	2019-10-21 12:33:46 +00:00
David Green	fba831e791	[ARM] Lower sadd_sat to qadd8 and qadd16 Lower the target independent signed saturating intrinsics to qadd8 and qadd16. This custom lowers them from a sadd_sat, catching the node early before it is promoted. It also adds a QADD8b and QADD16b node to mean the bottom "lane" of a qadd8/qadd16, so that we can call demand bits on it to show that it does not use the upper bits. Also handles QSUB8 and QSUB16. Differential Revision: https://reviews.llvm.org/D68974 llvm-svn: 375402	2019-10-21 09:53:38 +00:00
Kristof Beyls	78bfe3ab94	[ARM] Generate vcmp instead of vcmpe Based on the discussion in http://lists.llvm.org/pipermail/llvm-dev/2019-October/135574.html, the conclusion was reached that the ARM backend should produce vcmp instead of vcmpe instructions by default, i.e. not be producing an Invalid Operation exception when either arguments in a floating point compare are quiet NaNs. In the future, after constrained floating point intrinsics for floating point compare have been introduced, vcmpe instructions probably should be produced for those intrinsics - depending on the exact semantics they'll be defined to have. This patch logically consists of the following parts: - Revert http://llvm.org/viewvc/llvm-project?rev=294945&view=rev and http://llvm.org/viewvc/llvm-project?rev=294968&view=rev, which implemented fine-tuning for when to produce vcmpe (i.e. not do it for equality comparisons). The complexity introduced by those patches isn't needed anymore if we just always produce vcmp instead. Maybe these patches need to be reintroduced again once support is needed to map potential LLVM-IR constrained floating point compare intrinsics to the ARM instruction set. - Simply select vcmp, instead of vcmpe, see simple changes in lib/Target/ARM/ARMInstrVFP.td - Adapt lots of tests that tested for vcmpe (instead of vcmp). For all of these test, the intent of what is tested for isn't related to whether the vcmp should produce an Invalid Operation exception or not. Fixes PR43374. Differential Revision: https://reviews.llvm.org/D68463 llvm-svn: 374025	2019-10-08 08:25:42 +00:00
David Green	120a5e9a74	[ARM] Cortex-M4 schedule additions This is an attempt to fill in some of the missing instructions from the Cortex-M4 schedule, and make it easier to do the same for other ARM cpus. - Some instructions are marked as hasNoSchedulingInfo as they are pseudos or otherwise do not require scheduling info - A lot of features have been marked not supported - Some WriteRes's have been added for cvt instructions. - Some extra instruction latencies have been added, notably by relaxing the regex for dsp instruction to catch more cases, and some fp instructions. This goes a long way to get the CompleteModel working for this CPU. It does not go far enough as to get all scheduling info for all output operands correct. Differential Revision: https://reviews.llvm.org/D67957 llvm-svn: 373163	2019-09-29 08:38:48 +00:00
Matt Arsenault	3ecab8e455	Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338	2019-09-19 16:26:14 +00:00
Hans Wennborg	13bdae8541	Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_ instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314	2019-09-19 12:33:07 +00:00

1 2 3 4 5 ...

1304 Commits