llvm-project

Commit Graph

Author	SHA1	Message	Date
Amara Emerson	89e84dec18	[AArch64][GlobalISel] Fix fallbacks introduced for G_SITOFP in `8f283cafdd` If we have an integer->fp convert that has differing sizes, e.g. s32 to s64, then don't try to convert it to AArch64::G_SITOF since it won't select.	2021-01-15 01:10:49 -08:00
KAWASHIMA Takahiro	b54337070b	[AArch64] Add Fujitsu A64FX scheduling model Basic support of A64FX was added in D75594 but its scheduling model was missing. This commit adds the scheduling model. Also, this commit amends/adds some subtarget parameters of A64FX. The A64FX Microarchitecture Manual, which is source information of this commit, is on GitHub. https://github.com/fujitsu/A64FX/ Differential Revision: https://reviews.llvm.org/D93791	2021-01-15 17:14:04 +09:00
Kazu Hirata	7dc3575ef2	[llvm] Remove redundant return and continue statements (NFC) Identified with readability-redundant-control-flow.	2021-01-14 20:30:34 -08:00
Amara Emerson	8f283cafdd	[AArch64][GlobalISel] Add selection support for fpr bank source variants of G_SITOFP and G_UITOFP. In order to import patterns for these, we need to define new ops that can map to the AArch64ISD::[SU]ITOF nodes. We then transform fpr->fpr variants of the generic opcodes to these custom opcodes in preisel-lowering. We have to do it here and not the PostLegalizer combiner because this has to run after regbankselect. Differential Revision: https://reviews.llvm.org/D94702	2021-01-14 19:31:19 -08:00
Amara Emerson	036bc798f2	[AArch64][GlobalISel] Assign FPR banks to loads which are used by integer->float conversions. G_[US]ITOFP users of loads on AArch64 can operate on both gpr and fpr banks for scalars. Because of this, if their source is a load, then that load can be assigned to an fpr bank and therefore avoid having to do a cross bank copy via a gpr->fpr conversion. Differential Revision: https://reviews.llvm.org/D94701	2021-01-14 16:33:34 -08:00
Martin Storsjö	dbaa6a1858	Revert "[AArch64] Attempt to sink mul operands" This reverts commit `dda60035e9`. This commit caused failures to compile some sources, erroring out with "error in backend: Cannot select: t85: v2i32 = AArch64ISD::DUP t15", see https://reviews.llvm.org/D91271 for the full reproduction case.	2021-01-14 17:28:18 +02:00
Lucas Prates	2b1e25befe	[AArch64] Adding ACLE intrinsics for the LS64 extension This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`, and `__arm_st64bv0`. These are selected into the LS64 instructions LD64B, ST64B, ST64BV and ST64BV0, respectively. Based on patches written by Simon Tatham. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D93232	2021-01-14 09:43:58 +00:00
Jordan Rupprecht	752fafda3d	[NFC] Fix -Wsometimes-uninitialized After `49142991a6`, clang detects that MUL may be uninitialized. Set it to nullptr to suppress this check. Adding an assert to check that it is ultimately set fails two test cases. Since this is not a new issue, leave the assertion commented out until a code owner can fix the bug. The two failing test cases are noted in the assertion comment.	2021-01-13 20:32:38 -08:00
Kazu Hirata	4c1617dac8	[llvm] Use std::any_of (NFC)	2021-01-13 19:14:44 -08:00
Muhammad Asif Manzoor	4e8e888905	[AArch64][GlobalISel] Add support for FCONSTANT of FP128 type Add support for G_FCONSTANT of FP128 (Quadruple precision) type. It replaces the constant by emitting a load with a constant pool entry. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D94437	2021-01-13 10:46:10 -05:00
Nicholas Guy	dda60035e9	[AArch64] Attempt to sink mul operands Following on from D91255, this patch is responsible for sinking relevant mul operands to the same block so that umull/smull instructions can be correctly generated by the mul combine implemented in the aforementioned patch. Differential revision: https://reviews.llvm.org/D91271	2021-01-13 15:23:36 +00:00
Cullen Rhodes	ad85e39670	[SVE] Add ISel pattern for addvl Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D94504	2021-01-13 10:57:49 +00:00
Joe Ellis	3122c66aee	[AArch64][SVE] Remove chains of unnecessary SVE reinterpret intrinsics This commit extends SVEIntrinsicOpts::optimizeConvertFromSVBool to identify and remove longer chains of redundant SVE reintepret intrinsics. For example, the following chain of redundant SVE reinterprets is now recognised as redundant: %a = <vscale x 2 x i1> %1 = <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool(<vscale x 2 x i1> %a) %2 = <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool(<vscale x 16 x i1> %1) %3 = <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool(<vscale x 4 x i1> %2) %4 = <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool(<vscale x 16 x i1> %3) %5 = <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool(<vscale x 4 x i1> %4) %6 = <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool(<vscale x 16 x i1> %5) ret <vscale x 2 x i1> %6 and will be replaced with: ret <vscale x 2 x i1> %a Eliminating these can sometimes mean emitting fewer unnecessary loads/stores when lowering to assembly. Differential Revision: https://reviews.llvm.org/D94074	2021-01-13 09:44:09 +00:00
Hsiangkai Wang	914e2f5a02	[NFC] Use generic name for scalable vector stack ID. Differential Revision: https://reviews.llvm.org/D94471	2021-01-13 10:57:43 +08:00
Martin Storsjö	d1fa7afc7a	[AArch64] [Windows] Properly add :lo12: reloc specifiers when generating assembly This makes sure that assembly output actually can be assembled. Set the correct MCExpr relocations specifier VK_PAGEOFF - and also set VK_PAGE consistently even though it's not visible in the assembly output. Differential Revision: https://reviews.llvm.org/D94365	2021-01-12 23:56:03 +02:00
Bjorn Pettersson	32c073acb3	[GlobalISel] Map extractelt to G_EXTRACT_VECTOR_ELT Before this patch there was generic mapping from vector_extract to G_EXTRACT_VECTOR_ELT added in SelectionDAGCompat.td. That mapping is now replaced by a mapping from extractelt instead. The reasoning is that vector_extract is marked as deprecated, so it is assumed that a majority of targets will use extractelt and not vector_extract (and that the long term solution for all targets would be to use extractelt). Targets like AArch64 that still use vector_extract can add an additional mapping from the deprecated vector_extract as target specific tablegen definitions. Such a mapping is added for AArch64 in this patch to avoid breaking tests. When adding the extractelt => G_EXTRACT_VECTOR_ELT mapping we triggered some new code paths in GlobalISelEmitter, ending up in an assert when trying to import a pattern containing EXTRACT_SUBREG for ARM. Therefore this patch also adds a "failedImport" warning for that situation (instead of hitting the assert). Differential Revision: https://reviews.llvm.org/D93416	2021-01-11 21:53:56 +01:00
Kerry McLaughlin	c37f68a888	[SVE][CodeGen] Fix legalisation of floating-point masked gathers Changes in this patch: - When lowering floating-point masked gathers, cast the result of the gather back to the original type with reinterpret_cast before returning. - Added patterns for reinterpret_casts from integer to floating point, and concat_vector patterns for bfloat16. - Tests for various legalisation scenarios with floating point types. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D94171	2021-01-11 10:57:46 +00:00
Kazu Hirata	e3d3dbd339	[llvm] Ensure newlines at the end of files (NFC) This patch eliminates pesky "No newline at end of file" messages from git diff.	2021-01-10 09:24:57 -08:00
Mark Murray	7d4a8bc417	[AArch64] Add +flagm archictecture option, allowing the v8.4a flag modification extension. Differential Revision: https://reviews.llvm.org/D94081	2021-01-08 13:21:12 +00:00
Mark Murray	af7cce2fa4	[AArch64] Add +pauth archictecture option, allowing the v8.3a pointer authentication extension. Differential Revision: https://reviews.llvm.org/D94083	2021-01-08 13:21:11 +00:00
Nicholas Guy	ed23229a64	[AArch64] Fix crash caused by invalid vector element type Fixes a crash caused by D91255, when LLVMTy is null when calling changeExtendedVectorElementType. Differential Revision: https://reviews.llvm.org/D94234	2021-01-08 12:02:54 +00:00
David Sherwood	d1bf26fd94	[AArch64][SVE] Add lowering for llvm abs intrinsic Add functionality to permit lowering of the abs and neg intrinsics using the passthru variants. Differential Revision: https://reviews.llvm.org/D94160	2021-01-08 08:55:25 +00:00
Kazu Hirata	b934160aaa	[Target] Use llvm::find_if (NFC)	2021-01-07 20:29:36 -08:00
Cameron McInally	f4013359b3	[SVE] Add unpacked scalable floating point ZIP/UZP/TRN patterns Differential Revision: https://reviews.llvm.org/D94193	2021-01-07 09:56:53 -06:00
Simon Pilgrim	037b058e41	[AArch64] SVEIntrinsicOpts - use range loop and cast<> instead of dyn_cast<> for dereferenced pointer. NFCI. Don't directly dereference a dyn_cast<> - use cast<> so we assert for the correct type. Also, simplify the for loop to a range loop. Fixes clang static analyzer warning.	2021-01-07 14:21:55 +00:00
Caroline Concatto	01c190e907	[AArch64][CostModel]Fix gather scatter cost model This patch fixes a bug introduced in the patch: https://reviews.llvm.org/D93030 This patch pulls the test for scalable vector to be the first instruction to be checked. This avoids the Gather and Scatter cost model for AArch64 to compute the number of vector elements for something that is not a vector and therefore crashing.	2021-01-07 14:02:08 +00:00
Nicholas Guy	350247a93c	[AArch64] Rearrange mul(dup(sext/zext)) to mul(sext/zext(dup)) Performing this rearrangement allows for existing patterns to match cases where the vector may be built after an extend, instead of before. Differential Revision: https://reviews.llvm.org/D91255	2021-01-06 16:02:16 +00:00
Tomas Matheson	643e3c9076	[AArch64] Add BRB IALL and BRB INJ instructions BRB IALL: Invalidate the Branch Record Buffer BRB INJ: Branch Record Injection into the Branch Record Buffer Parser changes based on work by Simon Tatham. These are two-word mnemonics. The assembly parser works by special-casing the mnemonic in order to parse the second word as a plain identifier token. Reviewed by: MarkMurrayARM Differential Revision: https://reviews.llvm.org/D93899	2021-01-06 12:10:22 +00:00
David Green	a9b6440edd	[AArch64] Handle any extend whilst lowering addw/addl/subw/subl This adds an extra tablegen PatFrag, zanyext, which matches either any extend or zext and uses that in the aarch64 backend to handle any extends in addw/addl/subw/subl patterns. Differential Revision: https://reviews.llvm.org/D93833	2021-01-06 10:35:23 +00:00
David Green	78d8a821e2	[AArch64] Handle any extend whilst lowering mull Demanded bits may turn a sext or zext into an anyext if the top bits are not needed. This currently prevents the lowering to instructions like mull, addl and addw. This patch fixes the mull generation by keeping it simple and treating them like zextends. Differential Revision: https://reviews.llvm.org/D93832	2021-01-06 10:08:43 +00:00
Sander de Smalen	a7e3339f3b	[AArch64][SVE] Emit DWARF location expression for SVE stack objects. Extend PEI to emit a DWARF expression for StackOffsets that have a fixed and scalable component. This means the expression that needs to be added is either: <base> + offset or: <base> + offset + scalable_offset * scalereg where for SVE, the scale reg is the Vector Granule Dwarf register, which encodes the number of 64bit 'granules' in an SVE vector and which the debugger can evaluate at runtime. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D90020	2021-01-06 09:40:53 +00:00
Sander de Smalen	a9f5e4375b	[AArch64] Use faddp to implement fadd reductions. Custom-expand legal VECREDUCE_FADD SDNodes to benefit from pair-wise faddp instructions. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D59259	2021-01-06 09:36:51 +00:00
Kazu Hirata	cd088ba7e6	[llvm] Use llvm::lower_bound and llvm::upper_bound (NFC)	2021-01-05 21:15:59 -08:00
Christudasan Devadasan	d68458bd56	[GlobalISel] Base implementation for sret demotion. If the return values can't be lowered to registers SelectionDAG performs the sret demotion. This patch contains the basic implementation for the same in the GlobalISel pipeline. Furthermore, targets should bring relevant changes during lowerFormalArguments, lowerReturn and lowerCall to make use of this feature. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D92953	2021-01-06 10:30:50 +05:30
Bradley Smith	c73ae747cb	[AArch64][SVE] Add optimization to remove redundant ptest instructions Co-Authored-by: Graham Hunter <graham.hunter@arm.com> Co-Authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D93292	2021-01-05 15:28:36 +00:00
Paul Walker	eba6deab22	[SVE] Lower vector CTLZ, CTPOP and CTTZ operations. CTLZ and CTPOP are lowered to CLZ and CNT instructions respectively. CTTZ is not a native SVE operation but is instead lowered to: CTTZ(V) => CTLZ(BITREVERSE(V)) In the case of fixed-length support using SVE we also lower CTTZ operating on NEON sized vectors because of its reliance on BITREVERSE which is also lowered to SVE intructions at these lengths. Differential Revision: https://reviews.llvm.org/D93607	2021-01-05 10:42:35 +00:00
Kazu Hirata	eb198f4c3c	[llvm] Use llvm::any_of (NFC)	2021-01-04 11:42:47 -08:00
Caroline Concatto	060cfd9795	[AArch64][SVE]Add cost model for masked gather and scatter for scalable vector. A new TTI interface has been added 'Optional <unsigned>getMaxVScale' that returns the maximum vscale for a given target. When known getMaxVScale is used to compute the cost of masked gather scatter for scalable vector. Depends on D92094 Differential Revision: https://reviews.llvm.org/D93030	2021-01-04 13:59:58 +00:00
Florian Hahn	d38a0258a5	[AArch64] Add patterns for FMCLA*_indexed. This patch adds patterns for the indexed variants of FCMLA. Mostly based on a patch by Tim Northover. Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D92947	2021-01-04 13:45:51 +00:00
Usman Nadeem	685c8b537a	[AARCH64] Improve accumulator forwarding for Cortex-A57 model The old CPU model only had MLA->MLA forwarding. I added some missing MUL->MLA read advances and a missing absolute diff accumulator read advance according to the Cortex A57 Software Optimization Guide. The patch improves performance in EEMBC rgbyiqv2 by about 6%-7% and spec2006/milc by 8% (repeated runs on multiple devices), causes no significant regressions (none in SPEC). Differential Revision: https://reviews.llvm.org/D92296	2021-01-04 10:58:43 +00:00
David Sherwood	a65092040a	[SVE] Fix inline assembly parsing crash This patch fixes a crash encountered when compiling this code: ... float16_t a; __asm__("fminv %h[a], %[b], %[c].h" : [a] "=r" (a) : [b] "Upl" (b), [c] "w" (c)) The issue here is when using the 'h' modifier for a register constraint 'r'. Differential Revision: https://reviews.llvm.org/D93537	2021-01-04 09:11:05 +00:00
Mark Murray	5abfeccf10	[ARM][AArch64] Add Cortex-A78C Support for Clang and LLVM This patch upstreams support for the Armv8-a Cortex-A78C processor for AArch64 and ARM. In detail: Adding cortex-a78c as cpu option for aarch64 and arm targets in clang Adding Cortex-A78C CPU name and ProcessorModel in llvm Details of the CPU can be found here: https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a78c	2020-12-29 10:18:59 +00:00
Nikita Popov	fb77d95022	[AArch64] Fix legalization of i128 ctpop without neon If neon is disabled, LowerCTPOP will return SDValue() to indicate that normal legalization should be used. However, ReplaceNodeResults does not check for this and pushes the empty SDValue() onto the result vector, which will subsequently result in a crash. Differential Revision: https://reviews.llvm.org/D93825	2020-12-27 17:24:41 +01:00
Amara Emerson	e0721a0992	[AArch64][GlobalISel] Notify observer of mutated instruction for shift custom legalization. No test for this because it's a CSE verifier failure that's only exposed in a WIP patch for enabling CSE throughout the AArch64 GISel pipeline.	2020-12-25 00:31:47 -08:00
Kazu Hirata	d6ff5cf995	[Target] Use llvm::any_of (NFC)	2020-12-24 19:43:26 -08:00
Matt Arsenault	581d13f8ae	GlobalISel: Return APInt from getConstantVRegVal Returning int64_t was arbitrarily limiting for wide integer types, and the functions should handle the full generality of the IR. Also changes the full form which returns the originally defined vreg. Add another wrapper for the common case of just immediately converting to int64_t (arguably this would be useful for the full return value case as well). One possible issue with this change is some of the existing uses did break without conversion to getConstantVRegSExtVal, and it's possible some without adequate test coverage are now broken.	2020-12-22 22:23:58 -05:00
Paul Walker	8eec7294fe	[SVE] Lower vector BITREVERSE and BSWAP operations. These operations are lowered to RBIT and REVB instructions respectively. In the case of fixed-length support using SVE we also lower BITREVERSE operating on NEON sized vectors as this results in fewer instructions. Differential Revision: https://reviews.llvm.org/D93606	2020-12-22 16:49:50 +00:00
Kazu Hirata	966f1431de	[Target] Use llvm::erase_if (NFC)	2020-12-20 17:43:22 -08:00
Lucas Prates	1a9577bde1	[AArch64] Add support for ls64 to the .arch_extension asm directive This adds support for the 'ls64' AArch64 extension to the `.arch_extension` asm directive. Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D92574	2020-12-18 15:55:55 +00:00
Tomas Matheson	fc712eb7aa	[AArch64] Fix Copy Elemination for negative values Redundant Copy Elimination was eliminating a MOVi32imm -1 when it determined that the value of the destination register is already -1. However, it didn't take into account that the MOVi32imm zeroes the upper 32 bits (which are FFFFFFFF) and therefore cannot be eliminated. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D93100	2020-12-18 13:30:46 +00:00

1 2 3 4 5 ...

4816 Commits