llvm-project

Commit Graph

Author	SHA1	Message	Date
Nemanja Ivanovic	de4e0195ae	[PowerPC] Add missed test case updates In commit `1674d9b6b2`, I missed adding the updates to existing test cases. This should bring the bots back to green.	2021-12-21 14:55:19 -06:00
Nemanja Ivanovic	1674d9b6b2	[PowerPC] Fix vector equality comparison for v2i64 pre-Power8 The current code makes the assumption that equality comparison can be performed with a word comparison instruction. While this is true if the entire 64-bit results are used, it does not generally work. It is possible that the low order words and high order words produce different results and a user of only one will get the wrong result. This patch adds an and of the result words so that each word has the result of the comparison of the entire doubleword that contains it. Differential revision: https://reviews.llvm.org/D115678	2021-12-21 14:28:41 -06:00
Mircea Trofin	09103807e7	[NFC][regalloc] Introduce the RegAllocEvictionAdvisorAnalysis This patch introduces the eviction analysis and the eviction advisor, the default implementation, and the scaffolding for introducing the other implementations of the advisor. Differential Revision: https://reviews.llvm.org/D115707	2021-12-16 17:56:46 -08:00
Florian Hahn	59a85a7a52	[PPC] Update test after `f5f421e0ee`.	2021-12-16 11:28:54 +00:00
Chen Zheng	d0022a7250	[PowerPC] copy byval parameter to caller's stack when needed Now we won't copy the byval parameter (bigger than 8 bytes) to caller's parameter save area. Instead, we will only copy the byval parameter when it can not be passed entirely in registers which means we have to use parameter save area according to the 64 bit SVR4 ABI. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D111485	2021-12-09 01:00:47 +00:00
Chen Zheng	c16c99ab03	[Powerpc] testcases for D111485; nfc	2021-12-08 02:22:00 +00:00
Chen Zheng	63cd1842a7	[PowerPC] use lvx + splat directly for aligned splat load Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D114062	2021-12-08 02:02:18 +00:00
Chen Zheng	d0a8f86667	[PowerPC][NFC] add cases for D114062	2021-12-07 01:12:01 +00:00
Qiu Chaofan	e3c2694da9	[PowerPC] Implement general back2back fusion Implement 'back-to-back' FX fusion according to Power10 User Manual '19.1.5.4 Fusion', not enabled by default. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D114345	2021-12-06 10:15:05 +08:00
Nemanja Ivanovic	d6c0ef7887	[PowerPC] Handle base load with reservation mnemonic The Power ISA defined l[bhwdq]arx as both base and extended mnemonics. The base mnemonic takes the EH bit as an operand and the extended mnemonic omits it, making it implicitly zero. The existing implementation only handles the base mnemonic when EH is 1 and internally produces a different instruction. There are historical reasons for this. This patch simply removes the limitation introduced by this implementation that disallows the base mnemonic with EH = 0 in the ASM parser. This resolves an issue that prevented some files in the Linux kernel from being built with -fintegrated-as. Also fix a crash if the value is not an integer immediate.	2021-12-03 09:13:02 -06:00
Simon Pilgrim	e85667a2fb	[PowerPC] Add non-constant fcopysign f128 test coverage As discussed on D114589 as the constant case gets affected by SimplifyDemandedBits a lot - the non-constant case currently falls back to copysignl libcalls	2021-12-03 12:04:06 +00:00
Amy Kwan	c27734c183	[PowerPC] Fix load/store selection infrastructure when load/store intrinsics are used on P10. The load/store infrastructure previously made an incorrect assumption that whenever it is used with a load/store intrinsic on Power10 - those intrinsics would automatically be the lxvp/stxvp intrinsics introduced in Power10. However, this is obviously not the case as there are multiple instances of pre-P10 intrinsics that use the refactored load/store implementation. This patch corrects this assumption, and produces the expected intrinsic on pre-P10. Differential Revision: https://reviews.llvm.org/D114978	2021-12-02 15:59:29 -06:00
Simon Pilgrim	6803d08c38	[DAG][PowerPC] Enable initial ISD::BITCAST SimplifyDemandedBits/SimplifyMultipleUseDemandedBits big-endian handling This patch begins extending handling for peeking through bitcast nodes to big-endian targets as well as the existing little-endian case. Differential Revision: https://reviews.llvm.org/D114676	2021-12-02 11:47:53 +00:00
Yousuf Ali	415e821a50	[PowerPC][AIX] Add toc-data support for 64-bit AIX small code model. The patch expands the existing 32-bit toc-data attribute support to 64-bit. In both 32-bit and 64-bit it is supported for small code model only. Differential Revision: https://reviews.llvm.org/D114654	2021-12-01 10:56:21 -05:00
Qiu Chaofan	15826eb437	[Legalizer] Avoid expansion to BR_CC if illegal Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110616	2021-12-01 12:22:21 +08:00
Tarique Islam	0850655da6	Big-endian version of vpermxor A big-endian version of vpermxor, named vpermxor_be, is added to LLVM and Clang. vpermxor_be can be called directly on both the little-endian and the big-endian platforms. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D114540	2021-11-30 22:49:55 +00:00
Philip Reames	8906a0fe64	[SCEVExpander] Drop poison generating flags when reusing instructions The basic problem we have is that we're trying to reuse an instruction which is mapped to some SCEV. Since we can have multiple such instructions (potentially with different flags), this is analogous to our need to drop flags when performing CSE. A trivial implementation would simply drop flags on any instruction we decided to reuse, and that would be correct. This patch is almost that trivial patch except that we preserve flags on the reused instruction when existing users would imply UB on overflow already. Adding new users can, at most, refine this program to one which doesn't execute UB which is valid. In practice, this fixes two conceptual problems with the previous code: 1) a binop could have been canonicalized into a form with different opcode or operands, or 2) the inbounds GEP case which was simply unhandled. On the test changes, most are pretty straight forward. We loose some flags (in some cases, they'd have been dropped on the next CSE pass anyways). The one that took me the longest to understand was the ashr-expansion test. What's happening there is that we're considering reuse of the mul, previously we disallowed it entirely, now we allow it with no flags. The surrounding diffs are all effects of generating the same mul with a different operand order, and then doing simple DCE. The loss of the inbounds is unfortunate, but even there, we can recover most of those once we actually treat branch-on-poison as immediate UB. Differential Revision: https://reviews.llvm.org/D112734	2021-11-29 15:23:34 -08:00
Simon Pilgrim	7ba64ab05a	[PowerPC] Regenerate ppc64-P9-vabsd.ll tests	2021-11-27 16:43:50 +00:00
Nikita Popov	2b160e95c8	Reland [SCEV] Fix and validate ValueExprMap/ExprValueMap consistency Relative to the previous landing attempt, this introduces an additional flag on forgetMemoizedResults() to not remove SCEVUnknown phis from the value map. The invalidation after BECount calculation wants to leave these alone and skips them in its own use-def walk, but we can still end up invalidating them via forgetMemoizedResults() if there is another IR value with the same SCEV. This is intended as a temporary workaround only, and the need for this should go away once the getBackedgeTakenInfo() invalidation is refactored in the spirit of D114263. ----- This adds validation for consistency of ValueExprMap and ExprValueMap, and fixes identified issues: * Addrec construction directly wrote to ValueExprMap in a few places, without updating ExprValueMap. Add a helper to ensures they stay consistent. The adjustment in forgetSymbolicName() explicitly drops the old value from the map, so that we don't rely on it being overwritten. * forgetMemoizedResultsImpl() was dropping the SCEV from ExprValueMap, but not dropping the corresponding entries from ValueExprMap. Differential Revision: https://reviews.llvm.org/D113349	2021-11-27 12:37:15 +01:00
Nikita Popov	719354a571	Revert "[SCEV] Fix and validate ValueExprMap/ExprValueMap consistency" This reverts commit `bee8dcda1f`. Some sanitizer buildbots fail with: > Attempt to use a SCEVCouldNotCompute object! For example: https://lab.llvm.org/buildbot/#/builders/85/builds/7020/steps/9/logs/stdio	2021-11-26 22:18:23 +01:00
Nikita Popov	bee8dcda1f	[SCEV] Fix and validate ValueExprMap/ExprValueMap consistency Relative to the previous landing attempt, this makes insertValueToMap() resilient against the value already being present in the map -- previously I only checked this for the createSimpleAffineAddRec() case, but the same issue can also occur for the general createNodeForPHI(). In both cases, the addrec may be constructed and added to the map in a recursive query trying to create said addrec. In this case, this happens due to the invalidation when the BE count is computed, which ends up clearing out the symbolic name as well. ----- This adds validation for consistency of ValueExprMap and ExprValueMap, and fixes identified issues: * Addrec construction directly wrote to ValueExprMap in a few places, without updating ExprValueMap. Add a helper to ensures they stay consistent. The adjustment in forgetSymbolicName() explicitly drops the old value from the map, so that we don't rely on it being overwritten. * forgetMemoizedResultsImpl() was dropping the SCEV from ExprValueMap, but not dropping the corresponding entries from ValueExprMap. Differential Revision: https://reviews.llvm.org/D113349	2021-11-26 20:57:47 +01:00
Simon Pilgrim	a25e08dd3c	[PowerPC/ Regenerate fp128-bitcast-after-operation test checks	2021-11-25 13:39:57 +00:00
Nemanja Ivanovic	b7bf937bbe	[PowerPC] Provide XL-compatible vec_round implementation The XL implementation of vec_round for vector double uses "round-to-nearest, ties to even" just as the vector float `version does. However clang and gcc use "round-to-nearest-away" for vector double and "round-to-nearest, ties to even" for vector float. The XL behaviour is implemented under the __XL_COMPAT_ALTIVEC__ macro similarly to other instances of incompatibility. Differential revision: https://reviews.llvm.org/D113642	2021-11-24 06:43:56 -06:00
Nemanja Ivanovic	c9cb8edc51	[PowerPC] Allow scalars for asm constraint "v" with VSX Similarly to what GCC does, we should allow scalars with the "v" constraint rather than introducing unnecessary new constraints for scalars in Altivec registers. Differential revision: https://reviews.llvm.org/D113635	2021-11-23 17:03:04 -06:00
Nemanja Ivanovic	c933c2eb33	[PowerPC] Add BCD add/sub/cmp builtins Support for builtins that use bcdadd./bcdsub. to add/subtract Binary Coded Decimal values as well as to determine validity and compare BCD values. Differential revision: https://reviews.llvm.org/D114088	2021-11-23 11:42:36 -06:00
Qiu Chaofan	59f4b3d308	[PowerPC] Implement more fusion types for Power10 This implements the rest of Power10 instruction fusion pairs, according to user manual, including 'wide immediate', 'load compare', 'zero move' and 'SHA3 assist'. Only 'SHA3 assist' is enabled by default. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D112912	2021-11-23 17:21:17 +08:00
Nikita Popov	62e9acad0a	Revert "[SCEV] Fix and validate ValueExprMap/ExprValueMap consistency" This reverts commit `d633db8f9d`. Causes bootstrap assertion failures: https://lab.llvm.org/buildbot/#/builders/168/builds/3459/steps/9/logs/stdio	2021-11-22 15:47:33 +01:00
Nikita Popov	d633db8f9d	[SCEV] Fix and validate ValueExprMap/ExprValueMap consistency This adds validation for consistency of ValueExprMap and ExprValueMap, and fixes identified issues: * Addrec construction directly wrote to ValueExprMap in a few places, without updating ExprValueMap. Add a helper to ensures they stay consistent. The adjustment in forgetSymbolicName() explicitly drops the old value from the map, so that we don't rely on it being overwritten. * forgetMemoizedResultsImpl() was dropping the SCEV from ExprValueMap, but not dropping the corresponding entries from ValueExprMap. Differential Revision: https://reviews.llvm.org/D113349	2021-11-22 15:27:25 +01:00
Simon Pilgrim	357d636289	[PowerPC] Regenerate rlwinm2.ll test	2021-11-21 18:33:28 +00:00
Stefan Pintilie	e9d12c2480	[PowerPC][NFC] Add a series of codegen tests for vector reductions. This patch only adds tests for PowerPC. The purpose of these tests is to track what code is generated for various vector reductions. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D113801	2021-11-19 15:03:01 -06:00
Victor Huang	86e77cdb08	[PowerPC] Add a flag for conditional trap optimization This patch adds a flag to enable/disable conditional trap optimization. Optimization disabled by default. Peer reviewed by: nemanjai	2021-11-19 10:24:54 -06:00
Simon Pilgrim	812e64ef0c	[DAG] MatchRotate - support rotate-by-constant of illegal types Patch to fix some of the regressions in D77804. By folding to rotate/funnel-shift by constant amounts for illegal types, we prevent SimplifyDemandedBits from destroying the patterns prematurely, allowing us to use the rotate/funnel-shift legalization that was added in D112443. Differential Revision: https://reviews.llvm.org/D113192	2021-11-19 11:12:04 +00:00
Victor Huang	40c65655af	[PowerPC] Remove the redundant terminator instruction when optimizing conditional trap This patch is a follow up patch for `ae27ca9a67` to the remove redundant terminator when optimizing conditional trap. Peer reviewed by: nemanjai	2021-11-18 17:52:26 -06:00
Victor Huang	ae27ca9a67	[PowerPC] PPC backend optimization on conditional trap intrustions This patch adds PPC back end optimization to analyze the arguments of a conditional trap instruction to execute one of the following: 1. Delete it if never trap 2. Replace it if always trap 3. Otherwise keep it Reviewed By: nemanjai, amyk, PowerPC Differential revision: https://reviews.llvm.org/D111434	2021-11-16 13:11:57 -06:00
Kai Luo	c0da8a4e40	[CGP][PowerPC] Pre-commit test case for D113872. NFC.	2021-11-16 09:18:49 +00:00
Lei Huang	f50c6c1718	[PowerPC] Fix 32bit vector insert instructions for ISA3.1 The platform independent ISD::INSERT_VECTOR_ELT take a element index, but vins* instructions take a byte index. Update 32bit td patterns for vector insert to handle the element index accordingly. Since vector insert for non constant index are supported in ISA3.1, there is no need to use platform specific ISD node, PPCISD::VECINSERT. Update td pattern to directly use ISD::INSERT_VECTOR_ELT instead. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D113802	2021-11-15 14:36:39 -06:00
Chen Zheng	eec9ca622c	[PowerPC] guard update form prepare with non-const increment with option Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D113471	2021-11-15 02:16:46 +00:00
Victor Huang	18fe0a0d9e	[PowerPC] PPC backend optimization to lower int_ppc_tdw/int_ppc_tw intrinsics to TDI/TWI machine instructions This patch adds the backend optimization to match XL behavior for the two builtins __tdw and __tw that when the second input argument is an immediate, emitting tdi/twi instructions instead of td/tw. Reviewed By: nemanjai, amyk, PowerPC Differential revision: https://reviews.llvm.org/D112285	2021-11-11 09:52:00 -06:00
Qiu Chaofan	5e9021c606	[NFC] Clean-up typos in PowerPC CodeGen tests	2021-11-11 15:42:08 +08:00
Qiu Chaofan	bc39ce9fa5	[NFC] Remove unnecessary check prefix of AIX test `9e9b0f4` introduced support for asm-full-reg-names on AIX. Now we can merge the test check prefix.	2021-11-11 13:27:42 +08:00
Nemanja Ivanovic	5840f7197d	[PowerPC] Respect rounding mode in the back end Currently, the floating point instructions that depend on rounding mode are correctly marked in the PPC back end with an implicit use of the RM register. Similarly, instructions that explicitly define the register are marked with an implicit def of the same register. So for the most part, RM-using code won't be moved across RM-setting instructions. However, calls are not marked as RM-setting instructions so code can be moved across calls. This is generally desired, but so is the ability to turn off this behaviour with an appropriate option - and -frounding-math really should be that option. This patch provides a set of call instructions (for direct and indirect calls) that are marked with an implicit def of the RM register. These will be used for calls that are marked with the strictfp attribute. Differential revision: https://reviews.llvm.org/D111433	2021-11-10 08:19:58 -06:00
Qiu Chaofan	9b5e2b5261	[PowerPC] Implement basic macro fusion in Power10 Including basic fusion types around arithmetic and logical instructions. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D111693	2021-11-08 17:23:56 +08:00
Chen Zheng	50acbbe3cd	[AsmPrinter][ORE] use correct opcode name Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D113173	2021-11-08 01:51:24 +00:00
Chen Zheng	c7d27f90e7	[ORE][AsmPrinter] add testcase for D113173; NFC	2021-11-08 01:47:22 +00:00
Alfredo Dal'Ava Junior	1cb9f37a17	[FreeBSD] Do not mark __stack_chk_guard as dso_local This symbol is defined in libc.so so it is definitely not DSO-Local. Marking it as such causes problems on some platforms (such as PowerPC). Differential revision: https://reviews.llvm.org/D109090	2021-11-05 07:29:50 -05:00
Chen Zheng	fed2889f07	[PowerPC] use correct selection for v16i8/v8i16 splat load Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D113236	2021-11-05 10:04:03 +00:00
Qiu Chaofan	5fd406e254	[PowerPC] Add intrinsic to convert between ppc_fp128 and fp128 ppc_fp128 and fp128 are both 128-bit floating point types. However, we can't do conversion between them now, since trunc/ext are not allowed for same-size fp types. This patch adds two new intrinsics: llvm.ppc.convert.f128.to.ppcf128 and llvm.convert.ppcf128.to.f128, to support such conversion. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D109421	2021-11-05 16:58:38 +08:00
Qiu Chaofan	a84118756c	[PowerPC] Enforce side effects to FPSCR read/set intrinsics Currently, FPSCR is not modeled, so in some early passes (such as early-cse), the read/set intrinsics to FPSCR may get incorrect simplification. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D112380	2021-11-04 11:45:32 +08:00
Qiu Chaofan	741aeda97d	[PowerPC] Implement longdouble pack/unpack builtins Implement two builtins to pack/unpack IBM extended long double float, according to GCC 'Basic PowerPC Builtin Functions Available ISA 2.05'. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D112055	2021-11-03 17:57:25 +08:00
Chen Zheng	5a8b196340	[PowerPC] handle more splat loads without stack operation This mostly improves splat loads code generation on Power7 Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D106555	2021-11-03 05:17:41 +00:00

1 2 3 4 5 ...

3231 Commits