llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	73f01e3df5	[TTI] Add VecPred argument to getCmpSelInstrCost. On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV. Reviewed By: dmgreen, RKSimon Differential Revision: https://reviews.llvm.org/D90070	2020-10-30 13:49:08 +00:00
Nemanja Ivanovic	5459d08795	[PowerPC] Fix single-use check and update chain users for ld-splat When converting a BUILD_VECTOR or VECTOR_SHUFFLE to a splatting load as of `1461fb6e78`, we inaccurately check for a single user of the load and neglect to update the users of the output chain of the original load. As a result, we can emit a new load when the original load is kept and the new load can be reordered after a dependent store. This patch fixes those two issues. Fixes https://bugs.llvm.org/show_bug.cgi?id=47891	2020-10-27 16:49:38 -05:00
Victor Huang	2e1a737f46	[PowerPC][PCRelative] Turn on TLS support for PCRel by default Turn on TLS support for PCRel by default and update the test cases. Differential Revision: https://reviews.llvm.org/D88738 Reviewed by: stefanp, kamaub	2020-10-27 13:58:44 -05:00
Chen Zheng	00e573cadb	[LSR] fix typo in comments and rename for a new added hook.	2020-10-26 22:29:22 -04:00
Amy Kwan	803cc3aff2	[PowerPC] Implement Set Boolean Condition Instructions This patch implements the set boolean condition instructions introduced in POWER10. The set boolean condition instructions (set[n]bc[r]) are used during the following situations: - sign/zero/any extending i1 to an i32 or i64, - reg+reg, reg+imm or floating point comparisons being sign/zero extended to i32 or i64, - spilling CR bits (using the setnbc instruction) Differential Revision: https://reviews.llvm.org/D87705	2020-10-26 18:42:51 -05:00
Baptiste Saleil	edb27912a3	[PowerPC] Add intrinsics for MMA This patch adds support for MMA intrinsics. Authored by: Baptiste Saleil Reviewed By: #powerpc, bsaleil, amyk Differential Revision: https://reviews.llvm.org/D89345	2020-10-23 13:16:02 -05:00
Victor Huang	7a74bb899a	[PowerPC] Fix the Predicates for enabling pcrelative-memops and PLXVP/PSTXVP definitions In this patch, Predicates fix added for the following: * disable prefix-instrs will disable pcrelative-memops * set two predicates PairedVectorMemops and PrefixInstrs for PLXVP/PSTXVP definitions Differential Revision: https://reviews.llvm.org/D89727 Reviewed by: amyk, steven.zhang	2020-10-23 11:33:20 -05:00
Chen Zheng	1e0b6c1df0	[LSR] ignore profitable chain when reg num is not major cost. Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D89665	2020-10-23 09:35:48 -04:00
Nicholas Guy	9a2d2bedb7	Add "SkipDead" parameter to TargetInstrInfo::DefinesPredicate Some instructions may be removable through processes such as IfConversion, however DefinesPredicate can not be made aware of when this should be considered. This parameter allows DefinesPredicate to distinguish these removable instructions on a per-call basis, allowing for more fine-grained control from processes like ifConversion. Renames DefinesPredicate to ClobbersPredicate, to better reflect it's purpose Differential Revision: https://reviews.llvm.org/D88494	2020-10-21 11:52:47 +01:00
Amy Kwan	6a946fd06f	[DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead. MULH is often expanded on targets. This patch removes the isMulhCheaperThanMulShift hook and uses isOperationLegalOrCustom instead. Differential Revision: https://reviews.llvm.org/D80485	2020-10-19 12:23:04 -05:00
Kai Luo	354d3106c6	[PowerPC] Skip combining (uint_to_fp x) if x is not simple type Current powerpc64le backend hits ``` Combining: t7: f64 = uint_to_fp t6 llc: llvm-project/llvm/include/llvm/CodeGen/ValueTypes.h:291: llvm::MVT llvm::EVT::getSimpleVT() const: Assertion `isSimple() && "Expected a SimpleValueType!"' failed. ``` This patch fixes it by skipping combination if `t6` is not simple type. Fixed https://bugs.llvm.org/show_bug.cgi?id=47660. Reviewed By: #powerpc, steven.zhang Differential Revision: https://reviews.llvm.org/D88388	2020-10-19 05:23:46 +00:00
Albion Fung	d30155feaa	[PowerPC] Implementation of 128-bit Binary Vector Rotate builtins This patch implements 128-bit Binary Vector Rotate builtins for PowerPC10. Differential Revision: https://reviews.llvm.org/D86819	2020-10-16 18:03:22 -04:00
David Sherwood	47f2dc7e5f	[SVE][NFC] Replace some TypeSize comparisons in non-AArch64 Targets In most of lib/Target we know that we are not dealing with scalable types so it's perfectly fine to replace TypeSize comparison operators with their fixed width equivalents, making use of getFixedSize() and so on. Differential Revision: https://reviews.llvm.org/D89101	2020-10-15 09:01:21 +01:00
Ahsan Saghir	f3202b30b8	[PowerPC] Add assemble disassemble intrinsics for MMA This patch adds support for assemble disassemble intrinsics for MMA. Reviewed By: bsaleil, #powerpc Differential Revision: https://reviews.llvm.org/D88739	2020-10-13 13:21:58 -05:00
Simon Pilgrim	2c3e4a21f9	[PowerPC] ReplaceNodeResults - bail on funnel shifts and let generic legalizers deal with it Fixes regression raised on D88834 for 32-bit triple + 64-bit cpu cases (which apparently is a thing).	2020-10-10 19:13:16 +01:00
Fangrui Song	2bd4730850	[PowerPC] Fix signed overflow in decomposeMulByConstant after D88201 Caught by multipliers LONG_MAX (after +1) and LONG_MIN (after -1) in CodeGen/PowerPC/mul-const-i64.ll	2020-10-09 18:29:12 -07:00
Esme-Yi	e9fd8823ba	[DAGCombiner] Add decomposition patterns for Mul-by-Imm. Summary: This patch is derived from D87384. In this patch we expand the existing decomposition of mul-by-constant to be more general by implementing 2 patterns: ``` mul x, (2^N + 2^M) --> (add (shl x, N), (shl x, M)) mul x, (2^N - 2^M) --> (sub (shl x, N), (shl x, M)) ``` The conversion will be trigged if the multiplier is a big constant that the target can't use a single multiplication instruction to handle. This is controlled by the hook `decomposeMulByConstant`. More over, the conversion benefits from an ILP improvement since the instructions are independent. A case with the sequence like following also gets benefit since a shift instruction is saved. ``` res1 = a 0x8800; res2 = a 0x8080; ``` Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D88201	2020-10-09 08:51:40 +00:00
diggerlin	92bca12843	[AIX] add new option -mignore-xcoff-visibility SUMMARY: In IBM compiler xlclang , there is an option -fnovisibility which suppresses visibility. For more details see: https://www.ibm.com/support/knowledgecenter/SSGH3R_16.1.0/com.ibm.xlcpp161.aix.doc/compiler_ref/opt_visibility.html. We need to add the option -mignore-xcoff-visibility for compatibility with the IBM AIX OS (as the option is enabled by default in AIX). With this option llvm does not emit any visibility attribute to ASM or XCOFF object file. The option only work on the AIX OS, for other non-AIX OS using the option will report an unsupported options error. In AIX OS: 1.1 the option -mignore-xcoff-visibility is enabled by default , if there is not -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command . 1.2 if there is -fvisibility=* explicitly but not -mignore-xcoff-visibility explicitly in the clang command. it will generate visibility attributes. 1.3 if there are both -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command. The option "-mignore-xcoff-visibility" wins , it do not emit the visibility attribute. The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR. Reviewer: daltenty,Jason Liu Differential Revision: https://reviews.llvm.org/D87451	2020-10-08 09:34:58 -04:00
Chen Zheng	f05608707c	[PowerPC] implement target hook getTgtMemIntrinsic This patch can make pass recognize Powerpc related memory intrinsics. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D88373	2020-10-07 00:02:44 -04:00
Chen Zheng	0492dd91c4	[PowerPC] add more builtins for PPCTargetLowering::getTgtMemIntrinsic Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D88374	2020-10-06 23:48:33 -04:00
Craig Topper	1127662c6d	[SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS. getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize FP constants to the RHS. When this happens we should create the node with the FMF that was requested. By using FlagInserter when can ensure any calls to getNode/getSetcc during canonicalization will also get the flags. Differential Revision: https://reviews.llvm.org/D88063	2020-10-05 14:55:23 -07:00
Esme-Yi	e3475f5b91	[PowerPC] Add builtins for xvtdiv(dp\|sp) and xvtsqrt(dp\|sp). Summary: This patch implements the builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp. The instructions correspond to the following builtins: int vec_test_swdiv(vector double v1, vector double v2); int vec_test_swdivs(vector float v1, vector float v2); int vec_test_swsqrt(vector double v1); int vec_test_swsqrts(vector float v1); This patch depends on D88274, which fixes the bug in copying from CRRC to GPRC/G8RC. Reviewed By: steven.zhang, amyk Differential Revision: https://reviews.llvm.org/D88278	2020-10-04 16:24:20 +00:00
Esme-Yi	c4690b0077	[PowerPC] Put the CR field in low bits of GRC during copying CRRC to GRC. Summary: How we copying the CRRC to GRC is using a single MFOCRF to copy the contents of CR field n (CR bits 4×n+32:4×n+35) into bits 4×n+32:4×n+35 of register GRC. That’s not correct because we expect the value of destination register equals to source so we have to put the the contents of CR field in the lowest 4 bits. This patch adds a RLWINM after MFOCRF to achieve that. The problem came up when adding builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp, as posted in D88278. We need to move the outputs (in CR register) to GRC. However outputs of these instructions may not in a fixed CR# register, so we can’t directly add a rotation instruction in the .td patterns, but need to wait until the CR register is determined. Then we confirmed this should be a bug in POST-RA PSEUDO PASS. Reviewed By: nemanjai, shchenz Differential Revision: https://reviews.llvm.org/D88274	2020-10-02 01:26:18 +00:00
jasonliu	78a9e62aa6	[XCOFF] Enable -fdata-sections on AIX Summary: Some design decision worth noting about: I've noticed a recent mailing discussing about why string literal is not affected by -fdata-sections for ELF target: http://lists.llvm.org/pipermail/llvm-dev/2020-September/145121.html But on AIX, our linker could not split the mergeable string like other target. So I think it would make more sense for us to emit separate csect for every mergeable string in -fdata-sections mode, as there might not be other ways for linker to do garbage collection on unused mergeable string. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D88339	2020-10-02 00:16:24 +00:00
Ahsan Saghir	66d2e3f495	[PowerPC] Add outer product instructions for MMA This patch adds outer product instructions for MMA, including related infrastructure, and their tests. Depends on D84968. Reviewed By: #powerpc, bsaleil, amyk Differential Revision: https://reviews.llvm.org/D88043	2020-09-30 18:06:49 -05:00
Zarko Todorovski	052c5bf40a	[PPC] Do not emit extswsli in 32BIT mode when using -mcpu=pwr9 It looks like in some circumstances when compiling with `-mcpu=pwr9` we create an EXTSWSLI node when which causes llc to fail. No such error occurs in pwr8 or lower. This occurs in 32BIT AIX and BE Linux. the cause seems to be that the default return in combineSHL is to create an EXTSWSLI node. Adding a check for whether we are in PPC64 before that fixes the issue. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D87046	2020-09-30 11:06:20 -04:00
Benjamin Kramer	2c394bd407	[PowerPC] Avoid unused variable warning in Release builds PPCFrameLowering.cpp:632:8: warning: unused variable 'isAIXABI' [-Wunused-variable]	2020-09-30 17:02:55 +02:00
Sean Fertile	dfb717da1f	[PowerPC] Remove support for VRSAVE save/restore/update. After removal of Darwin as a PowerPC subtarget, the VRSAVE save/restore/spill/update code is no longer needed by any supported subtarget, so remove it while keeping support for vrsave and related instruction aliases for inline asm. I've pre-commited tests to document the existing vrsave handling in relation to @llvm.eh.unwind.init and inline asm usage, as well as a test which shows a beahviour change on AIX related to returning vector type as we were wrongly emiting VRSAVE_UPDATE on AIX.	2020-09-30 10:05:53 -04:00
Baptiste Saleil	0156914275	[PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types This patch legalizes the v256i1 and v512i1 types that will be used for MMA. It implements loads and stores of these types. v256i1 is a pair of VSX registers, so for this type, we load/store the two underlying registers. v512i1 is used for MMA accumulators. So in addition to loading and storing the 4 associated VSX registers, we generate instructions to prime (copy the VSX registers to the accumulator) after loading and unprime (copy the accumulator back to the VSX registers) before storing. This patch also adds the UACC register class that is necessary to implement the loads and stores. This class represents accumulator in their unprimed form and allow the distinction between primed and unprimed accumulators to avoid invalid copies of the VSX registers associated with primed accumulators. Differential Revision: https://reviews.llvm.org/D84968	2020-09-28 14:39:37 -05:00
Qiu Chaofan	40e86ca749	[PowerPC] Clean-up mayRaiseFPException bits According to POWER ISA, floating point instructions altering exception bits in FPSCR should be 'may raise FP exception'. (excluding those read or write the whole FPSCR directly, like mffs/mtfsf) We need to model FPSCR well in future patches to handle the special case properly. Instructions added mayRaiseFPException: - fre(s)/frsqrte(s) - fmadd(s)/fmsub(s)/fnmadd(s)/fnmsub(s) - xscmpoqp/xscmpuqp/xscmpeqdp/xscmpgedp/xscmpgtdp - xscvdphp/xscvhpdp/xvcvhpsp/xvcvsphp/xsrqpxp - xsmaxcdp/xsincdp/xsmaxjdp/xsminjdp Instructions removed mayRaiseFPException: - xstdivdp/xvtdiv(d\|s)p/xstsqrtdp/xvtsqrt(d\|s)p - xsabsdp/xsnabsdp/xvabs(d\|s)p/xvnabs(d\|s)p - xsnegdp/xscpsgndp/xvneg(d\|s)p/xvcpsgn(d\|s)p - xvcvsxwdp/xvcvuxwdp - xscvdpspn/xscvspdpn Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D87738	2020-09-28 18:22:12 +08:00
Amy Kwan	6f24774fc4	[NFC][PowerPC] Change PPCSubTarget (introduced from D87671) to Subtarget In D87671, it introduced PPCSubTarget in PPCISelDAGToDAG. This should have been Subtarget instead. This patch changes PPCSubTarget into Subtarget.	2020-09-26 17:53:51 -05:00
Baptiste Saleil	9b86b70094	[PowerPC] Add accumulator register class and instructions This patch adds the xxmfacc, xxmtacc and xxsetaccz instructions to manipulate accumulator registers. It also adds the ACC register class definition for the accumulator registers. Differential Revision: https://reviews.llvm.org/D84847	2020-09-25 12:25:13 -05:00
Amy Kwan	2e7117f847	[PowerPC] Implement the 128-bit vec_[all\|any]_[eq \| ne \| lt \| gt \| le \| ge] builtins in Clang/LLVM This patch implements the vec_[all\|any]_[eq \| ne \| lt \| gt \| le \| ge] builtins for vector signed/unsigned __int128. Differential Revision: https://reviews.llvm.org/D87910	2020-09-23 16:49:40 -04:00
Albion Fung	88cdbeab41	[PowerPC] Implement Vector signed/unsigned __int128 overloads for the comparison builtins This patch implements Vector signed/unsigned __int128 overloads for the comparison builtins. Differential Revision: https://reviews.llvm.org/D87804	2020-09-23 16:49:40 -04:00
Victor Huang	652a8f150d	[PowerPC][PCRelative] Thread Local Storage Support for Local Dynamic This patch is the initial support for the Local Dynamic Thread Local Storage model to produce code sequence and relocation correct to the ABI for the model when using PC relative memory operations. Differential Revision: https://reviews.llvm.org/D87721	2020-09-23 13:48:06 -05:00
Albion Fung	d7eb917a7c	[PowerPC] Implementation of 128-bit Binary Vector Mod and Sign Extend builtins This patch implements 128-bit Binary Vector Mod and Sign Extend builtins for PowerPC10. Differential: https://reviews.llvm.org/D87394#inline-815858	2020-09-23 01:18:14 -05:00
Stefanos Baziotis	a7873e5abc	Small fixes for "[LoopInfo] empty() -> isInnermost(), add isOutermost()"	2020-09-22 23:59:34 +03:00
Hubert Tong	b0f58aa116	[NFC] Replace tabs with spaces in PPCInstrPrefix.td	2020-09-22 14:23:32 -04:00
Amy Kwan	079757b551	[PowerPC] Implement Vector String Isolate Builtins in Clang/LLVM This patch implements the vector string isolate (predicate and non-predicate versions) builtins. The predicate builtins are custom selected within PPCISelDAGToDAG. Differential Revision: https://reviews.llvm.org/D87671	2020-09-22 11:31:44 -05:00
Amy Kwan	b3147058de	[PowerPC] Implement the 128-bit Vector Divide Extended Builtins in Clang/LLVM This patch implements the 128-bit vector divide extended builtins in Clang/LLVM. These builtins map to the vdivesq and vdiveuq instructions respectively. Differential Revision: https://reviews.llvm.org/D87729	2020-09-22 11:31:44 -05:00
Stefan Pintilie	7e78d89052	[PowerPC] Fix for compiler side issue in PCRelative Local Exec Stop combining loads and stores with PPCISD::ADD_TLS before we can merge the node with with TLS_LOCAL_EXEC_MAT_ADDR. The issue is that TLS_LOCAL_EXEC_MAT_ADDR cannot be selected by itself and requires the previous ADD_TLS node that goes with it. However, we sometimes try to combine ADD_TLS with loads and stores that come after it. If this happens then the ADD_TLS is removed and TLS_LOCAL_EXEC_MAT_ADDR cannot be selected. While this bug fix will address the issue it my not be ideal from a performance perspective as we may be able to add patterns to combine TLS_LOCAL_EXEC_MAT_ADDR with ADD_TLS with the load and store that comes after it all in one. However, this is beyond the scope of this patch. Reviewed By: NeHuang Differential Revision: https://reviews.llvm.org/D88030	2020-09-22 08:28:06 -05:00
Meera Nakrani	a3d0dce260	[ARM][TTI] Prevents constants in a min(max) or max(min) pattern from being hoisted when in a loop Changes TTI function getIntImmCostInst to take an additional Instruction parameter, which enables us to be able to check it is part of a min(max())/max(min()) pattern that will match SSAT. We can then mark the constant used as free to prevent it being hoisted so SSAT can still be generated. Required minor changes in some non-ARM backends to allow for the optional parameter to be included. Differential Revision: https://reviews.llvm.org/D87457	2020-09-22 11:54:10 +00:00
Baptiste Saleil	bb82135538	[PowerPC] Remove unnecessary patterns and types These patterns and type uses were added by mistake by commit `1372e23c7d`	2020-09-21 16:08:54 -05:00
Baptiste Saleil	1372e23c7d	[PowerPC] Add vector pair load/store instructions and vector pair register class This patch adds support for the lxvp, lxvpx, plxvp, stxvp, stxvpx and pstxvp instructions in the PowerPC backend. These instructions allow loading and storing VSX register pairs. This patch also adds the VSRp register class definition needed for these instructions. Differential Revision: https://reviews.llvm.org/D84359	2020-09-21 10:27:47 -05:00
Qiu Chaofan	1d782c2987	[PowerPC] Pass nofpexcept flag to custom lowered constrained ops This is a follow-up of D86605. For strict DAG FP node, if its FP exception behavior metadata is ignore, it should have nofpexcept flag. But during custom lowering, this flag isn't passed down. This is also seen on X86 target. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87390	2020-09-21 10:44:25 +08:00
Amy Kwan	37e7673c21	[PowerPC] Implement Move to VSR Mask builtins in LLVM/Clang This patch implements the vec_gen[b\|h\|w\|d\|q]m function prototypes in altivec.h in order to utilize the move to VSR with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82725	2020-09-18 18:16:14 -05:00
Amy Kwan	6f3c0991bf	[PowerPC] Add Set Boolean Condition Instruction Definitions and MC Tests This patch adds the instruction definitions and assembly/disassembly tests for the set boolean condition instructions. This also includes the negative, and reverse variants of the instruction. Differential Revision: https://reviews.llvm.org/D86252	2020-09-17 18:20:54 -05:00
Amy Kwan	2c3bc918db	[PowerPC] Implement Vector Count Mask Bits builtins in LLVM/Clang This patch implements the vec_cntm function prototypes in altivec.h in order to utilize the vector count mask bits instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82726	2020-09-17 18:20:53 -05:00
Simon Pilgrim	f026812110	InstCombiner.h - remove unnecessary KnownBits.h include. NFCI. Move the include down to cpp files with an implicit dependency.	2020-09-17 14:28:42 +01:00
Qiu Chaofan	ebfbdebe96	[PowerPC] Fix store-fptoi combine of f128 on Power8 llc would crash for (store (fptosi-f128-i32)) when -mcpu=pwr8, we should not generate FP_TO_(S\|U)INT_IN_VSR for f128 types at this time. This patch fixes it. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D86686	2020-09-17 10:21:35 +08:00
Albion Fung	05aa997d51	[PowerPC] Implement __int128 vector divide operations This patch implements __int128 vector divide operations for ISA3.1. Differential Revision: https://reviews.llvm.org/D85453	2020-09-15 15:19:35 -04:00
Kamau Bridgeman	c0f199e566	[PowerPC] Implement Thread Local Storage Support for Local Exec This patch is the initial support for the Local Exec Thread Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Patch by: Kamau Bridgeman Differential Revision: https://reviews.llvm.org/D83404	2020-09-14 14:16:28 -05:00
jasonliu	9868ea764f	[XCOFF][AIX] Handle TOC entries that could not be reached by positive range in small code model Summary: In small code model, AIX assembler could not deal with labels that could not be reached within the [-0x8000, 0x8000) range from TOC base. So when generating the assembly, we would need to help the assembler by subtracting an offset from the label to keep the actual value within [-0x8000, 0x8000). Reviewed By: hubert.reinterpretcast, Xiangling_L Differential Revision: https://reviews.llvm.org/D86879	2020-09-14 13:41:34 +00:00
David Blaikie	ce89eeee16	PPCInstrInfo: Fix readability-inconsistent-declaration-parameter-name clang-tidy warning Reduces the chance of confusion when calling the function with autocomplete (will show the more accurate/informative variable name), etc.	2020-09-13 13:08:17 -07:00
Qiu Chaofan	bec81dc67d	Reland "[PowerPC] Implement instruction clustering for stores" Commit `3c0b3250` introduced store fusion for PowerPC target, but it brought failure under UB sanitizer and was reverted. This patch fixes them.	2020-09-13 19:51:01 +08:00
QingShan Zhang	0680a3d56d	[Power10] Enable the heuristic for Power10 and switch the sched model with P9 Model Enable the pre-ra and post-ra scheduler strategy for Power10 as we want to customize the heuristic later. And switch the scheduler model with P9 model before P10 Model is available. The NoSchedModel is modelled as in-order cpu and the pre-ra scheduler is not bi-directional which will have big impact on the scheduler. Reviewed By: jji Differential Revision: https://reviews.llvm.org/D86865	2020-09-12 02:49:47 +00:00
QingShan Zhang	528554c39b	[PowerPC] Set the mayRaiseFPException for FCMPUS/FCMPUD From ISA, fcmpu will raise the Floating-Point Invalid Operation Exception (SNaN) if either of the operands is a Signaling NaN by setting the bit VXSNAN. But the instruction description didn't set the mayRaiseFPException which might have impact on the scheduling or some backend optimization. Reviewed By: qiucf Differential Revision: https://reviews.llvm.org/D83937	2020-09-12 02:42:22 +00:00
Kit Barton	009cd4e491	[PPC][GlobalISel] Add initial GlobalIsel infrastructure This adds the initial GlobalISel skeleton for PowerPC. It can only run ir-translator and legalizer for `ret void`. This is largely based on the initial GlobalISel patch for RISCV (https://reviews.llvm.org/D65219). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D83100	2020-09-10 11:58:01 -05:00
Qiu Chaofan	6afb279100	[PowerPC] [FPEnv] Disable strict FP mutation by default `22a0edd0` introduced a config IsStrictFPEnabled, which controls the strict floating point mutation (transforming some strict-fp operations into non-strict in ISel). This patch disables the mutation by default since we've finished PowerPC strict-fp enablement in backend. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87222	2020-09-10 13:28:09 +08:00
Qiu Chaofan	88ff4d2ca1	[PowerPC] Fix STRICT_FRINT/STRICT_FNEARBYINT lowering In standard C library, both rint and nearbyint returns rounding result in current rounding mode. But nearbyint never raises inexact exception. On PowerPC, x(v\|s)r(d\|s)pic may modify FPSCR XX, raising inexact exception. So we can't select constrained fnearbyint into xvrdpic. One exception here is xsrqpi, which will not raise inexact exception, so fnearbyint f128 is okay here. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87220	2020-09-09 22:40:58 +08:00
Brad Smith	88b368a1c4	[PowerPC] Set setMaxAtomicSizeInBitsSupported appropriately for 32-bit PowerPC in PPCTargetLowering Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D86165	2020-09-08 21:21:14 -04:00
Craig Topper	b1e68f885b	[SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.: This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130. This required adding a SDNodeFlags to SelectionDAG::getSetCC. Now we manage to contant fold some stuff undefs during the initial getNode that we don't do in later DAG combines. Differential Revision: https://reviews.llvm.org/D87200	2020-09-08 15:27:21 -07:00
Qiu Chaofan	8d9c13f37d	Revert "[PowerPC] Implement instruction clustering for stores" This reverts commit `3c0b325023`, (along with `ea795304` and `bb39eb9e`) since it breaks test with UB sanitizer.	2020-09-08 17:24:08 +08:00
Qiu Chaofan	bb39eb9e7f	[PowerPC] Fix getMemOperandWithOffsetWidth Commit `3c0b3250` introduced memory cluster under pwr10 target, but a check for operands was unexpectedly removed. This adds it back to avoid regression.	2020-09-08 15:35:25 +08:00
Mikael Holmen	ea795304ec	[PowerPC] Add parentheses to silence gcc warning Without gcc 7.4 warns with ../lib/Target/PowerPC/PPCInstrInfo.cpp:2284:25: warning: suggest parentheses around '&&' within '\|\|' [-Wparentheses] BaseOp1.isFI() && ~~~~~~~~~~~~~~~^~ "Only base registers and frame indices are supported."); ~	2020-09-08 08:39:57 +02:00
Qiu Chaofan	3c0b325023	[PowerPC] Implement instruction clustering for stores On Power10, it's profitable to schedule some stores with adjacent target address together. This patch implements this feature. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D86754	2020-09-08 11:03:09 +08:00
Amy Kwan	efa57f9a7a	[PowerPC] Implement Vector Expand Mask builtins in LLVM/Clang This patch implements the vec_expandm function prototypes in altivec.h in order to utilize the vector expand with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82727	2020-09-06 17:13:21 -05:00
Qiu Chaofan	705271d9cd	[PowerPC] Expand constrained ppc_fp128 to i32 conversion Libcall __gcc_qtou is not available, which breaks some tests needing it. On PowerPC, we have code to manually expand the operation, this patch applies it to constrained conversion. To keep it strict-safe, it's using the algorithm similar to expandFP_TO_UINT. For constrained operations marking FP exception behavior as 'ignore', we should set the NoFPExcept flag. However, in some custom lowering the flag is missed. This should be fixed by future patches. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D86605	2020-09-05 13:16:20 +08:00
Nemanja Ivanovic	69289cc10f	[PowerPC] Fix broken kill flag after MI peephole The test case in https://bugs.llvm.org/show_bug.cgi?id=47373 exposed two bugs in the PPC back end. The first one was fixed in commit `2771407584` but the test case had to be added without -verify-machineinstrs due to the second bug. This commit fixes the use-after-kill that is left behind by the PPC MI peephole optimization.	2020-09-02 17:07:49 -05:00
Nemanja Ivanovic	2771407584	[PowerPC] Do not legalize vector FDIV without VSX Quite a while ago, we legalized these nodes as we added custom handling for reciprocal estimates in the back end. We have since moved to target-independent combines but neglected to turn off legalization. As a result, we can now get selection failures on non-VSX subtargets as evidenced in the listed PR. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47373	2020-09-02 16:03:36 -05:00
Albion Fung	5d1fe3f903	[PowerPC] Implemented Vector Multiply Builtins This patch implements the builtins for Vector Multiply Builtins (vmulxxd family of instructions), and adds the appropriate test cases for these builtins. The builtins utilize the vector multiply instructions itnroduced with ISA 3.1. Differential Revision: https://reviews.llvm.org/D83955	2020-09-02 14:16:21 -05:00
Amy Kwan	0c2d872d5d	[PowerPC] Implement builtins for xvcvspbf16 and xvcvbf16spn This patch adds the builtin implementation for the xvcvspbf16 and xvcvbf16spn instructions. Differential Revision: https://reviews.llvm.org/D86795	2020-09-01 17:16:43 -05:00
Sean Fertile	fecc27db11	[PowerPC][AIX] Update save/restore offset for frame and base pointers. General purpose registers 30 and 31 are handled differently when they are reserved as the base-pointer and frame-pointer respectively. This fixes the offset of their fixed-stack objects when there are fpr calle-saved registers. Differential Revision: https://reviews.llvm.org/D85850	2020-09-01 14:13:05 -04:00
Qiu Chaofan	29ae448595	[PowerPC] Handle STRICT_FSETCC(S) in more cases On -O0, i1 strict_fsetcc will be promoted to i32. We don't handle that in TD patterns. This patch fills logic in PPCISelDAGToDAG to handle more cases. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D86595	2020-09-02 00:33:21 +08:00
Amy Kwan	ca2227c1b3	[PowerPC] Implement instruction definitions/MC Tests for xvcvspbf16 and xvcvbf16spn This patch adds the td instruction definitions of the xvcvspbf16 and xvcvbf16spn instructions, along with their respective MC tests. Differential Revision: https://reviews.llvm.org/D86794	2020-09-01 10:59:43 -05:00
Craig Topper	aab90384a3	[Attributes] Add a method to check if an Attribute has AttrKind None. Use instead of hasAttribute(Attribute::None) There's a special case in hasAttribute for None when pImpl is null. If pImpl is not null we dispatch to pImpl->hasAttribute which will always return false for Attribute::None. So if we just want to check for None its sufficient to just check that pImpl is null. Which can even be done inline. This patch adds a helper for that case which I hope will speed up our getSubtargetImpl implementations. Differential Revision: https://reviews.llvm.org/D86744	2020-08-28 13:23:45 -07:00
Albion Fung	331dcc43ea	[PowerPC] Implemented Vector Load with Zero and Signed Extend Builtins This patch implements the builtins for Vector Load with Zero and Signed Extend Builtins (lxvr_x for b, h, w, d), and adds the appropriate test cases for these builtins. The builtins utilize the vector load instructions itnroduced with ISA 3.1. Differential Revision: https://reviews.llvm.org/D82502#inline-797941	2020-08-28 11:28:58 -05:00
Kai Luo	cbea17568f	[PowerPC] PPCBoolRetToInt: Don't translate Constant's operands When collecting `i1` values via `findAllDefs`, ignore Constant's operands, since Constant's operands might not be `i1`. Fixes https://bugs.llvm.org/show_bug.cgi?id=46923 which causes ICE ``` llvm-project/llvm/lib/IR/Constants.cpp:1924: static llvm::Constant llvm::ConstantExpr::getZExt(llvm::Constant , llvm::Type *, bool): Assertion `C->getType()->getScalarSizeInBits() < Ty->getScalarSizeInBits()&& "SrcTy must be smaller than DestTy for ZExt!"' failed. ``` Differential Revision: https://reviews.llvm.org/D85007	2020-08-28 01:56:12 +00:00
Amy Kwan	76b0f99ea8	[PowerPC] Implement Vector Multiply High/Divide Extended Builtins in LLVM/Clang This patch implements the function prototypes vec_mulh and vec_dive in order to utilize the vector multiply high (vmulh[s\|u][w\|d]) and vector divide extended (vdive[s\|u][w\|d]) instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82609	2020-08-26 23:14:34 -05:00
jasonliu	413054400d	[XCOFF][AIX] Support relocation generation for large code model Summary: Support TOCU and TOCL relocation type for object file generation. Reviewed by: DiggerLin Differential Revision: https://reviews.llvm.org/D84549	2020-08-26 17:12:28 +00:00
Mikael Holmen	59e1fbe557	[PowerPC] Fix gcc warning [NFC] Without the fix gcc 7.4 warns with ../lib/Target/PowerPC/PPCAsmPrinter.cpp: In member function 'void {anonymous}::PPCAsmPrinter::EmitTlsCall(const llvm::MachineInstr*, llvm::MCSymbolRefExpr::VariantKind)': ../lib/Target/PowerPC/PPCAsmPrinter.cpp:525:53: warning: enumeral and non-enumeral type in conditional expression [-Wextra] MCInstBuilder(Subtarget->isPPC64() ? Opcode : PPC::BL_TLS) ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~	2020-08-25 12:58:38 +02:00
Nemanja Ivanovic	075a92dea1	[PowerPC] Do not use FISel for calls and TOC-based accesses with PC-Rel PC-Relative addressing introduces a fair bit of complexity for correctly eliminating TOC accesses. FastISel does not include any of that handling so we miscompile code with -mcpu=pwr10 -O0 if it includes an external call that FastISel does not handle followed by any of the following: Floating point constant materialization Materialization of a GlobalValue Call that FastISel does handle This patch switches to SDISel for any of the above. Differential revision: https://reviews.llvm.org/D86343	2020-08-24 16:51:44 -05:00
Nemanja Ivanovic	c485343c83	[PowerPC] Handle SUBFIC in reg+reg -> reg+imm transformation We initially missed the subtract-immediate in this transformation. This patch just adds that. Differential revision: https://reviews.llvm.org/D84659	2020-08-24 16:22:59 -05:00
Roland Froese	b6d7ed469f	[PowerPC] Extend custom lower of vector truncate to handle wider input Current custom lowering of truncate vector handles a source of up to 128 bits, but that only uses one of the two shuffle vector operands. Extend it to use both operands to handle 256 bit sources. Differential Revision: https://reviews.llvm.org/D68035	2020-08-24 15:33:43 -04:00
Baptiste Saleil	512e256c0d	[PowerPC] Add clang options to control MMA support This patch adds frontend and backend options to enable and disable the PowerPC MMA operations added in ISA 3.1. Instructions using these options will be added in subsequent patches. Differential Revision: https://reviews.llvm.org/D81442	2020-08-24 09:35:55 -05:00
Qiu Chaofan	fed6107dcb	[PowerPC] Allow constrained FP intrinsics in mightUseCTR We may meet Invalid CTR loop crash when there's constrained ops inside. This patch adds constrained FP intrinsics to the list so that CTR loop verification doesn't complain about it. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D81924	2020-08-24 11:09:58 +08:00
Qiu Chaofan	41ba9d7723	[PowerPC] Support constrained vector fp/int conversion This patch makes these operations legal, and add necessary codegen patterns. There's still some issue similar to D77033 for conversion from v1i128 type. But normal type tests synced in vector-constrained-fp-intrinsic are passed successfully. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D83654	2020-08-24 10:10:27 +08:00
Qiu Chaofan	a5b7b8cce0	[PowerPC] Support constrained scalar sitofp/uitofp This patch adds support for constrained scalar int to fp operations on PowerPC. Besides, this also fixes the FP exception bit of FCFID* instructions. Reviewed By: steven.zhang, uweigand Differential Revision: https://reviews.llvm.org/D81669	2020-08-22 02:10:29 +08:00
Kamau Bridgeman	365f861c45	[PowerPC][PCRelative] Thread Local Storage Support for Initial Exec This patch is the initial support for the Intial Exec Thread Local Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D81947	2020-08-21 10:13:11 -05:00
Kang Zhang	95e18b2d9d	[PowerPC] Fix a typo for InstAlias of mfsprg D77531 has a type for mfsprg, it should be mtsprg. This patch is to fix this typo.	2020-08-21 01:10:52 +00:00
Kamau Bridgeman	b74b80bb2d	[PowerPC][PCRelative] Thread Local Storage Support for General Dynamic This patch is the initial support for the General Dynamic Thread Local Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Patch by: NeHuang Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D82315	2020-08-20 15:08:13 -05:00
Qiu Chaofan	131b3b9ed4	[PowerPC] Support constrained scalar fptosi/fptoui This patch adds support for constrained scalar fp to int operations on PowerPC. Besides, this fixes the FP exception bit of quad-precision convert & truncate instructions. Reviewed By: steven.zhang, uweigand Differential Revision: https://reviews.llvm.org/D81537	2020-08-20 13:29:43 +08:00
jasonliu	f48eced390	[XCOFF] emit .rename for .lcomm when necessary Summary: This is a follow up for D82481. For .lcomm directive, although it's not necessary to have .rename emitted, it's still desirable to do it so that we do not see internal 'Rename..' gets print out in symbol table. And we could have consistent naming between TC entry and .lcomm. And also have consistent naming between IR and final object file. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D86075	2020-08-18 15:32:45 +00:00
Amy Kwan	c7ec3a7e33	[PowerPC] Implement Vector Extract Mask builtins in LLVM/Clang This patch implements the vec_extractm function prototypes in altivec.h in order to utilize the vector extract with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82675	2020-08-17 21:14:17 -05:00
Chen Zheng	4d52ebb9b9	[PowerPC] Make StartMI ignore COPY like instructions. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D85659	2020-08-17 02:12:30 -04:00
Craig Topper	c7a0b2684f	[X86][MC][Target] Initial backend support a tune CPU to support -mtune This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line. This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned. One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU. I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning. Differential Revision: https://reviews.llvm.org/D85165	2020-08-14 15:31:50 -07:00
Xiangling Liao	f759b4e43b	[AIX] Generate unique module id based on Pid and timestamp A unique module id, which is a part of sinit and sterm function names, is necessary to be unique. However, `getUniqueModuleId` will fail if there is no strong external symbol within a module. We turn to use Pid and timestamp when this happens. Differential Revision: https://reviews.llvm.org/D85527	2020-08-14 16:22:50 -04:00
Albion Fung	3136cbe29e	[PowerPC] Implement Vector Shift Builtins This patch implements the builtins for the vector shifts (shl, srl, sra), and adds the appropriate test cases for these builtins. The builtins utilize the vector shift instructions introduced within ISA 3.1. Differential Revision: https://reviews.llvm.org/D83338	2020-08-12 18:26:58 -05:00
diggerlin	e9ac1495e2	[AIX][XCOFF] change the operand of branch instruction from symbol name to qualified symbol name for function declarations SUMMARY: 1. in the patch , remove setting storageclass in function .getXCOFFSection and construct function of class MCSectionXCOFF there are XCOFF::StorageMappingClass MappingClass; XCOFF::SymbolType Type; XCOFF::StorageClass StorageClass; in the MCSectionXCOFF class, these attribute only used in the XCOFFObjectWriter, (asm path do not need the StorageClass) we need get the value of StorageClass, Type,MappingClass before we invoke the getXCOFFSection every time. actually , we can get the StorageClass of the MCSectionXCOFF from it's delegated symbol. 2. we also change the oprand of branch instruction from symbol name to qualify symbol name. for example change bl .foo extern .foo to bl .foo[PR] extern .foo[PR] 3. and if there is reference indirect call a function bar. we also add extern .bar[PR] Reviewers: Jason liu, Xiangling Liao Differential Revision: https://reviews.llvm.org/D84765	2020-08-11 15:26:19 -04:00
Kerry McLaughlin	85c7e89f3b	[CodeGen] Refactor getMemBasePlusOffset & getObjectPtrOffset to accept a TypeSize Changes the Offset arguments to both functions from int64_t to TypeSize & updates all uses of the functions to create the offset using TypeSize::Fixed() Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85220	2020-08-11 12:17:10 +01:00
jasonliu	20abff0481	[XCOFF][AIX] Use TE storage mapping class when large code model is enabled Summary: Use TE SMC instead of TC SMC in large code model mode, so that large code model TOC entries could get placed after all the small code model TOC entries, which reduces the chance of TOC overflow. Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D85455	2020-08-10 19:52:10 +00:00
jasonliu	7866442b3f	[XCOFF] Adjust .rename emission sequence Summary: AIX assembler does not generate correct relocation when .rename appear between tc entry label and .tc directive. So only emit .rename after .tc/.comm or other linkage is emitted. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D85317	2020-08-10 14:48:24 +00:00
Xiangling Liao	6ef801aa6b	[AIX] Static init frontend recovery and backend support On the frontend side, this patch recovers AIX static init implementation to use the linkage type and function names Clang chooses for sinit related function. On the backend side, this patch sets correct linkage and function names on aliases created for sinit/sterm functions. Differential Revision: https://reviews.llvm.org/D84534	2020-08-10 10:10:49 -04:00
Stefan Pintilie	81883ca074	[PowerPC] Add option to control PCRel GOT indirect linker optimization Add a hidden option to the compiler to control a the PC Relative GOT indirect linker optimization. If this option is set to false the compiler will no loger produce the relocations required by the linker to perform the optimization. Reviewed By: nemanjai, NeHuang, #powerpc Differential Revision: https://reviews.llvm.org/D85377	2020-08-10 09:07:17 -05:00
Qiu Chaofan	dbcfbffc7a	[PowerPC] Add intrinsic to read or set FPSCR register This patch introduces two intrinsics: llvm.ppc.setflm and llvm.ppc.readflm. They read from or write to FPSCR register (floating-point status & control) which contains rounding mode and exception status. To ensure correctness of program, we need to prevent FP operations from being moved across these intrinsics (mffs/mtfsf instruction), so here I set them as scheduling boundaries. We can relax such restriction if FPSCR is modeled well in the future. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D84914	2020-08-10 18:27:45 +08:00
Arthur Eubanks	1bf4629f11	[PPC] Rename bool-ret-to-int -> ppc-bool-ret-to-int Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D85391	2020-08-07 11:27:05 -07:00
Amy Kwan	98eccec3ae	[PowerPC] Add Vector Extract/Expand/Count with Mask, Move to VSR Mask Instruction Definitions and MC Tests This patch adds the instruction definitions and assembly/disassembly tests for the following set of instructions: Vector Extract [byte \| half \| word \| doubleword \| quad] with mask Vector Expand [byte \| half \| word \| doubleword \| quad] with mask Move to VSR [byte \| byte immediate \| half \| word \| doubleword \| quad] with mask Vector Count Mask Bits [byte \| half \| word \| doubleword] Differential Revision: https://reviews.llvm.org/D83724	2020-08-07 11:02:08 -05:00
Kamau Bridgeman	d8c6d083c9	[PowerPC][PCRelative] Set TLS unsupported with PC relative memops Introduce a fatal error if any thread local storage code is compiled using pc relative memory operations as well as a hidden override option `-enable-ppc-pcrel-tls` so that this support can be incrementally added if possible. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D85448	2020-08-07 10:56:24 -05:00
biplmish	cce1b0e891	[PowerPC] Implement Vector Extract Low/High Order Builtins in LLVM/Clang This patch implements the function prototypes vec_extractl and vec_extracth in altivec.h to utilize the vector extract double element instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D84622	2020-08-07 01:02:29 -05:00
QingShan Zhang	55de46f3b2	[PowerPC] Support constrained fp operation for setcc The constrained fp operation fcmp was added by https://reviews.llvm.org/D69281. This patch is trying to add the support for PowerPC backend. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D81727	2020-08-07 05:16:36 +00:00
Nemanja Ivanovic	14d726acd6	[PowerPC] Don't remove single swap between the load and store The swap removal pass looks to remove swaps when a loaded value is swapped, some number of lane-insensitive operations are performed and then the value is swapped again and stored. However, in a situation where we load the value, swap it and then store it without swapping again, the pass erroneously removes the single swap. The reason is that both checks in the same equivalence class: - load feeds a swap - swap feeds a store pass. However, there is no check that the two swaps are actually a single swap. This patch just fixes that. Differential revision: https://reviews.llvm.org/D84785	2020-08-04 10:38:15 -05:00
Jay Foad	28e322ea93	[PowerPC] Custom lowering for funnel shifts The custom lowering saves an instruction over the generic expansion, by taking advantage of the fact that PowerPC shift instructions are well defined in the shift-by-bitwidth case. Differential Revision: https://reviews.llvm.org/D83948	2020-08-04 16:30:49 +01:00
Qiu Chaofan	6a78a8dd37	[NFC] [PowerPC] Refactor fp/int conversion lowering For FP_TO_INT and INT_TO_FP lowering, we have direct-move and non-direct-move methods. But they share some conversion logic, so we can reduce redundant code by introducing new methods. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D81818	2020-08-04 15:48:16 +08:00
Chen Zheng	45c46d180e	[PowerPC] mark r+i as legal address mode for vector type after pwr9 Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D84735	2020-08-04 00:02:37 -04:00
Chen Zheng	ba955397ac	[SCEVExpander][PowerPC]clear scev rewriter before deleting instructions. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D85130	2020-08-03 20:36:08 -04:00
Christopher Tetreault	b43791e701	[SVE] Remove bad calls to VectorType::getNumElements() from PowerPC Differential Revision: https://reviews.llvm.org/D85154	2020-08-03 15:15:20 -07:00
Fangrui Song	40da58a04b	[MC] Default MCAsmBackend::mayNeedRelaxation() to false	2020-08-02 22:13:59 -07:00
QingShan Zhang	62e4644616	[NFC][PowerPC] Add a multiclass for fsetcc to define them in a uniform way This is a refactor patch to prepare for adding the support for strict-fsetcc in PowerPC backend. We want to move their definition into a uniform way so that, we could add the strict node easier. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D81712	2020-08-03 03:28:03 +00:00
Kazu Hirata	60434989e5	Use llvm::is_contained where appropriate (NFC) Use llvm::is_contained where appropriate (NFC) Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D85083	2020-08-01 21:51:06 -07:00
Justin Hibbits	7e9153e940	PowerPC: Don't lower SELECT_CC to PPCISD::FSEL on SPE SPE doesn't have a fsel instruction, so don't try to lower to it. This fixes a "Cannot select: tN: f64 = PPCISD::FSEL tX, tY, tZ" error. Reviewed By: #powerpc, lkail Differential Revision: https://reviews.llvm.org/D77773	2020-07-31 22:52:47 -05:00
Justin Hibbits	914dbf4808	PowerPC: Fix SPE extloadf32 handling. The patterns were incorrect copies from the FPU code, and are unnecessary, since there's no extended load for SPE. Just let LLVM itself do the work by marking it expand. Reviewed By: #powerpc, lkail Differential Revision: https://reviews.llvm.org/D78670	2020-07-31 22:42:57 -05:00
Albion Fung	93fd8dbdc2	[PowerPC] Add Vector String Isolate instruction definitions and MC Tests This patch implements the instruction definition and MC tests for the vector string isolate instructions. Differential Revision: https://reviews.llvm.org/D84197	2020-07-31 12:32:29 -05:00
Matt Arsenault	57bd64ff84	Support addrspacecast initializers with isNoopAddrSpaceCast Moves isNoopAddrSpaceCast to the TargetMachine. It logically belongs with the DataLayout.	2020-07-31 10:42:43 -04:00
QingShan Zhang	9b04fec002	[PowerPC] Retrieve the offset from load/store if it stores to stack slots Scheduler will try to retrieve the offset and base addr to determine if two loads/stores are disjoint memory access. PowerPC failed to handle this for frame index which will bring extra memory dependency for loads/stores. Reviewed By: jji Differential Revision: https://reviews.llvm.org/D84308	2020-07-31 07:08:20 +00:00
jasonliu	04dc9691eb	[XCOFF][AIX] Enable -ffunction-sections Summary: This patch implements -ffunction-sections on AIX. This patch focuses on assembly generation. Follow-on patch needs to handle: 1. -ffunction-sections implication for jump table. 2. Object file generation path and associated testing. Differential Revision: https://reviews.llvm.org/D83875	2020-07-30 13:30:01 +00:00
Simon Pilgrim	cc529285fd	VectorUtils.h - reduce unnecessary includes. NFC. Replace TargetLibraryInfo.h include with forward declaration and fix implicit dependencies. Reduce SmallSet.h include to SmallVector.h include.	2020-07-30 12:27:49 +01:00
Kang Zhang	a18953c1c0	[PowerPC] Fix RM operands for some instructions Summary: Some instructions have set the wrong [RM] flag, this patch is to fix it. Instructions x(v\|s)r(d\|s)pi[zmp]? and fri[npzm] use fixed rounding directions without referencing current rounding mode. Also, the SETRNDi, SETRND, BCLRn, MTFSFI, MTFSB0, MTFSB1, MTFSFb, MTFSFI, MTFSFI_rec, MTFSF, MTFSF_rec should also fix the RM flag. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D81360	2020-07-30 02:10:49 +00:00
Baptiste Saleil	7aaa85627b	[PowerPC] Add options to control paired vector memops support Adds frontend and backend options to enable and disable the PowerPC paired vector memory operations added in ISA 3.1. Instructions using these options will be added in subsequent patches. Differential Revision: https://reviews.llvm.org/D83722	2020-07-29 14:00:53 -05:00
Kang Zhang	802c043078	[PowerPC] Set v1i128 to expand for SETCC to avoid crash Summary: PPC only supports the instruction selection for v16i8, v8i16, v4i32, v2i64, v4f32 and v2f64 for ISD::SETCC, don't support the v1i128, so v1i128 for ISD::SETCC will crash. This patch is to set v1i128 to expand to avoid crash. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D84238	2020-07-29 16:39:27 +00:00
David Green	60280e9818	[Analysis] TTI: Add CastContextHint for getCastInstrCost Currently, getCastInstrCost has limited information about the cast it's rating, often just the opcode and types. Sometimes there is a context instruction as well, but it isn't trustworthy: for instance, when the vectorizer is rating a plan, it calls getCastInstrCost with the old instructions when, in fact, it's trying to evaluate the cost of the instruction post-vectorization. Thus, the current system can get the cost of certain casts incorrect as the correct cost can vary greatly based on the context in which it's used. For example, if the vectorizer queries getCastInstrCost to evaluate the cost of a sext(load) with tail predication enabled, getCastInstrCost will think it's free most of the time, but it's not always free. On ARM MVE, a VLD2 group cannot be extended like a normal VLDR can. Similar situations can come up with how masked loads can be extended when being split. To fix that, this path adds a new parameter to getCastInstrCost to give it a hint about the context of the cast. It adds a CastContextHint enum which contains the type of the load/store being created by the vectorizer - one for each of the types it can produce. Original patch by Pierre van Houtryve Differential Revision: https://reviews.llvm.org/D79162	2020-07-29 13:32:53 +01:00
Kang Zhang	00046d789c	[PowerPC] Add Def CR1 for MTFSFI_rec and MTFSF_rec	2020-07-29 01:47:23 +00:00
jasonliu	f8ab66538c	[NFC][XCOFF] Use getFunctionEntryPointSymbol from TLOF to simplify logic Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D84693	2020-07-28 18:59:51 +00:00
Jinsong Ji	d28f86723f	Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support" This reverts commit `bf544fa1c3`. Fixed the typo in PPCInstrInfo.cpp.	2020-07-28 14:00:11 +00:00
Stefan Pintilie	97470897c4	[PowerPC] Split s34imm into two types Currently the instruction paddi always takes s34imm as the type for the 34 bit immediate. However, the PC Relative form of the instruction should not produce the same fixup as the non PC Relative form. This patch splits the s34imm type into s34imm and s34imm_pcrel so that two different fixups can be emitted. Reviewed By: nemanjai, #powerpc, kamaub Differential Revision: https://reviews.llvm.org/D83255	2020-07-28 05:55:56 -05:00
Jinsong Ji	bf544fa1c3	Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support" This reverts commit `adffce7153`. This is breaking test-suite, revert while investigation.	2020-07-27 21:07:00 +00:00
Jinsong Ji	adffce7153	[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html no one is making use of QPX/A2Q/BGQ/BGP CNK anymore. This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang, CNK support in openmp/polly. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D83915	2020-07-27 19:24:39 +00:00
jasonliu	c25f61cf6a	[XCOFF][AIX] Handle llvm.used and llvm.compiler.used global array For now, just return and do nothing when we see llvm.used and llvm.compiler.used global array. Hopefully, we could come up with a good solution later to prevent linker from eliminating symbols in llvm.used array. Reviewed By: DiggerLin, daltenty Differential Revision: https://reviews.llvm.org/D84363	2020-07-27 15:28:32 +00:00
biplmish	825ed2d43d	[PowerPC] Add Vector Extract Double Instruction Definitions and MC tests. This patch adds the td definitions and asm/disasm tests for the following instructions: Vector Extract Double Left Index - vextdubvlx, vextduhvlx, vextduwvlx, vextddvlx Vector Extract Double Right Index - vextdubvrx, vextduhvrx, vextduwvrx, vextddvrx Differential Revision: https://reviews.llvm.org/D84384	2020-07-26 23:56:19 -05:00
Nemanja Ivanovic	cdead4f89c	[PowerPC][NFC] Fix an assert that cannot trip from `7d076e19e3` I mixed up the precedence of operators in the assert and thought I had it right since there was no compiler warning. This just adds the parentheses in the expression as needed.	2020-07-25 20:28:52 -04:00
Amy Kwan	739cd2638b	[PowerPC] Exploit the High Order Vector Multiply Instructions on Power10 This patch aims to exploit the following vector multiply high instructions on Power10. vmulhsw VRT, VRA, VRB vmulhsd VRT, VRA, VRB vmulhuw VRT, VRA, VRB vmulhud VRT, VRA, VRB Differential Revision: https://reviews.llvm.org/D82584	2020-07-24 20:57:57 -05:00
Amy Kwan	74790a5dde	[PowerPC] Implement Truncate and Store VSX Vector Builtins This patch implements the `vec_xst_trunc` function in altivec.h in order to utilize the Store VSX Vector Rightmost [byte \| half \| word \| doubleword] Indexed instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82467	2020-07-24 19:22:39 -05:00
Nemanja Ivanovic	7d076e19e3	[PowerPC] Fix computation of offset for load-and-splat for permuted loads Unfortunately this is another regression from my canonicalization patch (`1fed131660`). The patch contained two implicit assumptions: 1. That we would have a permuted load only if we are loading a partial vector 2. That a partial vector load would necessarily be as wide as the splat However, assumption 2 is not correct since it is possible to do a wider load and only splat a half of it. This patch corrects this assumption by simply checking if the load is permuted and adjusting the offset if it is.	2020-07-24 15:38:46 -04:00
Amy Kwan	1dc1a3fb0c	[PowerPC] Implement low-order Vector Multiply, Modulus and Divide Instructions This patch aims to implement the low order vector multiply, divide and modulo instructions available on Power10. The patch involves legalizing the ISD nodes MUL, UDIV, SDIV, UREM and SREM for v2i64 and v4i32 vector types in order to utilize the following instructions: - Vector Multiply Low Doubleword: vmulld - Vector Modulus Word/Doubleword: vmodsw, vmoduw, vmodsd, vmodud - Vector Divide Word/Doubleword: vdivsw, vdivsd, vdivuw, vdivud Differential Revision: https://reviews.llvm.org/D82510	2020-07-23 17:18:36 -05:00
Amy Kwan	5f11027395	[PowerPC][Power10] Fix vinsvlx instructions to have i32 arguments. Previously, the vinsvlx instructions were incorrectly defined with i64 as the second argument. This patches fixes this issue by correcting the second argument of the vins*vlx instructions/intrinsics to be i32. Differential Revision: https://reviews.llvm.org/D84277	2020-07-22 17:58:14 -05:00
Amy Kwan	08b4a50e39	[PowerPC][Power10] Fix the Test LSB by Byte (xvtlsbb) Builtins Implementation The implementation of the xvtlsbb builtins/intrinsics were not correct as the intrinsics previously used i1 as an argument type. This patch changes the i1 argument type used in these intrinsics to be i32 instead, as having the second as an i1 can lead to issues in the backend. Differential Revision: https://reviews.llvm.org/D84291	2020-07-22 13:27:05 -05:00
Stefan Pintilie	a60251d739	[PowerPC] Add linker opt for PC Relative GOT indirect accesses A linker optimization is available on PowerPC for GOT indirect PCRelative loads. The idea is that we can mark a usual GOT indirect load: pld 3, vec@got@pcrel(0), 1 lwa 3, 4(3) With a relocation to say that if we don't need to go through the GOT we can let the linker further optimize this and replace a load with a nop. pld 3, vec@got@pcrel(0), 1 .Lpcrel1: .reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8) lwa 3, 4(3) This patch adds the logic that allows the compiler to add the R_PPC64_PCREL_OPT. Reviewers: nemanjai, lei, hfinkel, sfertile, efriedma, tstellar, grosbach Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D79864	2020-07-22 09:08:23 -05:00
jasonliu	b98b1700ef	[XCOFF] Enable symbol alias for AIX Summary: AIX assembly's .set directive is not usable for aliasing purpose. We need to use extra-label-at-defintion strategy to generate symbol aliasing on AIX. Reviewed By: DiggerLin, Xiangling_L Differential Revision: https://reviews.llvm.org/D83252	2020-07-22 14:03:55 +00:00
Sebastian Neubauer	2a6c871596	[InstCombine] Move target-specific inst combining For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time. D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target). This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. This allows to move about 3000 lines out from InstCombine to the targets. Differential Revision: https://reviews.llvm.org/D81728	2020-07-22 15:59:49 +02:00
Chen Zheng	36f9fe2d34	[PowerPC] fixupIsDeadOrKill start and end in different block fixing In fixupIsDeadOrKill, we assume StartMI and EndMI not exist in same basic block, so we add an assertion in that function. This is wrong before RA, as before RA the true definition may exist in another block through copy like instructions. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D83365	2020-07-22 06:27:13 -04:00
Kai Luo	c3f9697f1f	[PowerPC] Fix wrong codegen when stack pointer has to realign performing dynalloc Current powerpc backend generates wrong code sequence if stack pointer has to realign if `-fstack-clash-protection` enabled. When probing dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes `NegSizeReg` as input and returns `FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated correctly, however code following `PREPARE_PROBED_ALLOCA` still uses value of `NegSizeReg`, which does not contain `ActualNegSize` if `MaxAlign > TargetAlign`, to calculate loop trip count and residual number of bytes. This patch is part of fix of https://bugs.llvm.org/show_bug.cgi?id=46759. Differential Revision: https://reviews.llvm.org/D84152	2020-07-22 06:35:12 +00:00

1 2 3 4 5 ...

6407 Commits