llvm-project

Commit Graph

Author	SHA1	Message	Date
Chen Zheng	66a03d1022	[PowerPC] prepare more dq form for P10 pair load/store Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92393	2020-12-08 21:01:40 -05:00
Stefan Pintilie	2812c15156	[PowerPC] Fix missing nop after call to weak callee. Weak functions can be replaced by other functions at link time. Previously it was assumed that no matter what the weak callee function was replaced with it would still share the same TOC as the caller. This is no longer true as a weak callee with a TOC setup can be replaced by another function that was compiled with PC Relative and does not have a TOC at all. This patch makes sure that all calls to functions defined as weak from a caller that has a valid TOC have a nop after the call to allow a place for the linker to restore the TOC. Reviewed By: NeHuang Differential Revision: https://reviews.llvm.org/D91983	2020-12-08 09:38:44 -06:00
Qiu Chaofan	5e85a2ba16	[PowerPC] Implement intrinsic for DARN instruction Instruction darn was introduced in ISA 3.0. It means 'Deliver A Random Number'. The immediate number L means: - L=0, the number is 32-bit (higher 32-bits are all-zero) - L=1, the number is 'conditioned' (processed by hardware to reduce bias) - L=2, the number is not conditioned, directly from noise source GCC implements them in three separate intrinsics: __builtin_darn, __builtin_darn_32 and __builtin_darn_raw. This patch implements the same intrinsics. And this change also addresses Bugzilla PR39800. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92465	2020-12-08 14:08:52 +08:00
Esme-Yi	49599cb1a2	[PowerPC] Correct the bit-width definition for some imm operand in td. Summary: The imm operands of some instructions are not defined accurately in td. This is a small patch to correct these definitions. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D91603	2020-12-08 03:20:12 +00:00
Stefan Pintilie	49921d1c3c	[PowerPC] Exploitation of xxeval instruction for AND and NAND The xxeval instruction was intorduced in Power PC in Power 10. The instruction accepts three vector registers and an immediate. Depending on the value of the immediate the instruction can be used to perform certain bitwise boolean operations (and, or, xor, ...) on the given vector registers. This patch implements the AND and NAND patterns that can be used by the instruction. Reviewed By: nemanjai, #powerpc, bsaleil, NeHuang, jsji Differential Revision: https://reviews.llvm.org/D92420	2020-12-07 12:36:54 -06:00
Esme-Yi	28fdeea952	[PowerPC] Add support for intrinsics dcbfps and dcbstps in P10. Summary: This patch added support for the intrinsics llvm.ppc.dcbfps and llvm.ppc.dcbstps. dcbfps and dcbstps are actually extended mnemonics of dcbf. dcbfps RA,RB ---> dcbf RA,RB,4 dcbstps RA,RB ---> dcbf RA,RB,6 Reviewed By: amyk, steven.zhang Differential Revision: https://reviews.llvm.org/D91323	2020-12-07 05:19:06 +00:00
Qiu Chaofan	efdd463050	[PowerPC] Fix chain for i1-to-fp operation A simple SELECT is used for converting i1 to floating types on ppc32, but in constrained cases, the chain is not handled properly. This patch will fix that. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92365	2020-12-07 10:38:56 +08:00
Jinsong Ji	c8ec685ca5	[llvm-exegesis][PowerPC] Add more register classes This PR adds more register class support in PowerPC, mark OperandType for imm and memory operands. Also added more unit tests for SnippetGenerator. Reviewed By: #powerpc, steven.zhang Differential Revision: https://reviews.llvm.org/D88044	2020-12-04 15:02:12 +00:00
QingShan Zhang	c25b039e21	[PowerPC] Fix the regression caused by commit `9c588f53fc` Add a TypeLegal check for MVT::i1 and add the test.	2020-12-04 10:22:13 +00:00
Baptiste Saleil	45ec3a37b0	[PowerPC] Fix for excessive ACC copies due to PHI nodes When using accumulators in loops, they are passed around in PHI nodes of unprimed accumulators, causing the generation of additional prime/unprime instructions. This patch detects these cases and changes these PHI nodes to primed accumulator PHI nodes. We also add IR and MIR test cases for several PHI node cases. Differential Revision: https://reviews.llvm.org/D91391	2020-12-03 09:51:23 -06:00
QingShan Zhang	9bf0fea372	[PowerPC] Add the hw sqrt test for vector type v4f32/v2f64 PowerPC ISA support the input test for vector type v4f32 and v2f64. Replace the software compare with hw test will improve the perf. Reviewed By: ChenZheng Differential Revision: https://reviews.llvm.org/D90914	2020-12-03 03:19:18 +00:00
jasonliu	a65d8c5d72	[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructions. 2. AIX uses a new personality routine, named __xlcxx_personality_v1. It doesn't use the GCC personality rountine, because the interoperability is not there yet on AIX. 3. AIX do not use eh_frame sections. Instead, it would use a eh_info section (compat unwind section) to store the information about personality routine and LSDA data address. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D91455	2020-12-02 18:42:44 +00:00
Qiu Chaofan	ffa2dce590	[PowerPC] Fix FLT_ROUNDS_ on little endian In lowering of FLT_ROUNDS_, FPSCR content will be moved into FP register and then GPR, and then truncated into word. For subtargets without direct move support, it will store and then load. The load address needs adjustment (+4) only on big-endian targets. This patch fixes it on using generic opcodes on little-endian and subtargets with direct-move. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D91845	2020-12-02 17:16:32 +08:00
QingShan Zhang	47f784ace6	[PowerPC] Promote the i1 to i64 for SINT_TO_FP/FP_TO_SINT i1 is the native type for PowerPC if crbits is enabled. However, we need to promote the i1 to i64 as we didn't have the pattern for i1. Reviewed By: Qiu Chao Fang Differential Revision: https://reviews.llvm.org/D92067	2020-12-02 05:37:45 +00:00
Chen Zheng	95d6042dd4	[NFC][PowerPC] code refactor: split IsReassociable to fma and add. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92070	2020-12-01 21:18:57 -05:00
Fangrui Song	7c4555f60d	[PowerPC] Delete remnant Darwin code in PPCAsmParser Continue the work started at D50989. The code has been long dead since the triple has been removed (D75494). Reviewed By: nickdesaulniers, void Differential Revision: https://reviews.llvm.org/D91836	2020-11-30 10:16:19 -08:00
QingShan Zhang	4d83aba422	[DAGCombine] Adding a hook to improve the precision of fsqrt if the input is denormal For now, we will hardcode the result as 0.0 if the input is denormal or 0. That will have the impact the precision. As the fsqrt added belong to the cold path of the cmp+branch, it won't impact the performance for normal inputs for PowerPC, but improve the precision if the input is denormal. Reviewed By: Spatel Differential Revision: https://reviews.llvm.org/D80974	2020-11-27 02:10:55 +00:00
Zarko Todorovski	6d648e69c0	[AIX] Add support for non var_arg extended vector ABI calling convention on AIX This patch enables passing non variadic vector type parameters on the caller and callee side and vector return on AIX that are passed in vector registers only. So far, support is enabled for only the AIX extended Altivec ABI Calling convention. Reviewed By: sfertile, DiggerLin Differential Revision: https://reviews.llvm.org/D86476	2020-11-26 12:03:51 -05:00
Simon Pilgrim	0637dfe88b	[DAG] Legalize abs(x) -> smax(x,sub(0,x)) iff smax/sub are legal If smax() is legal, this is likely to result in smaller codegen expansion for abs(x) than the xor(add,ashr) method. This is also what PowerPC has been doing for its abs implementation, so it lets us get rid of a load of custom lowering code there (and which was never updated when they added smax lowering). Alive2: https://alive2.llvm.org/ce/z/xRk3cD Differential Revision: https://reviews.llvm.org/D92095	2020-11-25 15:03:03 +00:00
Kai Luo	97e7ce3b15	[PowerPC] Probe the gap between stackptr and realigned stackptr During reviewing https://reviews.llvm.org/D84419, @efriedma mentioned the gap between realigned stack pointer and origin stack pointer should be probed too whatever the alignment is. This patch fixes the issue for PPC64. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D88078	2020-11-25 07:01:45 +00:00
QingShan Zhang	9c588f53fc	[DAGCombine] Add hook to allow target specific test for sqrt input PowerPC has instruction ftsqrt/xstsqrtdp etc to do the input test for software square root. LLVM now tests it with smallest normalized value using abs + setcc. We should add hook to target that has test instructions. Reviewed By: Spatel, Chen Zheng, Qiu Chao Fang Differential Revision: https://reviews.llvm.org/D80706	2020-11-25 05:37:15 +00:00
Zarko Todorovski	be7d425edc	[PPC][AIX] Add vector callee saved registers for AIX extended vector ABI This patch is the initial patch for support of the AIX extended vector ABI. The extended ABI treats vector registers V20-V31 as non-volatile and we add them as callee saved registers in this patch. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D88676	2020-11-24 23:01:51 -05:00
QingShan Zhang	fa42f08b26	[PowerPC][FP128] Fix the incorrect calling convention for IEEE long double on Power8 For now, we are using the GPR to pass the arguments/return value for fp128 on Power8, which is incorrect. It should be VSR. The reason why we do it this way is that, we are setting the fp128 as illegal which make LLVM try to emulate it with i128 on Power8. So, we need to correct it as legal. Reviewed By: Nemanjai Differential Revision: https://reviews.llvm.org/D91527	2020-11-25 01:43:48 +00:00
Zarko Todorovski	c92f29b05e	[AIX] Add mabi=vec-extabi options to enable the AIX extended and default vector ABIs. Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX. The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc. Reviewed By: Xiangling_L, DiggerLin Differential Revision: https://reviews.llvm.org/D89684	2020-11-24 18:17:53 -05:00
Sean Fertile	4f5355ee73	[PowerPC] Don't reuse an illegal typed load for int_to_fp conversion. When the operand to an (s/u)int_to_fp node is an illegally typed load we cannot reuse the load address since we can not build a proper dependancy chain. The legalized loads will use a different chain output then the illegal load. If we reuse the load address then we will build a conversion node that uses the chain of the illegal load and operations which modify the memory address in the other dependancy chain can be scheduled before the floating point load which feeds the conversion. Differential Revision: https://reviews.llvm.org/D91265	2020-11-24 15:45:33 -05:00
Victor Huang	1f5c4a0d04	[PowerPC][PCRelative] Add new pseudo instructions for PCRel TLS to fix R2 clobber issue New pseudo instructions GETtlsADDRPCREL and GETtlsldADDRPCREL are added for properly setting REGMASK for tls_get_addr function when using PCRelative address. Differential Revisien: https://reviews.llvm.org/D91420 Reviewed by: bsaleil	2020-11-24 11:34:32 -06:00
Masoud Ataei	b86a1cd2f8	[PowerPC] dyn_cast should be dyn_cast_or_null in MASSV pass It is possible that we have different constants in different slots of second vector double (float) of pow function. So, in this case Exp->getSplatValue() will return nullptr. Here, I handle it properly. Reviewed By: steven.zhang, PowerPC Differential Revision: https://reviews.llvm.org/D91729	2020-11-24 16:21:12 +00:00
Richard Smith	97c8fba7e4	Fix signed integer overflow bug that's causing test failures with UBSan.	2020-11-23 17:20:58 -08:00
Xiangling Liao	01b3e6e026	[AIX] Support init priority Support reserved [0-100] and non-reserved[101-65535] Clang/GNU init priority values on AIX. This patch maps Clang/GNU values into priority values used in sinit/sterm functions. User can play with values and be able to get init to occur before or after XL init and vice versa. Differential Revision: https://reviews.llvm.org/D91272	2020-11-23 14:50:05 -05:00
Esme-Yi	1c0941e152	[PowerPC] Extend folding RLWINM + RLWINM to post-RA. Summary: We have the patterns to fold 2 RLWINMs before RA, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization too. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D89855	2020-11-22 07:37:24 +00:00
Craig Topper	a7eae62a42	[SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version. The default version only works if the returned node has a single result. The X86 and PowerPC versions support multiple results and allow a single result to be returned from a node with multiple outputs. And allow a single result that is not result 0 of the node. Also replace the Mips version since the new version should work for it. The original version handled multiple results, but only if the new node and original node had the same number of results. Differential Revision: https://reviews.llvm.org/D91846	2020-11-20 10:06:53 -08:00
Bill Wendling	b2f6630739	[PowerPC] Allow a '%' prefix for registers in CFI directives Clang generates a '%' prefix for some registers in CFI directives. E.g. ".cfi_register lr, r12" becomes ".cfi_register lr, %r12" after processing. Differential Revision: https://reviews.llvm.org/D91735	2020-11-19 18:19:51 -08:00
Baptiste Saleil	18db29ea6f	[PowerPC] Add peephole to remove redundant accumulator prime/unprime instructions In some situations, the compiler may insert an accumulator prime instruction and an accumulator unprime instruction with no use of that accumulator between the two. That's for example the case when we store an accumulator after assembling it or restoring it. This patch adds a peephole to remove these prime and unprime instructions. Differential Revision: https://reviews.llvm.org/D91386	2020-11-18 15:01:07 -06:00
Simon Pilgrim	5f3a8074a4	[PPC] Fix dead store value clang static analyzer warning. NFCI. Simplify the SplatBits 2-byte -> 4-byte 'splat'.	2020-11-17 16:27:45 +00:00
Florian Hahn	b2f4c5fddc	[AsmWriter] Factor out mnemonic generation to accessible getMnemonic. This patch factors out the part of printInstruction that gets the mnemonic string for a given MCInst. This is intended to be used subsequently for the instruction-mix remarks to display the final mnemonic (D90040). Unfortunately making `getMnemonic` available to the AsmPrinter seems to require making it virtual. Not sure if there's a way around that with the current layering of the AsmPrinters. Reviewed By: Paul-C-Anagnostopoulos Differential Revision: https://reviews.llvm.org/D90039	2020-11-17 09:47:38 +00:00
Baptiste Saleil	3f78605a8c	[PowerPC] Add paired vector load and store builtins and intrinsics This patch adds the Clang builtins and LLVM intrinsics to load and store vector pairs. Differential Revision: https://reviews.llvm.org/D90799	2020-11-13 12:35:10 -06:00
serge-sans-paille	9218ff50f9	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Baptiste Saleil	37c4ac8545	[PowerPC] Accumulator/Unprimed Accumulator register copy, spill and restore This patch adds support for accumulator/unprimed accumulator register copy, spill and restore for MMA. Authored By: Baptiste Saleil Reviewed By: #powerpc, bsaleil, amyk Differential Revision: https://reviews.llvm.org/D90616	2020-11-11 16:23:45 -06:00
Esme-Yi	6e0ad5bc8c	[PowerPC] Add an ISEL pattern for Mul with Imm. Summary: This patch try to do the following transformation if the multiplier doen't fit int16: (mul X, c1 << c2) -> (rldicr (mulli X, c1) c2) Reviewed By: jsji, steven.zhang Differential Revision: https://reviews.llvm.org/D87384	2020-11-10 06:52:39 +00:00
Mircea Trofin	2ac3a7d0c4	[NFC] Use [MC]Register Differential Revision: https://reviews.llvm.org/D90795	2020-11-09 08:37:14 -08:00
jasonliu	42d2109380	[XCOFF] Enable explicit sections on AIX Implement mechanism to allow explicit sections to be generated on AIX. Reviewed By: DiggerLin Differential Revision: https://reviews.llvm.org/D88615	2020-11-09 16:27:38 +00:00
Esme-Yi	5053eab890	Revert "[PowerPC] Extend folding RLWINM + RLWINM to post-RA." This reverts commit `119ab2181e`.	2020-11-03 16:34:02 +00:00
Esme-Yi	119ab2181e	[PowerPC] Extend folding RLWINM + RLWINM to post-RA. Summary: This patch depends on D89846. We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too. Reviewed By: shchenz, steven.zhang Differential Revision: https://reviews.llvm.org/D89855	2020-11-03 07:44:11 +00:00
Esme-Yi	b969dfe26f	[NFC][PowerPC] Move the folding RLWINMs from ppc-mi-peephole to PPCInstrInfo. Summary: We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example D88274. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too. This is a NFC patch to move the folding patterns to PPCInstrInfo, and the follow-up works will be calling it in pre-emit-peephole and expand the patterns to handle more cases. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D89846	2020-11-03 06:28:56 +00:00
Qiu Chaofan	d14e51806b	[PowerPC] Skip IEEE 128-bit FP type in FastISel Vector types, quadword integers and f128 currently cannot be handled in FastISel. We did not skip f128 type in lowering arguments, which causes a crash. This patch will fix it. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D90206	2020-11-03 11:17:11 +08:00
Qiu Chaofan	3204ffeade	[PowerPC] [NFC] Rename VCMPo to VCMP_rec Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D90581	2020-11-03 11:10:59 +08:00
Fangrui Song	ca01a6b3ac	[PowerPC] Parse and ignore .machine ppc64 In the wild, kexec-tools purgatory/arch/ppc64/v2wrap.S and hvcall.S use this directive.	2020-11-02 16:49:57 -08:00
Florian Hahn	b3b993a7ad	Reland "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts the revert commit `408c4408fa`. This version of the patch includes a fix for a crash caused by treating ICmp/FCmp constant expressions as instructions. Original message: On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV.	2020-11-02 15:39:29 +00:00
Qiu Chaofan	2762e6734f	[PowerPC] Fix a crash in POWER 9 setb peephole Variable InnerIsSel references FalseRes, while FalseRes might be zext/sext. So InnerIsSel should reference SetOrSelCC, otherwise a crash will happen. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D90142	2020-11-02 14:29:43 +08:00
Florian Hahn	408c4408fa	Revert "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts commit `73f01e3df5`. This appears to break http://lab.llvm.org:8011/#/builders/85/builds/383.	2020-10-30 21:26:14 +00:00
Florian Hahn	73f01e3df5	[TTI] Add VecPred argument to getCmpSelInstrCost. On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV. Reviewed By: dmgreen, RKSimon Differential Revision: https://reviews.llvm.org/D90070	2020-10-30 13:49:08 +00:00
Nemanja Ivanovic	5459d08795	[PowerPC] Fix single-use check and update chain users for ld-splat When converting a BUILD_VECTOR or VECTOR_SHUFFLE to a splatting load as of `1461fb6e78`, we inaccurately check for a single user of the load and neglect to update the users of the output chain of the original load. As a result, we can emit a new load when the original load is kept and the new load can be reordered after a dependent store. This patch fixes those two issues. Fixes https://bugs.llvm.org/show_bug.cgi?id=47891	2020-10-27 16:49:38 -05:00
Victor Huang	2e1a737f46	[PowerPC][PCRelative] Turn on TLS support for PCRel by default Turn on TLS support for PCRel by default and update the test cases. Differential Revision: https://reviews.llvm.org/D88738 Reviewed by: stefanp, kamaub	2020-10-27 13:58:44 -05:00
Chen Zheng	00e573cadb	[LSR] fix typo in comments and rename for a new added hook.	2020-10-26 22:29:22 -04:00
Amy Kwan	803cc3aff2	[PowerPC] Implement Set Boolean Condition Instructions This patch implements the set boolean condition instructions introduced in POWER10. The set boolean condition instructions (set[n]bc[r]) are used during the following situations: - sign/zero/any extending i1 to an i32 or i64, - reg+reg, reg+imm or floating point comparisons being sign/zero extended to i32 or i64, - spilling CR bits (using the setnbc instruction) Differential Revision: https://reviews.llvm.org/D87705	2020-10-26 18:42:51 -05:00
Baptiste Saleil	edb27912a3	[PowerPC] Add intrinsics for MMA This patch adds support for MMA intrinsics. Authored by: Baptiste Saleil Reviewed By: #powerpc, bsaleil, amyk Differential Revision: https://reviews.llvm.org/D89345	2020-10-23 13:16:02 -05:00
Victor Huang	7a74bb899a	[PowerPC] Fix the Predicates for enabling pcrelative-memops and PLXVP/PSTXVP definitions In this patch, Predicates fix added for the following: * disable prefix-instrs will disable pcrelative-memops * set two predicates PairedVectorMemops and PrefixInstrs for PLXVP/PSTXVP definitions Differential Revision: https://reviews.llvm.org/D89727 Reviewed by: amyk, steven.zhang	2020-10-23 11:33:20 -05:00
Chen Zheng	1e0b6c1df0	[LSR] ignore profitable chain when reg num is not major cost. Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D89665	2020-10-23 09:35:48 -04:00
Nicholas Guy	9a2d2bedb7	Add "SkipDead" parameter to TargetInstrInfo::DefinesPredicate Some instructions may be removable through processes such as IfConversion, however DefinesPredicate can not be made aware of when this should be considered. This parameter allows DefinesPredicate to distinguish these removable instructions on a per-call basis, allowing for more fine-grained control from processes like ifConversion. Renames DefinesPredicate to ClobbersPredicate, to better reflect it's purpose Differential Revision: https://reviews.llvm.org/D88494	2020-10-21 11:52:47 +01:00
Amy Kwan	6a946fd06f	[DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead. MULH is often expanded on targets. This patch removes the isMulhCheaperThanMulShift hook and uses isOperationLegalOrCustom instead. Differential Revision: https://reviews.llvm.org/D80485	2020-10-19 12:23:04 -05:00
Kai Luo	354d3106c6	[PowerPC] Skip combining (uint_to_fp x) if x is not simple type Current powerpc64le backend hits ``` Combining: t7: f64 = uint_to_fp t6 llc: llvm-project/llvm/include/llvm/CodeGen/ValueTypes.h:291: llvm::MVT llvm::EVT::getSimpleVT() const: Assertion `isSimple() && "Expected a SimpleValueType!"' failed. ``` This patch fixes it by skipping combination if `t6` is not simple type. Fixed https://bugs.llvm.org/show_bug.cgi?id=47660. Reviewed By: #powerpc, steven.zhang Differential Revision: https://reviews.llvm.org/D88388	2020-10-19 05:23:46 +00:00
Albion Fung	d30155feaa	[PowerPC] Implementation of 128-bit Binary Vector Rotate builtins This patch implements 128-bit Binary Vector Rotate builtins for PowerPC10. Differential Revision: https://reviews.llvm.org/D86819	2020-10-16 18:03:22 -04:00
David Sherwood	47f2dc7e5f	[SVE][NFC] Replace some TypeSize comparisons in non-AArch64 Targets In most of lib/Target we know that we are not dealing with scalable types so it's perfectly fine to replace TypeSize comparison operators with their fixed width equivalents, making use of getFixedSize() and so on. Differential Revision: https://reviews.llvm.org/D89101	2020-10-15 09:01:21 +01:00
Ahsan Saghir	f3202b30b8	[PowerPC] Add assemble disassemble intrinsics for MMA This patch adds support for assemble disassemble intrinsics for MMA. Reviewed By: bsaleil, #powerpc Differential Revision: https://reviews.llvm.org/D88739	2020-10-13 13:21:58 -05:00
Simon Pilgrim	2c3e4a21f9	[PowerPC] ReplaceNodeResults - bail on funnel shifts and let generic legalizers deal with it Fixes regression raised on D88834 for 32-bit triple + 64-bit cpu cases (which apparently is a thing).	2020-10-10 19:13:16 +01:00
Fangrui Song	2bd4730850	[PowerPC] Fix signed overflow in decomposeMulByConstant after D88201 Caught by multipliers LONG_MAX (after +1) and LONG_MIN (after -1) in CodeGen/PowerPC/mul-const-i64.ll	2020-10-09 18:29:12 -07:00
Esme-Yi	e9fd8823ba	[DAGCombiner] Add decomposition patterns for Mul-by-Imm. Summary: This patch is derived from D87384. In this patch we expand the existing decomposition of mul-by-constant to be more general by implementing 2 patterns: ``` mul x, (2^N + 2^M) --> (add (shl x, N), (shl x, M)) mul x, (2^N - 2^M) --> (sub (shl x, N), (shl x, M)) ``` The conversion will be trigged if the multiplier is a big constant that the target can't use a single multiplication instruction to handle. This is controlled by the hook `decomposeMulByConstant`. More over, the conversion benefits from an ILP improvement since the instructions are independent. A case with the sequence like following also gets benefit since a shift instruction is saved. ``` res1 = a 0x8800; res2 = a 0x8080; ``` Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D88201	2020-10-09 08:51:40 +00:00
diggerlin	92bca12843	[AIX] add new option -mignore-xcoff-visibility SUMMARY: In IBM compiler xlclang , there is an option -fnovisibility which suppresses visibility. For more details see: https://www.ibm.com/support/knowledgecenter/SSGH3R_16.1.0/com.ibm.xlcpp161.aix.doc/compiler_ref/opt_visibility.html. We need to add the option -mignore-xcoff-visibility for compatibility with the IBM AIX OS (as the option is enabled by default in AIX). With this option llvm does not emit any visibility attribute to ASM or XCOFF object file. The option only work on the AIX OS, for other non-AIX OS using the option will report an unsupported options error. In AIX OS: 1.1 the option -mignore-xcoff-visibility is enabled by default , if there is not -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command . 1.2 if there is -fvisibility=* explicitly but not -mignore-xcoff-visibility explicitly in the clang command. it will generate visibility attributes. 1.3 if there are both -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command. The option "-mignore-xcoff-visibility" wins , it do not emit the visibility attribute. The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR. Reviewer: daltenty,Jason Liu Differential Revision: https://reviews.llvm.org/D87451	2020-10-08 09:34:58 -04:00
Chen Zheng	f05608707c	[PowerPC] implement target hook getTgtMemIntrinsic This patch can make pass recognize Powerpc related memory intrinsics. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D88373	2020-10-07 00:02:44 -04:00
Chen Zheng	0492dd91c4	[PowerPC] add more builtins for PPCTargetLowering::getTgtMemIntrinsic Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D88374	2020-10-06 23:48:33 -04:00
Craig Topper	1127662c6d	[SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS. getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize FP constants to the RHS. When this happens we should create the node with the FMF that was requested. By using FlagInserter when can ensure any calls to getNode/getSetcc during canonicalization will also get the flags. Differential Revision: https://reviews.llvm.org/D88063	2020-10-05 14:55:23 -07:00
Esme-Yi	e3475f5b91	[PowerPC] Add builtins for xvtdiv(dp\|sp) and xvtsqrt(dp\|sp). Summary: This patch implements the builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp. The instructions correspond to the following builtins: int vec_test_swdiv(vector double v1, vector double v2); int vec_test_swdivs(vector float v1, vector float v2); int vec_test_swsqrt(vector double v1); int vec_test_swsqrts(vector float v1); This patch depends on D88274, which fixes the bug in copying from CRRC to GPRC/G8RC. Reviewed By: steven.zhang, amyk Differential Revision: https://reviews.llvm.org/D88278	2020-10-04 16:24:20 +00:00
Esme-Yi	c4690b0077	[PowerPC] Put the CR field in low bits of GRC during copying CRRC to GRC. Summary: How we copying the CRRC to GRC is using a single MFOCRF to copy the contents of CR field n (CR bits 4×n+32:4×n+35) into bits 4×n+32:4×n+35 of register GRC. That’s not correct because we expect the value of destination register equals to source so we have to put the the contents of CR field in the lowest 4 bits. This patch adds a RLWINM after MFOCRF to achieve that. The problem came up when adding builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp, as posted in D88278. We need to move the outputs (in CR register) to GRC. However outputs of these instructions may not in a fixed CR# register, so we can’t directly add a rotation instruction in the .td patterns, but need to wait until the CR register is determined. Then we confirmed this should be a bug in POST-RA PSEUDO PASS. Reviewed By: nemanjai, shchenz Differential Revision: https://reviews.llvm.org/D88274	2020-10-02 01:26:18 +00:00
jasonliu	78a9e62aa6	[XCOFF] Enable -fdata-sections on AIX Summary: Some design decision worth noting about: I've noticed a recent mailing discussing about why string literal is not affected by -fdata-sections for ELF target: http://lists.llvm.org/pipermail/llvm-dev/2020-September/145121.html But on AIX, our linker could not split the mergeable string like other target. So I think it would make more sense for us to emit separate csect for every mergeable string in -fdata-sections mode, as there might not be other ways for linker to do garbage collection on unused mergeable string. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D88339	2020-10-02 00:16:24 +00:00
Ahsan Saghir	66d2e3f495	[PowerPC] Add outer product instructions for MMA This patch adds outer product instructions for MMA, including related infrastructure, and their tests. Depends on D84968. Reviewed By: #powerpc, bsaleil, amyk Differential Revision: https://reviews.llvm.org/D88043	2020-09-30 18:06:49 -05:00
Zarko Todorovski	052c5bf40a	[PPC] Do not emit extswsli in 32BIT mode when using -mcpu=pwr9 It looks like in some circumstances when compiling with `-mcpu=pwr9` we create an EXTSWSLI node when which causes llc to fail. No such error occurs in pwr8 or lower. This occurs in 32BIT AIX and BE Linux. the cause seems to be that the default return in combineSHL is to create an EXTSWSLI node. Adding a check for whether we are in PPC64 before that fixes the issue. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D87046	2020-09-30 11:06:20 -04:00
Benjamin Kramer	2c394bd407	[PowerPC] Avoid unused variable warning in Release builds PPCFrameLowering.cpp:632:8: warning: unused variable 'isAIXABI' [-Wunused-variable]	2020-09-30 17:02:55 +02:00
Sean Fertile	dfb717da1f	[PowerPC] Remove support for VRSAVE save/restore/update. After removal of Darwin as a PowerPC subtarget, the VRSAVE save/restore/spill/update code is no longer needed by any supported subtarget, so remove it while keeping support for vrsave and related instruction aliases for inline asm. I've pre-commited tests to document the existing vrsave handling in relation to @llvm.eh.unwind.init and inline asm usage, as well as a test which shows a beahviour change on AIX related to returning vector type as we were wrongly emiting VRSAVE_UPDATE on AIX.	2020-09-30 10:05:53 -04:00
Baptiste Saleil	0156914275	[PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types This patch legalizes the v256i1 and v512i1 types that will be used for MMA. It implements loads and stores of these types. v256i1 is a pair of VSX registers, so for this type, we load/store the two underlying registers. v512i1 is used for MMA accumulators. So in addition to loading and storing the 4 associated VSX registers, we generate instructions to prime (copy the VSX registers to the accumulator) after loading and unprime (copy the accumulator back to the VSX registers) before storing. This patch also adds the UACC register class that is necessary to implement the loads and stores. This class represents accumulator in their unprimed form and allow the distinction between primed and unprimed accumulators to avoid invalid copies of the VSX registers associated with primed accumulators. Differential Revision: https://reviews.llvm.org/D84968	2020-09-28 14:39:37 -05:00
Qiu Chaofan	40e86ca749	[PowerPC] Clean-up mayRaiseFPException bits According to POWER ISA, floating point instructions altering exception bits in FPSCR should be 'may raise FP exception'. (excluding those read or write the whole FPSCR directly, like mffs/mtfsf) We need to model FPSCR well in future patches to handle the special case properly. Instructions added mayRaiseFPException: - fre(s)/frsqrte(s) - fmadd(s)/fmsub(s)/fnmadd(s)/fnmsub(s) - xscmpoqp/xscmpuqp/xscmpeqdp/xscmpgedp/xscmpgtdp - xscvdphp/xscvhpdp/xvcvhpsp/xvcvsphp/xsrqpxp - xsmaxcdp/xsincdp/xsmaxjdp/xsminjdp Instructions removed mayRaiseFPException: - xstdivdp/xvtdiv(d\|s)p/xstsqrtdp/xvtsqrt(d\|s)p - xsabsdp/xsnabsdp/xvabs(d\|s)p/xvnabs(d\|s)p - xsnegdp/xscpsgndp/xvneg(d\|s)p/xvcpsgn(d\|s)p - xvcvsxwdp/xvcvuxwdp - xscvdpspn/xscvspdpn Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D87738	2020-09-28 18:22:12 +08:00
Amy Kwan	6f24774fc4	[NFC][PowerPC] Change PPCSubTarget (introduced from D87671) to Subtarget In D87671, it introduced PPCSubTarget in PPCISelDAGToDAG. This should have been Subtarget instead. This patch changes PPCSubTarget into Subtarget.	2020-09-26 17:53:51 -05:00
Baptiste Saleil	9b86b70094	[PowerPC] Add accumulator register class and instructions This patch adds the xxmfacc, xxmtacc and xxsetaccz instructions to manipulate accumulator registers. It also adds the ACC register class definition for the accumulator registers. Differential Revision: https://reviews.llvm.org/D84847	2020-09-25 12:25:13 -05:00
Amy Kwan	2e7117f847	[PowerPC] Implement the 128-bit vec_[all\|any]_[eq \| ne \| lt \| gt \| le \| ge] builtins in Clang/LLVM This patch implements the vec_[all\|any]_[eq \| ne \| lt \| gt \| le \| ge] builtins for vector signed/unsigned __int128. Differential Revision: https://reviews.llvm.org/D87910	2020-09-23 16:49:40 -04:00
Albion Fung	88cdbeab41	[PowerPC] Implement Vector signed/unsigned __int128 overloads for the comparison builtins This patch implements Vector signed/unsigned __int128 overloads for the comparison builtins. Differential Revision: https://reviews.llvm.org/D87804	2020-09-23 16:49:40 -04:00
Victor Huang	652a8f150d	[PowerPC][PCRelative] Thread Local Storage Support for Local Dynamic This patch is the initial support for the Local Dynamic Thread Local Storage model to produce code sequence and relocation correct to the ABI for the model when using PC relative memory operations. Differential Revision: https://reviews.llvm.org/D87721	2020-09-23 13:48:06 -05:00
Albion Fung	d7eb917a7c	[PowerPC] Implementation of 128-bit Binary Vector Mod and Sign Extend builtins This patch implements 128-bit Binary Vector Mod and Sign Extend builtins for PowerPC10. Differential: https://reviews.llvm.org/D87394#inline-815858	2020-09-23 01:18:14 -05:00
Stefanos Baziotis	a7873e5abc	Small fixes for "[LoopInfo] empty() -> isInnermost(), add isOutermost()"	2020-09-22 23:59:34 +03:00
Hubert Tong	b0f58aa116	[NFC] Replace tabs with spaces in PPCInstrPrefix.td	2020-09-22 14:23:32 -04:00
Amy Kwan	079757b551	[PowerPC] Implement Vector String Isolate Builtins in Clang/LLVM This patch implements the vector string isolate (predicate and non-predicate versions) builtins. The predicate builtins are custom selected within PPCISelDAGToDAG. Differential Revision: https://reviews.llvm.org/D87671	2020-09-22 11:31:44 -05:00
Amy Kwan	b3147058de	[PowerPC] Implement the 128-bit Vector Divide Extended Builtins in Clang/LLVM This patch implements the 128-bit vector divide extended builtins in Clang/LLVM. These builtins map to the vdivesq and vdiveuq instructions respectively. Differential Revision: https://reviews.llvm.org/D87729	2020-09-22 11:31:44 -05:00
Stefan Pintilie	7e78d89052	[PowerPC] Fix for compiler side issue in PCRelative Local Exec Stop combining loads and stores with PPCISD::ADD_TLS before we can merge the node with with TLS_LOCAL_EXEC_MAT_ADDR. The issue is that TLS_LOCAL_EXEC_MAT_ADDR cannot be selected by itself and requires the previous ADD_TLS node that goes with it. However, we sometimes try to combine ADD_TLS with loads and stores that come after it. If this happens then the ADD_TLS is removed and TLS_LOCAL_EXEC_MAT_ADDR cannot be selected. While this bug fix will address the issue it my not be ideal from a performance perspective as we may be able to add patterns to combine TLS_LOCAL_EXEC_MAT_ADDR with ADD_TLS with the load and store that comes after it all in one. However, this is beyond the scope of this patch. Reviewed By: NeHuang Differential Revision: https://reviews.llvm.org/D88030	2020-09-22 08:28:06 -05:00
Meera Nakrani	a3d0dce260	[ARM][TTI] Prevents constants in a min(max) or max(min) pattern from being hoisted when in a loop Changes TTI function getIntImmCostInst to take an additional Instruction parameter, which enables us to be able to check it is part of a min(max())/max(min()) pattern that will match SSAT. We can then mark the constant used as free to prevent it being hoisted so SSAT can still be generated. Required minor changes in some non-ARM backends to allow for the optional parameter to be included. Differential Revision: https://reviews.llvm.org/D87457	2020-09-22 11:54:10 +00:00
Baptiste Saleil	bb82135538	[PowerPC] Remove unnecessary patterns and types These patterns and type uses were added by mistake by commit `1372e23c7d`	2020-09-21 16:08:54 -05:00
Baptiste Saleil	1372e23c7d	[PowerPC] Add vector pair load/store instructions and vector pair register class This patch adds support for the lxvp, lxvpx, plxvp, stxvp, stxvpx and pstxvp instructions in the PowerPC backend. These instructions allow loading and storing VSX register pairs. This patch also adds the VSRp register class definition needed for these instructions. Differential Revision: https://reviews.llvm.org/D84359	2020-09-21 10:27:47 -05:00
Qiu Chaofan	1d782c2987	[PowerPC] Pass nofpexcept flag to custom lowered constrained ops This is a follow-up of D86605. For strict DAG FP node, if its FP exception behavior metadata is ignore, it should have nofpexcept flag. But during custom lowering, this flag isn't passed down. This is also seen on X86 target. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87390	2020-09-21 10:44:25 +08:00
Amy Kwan	37e7673c21	[PowerPC] Implement Move to VSR Mask builtins in LLVM/Clang This patch implements the vec_gen[b\|h\|w\|d\|q]m function prototypes in altivec.h in order to utilize the move to VSR with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82725	2020-09-18 18:16:14 -05:00
Amy Kwan	6f3c0991bf	[PowerPC] Add Set Boolean Condition Instruction Definitions and MC Tests This patch adds the instruction definitions and assembly/disassembly tests for the set boolean condition instructions. This also includes the negative, and reverse variants of the instruction. Differential Revision: https://reviews.llvm.org/D86252	2020-09-17 18:20:54 -05:00
Amy Kwan	2c3bc918db	[PowerPC] Implement Vector Count Mask Bits builtins in LLVM/Clang This patch implements the vec_cntm function prototypes in altivec.h in order to utilize the vector count mask bits instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82726	2020-09-17 18:20:53 -05:00
Simon Pilgrim	f026812110	InstCombiner.h - remove unnecessary KnownBits.h include. NFCI. Move the include down to cpp files with an implicit dependency.	2020-09-17 14:28:42 +01:00
Qiu Chaofan	ebfbdebe96	[PowerPC] Fix store-fptoi combine of f128 on Power8 llc would crash for (store (fptosi-f128-i32)) when -mcpu=pwr8, we should not generate FP_TO_(S\|U)INT_IN_VSR for f128 types at this time. This patch fixes it. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D86686	2020-09-17 10:21:35 +08:00

1 2 3 4 5 ...

6407 Commits