llvm-project

Commit Graph

Author	SHA1	Message	Date
Brad Smith	88b368a1c4	[PowerPC] Set setMaxAtomicSizeInBitsSupported appropriately for 32-bit PowerPC in PPCTargetLowering Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D86165	2020-09-08 21:21:14 -04:00
Craig Topper	b1e68f885b	[SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.: This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130. This required adding a SDNodeFlags to SelectionDAG::getSetCC. Now we manage to contant fold some stuff undefs during the initial getNode that we don't do in later DAG combines. Differential Revision: https://reviews.llvm.org/D87200	2020-09-08 15:27:21 -07:00
Qiu Chaofan	705271d9cd	[PowerPC] Expand constrained ppc_fp128 to i32 conversion Libcall __gcc_qtou is not available, which breaks some tests needing it. On PowerPC, we have code to manually expand the operation, this patch applies it to constrained conversion. To keep it strict-safe, it's using the algorithm similar to expandFP_TO_UINT. For constrained operations marking FP exception behavior as 'ignore', we should set the NoFPExcept flag. However, in some custom lowering the flag is missed. This should be fixed by future patches. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D86605	2020-09-05 13:16:20 +08:00
Nemanja Ivanovic	2771407584	[PowerPC] Do not legalize vector FDIV without VSX Quite a while ago, we legalized these nodes as we added custom handling for reciprocal estimates in the back end. We have since moved to target-independent combines but neglected to turn off legalization. As a result, we can now get selection failures on non-VSX subtargets as evidenced in the listed PR. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47373	2020-09-02 16:03:36 -05:00
Albion Fung	331dcc43ea	[PowerPC] Implemented Vector Load with Zero and Signed Extend Builtins This patch implements the builtins for Vector Load with Zero and Signed Extend Builtins (lxvr_x for b, h, w, d), and adds the appropriate test cases for these builtins. The builtins utilize the vector load instructions itnroduced with ISA 3.1. Differential Revision: https://reviews.llvm.org/D82502#inline-797941	2020-08-28 11:28:58 -05:00
Roland Froese	b6d7ed469f	[PowerPC] Extend custom lower of vector truncate to handle wider input Current custom lowering of truncate vector handles a source of up to 128 bits, but that only uses one of the two shuffle vector operands. Extend it to use both operands to handle 256 bit sources. Differential Revision: https://reviews.llvm.org/D68035	2020-08-24 15:33:43 -04:00
Qiu Chaofan	41ba9d7723	[PowerPC] Support constrained vector fp/int conversion This patch makes these operations legal, and add necessary codegen patterns. There's still some issue similar to D77033 for conversion from v1i128 type. But normal type tests synced in vector-constrained-fp-intrinsic are passed successfully. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D83654	2020-08-24 10:10:27 +08:00
Qiu Chaofan	a5b7b8cce0	[PowerPC] Support constrained scalar sitofp/uitofp This patch adds support for constrained scalar int to fp operations on PowerPC. Besides, this also fixes the FP exception bit of FCFID* instructions. Reviewed By: steven.zhang, uweigand Differential Revision: https://reviews.llvm.org/D81669	2020-08-22 02:10:29 +08:00
Kamau Bridgeman	365f861c45	[PowerPC][PCRelative] Thread Local Storage Support for Initial Exec This patch is the initial support for the Intial Exec Thread Local Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D81947	2020-08-21 10:13:11 -05:00
Kamau Bridgeman	b74b80bb2d	[PowerPC][PCRelative] Thread Local Storage Support for General Dynamic This patch is the initial support for the General Dynamic Thread Local Local Storage model to produce code sequence and relocations correct to the ABI for the model when using PC relative memory operations. Patch by: NeHuang Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D82315	2020-08-20 15:08:13 -05:00
Qiu Chaofan	131b3b9ed4	[PowerPC] Support constrained scalar fptosi/fptoui This patch adds support for constrained scalar fp to int operations on PowerPC. Besides, this fixes the FP exception bit of quad-precision convert & truncate instructions. Reviewed By: steven.zhang, uweigand Differential Revision: https://reviews.llvm.org/D81537	2020-08-20 13:29:43 +08:00
Albion Fung	3136cbe29e	[PowerPC] Implement Vector Shift Builtins This patch implements the builtins for the vector shifts (shl, srl, sra), and adds the appropriate test cases for these builtins. The builtins utilize the vector shift instructions introduced within ISA 3.1. Differential Revision: https://reviews.llvm.org/D83338	2020-08-12 18:26:58 -05:00
diggerlin	e9ac1495e2	[AIX][XCOFF] change the operand of branch instruction from symbol name to qualified symbol name for function declarations SUMMARY: 1. in the patch , remove setting storageclass in function .getXCOFFSection and construct function of class MCSectionXCOFF there are XCOFF::StorageMappingClass MappingClass; XCOFF::SymbolType Type; XCOFF::StorageClass StorageClass; in the MCSectionXCOFF class, these attribute only used in the XCOFFObjectWriter, (asm path do not need the StorageClass) we need get the value of StorageClass, Type,MappingClass before we invoke the getXCOFFSection every time. actually , we can get the StorageClass of the MCSectionXCOFF from it's delegated symbol. 2. we also change the oprand of branch instruction from symbol name to qualify symbol name. for example change bl .foo extern .foo to bl .foo[PR] extern .foo[PR] 3. and if there is reference indirect call a function bar. we also add extern .bar[PR] Reviewers: Jason liu, Xiangling Liao Differential Revision: https://reviews.llvm.org/D84765	2020-08-11 15:26:19 -04:00
Kerry McLaughlin	85c7e89f3b	[CodeGen] Refactor getMemBasePlusOffset & getObjectPtrOffset to accept a TypeSize Changes the Offset arguments to both functions from int64_t to TypeSize & updates all uses of the functions to create the offset using TypeSize::Fixed() Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85220	2020-08-11 12:17:10 +01:00
Qiu Chaofan	dbcfbffc7a	[PowerPC] Add intrinsic to read or set FPSCR register This patch introduces two intrinsics: llvm.ppc.setflm and llvm.ppc.readflm. They read from or write to FPSCR register (floating-point status & control) which contains rounding mode and exception status. To ensure correctness of program, we need to prevent FP operations from being moved across these intrinsics (mffs/mtfsf instruction), so here I set them as scheduling boundaries. We can relax such restriction if FPSCR is modeled well in the future. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D84914	2020-08-10 18:27:45 +08:00
Kamau Bridgeman	d8c6d083c9	[PowerPC][PCRelative] Set TLS unsupported with PC relative memops Introduce a fatal error if any thread local storage code is compiled using pc relative memory operations as well as a hidden override option `-enable-ppc-pcrel-tls` so that this support can be incrementally added if possible. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D85448	2020-08-07 10:56:24 -05:00
QingShan Zhang	55de46f3b2	[PowerPC] Support constrained fp operation for setcc The constrained fp operation fcmp was added by https://reviews.llvm.org/D69281. This patch is trying to add the support for PowerPC backend. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D81727	2020-08-07 05:16:36 +00:00
Jay Foad	28e322ea93	[PowerPC] Custom lowering for funnel shifts The custom lowering saves an instruction over the generic expansion, by taking advantage of the fact that PowerPC shift instructions are well defined in the shift-by-bitwidth case. Differential Revision: https://reviews.llvm.org/D83948	2020-08-04 16:30:49 +01:00
Qiu Chaofan	6a78a8dd37	[NFC] [PowerPC] Refactor fp/int conversion lowering For FP_TO_INT and INT_TO_FP lowering, we have direct-move and non-direct-move methods. But they share some conversion logic, so we can reduce redundant code by introducing new methods. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D81818	2020-08-04 15:48:16 +08:00
Chen Zheng	45c46d180e	[PowerPC] mark r+i as legal address mode for vector type after pwr9 Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D84735	2020-08-04 00:02:37 -04:00
Justin Hibbits	7e9153e940	PowerPC: Don't lower SELECT_CC to PPCISD::FSEL on SPE SPE doesn't have a fsel instruction, so don't try to lower to it. This fixes a "Cannot select: tN: f64 = PPCISD::FSEL tX, tY, tZ" error. Reviewed By: #powerpc, lkail Differential Revision: https://reviews.llvm.org/D77773	2020-07-31 22:52:47 -05:00
Justin Hibbits	914dbf4808	PowerPC: Fix SPE extloadf32 handling. The patterns were incorrect copies from the FPU code, and are unnecessary, since there's no extended load for SPE. Just let LLVM itself do the work by marking it expand. Reviewed By: #powerpc, lkail Differential Revision: https://reviews.llvm.org/D78670	2020-07-31 22:42:57 -05:00
Kang Zhang	a18953c1c0	[PowerPC] Fix RM operands for some instructions Summary: Some instructions have set the wrong [RM] flag, this patch is to fix it. Instructions x(v\|s)r(d\|s)pi[zmp]? and fri[npzm] use fixed rounding directions without referencing current rounding mode. Also, the SETRNDi, SETRND, BCLRn, MTFSFI, MTFSB0, MTFSB1, MTFSFb, MTFSFI, MTFSFI_rec, MTFSF, MTFSF_rec should also fix the RM flag. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D81360	2020-07-30 02:10:49 +00:00
Kang Zhang	802c043078	[PowerPC] Set v1i128 to expand for SETCC to avoid crash Summary: PPC only supports the instruction selection for v16i8, v8i16, v4i32, v2i64, v4f32 and v2f64 for ISD::SETCC, don't support the v1i128, so v1i128 for ISD::SETCC will crash. This patch is to set v1i128 to expand to avoid crash. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D84238	2020-07-29 16:39:27 +00:00
jasonliu	f8ab66538c	[NFC][XCOFF] Use getFunctionEntryPointSymbol from TLOF to simplify logic Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D84693	2020-07-28 18:59:51 +00:00
Jinsong Ji	d28f86723f	Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support" This reverts commit `bf544fa1c3`. Fixed the typo in PPCInstrInfo.cpp.	2020-07-28 14:00:11 +00:00
Jinsong Ji	bf544fa1c3	Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support" This reverts commit `adffce7153`. This is breaking test-suite, revert while investigation.	2020-07-27 21:07:00 +00:00
Jinsong Ji	adffce7153	[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html no one is making use of QPX/A2Q/BGQ/BGP CNK anymore. This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang, CNK support in openmp/polly. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D83915	2020-07-27 19:24:39 +00:00
Nemanja Ivanovic	cdead4f89c	[PowerPC][NFC] Fix an assert that cannot trip from `7d076e19e3` I mixed up the precedence of operators in the assert and thought I had it right since there was no compiler warning. This just adds the parentheses in the expression as needed.	2020-07-25 20:28:52 -04:00
Amy Kwan	739cd2638b	[PowerPC] Exploit the High Order Vector Multiply Instructions on Power10 This patch aims to exploit the following vector multiply high instructions on Power10. vmulhsw VRT, VRA, VRB vmulhsd VRT, VRA, VRB vmulhuw VRT, VRA, VRB vmulhud VRT, VRA, VRB Differential Revision: https://reviews.llvm.org/D82584	2020-07-24 20:57:57 -05:00
Nemanja Ivanovic	7d076e19e3	[PowerPC] Fix computation of offset for load-and-splat for permuted loads Unfortunately this is another regression from my canonicalization patch (`1fed131660`). The patch contained two implicit assumptions: 1. That we would have a permuted load only if we are loading a partial vector 2. That a partial vector load would necessarily be as wide as the splat However, assumption 2 is not correct since it is possible to do a wider load and only splat a half of it. This patch corrects this assumption by simply checking if the load is permuted and adjusting the offset if it is.	2020-07-24 15:38:46 -04:00
Amy Kwan	1dc1a3fb0c	[PowerPC] Implement low-order Vector Multiply, Modulus and Divide Instructions This patch aims to implement the low order vector multiply, divide and modulo instructions available on Power10. The patch involves legalizing the ISD nodes MUL, UDIV, SDIV, UREM and SREM for v2i64 and v4i32 vector types in order to utilize the following instructions: - Vector Multiply Low Doubleword: vmulld - Vector Modulus Word/Doubleword: vmodsw, vmoduw, vmodsd, vmodud - Vector Divide Word/Doubleword: vdivsw, vdivsd, vdivuw, vdivud Differential Revision: https://reviews.llvm.org/D82510	2020-07-23 17:18:36 -05:00
jasonliu	b98b1700ef	[XCOFF] Enable symbol alias for AIX Summary: AIX assembly's .set directive is not usable for aliasing purpose. We need to use extra-label-at-defintion strategy to generate symbol aliasing on AIX. Reviewed By: DiggerLin, Xiangling_L Differential Revision: https://reviews.llvm.org/D83252	2020-07-22 14:03:55 +00:00
Kai Luo	c3f9697f1f	[PowerPC] Fix wrong codegen when stack pointer has to realign performing dynalloc Current powerpc backend generates wrong code sequence if stack pointer has to realign if `-fstack-clash-protection` enabled. When probing dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes `NegSizeReg` as input and returns `FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated correctly, however code following `PREPARE_PROBED_ALLOCA` still uses value of `NegSizeReg`, which does not contain `ActualNegSize` if `MaxAlign > TargetAlign`, to calculate loop trip count and residual number of bytes. This patch is part of fix of https://bugs.llvm.org/show_bug.cgi?id=46759. Differential Revision: https://reviews.llvm.org/D84152	2020-07-22 06:35:12 +00:00
Chen Zheng	e8425b27fe	[PowerPC] add store (load float) pattern to isProfitableToHoist store (load float) can be optimized to store(load i32) in InstCombine pass. Add store (load float) to isProfitableToHoist to make sure we don't break the opt in InstCombine pass. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D82341	2020-07-21 20:55:13 -04:00
diggerlin	11546898e2	[AIX][XCOFF]emit extern linkage for the llvm intrinsic symbol SUMMARY: when we call memset, memcopy,memmove etc(this are llvm intrinsic function) in the c source code. the llvm will generate IR like call call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8), i8 %1, i32 %2, i1 false) for c source code bash> cat test_memset.call struct S{ int a; int b; }; extern struct S s; void bar() { memset(&s, s.b, s.b); } like %struct.S = type { i32, i32 } @s = external global %struct.S, align 4 ; Function Attrs: noinline nounwind optnone define void @bar() #0 { entry: %0 = load i32, i32 getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4 %1 = trunc i32 %0 to i8 %2 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4 call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8), i8 %1, i32 %2, i1 false) ret void } declare void @llvm.memset.p0i8.i32(i8 nocapture writeonly, i8, i32, i1 immarg) #1 If we want to let the aix as assembly compile pass without -u it need to has following assembly code. .extern .memset (we do not output extern linkage for llvm instrinsic function. even if we output the extern linkage for llvm intrinsic function, we should not out .extern llvm.memset.p0i8.i32, instead of we should emit .extern memset) for other llvm buildin function floatdidf . even if we do not call these function floatdidf in the c source code(the generated IR also do not the call __floatdidf . the function call was generated in the LLVM optimized. the function is not in the functions list of Module, but we still need to emit extern .__floatdidf The solution for it as : We record all the lllvm intrinsic extern symbol when transformCallee(), and emit all these symbol in the AsmPrinter::doFinalization(Module &M) Reviewers: jasonliu, Sean Fertile, hubert.reinterpretcast, Differential Revision: https://reviews.llvm.org/D78929	2020-07-21 16:03:04 -04:00
Fangrui Song	eafe7c14ea	[PowerPC] Fix combineVectorShuffle regression after D77448 Commit `1fed131660` assumed that NewShuffle (shuffle vector canonicalization result) will always be ShuffleVectorSDNode, which may be false (it may be a BITCAST node): ``` ... t12: v4i32 = scalar_to_vector t2 t15: v16i8 = bitcast t12 # LHS t17: v16i8 = vector_shuffle<u,u,u,u,u,u,u,u,0,1,2,3,u,u,u,u> t15, undef:v16i8 # SVN ``` Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D83617	2020-07-13 16:57:27 -07:00
Qiu Chaofan	b6912c879e	[PowerPC] Support constrained conversion in SPE target This patch adds support for constrained int/fp conversion between signed/unsigned i32 and f32/f64. Reviewed By: jhibbits Differential Revision: https://reviews.llvm.org/D82747	2020-07-13 12:18:36 +08:00
Lei Huang	90b1a710ae	[PowerPC] Enable default support of quad precision operations Summary: Remove option guarding support of quad precision operations. Reviewers: nemanjai, #powerpc, steven.zhang Reviewed By: nemanjai, #powerpc, steven.zhang Subscribers: qiucf, wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83437	2020-07-10 13:27:48 -05:00
Kai Luo	e2b93185b8	[PowerPC] Only make copies of registers on stack in variadic function when va_start is called On PPC64, for a variadic function, if va_start is not called, it won't access any variadic argument on stack, thus we can save stores of registers used to pass arguments. Differential Revision: https://reviews.llvm.org/D82361	2020-07-09 07:18:17 +00:00
Nemanja Ivanovic	1b1539712e	[PowerPC] Do not RAUW combined nodes in VECTOR_SHUFFLE legalization When legalizing shuffles, we make an attempt to combine it into a PPC specific canonical form that avoids a need for a swap. If the combine is successful, we RAUW the node and the custom legalization replaces the now dead node instead of the one it should replace. Remove that erroneous call to RAUW.	2020-07-06 22:09:28 -05:00
Amy Kwan	c13e3e2c2e	[PowerPC][Power10] Exploit the xxsplti32dx instruction when lowering VECTOR_SHUFFLE. This patch aims to exploit the xxsplti32dx XT, IX, IMM32 instruction when lowering VECTOR_SHUFFLEs. We implement lowerToXXSPLTI32DX when lowering vector shuffles to check if: - Element size is 4 bytes - The RHS is a constant vector (and constant splat of 4-bytes) - The shuffle mask is a suitable mask for the XXSPLTI32DX instruction where it is one of the 32 masks: <0, 4-7, 2, 4-7> <4-7, 1, 4-7, 3> Differential Revision: https://reviews.llvm.org/D83245	2020-07-06 20:28:38 -05:00
jasonliu	6d3ae365bd	[XCOFF][AIX] Give symbol an internal name when desired symbol name contains invalid character(s) Summary: When a desired symbol name contains invalid character that the system assembler could not process, we need to emit .rename directive in assembly path in order for that desired symbol name to appear in the symbol table. Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L Differential Revision: https://reviews.llvm.org/D82481	2020-07-06 15:49:15 +00:00
Esme-Yi	0607c8df7f	[PowerPC] Legalize SREM/UREM directly on P9. Summary: As Bugzilla-35090 reported, the rationale for using custom lowering SREM/UREM should no longer be true. At the IR level, the div-rem-pairs pass performs the transformation where the remainder is computed from the result of the division when both a required. We should now be able to lower these directly on P9. And the pass also fixed the problem that divide is in a different block than the remainder. This is a patch to remove redundant code and make SREM/UREM legal directly on P9. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D82145	2020-07-06 11:47:31 +00:00
Guillaume Chatelet	87e2751cf0	[Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D83082	2020-07-03 08:06:43 +00:00
Kai Luo	03828e38c3	[PowerPC] Implement probing for dynamic stack allocation This patch is part of supporting `-fstack-clash-protection`. Mainly do such things compared to existing `lowerDynamicAlloc` - Added a new pseudo instruction PPC::PREPARE_PROBED_ALLOC to get actual frame pointer and final stack pointer. - Synthesize a loop to probe by blocks. - Use DYNAREAOFFSET to get MaxCallFrameSize which is calculated in prologepilog. Differential Revision: https://reviews.llvm.org/D81358	2020-07-03 05:36:40 +00:00
Nemanja Ivanovic	a701dc5510	[PowerPC] Remove undefs from splat input when changing shuffle mask As of `1fed131660`, we have code that changes shuffle masks so that we can put the shuffle in a canonical form that can be matched to a single instruction. However, it does not properly account for undef elements in the BUILD_VECTOR that is the RHS splat so we can end up with undefs where they shouldn't be. This patch converts the splat input with undefs to one without.	2020-07-02 12:26:56 -05:00
Guillaume Chatelet	8dbafd24d6	[Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82977	2020-07-02 11:28:02 +00:00
Anil Mahmud	c5b4f03b53	[PowerPC] Exploit xxspltiw and xxspltidp instructions Exploits the VSX Vector Splat Immediate Word and VSX Vector Splat Immediate Double Precision instructions: xxspltiw XT,IMM32 xxspltidp XT,IMM32 Differential Revision: https://reviews.llvm.org/D82911	2020-07-01 19:18:29 -05:00
Stefan Pintilie	b294e00fb0	[PowerPC] Fix for PC Relative call protocol The situation where the caller uses a TOC and the callee does not but is marked as clobbers the TOC (st_other=1) was not being compiled correctly if both functions where in the same object file. The call site where we had `callee` was missing a nop after the call. This is because it was assumed that since the two functions where in the same DSO they would be sharing a TOC. This is not the case if the callee uses PC Relative because in that case it may clobber the TOC. This patch makes sure that we add the cnop correctly so that the linker has a place to restore the TOC. Reviewers: sfertile, NeHuang, saghir Differential Revision: https://reviews.llvm.org/D81126	2020-07-01 07:08:41 -05:00

1 2 3 4 5 ...

1546 Commits