llvm-project

Commit Graph

Author	SHA1	Message	Date
Nemanja Ivanovic	f2588a28a8	[PowerPC] Recommit r340016 after fixing the reported issue The internal benchmark failure reported by Google was due to a missing check for the result type for the sign-extend and shift DAG. This commit adds the check and re-commits the patch. llvm-svn: 340734	2018-08-27 11:20:27 +00:00
Eric Christopher	3dc594c1e6	Temporarily Revert "[PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction" due to it causing a compiler crash on valid. This reverts commit r340016, testcase forthcoming. llvm-svn: 340315	2018-08-21 18:35:08 +00:00
Stefan Pintilie	39869ccf51	[PowerPC] Generate lxsd instead of the ld->mtvsrd sequence for vector loads This patch addresses: - Implementation within PPCISelLowering.cpp to check if we should use direct load into vector instructions (such as lxsd/lfd ) when the scalar_to_vector function is used; which will allow us to catch as many cases of the scalar_to_vector uses as possible to translate the ld->mtvsrd sequence into lxsd. - Test cases to exhibit the behaviour of emitting lxsd/lfd. Patch by amyk Differential revision: https://reviews.llvm.org/D49698 llvm-svn: 340037	2018-08-17 15:15:26 +00:00
Nemanja Ivanovic	39751276b0	[PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction Add a DAG combine for the PowerPC code generator to generate the Power9 extswsli extend sign and shift immediate instruction. Patch by RolandF. Differential revision: https://reviews.llvm.org/D49879 llvm-svn: 340016	2018-08-17 12:35:44 +00:00
Chandler Carruth	c73c0307fe	[MI] Change the array of `MachineMemOperand` pointers to be a generically extensible collection of extra info attached to a `MachineInstr`. The primary change here is cleaning up the APIs used for setting and manipulating the `MachineMemOperand` pointer arrays so chat we can change how they are allocated. Then we introduce an extra info object that using the trailing object pattern to attach some number of MMOs but also other extra info. The design of this is specifically so that this extra info has a fixed necessary cost (the header tracking what extra info is included) and everything else can be tail allocated. This pattern works especially well with a `BumpPtrAllocator` which we use here. I've also added the basic scaffolding for putting interesting pointers into this, namely pre- and post-instruction symbols. These aren't used anywhere yet, they're just there to ensure I've actually gotten the data structure types correct. I'll flesh out support for these in a subsequent patch (MIR dumping, parsing, the works). Finally, I've included an optimization where we store any single pointer inline in the `MachineInstr` to avoid the allocation overhead. This is expected to be the overwhelmingly most common case and so should avoid any memory usage growth due to slightly less clever / dense allocation when dealing with >1 MMO. This did require several ergonomic improvements to the `PointerSumType` to reasonably support the various usage models. This also has a side effect of freeing up 8 bits within the `MachineInstr` which could be repurposed for something else. The suggested direction here came largely from Hal Finkel. I hope it was worth it. ;] It does hopefully clear a path for subsequent extensions w/o nearly as much leg work. Lots of thanks to Reid and Justin for careful reviews and ideas about how to do all of this. Differential Revision: https://reviews.llvm.org/D50701 llvm-svn: 339940	2018-08-16 21:30:05 +00:00
Nemanja Ivanovic	5b9a4f8ee5	[PowerPC] Enhance the selection(ISD::VSELECT) of vector type To make ISD::VSELECT available(legal) so long as there are altivec instruction, otherwise it's default behavior is expanding. Use xxsel to match vselect if vsx is open, or use vsel. In order to do not write many patterns in td file, promote (for vector it's bitcast) all other type into v4i32 and only pattern match vselect of v4i32 into vsel or xxsel. Patch by wuzish Differential revision: https://reviews.llvm.org/D49531 llvm-svn: 339779	2018-08-15 15:30:36 +00:00
Nemanja Ivanovic	8b4bd09e22	[PowerPC] Don't run BV DAG Combine before legalization if it assumes legal types When trying to combine a DAG that builds a vector out of sign-extensions of vector extracts, the code assumes legal input types. Due to that, we have to disable this combine prior to legalization. In some cases, the DAG will look slightly different after legalization so account for that in the matching code. This is a fix for https://bugs.llvm.org/show_bug.cgi?id=38087 Differential Revision: https://reviews.llvm.org/D49080 llvm-svn: 339769	2018-08-15 12:58:13 +00:00
Zaara Syeda	b2595b988b	[PowerPC] Improve codegen for vector loads using scalar_to_vector This patch aims to improve the codegen for vector loads involving the scalar_to_vector (load X) sequence. Initially, ld->mv instructions were used for scalar_to_vector (load X), so this patch allows scalar_to_vector (load X) to utilize: LXSD and LXSDX for i64 and f64 LXSIWAX for i32 (sign extension to i64) LXSIWZX for i32 and f64 Committing on behalf of Amy Kwan. Differential Revision: https://reviews.llvm.org/D48950 llvm-svn: 339260	2018-08-08 15:20:43 +00:00
Nemanja Ivanovic	e1a525ed06	[PowerPC] Do not round values prior to converting to integer Adding the FP_ROUND nodes when combining FP_TO_[SU]INT of elements feeding a BUILD_VECTOR into an FP_TO_[SU]INT of the built vector loses precision. This patch removes the code that adds these nodes to true f64 operands. It also adds patterns required to ensure the code is still vectorized rather than converting individual elements and inserting into a vector. Fixes https://bugs.llvm.org/show_bug.cgi?id=38342 Differential Revision: https://reviews.llvm.org/D50121 llvm-svn: 338658	2018-08-02 00:03:22 +00:00
Craig Topper	2f60ef2c78	[DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to BuildSDIV/BuildUDIV/etc. The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation. llvm-svn: 338329	2018-07-30 23:22:00 +00:00
Craig Topper	a568a27dfa	[DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to BuildSDIVPow2. llvm-svn: 338303	2018-07-30 21:04:34 +00:00
Matt Arsenault	81920b0a25	DAG: Add calling convention argument to calling convention funcs This seems like a pretty glaring omission, and AMDGPU wants to treat kernels differently from other calling conventions. llvm-svn: 338194	2018-07-28 13:25:19 +00:00
Justin Hibbits	22e939a15b	Fix build failures from r337347, found by clang * Delete a no-longer-used override, and mark the other getRegisterTypeForCallingConv() as override. * SPE only supports i32, not i64, as the internal type, so simply remove the type check, so that DestReg and Opc are provably always set. GCC 6.4 did not warn about either of the above. llvm-svn: 337350	2018-07-18 05:19:25 +00:00
Justin Hibbits	d52990c71b	Introduce codegen for the Signal Processing Engine Summary: The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1, e500v2, and several e200 cores. This adds support targeting the e500v2, as this is more common than the e500v1, and is in SoCs still on the market. This patch is very intrusive because the SPE is binary incompatible with the traditional FPU. After discussing with others, the cleanest solution was to make both SPE and FPU features on top of a base PowerPC subset, so all FPU instructions are now wrapped with HasFPU predicates. Supported by this are: * Code generation following the SPE ABI at the LLVM IR level (calling conventions) * Single- and Double-precision math at the level supported by the APU. Still to do: * Vector operations * SPE intrinsics As this changes the Callee-saved register list order, one test, which tests the precise generated code, was updated to account for the new register order. Reviewed by: nemanjai Differential Revision: https://reviews.llvm.org/D44830 llvm-svn: 337347	2018-07-18 04:25:10 +00:00
Justin Hibbits	ceb3cd96f7	Add PowerPC e500(v2) core scheduler and directives. Differential Revision: https://reviews.llvm.org/D44828 llvm-svn: 337345	2018-07-18 04:24:49 +00:00
Stefan Pintilie	133acb22bb	[Power9] Add __float128 builtins for Rounding Operations Added __float128 support for a number of rounding operations: trunc rint nearbyint round floor ceil Differential Revision: https://reviews.llvm.org/D48415 llvm-svn: 336601	2018-07-09 20:38:40 +00:00
Stefan Pintilie	3d76326d24	[Power9] Add __float128 support for compare operations Added handling for the select f128. Differential Revision: https://reviews.llvm.org/D48294 llvm-svn: 336548	2018-07-09 13:36:14 +00:00
Stefan Pintilie	b351f09c9e	[Power9] Add __float128 library call for frem Power 9 does not have a hardware instruction for frem but we can call fmodf128. Differential Revision: https://reviews.llvm.org/D48552 llvm-svn: 336406	2018-07-06 02:47:02 +00:00
Lei Huang	5612b90694	[Power9] Add lib calls for float128 operations with no equivalent PPC instructions Map the following instructions to the proper float128 lib calls: pow[i], exp[2], log[2\|10], sin, cos, fmin, fmax Differential Revision: https://reviews.llvm.org/D48544 llvm-svn: 336361	2018-07-05 15:21:37 +00:00
Lei Huang	a855e17f09	[Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory, or in memory. This patch ensures that float128 members of non-homogenous aggregates are passed via VSX registers. This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128 to a new PPCISD node, BUILD_FP128. Differential Revision: https://reviews.llvm.org/D48308 llvm-svn: 336310	2018-07-05 06:21:37 +00:00
Lei Huang	d17c39ccaa	[Power9]Legalize and emit code for quad-precision convert from single-precision Legalize and emit code for quad-precision floating point operation conversion of single-precision value to quad-precision. Differential Revision: https://reviews.llvm.org/D47569 llvm-svn: 336307	2018-07-05 04:18:37 +00:00
Lei Huang	a26f3be454	[Power9] Implement float128 parameter passing and return values This patch enable parameter passing and return by value for float128 types. Passing aggregate/union which contain float128 members will be submitted in subsequent patches. Differential Revision: https://reviews.llvm.org/D47552 llvm-svn: 336306	2018-07-05 04:10:15 +00:00
Lei Huang	6270ab6ce4	[Power9]Legalize and emit code for round & convert quad-precision values Legalize and emit code for round & convert float128 to double precision and single precision. Differential Revision: https://reviews.llvm.org/D46997 llvm-svn: 336299	2018-07-04 21:59:16 +00:00
Strahinja Petrovic	bb2b00bb80	[PowerPC] Fix label address calculation for ppc32 This patch fixes calculating address of label on ppc32 (for -fPIC). Differential Revision: https://reviews.llvm.org/D46582 llvm-svn: 335043	2018-06-19 13:07:40 +00:00
Hiroshi Inoue	0f7f59f073	[PowerPC] fix trivial typos in comment, NFC llvm-svn: 334583	2018-06-13 08:54:13 +00:00
Amaury Sechet	8467411dad	Set ADDE/ADDC/SUBE/SUBC to expand by default Summary: They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while. Target that uses these opcodes are changed in order to ensure their behavior doesn't change. Reviewers: efriedma, craig.topper, dblaikie, bkramer Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D47422 llvm-svn: 333748	2018-06-01 13:21:33 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Lei Huang	c517e95bc6	[Power9]Legalize and emit code for truncate and convert QP to DW Legalize and emit code for: * xscvqpsdz : VSX Scalar truncate & Convert Quad-Precision to Signed Dword * xscvqpudz : VSX Scalar truncate & Convert Quad-Precision to Unsigned Dword Differential Revision: https://reviews.llvm.org/D45553 llvm-svn: 331787	2018-05-08 18:23:31 +00:00
Lei Huang	c29229a644	[PowerPC] Unify handling for conversion of FP_TO_INT feeding a store Existing DAG combine only handles conversions for FP_TO_SINT: "{f32, f64} x { i32, i16 }" This patch simplifies the code to handle: "{ FP_TO_SINT, FP_TO_UINT } x { f64, f32 } x { i64, i32, i16, i8 }" Differential Revision: https://reviews.llvm.org/D46102 llvm-svn: 331778	2018-05-08 17:36:40 +00:00
Nemanja Ivanovic	61ffbf21cd	Commit r331416 breaks the big-endian PPC bot. On the big endian build, we actually encounter constants wider than 64-bits. Add the guard to prevent tripping the assert. llvm-svn: 331420	2018-05-03 01:04:13 +00:00
Nemanja Ivanovic	01e2e79abf	[PowerPC] Implement isMaskAndCmp0FoldingBeneficial Sinking the and closer to a compare against zero is beneficial on PPC as it allows us to emit record-form instructions. In the future, we may expand this to a larger set of operations that feed compares against zero since PPC has lots of record-form instructions. Differential revision: https://reviews.llvm.org/D46060 llvm-svn: 331416	2018-05-02 23:55:23 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Lei Huang	42ab1d3d03	[NFC] Move verificaiton check for f128 conversion into LowerINT_TO_FP() Move veriication check for legal conversions to f128 into LowerINT_TO_FP() and fix some indentations to match other sections of the code for readability. llvm-svn: 330138	2018-04-16 17:30:24 +00:00
Lei Huang	10367eb422	[Power9]Legalize and emit code for converting (Un)Signed DWord to Quad-Precision Legalize and emit code for: * xscvsdqp * xscvudqp Differential Revision: https://reviews.llvm.org/D45230 llvm-svn: 329931	2018-04-12 18:00:14 +00:00
Lei Huang	09fda63af0	[Power9]Legalize and emit code for quad-precision fma instructions Legalize and emit code for the following quad-precision fma: * xsmaddqp * xsnmaddqp * xsmsubqp * xsnmsubqp Differential Revision: https://reviews.llvm.org/D44843 llvm-svn: 329206	2018-04-04 16:43:50 +00:00
Craig Topper	2fa1436206	[IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806	2018-03-29 17:21:10 +00:00
Lei Huang	be0afb0870	[Power9]Legalize and emit code for quad-precision convert from double-precision Legalize and emit code for quad-precision floating point operation xscvdpqp and add option to guard the quad precision operation support. Differential Revision: https://reviews.llvm.org/D44746 llvm-svn: 328558	2018-03-26 17:46:25 +00:00
David Blaikie	36a0f226b1	Fix layering by moving ValueTypes.h from CodeGen to IR ValueTypes.h is implemented in IR already. llvm-svn: 328397	2018-03-23 23:58:31 +00:00
David Blaikie	13e77db2df	Fix layering of MachineValueType.h by moving it from CodeGen to Support This is used by llvm tblgen as well as by LLVM Targets, so the only common place is Support for now. (maybe we need another target for these sorts of things - but for now I'm at least making them correct & we can make them better if/when people have strong feelings) llvm-svn: 328395	2018-03-23 23:58:25 +00:00
Craig Topper	c2dbd677bd	[PowerPC][LegalizeFloatTypes] Move the PPC hacks for (i32 fp_to_sint/fp_to_uint (ppcf128 X)) out of LegalizeFloatTypes and into PPC specific code I'm not entirely sure these hacks are still needed. If you remove the hacks completely, the name of the library call that gets generated doesn't match the grep the test previously had. So the test wasn't really checking anything. If the hack is still needed it belongs in PPC specific code. I believe the FP_TO_SINT code here is the only place in the tree where a FP_ROUND_INREG node is created today. And I don't think its even being used correctly because the legalization returned a BUILD_PAIR with the same value twice. That doesn't seem right to me. By moving the code entirely to PPC we can avoid creating the FP_ROUND_INREG at all. I replaced the grep in the existing test with full checks generated by hacking update_llc_test_check.py to support ppc32 just long enough to generate it. Differential Revision: https://reviews.llvm.org/D44061 llvm-svn: 328017	2018-03-20 18:49:28 +00:00
Lei Huang	6d1596a98c	[PowerPC][Power9]Legalize and emit code for quad-precision add/div/mul/sub Legalize and emit code for quad-precision floating point operations: * xsaddqp * xssubqp * xsdivqp * xsmulqp Differential Revision: https://reviews.llvm.org/D44506 llvm-svn: 327878	2018-03-19 18:52:20 +00:00
Guozhi Wei	9c916584ba	[PPC] Avoid non-simple MVT in STBRX optimization PR35402 triggered this case. It bswap and stores a 48bit value, current STBRX optimization transforms it into STBRX. Unfortunately 48bit is not a simple MVT, there is no PPC instruction to support it, and it can't be automatically expanded by llvm, so caused a crash. This patch detects the non-simple MVT and returns early. Differential Revision: https://reviews.llvm.org/D44500 llvm-svn: 327651	2018-03-15 17:49:12 +00:00
Lei Huang	1f8da3ae19	[PowerPC][NFC] formatting-only fix llvm-svn: 327599	2018-03-15 03:06:44 +00:00
Craig Topper	80d3bb3b4b	[TargetLowering] Rename DAGCombinerInfo::isAfterLegalizeVectorOps to DAGCombiner::isAfterLegalizeDAG since that's what it checks. NFC The code checks Level == AfterLegalizeDAG which is the fourth and last of the possible DAG combine stages that we have. There is a Level called AfterLegalVectorOps, but that's the third DAG combine and it doesn't always run. A function called isAfterLegalVectorOps should imply it returns true in either of the DAG combines that runs after the legalize vector ops stage, but that's not what this function does. llvm-svn: 326832	2018-03-06 19:44:52 +00:00
Chih-Hung Hsieh	9f9e4681ac	[TLS] use emulated TLS if the target supports only this mode Emulated TLS is enabled by llc flag -emulated-tls, which is passed by clang driver. When llc is called explicitly or from other drivers like LTO, missing -emulated-tls flag would generate wrong TLS code for targets that supports only this mode. Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether emulated TLS code should be generated. Unit tests are modified to run with and without the -emulated-tls flag. Differential Revision: https://reviews.llvm.org/D42999 llvm-svn: 326341	2018-02-28 17:48:55 +00:00
Lei Huang	dfd41552f4	[PowerPC] Reduce stack frame for fastcc functions by only allocating parameter save area when needed Current implementation always allocates the parameter save area conservatively for fastcc functions. There is no reason to allocate the parameter save area if all the parameters can be passed via registers. Differential Revision: https://reviews.llvm.org/D42602 llvm-svn: 325581	2018-02-20 15:09:45 +00:00
Zaara Syeda	1f59ae311b	Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778	2018-01-30 16:17:22 +00:00
Tim Shen	7abe9887b0	[PPC] Avoid incorrect fp-i128-fp lowering. Summary: Fix an issue that's similar to what D41411 fixed: float(__int128(float_var)) shouldn't be optimized to xscvdpsxds + xscvsxdsp, as they mean (float)(int64_t)float_var. Reviewers: jtony, hfinkel, echristo Subscribers: sanjoy, nemanjai, hiraditya, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D42400 llvm-svn: 323270	2018-01-23 22:06:57 +00:00
Zaara Syeda	c9dc7b451b	Revert [PowerPC] This reverts commit rL322721 Failing build bots. Revert the commit now. llvm-svn: 322748	2018-01-17 20:00:15 +00:00
Zaara Syeda	8e951fd2f6	[PowerPC] Add handling for ColdCC calling convention and a pass to mark candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721	2018-01-17 18:22:55 +00:00
Nemanja Ivanovic	ebb23078e9	[PowerPC] Zero-extend the compare operand for ATOMIC_CMP_SWAP Part of the fix for https://bugs.llvm.org/show_bug.cgi?id=35812. This patch ensures that the compare operand for the atomic compare and swap is properly zero-extended to 32 bits if applicable. A follow-up commit will fix the extension for the SETCC node generated when expanding an ATOMIC_CMP_SWAP_WITH_SUCCESS. That will complete the bug fix. Differential Revision: https://reviews.llvm.org/D41856 llvm-svn: 322372	2018-01-12 14:58:41 +00:00
Craig Topper	d461aefe5f	[PowerPC] Add an ISD::TRUNCATE to the legalization for ppc_is_decremented_ctr_nonzero Summary: I believe legalization is really expecting that ReplaceNodeResults will return something with the same type as the thing that's being legalized. Ultimately, it uses the output to replace the uses in the DAG so the type should match to make that work. There are two relevant cases here. When crbits are enabled, then i1 is a legal type and getSetCCResultType should return i1. In this case, the truncate will be between i1 and i1 and should be removed (SelectionDAG::getNode does this). Otherwise, getSetCCResultType will be i32 and the legalizer will promote the truncate to be i32 -> i32 which will be similarly removed. With this fixed we can remove some code from PromoteIntRes_SETCC that seemed to only exist to deal with the intrinsic being replaced with a larger type without changing the other operand. With the truncate being used for connectivity this doesn't happen anymore. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: nemanjai, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D41654 llvm-svn: 321959	2018-01-07 07:51:36 +00:00
Hiroshi Inoue	ca3cdd7f27	[PowerPC] fix a bug in TCO eligibility check If the callee and caller use different calling convensions, we cannot apply TCO if the callee requires arguments on stack; e.g. C calling convention and Fast CC use the same registers for parameter passing, but the stack offset is not necessarily same. This patch also recommit r319218 "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." by @sfertile since the problem reported in r320106 should be fixed. Differential Revision: https://reviews.llvm.org/D40893 llvm-svn: 321579	2017-12-30 08:09:04 +00:00
Tony Jiang	eba757e45c	[PowerPC] Fix parest build failure in SPEC2017. The build failure was caused by an assertion in pre-legalization DAGCombine: Combining: t6: ppcf128 = uint_to_fp t5 ... into: t20: f32 = PPCISD::FCFIDUS t19 which is clearly wrong since ppcf128 are definitely different type with f32 and we cannot change the node value type when do DAGCombine. The fix is don't handle ppc_fp128 or i1 conversions in PPCTargetLowering::combineFPToIntToFP and leave it to downstream to legalize it and expand it to small legal types. Differential Revision: https://reviews.llvm.org/D41411 llvm-svn: 321276	2017-12-21 15:42:50 +00:00
Matthias Braun	f1caa2833f	MachineFunction: Return reference from getFunction(); NFC The Function can never be nullptr so we can return a reference. llvm-svn: 320884	2017-12-15 22:22:58 +00:00
Nemanja Ivanovic	1794cdc481	Fix code causing fallthrough warnings in the PPC back end. llvm-svn: 320806	2017-12-15 11:47:48 +00:00
Matt Arsenault	7d7adf4f2e	TLI: Allow using PSV for intrinsic mem operands llvm-svn: 320756	2017-12-14 22:34:10 +00:00
Matt Arsenault	1117133687	DAG: Expose all MMO flags in getTgtMemIntrinsic Rather than adding more bits to express every MMO flag you could want, just directly use the MMO flags. Also fixes using a bunch of bool arguments to getMemIntrinsicNode. On AMDGPU, buffer and image intrinsics should always have MODereferencable set, but currently there is no way to do that directly during the initial intrinsic lowering. llvm-svn: 320746	2017-12-14 21:39:51 +00:00
Nemanja Ivanovic	50d37a1129	[PowerPC] Sign-extend negative constant stores Second part of https://reviews.llvm.org/D40348. Revision r318436 has extended all constants feeding a store to 64 bits to allow for CSE on the SDAG. However, negative constants were zero extended which made the constant being loaded appear to be a positive value larger than 16 bits. This resulted in long sequences to materialize such constants rather than simply a "load immediate". This patch just sign-extends those updated constants so that they remain 16-bit signed immediates if they started out that way. llvm-svn: 320368	2017-12-11 14:35:48 +00:00
Eric Christopher	a469acac03	Temporarily revert "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." It is causing sanitizer failures on llvm tests in a bootstrapped compiler. No bot link since it's currently down, but following up to get the bot up. This reverts commit r319218. llvm-svn: 320106	2017-12-07 22:26:19 +00:00
Sean Fertile	e200016ea9	[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions. Allow fastcc callees to be tail-called from ccc callers. Differential Revision: https://reviews.llvm.org/D40355 llvm-svn: 319218	2017-11-28 20:25:58 +00:00
David Blaikie	b3bde2ea50	Fix a bunch more layering of CodeGen headers that are in Target All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490	2017-11-17 01:07:10 +00:00
Guozhi Wei	433e8d3e04	[PPC] Change i32 constant in store instruction to i64 This patch changes all i32 constant in store instruction to i64 with truncation, to increase the chance that the referenced constant can be shared with other i64 constant. Differential Revision: https://reviews.llvm.org/D39352 llvm-svn: 318436	2017-11-16 18:27:34 +00:00
Sean Fertile	0f0837e84e	[PowerPC] Implement mayBeEmittedAsTailCall for PPC Implements TargetLowering callback 'mayBeEmittedAsTailCall' that enables CodeGenPrepare to duplicate returns when they might enable a tail-call. Differential Revision: https://reviews.llvm.org/D39777 llvm-svn: 318321	2017-11-15 18:58:27 +00:00
Sean Fertile	7b056b3048	[PowerPC] Split out the tailcall calling convention checks. NFC. Move the calling convention checks for tail-call eligibility for the 64-bit SysV ABI into a separate function. This is so that it can be shared with 'mayBeEmittedAsTailCall' in a subsequent change. llvm-svn: 318305	2017-11-15 16:53:41 +00:00
David Blaikie	3f833edc7c	Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the layering of its implementation. llvm-svn: 317647	2017-11-08 01:01:31 +00:00
Graham Yiu	5cd044e8c8	Use new vector insert half-word and byte instructions when we see insertelement on '8 x i16' and '16 x i8' types. Also extended existing lit testcase to cover these cases. Differential Revision: https://reviews.llvm.org/D34630 llvm-svn: 317613	2017-11-07 20:55:43 +00:00
Graham Yiu	52a52a6cab	Fix buildbot breakages from r317503. Add parentheses to assignment when using result as a condition. llvm-svn: 317508	2017-11-06 21:04:19 +00:00
Graham Yiu	030621bbcb	Adds code to PPC ISEL lowering to recognize byte inserts from vector_shuffles, and use P9 shift and vector insert byte instructions instead of vperm. Extends tests from vector insert half-word. Differential Revision: https://reviews.llvm.org/D34497 llvm-svn: 317503	2017-11-06 20:18:30 +00:00
Guozhi Wei	e3b8d9a312	[PPC] Use xxbrd to speed up bswap64 Power doesn't have bswap instructions, so llvm generates following code sequence for bswap64. rotldi 5, 3, 16 rotldi 4, 3, 8 rotldi 9, 3, 24 rotldi 10, 3, 32 rotldi 11, 3, 48 rotldi 12, 3, 56 rldimi 4, 5, 8, 48 rldimi 4, 9, 16, 40 rldimi 4, 10, 24, 32 rldimi 4, 11, 40, 16 rldimi 4, 12, 48, 8 rldimi 4, 3, 56, 0 But Power9 has vector bswap instructions, they can also be used to speed up scalar bswap intrinsic. With this patch, bswap64 can be translated to: mtvsrdd 34, 3, 3 xxbrd 34, 34 mfvsrld 3, 34 Differential Revision: https://reviews.llvm.org/D39510 llvm-svn: 317499	2017-11-06 19:09:38 +00:00
Graham Yiu	671526148c	Adds code to PPC ISEL lowering to recognize half-word inserts from vector_shuffles, and use P9 shift and vector insert instructions instead of vperm. Differential Revision: https://reviews.llvm.org/D34160 llvm-svn: 317111	2017-11-01 18:06:56 +00:00
Hiroshi Inoue	e3a3e3c9e9	[PowerPC] Eliminate sign- and zero-extensions if already sign- or zero-extended This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass. If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated. One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI. For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated. void int_func(int); void ii_test(int a) { if (a & 1) return int_func(a); } Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG. Differential Revision: https://reviews.llvm.org/D31319 llvm-svn: 315888	2017-10-16 04:12:57 +00:00
Matt Arsenault	f2db97d8fa	DAG: Add opcode and source type to isFPExtFree This is only currently used for mad/fma transforms. This is the only case where it should be used for AMDGPU, so add an opcode to be sure. llvm-svn: 315740	2017-10-13 19:55:45 +00:00
Hal Finkel	112a6bac72	[PowerPC] Don't use xscvdpspn on the P7 xscvdpspn was not introduced until the P8, so don't use it on the P7. Fixes a regression introduced in r288152. llvm-svn: 312612	2017-09-06 03:08:26 +00:00
Tony Jiang	61ef1c540c	[PPC][NFC] Renaming things with 'xxinsert' moniker to 'vecinsert' to make it more general. Commit on behalf of Graham Yiu (gyiu@ca.ibm.com) llvm-svn: 312547	2017-09-05 18:08:02 +00:00
Sean Fertile	00393cce3a	[PPC] Refine checks for emiting TOC restore nop and tail-call eligibility. For the medium and large code models we only need to check if a call crosses dso-boundaries when considering tail-call elgibility. Differential Revision: https://reviews.llvm.org/D34245 llvm-svn: 311353	2017-08-21 17:35:32 +00:00
Nemanja Ivanovic	979dcb6f09	[PowerPC] Don't crash on larger splats achieved through 1-byte splats We've implemented a 1-byte splat using XXSPLTISB on P9. However, LLVM will produce a 1-byte splat even for wider element BUILD_VECTOR nodes. This patch prevents crashing in that situation. Differential Revision: https://reviews.llvm.org/D35650 llvm-svn: 310358	2017-08-08 13:52:45 +00:00
Rafael Espindola	79e238afee	Delete Default and JITDefault code models IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911	2017-08-03 02:16:21 +00:00
Stefan Pintilie	873889ca16	[Power9] Exploit vector absolute difference instructions on Power 9 Power 9 has instructions to do absolute difference (VABSDUB, VABSDUH, VABSDUW) for byte, halfword and word. We should take advantage of these. Differential Revision: https://reviews.llvm.org/D34684 llvm-svn: 309876	2017-08-02 20:07:21 +00:00
Peter Collingbourne	081ffe2ff2	Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI. This was a use-after-free waiting to happen. llvm-svn: 309159	2017-07-26 19:15:29 +00:00
Jonas Paulsson	024e319489	[SystemZ, LoopStrengthReduce] This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729	2017-07-21 11:59:37 +00:00
Nemanja Ivanovic	3c7e276d24	[PowerPC] Ensure displacements for DQ-Form instructions are multiples of 16 As outlined in the PR, we didn't ensure that displacements for DQ-Form instructions are multiples of 16. Since the instruction encoding encodes a quad-word displacement, a sub-16 byte displacement is meaningless and ends up being encoded incorrectly. Fixes https://bugs.llvm.org/show_bug.cgi?id=33671. Differential Revision: https://reviews.llvm.org/D35007 llvm-svn: 307934	2017-07-13 18:17:10 +00:00
Tony Jiang	acefbcf38e	[PPC CodeGen] Expand the bitreverse.i64 intrinsic. Differential Revision: https://reviews.llvm.org/D34908 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307563	2017-07-10 18:11:23 +00:00
Lei Huang	168d14b143	[PowerPC] Reduce register pressure by not materializing a constant just for use as an index register for X-Form loads/stores. For this example: float test (int *arr) { return arr[2]; } We currently generate the following code: li r4, 8 lxsiwax f0, r3, r4 xscvsxdsp f1, f0 With this patch, we will now generate: addi r3, r3, 8 lxsiwax f0, 0, r3 xscvsxdsp f1, f0 Originally reported in: https://bugs.llvm.org/show_bug.cgi?id=27204 Differential Revision: https://reviews.llvm.org/D35027 llvm-svn: 307553	2017-07-10 16:44:45 +00:00
Hiroshi Inoue	a86c920b1e	fix typos in comments and error messages; NFC llvm-svn: 307533	2017-07-10 12:44:25 +00:00
Lei Huang	317104183d	[PowerPC] NFC : Common up definitions of isIntS16Immediate and update parameter to int16_t llvm-svn: 307442	2017-07-07 21:12:35 +00:00
Tony Jiang	c260e0eb56	[PPC CodeGen] Expand the bitreverse.i32 intrinsic. Differential Revision: https://reviews.llvm.org/D33572 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307413	2017-07-07 16:41:55 +00:00
Simon Pilgrim	6c1695c6f7	[PowerPC] Fix -Wimplicit-fallthrough warnings. NFCI. llvm-svn: 307382	2017-07-07 10:21:44 +00:00
Tony Jiang	9a91a18110	[Power9] Exploit vector integer extend instructions when indices aren't correct. This patch adds on to the exploitation added by https://reviews.llvm.org/D33510. This now catches build vector nodes where the inputs are coming from sign extended vector extract elements where the indices used by the vector extract are not correct. We can still use the new hardware instructions by adding a shuffle to move the elements to the correct indices. I introduced a new PPCISD node here because adding a vector_shuffle and changing the elements of the vector_extracts was getting undone by another DAG combine. Commit on behalf of Zaara Syeda (syzaara@ca.ibm.com) Differential Revision: https://reviews.llvm.org/D34009 llvm-svn: 307169	2017-07-05 16:00:38 +00:00
Eric Christopher	a5b50cd645	Tidy up some calls to getRegister for readability. llvm-svn: 305626	2017-06-17 02:25:49 +00:00
Lei Huang	f689f69fea	Test commit - NFC. Modified a comment to confirm commit access functionality. llvm-svn: 305402	2017-06-14 17:25:55 +00:00
Kit Barton	0b216305db	Test commit - NFC. Modified a comment to confirm commit access functionality. llvm-svn: 305309	2017-06-13 17:35:29 +00:00
NAKAMURA Takumi	3807ab24c6	PPCISelLowering.cpp: Fix warnings in r305214. [-Wdocumentation] llvm-svn: 305277	2017-06-13 07:34:32 +00:00
Tony Jiang	1a8eec141a	[PowerPC] Match vec_revb builtins to P9 instructions. Power9 has instructions that will reverse the bytes within an element for all sizes (half-word, word, double-word and quad-word). These can be used for the vec_revb builtins in altivec.h. However, we implement these to match vector shuffle nodes as that will cover both the builtins and vector shuffles that occur in the SDAG through other means. Differential Revision: https://reviews.llvm.org/D33690 llvm-svn: 305214	2017-06-12 18:24:36 +00:00
Tony Jiang	30a49d1a3d	[Power9] Added support for the modsw, moduw, modsd, modud hardware instructions. Note that if we need the result of both the divide and the modulo then we compute the modulo based on the result of the divide and not using the new hardware instruction. Commit on behalf of STEFAN PINTILIE. Differential Revision: https://reviews.llvm.org/D33940 llvm-svn: 305210	2017-06-12 17:58:42 +00:00
Sanjay Patel	d4765a38b4	[DAG] add helper to bind memop chains; NFCI This step is just intended to reduce code duplication rather than change any functionality. A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper. Differential Revision: https://reviews.llvm.org/D33649 llvm-svn: 305192	2017-06-12 14:41:48 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Zaara Syeda	3a7578c658	[PPC] Inline expansion of memcmp This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313	2017-05-31 17:12:38 +00:00
Tony Jiang	60c247de18	[PowerPC] Fix a performance bug for PPC::XXPERMDI. There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI Instruction, this patch recognizes them and does the selection to improve the PPC performance. Differential Revision: https://reviews.llvm.org/D33404 llvm-svn: 304298	2017-05-31 13:09:57 +00:00
Craig Topper	f6d4dc5b4a	[SelectionDAG] Set ISD::FPOWI to Expand by default Summary: Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie". This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default. Reviewers: spatel, RKSimon, efriedma Reviewed By: RKSimon Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits Differential Revision: https://reviews.llvm.org/D33530 llvm-svn: 304215	2017-05-30 15:27:55 +00:00

1 2 3 4 5 ...

1336 Commits