llvm-project

Commit Graph

Author	SHA1	Message	Date
Qiu Chaofan	f7294ac809	[PowerPC] Remove extra swap for extract+vperm on LE This is a simple fix on LE. On BE, vector shuffles are categorized into different ops. We may need more work to eliminate these in tablegen/pre-isel. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D101605	2021-05-07 13:48:08 +08:00
Victor Huang	bb113b9845	[AIX][TLS] Add support for TLSGD relocations to XCOFF objects - Add branch absolute reloction R_RBA, R_TLS relocation for the variable offset for the tlsgd model and R_TLSM for the region handle for the tlsgd model - Properly set the relocation fixed values for R_TLS and R_TLSM - Emit the TCEntry with the variant kind in the XCOFFStreamer Reviewed by: sfertile, nemanjai, DiggerLin Differential Revision: https://reviews.llvm.org/D100214	2021-05-06 09:01:47 -05:00
Ahsan Saghir	670736a904	[PowerPC] Prevent argument promotion of types with size greater than 128 bits This patch prevents argument promotion of types having type size greater than 128 bits. Fixes Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=49952 Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D101188	2021-05-04 12:09:25 -05:00
Amy Kwan	1998a08655	[PowerPC][NFC] Update atomic patterns to use the refactored load/store implementation This patch updates the scalar atomic patterns to use the refactored load/store implementation introduced in D93370. All existing test cases pass with when the refactored patterns are utilized. Differential Revision: https://reviews.llvm.org/D94498	2021-05-04 10:46:45 -05:00
Zarko Todorovski	d98e5e02ad	[AIX] Remove unused vector registers from allocation order in the default AltiVec ABI The previous implementation of the default AltiVec ABI marked registers V20-V31 as reserved. This failed to prevent reserved VFRC registers being allocated. In this patch instead of marking the registers reserved we remove unallowed registers from the allocation order completely. This is a slight rework of an implementation by @nemanjai Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D100050	2021-05-03 13:50:51 -04:00
Daniil Fukalov	3489c2d7b1	[TTI] NFC: Change getTypeLegalizationCost to return InstructionCost. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: sdesmalen, kparzysz Differential Revision: https://reviews.llvm.org/D101533	2021-04-30 22:51:51 +03:00
Amy Kwan	64d951be61	[PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns. This patch introduces a new infrastructure that is used to select the load and store instructions in the PPC backend. The primary motivation is that the current implementation of selecting load/stores is dependent on the ordering of patterns in TableGen. Given this limitation, we are not able to easily and reliably generate the P10 prefixed load and stores instructions (such as when the immediates that fit within 34-bits). This refactoring is meant to provide us with more control over the patterns/different forms to exploit, as well as eliminating dependency of pattern declaration in TableGen. The idea of this refactoring is that it introduces a set of addressing modes that correspond to different instruction formats of a particular load and store instruction, along with a set of common flags that describes a load/store. Whenever a load/store instruction is being selected, we analyze the instruction and compute a set of flags for it. The computed flags are then used to select the most optimal load/store addressing mode. This patch is the first of a series of patches to be committed - it contains the initial implementation of the refactored load/store selection infrastructure and also updates P8/P9 patterns to adopt this infrastructure. The idea is that incremental patches will add more implementation and support, and eventually the old implementation will be removed. Differential Revision: https://reviews.llvm.org/D93370	2021-04-30 09:53:19 -05:00
Sidharth Baveja	70c433a184	[XCOFF][AIX] Add Global Variables Directly to TOC for 32 bit AIX Summary: This patch implements the backend implementation of adding global variables directly to the table of contents (TOC), rather than adding the address of the variable to the TOC. Currently, this patch will look for the "toc-data" attribute on symbols in the IR, and then add those symbols to the TOC. ATM, this is implemented for 32 bit AIX. Reviewers: sfertile Differential Revision: https://reviews.llvm.org/D101178	2021-04-30 14:48:02 +00:00
Victor Huang	ae3377c553	[AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects - Add new variantKinds for the symbol's variable offset and region handle - Print the proper relocation specifier @gd in the asm streamer when emitting the TC Entry for the variable offset for the symbol - Fix the switch section failure between the TC Entry of variable offset and region handle - Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property Reviewed by: sfertile Differential Revision: https://reviews.llvm.org/D100956	2021-04-29 13:18:59 -05:00
Qiu Chaofan	56d923efdb	[SPE] Support constrained float operations on SPE This patch enables support on SPE for constrained arithmetic and comparison operations. This fixes bugzilla 50070. One thing not covered is fcmp vs. fcmps on SPE. Some condition code generates singaling comparison while some not. In this patch, all are considered as singaling. So there might be still some issue when compiling from C code. Reviewed By: jhibbits Differential Revision: https://reviews.llvm.org/D101282	2021-04-29 16:34:10 +08:00
Qiu Chaofan	d5c2492455	[PowerPC] Fix SELECT_CC with i64 operand on PPC32 This patch fixes the infinite loop in legalization of PPC32 SELECT_CC with 64-bit operand.	2021-04-28 17:48:33 +08:00
Victor Huang	241c2da406	[AIX][Power10] Restrict prefixed instructions from crossing the 64byte boundary This patch adds the support to restrict prefixed instruction from crossing the 64 byte boundary: - Add the infrastructure to register a custom XCOFF streamer - Add a custom XCOFF streamer for PowerPC to allow us to intercept instructions as they are being emitted and align all 8 byte instructions to a 64 byte boundary if required by adding a 4 byte nop. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D101107	2021-04-27 11:55:18 -05:00
Zarko Todorovski	f818ec9dd1	[AIX] Allow safe for 32bit P9 VSX extract and insert pattern matches In https://reviews.llvm.org/D92789 PPC64 checks were added that disallowed most VSX pattern matching. We enable some safe ones for 32bit in this patch. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D97503	2021-04-27 07:27:43 -04:00
Nemanja Ivanovic	6725b90a02	[PowerPC] Add vec_ctsl and vec_ctul to altivec.h These are added for compatibility with XLC. They are similar to vec_cts and vec_ctu except that the result is a doubleword vector regardless of the parameter type.	2021-04-23 11:03:38 -05:00
Sander de Smalen	f9a50f04ba	[TTI] NFC: Change getIntImmCost[Inst\|Intrin] to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Differential Revision: https://reviews.llvm.org/D100565	2021-04-23 16:06:36 +01:00
Jay Foad	82d34fe2b3	Fix typo "beneficiates" in comments	2021-04-22 12:30:16 +01:00
Nemanja Ivanovic	092619cf6b	[PowerPC] Improve codegen for vector fp to int widening conversions We currently do not utilize instructions that convert single precision vectors to doubleword integer vectors. These conversions come up in code occasionally and this improvement allows us to open code some functions that need to be added to altivec.h.	2021-04-22 05:04:06 -05:00
Nemanja Ivanovic	03e7fefff8	[PowerPC] Canonicalize shuffles on big endian targets as well Extend shuffle canonicalization and conversion of shuffles fed by vectorized scalars to big endian subtargets. For big endian subtargets, loads and direct moves of scalars into vector registers put the data in the correct element for SCALAR_TO_VECTOR if the data type is 8 bytes wide. However, if the data type is narrower, the value still ends up in the wrong place - althouth a different wrong place than on little endian targets. This patch extends the combine that keeps values where they are if they feed a shuffle to big endian targets. Differential revision: https://reviews.llvm.org/D100478	2021-04-20 07:29:47 -05:00
Qiu Chaofan	2432d80d3b	[PowerPC] Use mtvsrdd to put callee-saved GPR into VSR This patch exploits mtvsrdd instruction (available in ISA3.0+) to save two callee-saved GPR registers into a single VSR, making it more efficient. Reviewed By: jsji, nemanjai Differential Revision: https://reviews.llvm.org/D62565	2021-04-20 16:43:24 +08:00
Qiu Chaofan	b820339752	[PowerPC] Support f128 under VSX This patch is the last one in backend to support fp128 type in pre-POWER9 subtargets with VSX, removing temporary option and updating remaining tests. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92374	2021-04-20 15:49:52 +08:00
Jinsong Ji	d88d8c5b86	[PowerPC] Disable relative lookup table converter pass for AIX XCOFF hasn't implemented lowerRelativeReference. So we need to disable new pass introduced by https://reviews.llvm.org/D94355 for AIX for now. Reviewed By: gulfem Differential Revision: https://reviews.llvm.org/D100584	2021-04-19 19:28:11 +00:00
Nick Desaulniers	c440b97d89	[TargetLowering] move "o" and "X" constraint handling to base class These constraints are machine agnostic; there's no reason to handle these per-arch. If arches don't support these constraints, then they will fail elsewhere during instruction selection. We don't need virtual calls to look these up; TargetLowering::getInlineAsmMemConstraint should only be overridden by architectures with additional unique memory constraints. Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D100416	2021-04-19 10:53:31 -07:00
Serge Guelton	d6de1e1a71	Normalize interaction with boolean attributes Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299	2021-04-17 08:17:33 +02:00
Nemanja Ivanovic	ff769dd111	[PowerPC] Minor improvement for insert_vector_elt codegen For v2f64, all VSX subtargets can insert an element with a single XXPERMDI.	2021-04-16 18:52:37 -05:00
Stefan Pintilie	f28cb01be0	[PowerPC] Add ROP Protection Instructions for PowerPC There are four new PowerPC instructions that are introduced in Power 10. They are hashst, hashchk, hashstp, hashchkp. These instructions will be used for ROP Protection. This patch adds the four instructions. Reviewed By: nemanjai, amyk, #powerpc Differential Revision: https://reviews.llvm.org/D99375	2021-04-15 11:38:38 -05:00
Sander de Smalen	4f42d873c2	[TTI] NFC: Change getArithmeticInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100317	2021-04-14 17:20:36 +01:00
Sander de Smalen	1af35e77f4	[TTI] NFC: Change getVectorInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100315	2021-04-14 17:20:35 +01:00
Sander de Smalen	174e8f6c5e	[TTI] NFC: Change getShuffleCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100314	2021-04-14 17:20:35 +01:00
Sander de Smalen	14b934f8a6	[TTI] NFC: Change getCFInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D100313	2021-04-14 17:20:34 +01:00
Zarko Todorovski	6b7838b68c	[AIX] Allow safe for 32bit P8 VSX pattern matching Pull some of the safe for 32bit pattern matching for Pwr8 and above. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D97909	2021-04-14 08:12:48 -04:00
Nemanja Ivanovic	8be3181df6	[PowerPC] Fix incorrect subreg typo from `0148bf53f0`	2021-04-14 05:01:12 -05:00
Nemanja Ivanovic	0148bf53f0	[PowerPC] Use correct node to get a super register from a subreg The VSX tablegen file has some rather eggregious uses of COPY_TO_REGCLASS even in situations where it needs to use SUBREG_TO_REG. While this produces correct code, it often doesn't allow the register coalescer to coalesce copies and the resulting code ends up being suboptimal. This patch just changes over patterns that should use SUBREG_TO_REG.	2021-04-13 19:52:21 -05:00
Sander de Smalen	03f47bdcb1	[TTI] NFC: Change get[Interleaved]MemoryOpCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100205	2021-04-13 14:21:02 +01:00
Sander de Smalen	db134e2428	[TTI] NFC: Change getCmpSelInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100203	2021-04-13 14:21:01 +01:00
Sander de Smalen	92d8421f49	[TTI] NFC: Change getCastInstrCost and getExtractWithExtendCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100199	2021-04-13 14:20:58 +01:00
Chen Zheng	80aa9b0f7b	[PowerPC] stop reverse mem op generation for some cases. We should consider the feeder user number when we do reverse memory operation transformation. Otherwise, we may get negative impact. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D100166	2021-04-12 22:41:28 -04:00
Qiu Chaofan	ece7345859	[PowerPC] Lower f128 SETCC/SELECT_CC as libcall if p9vector disabled XSCMPUQP is not available for pre-P9 subtargets. This patch will lower them into libcall for correct behavior on power7/power8. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92083	2021-04-12 10:33:32 +08:00
dfukalov	8f4b7e94a2	[AMDGPU][CostModel] Refine cost model for control-flow instructions. Added cost estimation for switch instruction, updated costs of branches, fixed phi cost. Had to increase `-amdgpu-unroll-threshold-if` default value since conditional branch cost (size) was corrected to higher value. Test renamed to "control-flow.ll". Removed redundant code in `X86TTIImpl::getCFInstrCost()` and `PPCTTIImpl::getCFInstrCost()`. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D96805	2021-04-10 09:20:24 +03:00
Mitch Phillips	1a2756b777	Revert "[PowerPC] Add ROP Protection Instructions for PowerPC" This reverts commit `16fe741c69`. Reason: Broke the UBSan buildbots. More information available in the phabricator review: https://reviews.llvm.org/D99375	2021-04-09 13:36:41 -07:00
Stefan Pintilie	5bca7cdafb	Add correct types to the xxsplti32dx pattern. Regiser types for xxsplti32dx for two td file patterns was incorrect. Fixed the two types and added a test case that was reduced from a larger failing test. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D100223	2021-04-09 14:11:34 -05:00
Stefan Pintilie	16fe741c69	[PowerPC] Add ROP Protection Instructions for PowerPC There are four new PowerPC instructions that are introduced in Power 10. They are hashst, hashchk, hashstp, hashchkp. These instructions will be used for ROP Protection. This patch adds the four instructions. Reviewed By: nemanjai, amyk, #powerpc Differential Revision: https://reviews.llvm.org/D99375	2021-04-09 12:09:01 -05:00
Chen Zheng	74e77295e7	[PowerPC] fixup killed flags for ri + addi to ri transformation Fixup killed flags if DefMI and MI are not in the same basic blocks. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D100023	2021-04-07 22:04:08 -04:00
Qiu Chaofan	033c9c2552	[PowerPC] Fix use check of swap-reduction This will fix swap-reduction in DAGISel for cases where COPY_TO_REGCLASS has multiple uses.	2021-04-07 15:55:52 +08:00
Amy Kwan	bd6033eca7	[PowerPC] Materialize 34-bit constants with pli directly Previously, 34-bit constants were materialized in selectI64Imm(), and we relied on td pattern matching to instead produce a pli. This becomes problematic as there is no guarantee that the 34-bit constant will reach the td pattern selection for pli. It is also possible for other transformations (such as complex bit permutations) to also produce and utilize the 34-bit constant materialized through selectI64Imm(). This patch instead produces pli on Power10 directly whenever the constant fits within 34-bits. Differential Revision: https://reviews.llvm.org/D99906	2021-04-06 13:38:11 -05:00
Nikita Popov	665065821e	[FastISel] Remove kill tracking This is a followup to D98145: As far as I know, tracking of kill flags in FastISel is just a compile-time optimization. However, I'm not actually seeing any compile-time regression when removing the tracking. This probably used to be more important in the past, before FastRA was switched to allocate instructions in reverse order, which means that it discovers kills as a matter of course. As such, the kill tracking doesn't really seem to serve a purpose anymore, and just adds additional complexity and potential for errors. This patch removes it entirely. The primary changes are dropping the hasTrivialKill() method and removing the kill arguments from the emitFast methods. The rest is mechanical fixup. Differential Revision: https://reviews.llvm.org/D98294	2021-04-03 15:50:13 +02:00
Shimin Cui	00c0c8c87d	[PowerPC] [MLICM] Enable hoisting of caller preserved registers on AIX On ppc64 linux , MachineLICM will hoist caller preserved registers, including TOC loads of the global variable address, out of loops. This is to enable this on AIX for both ppc64 and ppc32. Differential Revision: https://reviews.llvm.org/D99076	2021-03-31 12:46:25 -04:00
Sander de Smalen	2f6f249a49	NFC: Change getIntrinsicInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Depends on D97468 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D97469	2021-03-31 14:04:41 +01:00
Sander de Smalen	3ccbd4f3c7	NFC: Change getUserCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Depends on D97382 Reviewed By: ctetreau, paulwalker-arm Differential Revision: https://reviews.llvm.org/D97466	2021-03-31 10:13:09 +01:00
Tomas Matheson	a9968c0a33	[NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions Currently needsStackRealignment returns false if canRealignStack returns false. This means that the behavior of needsStackRealignment does not correspond to it's name and description; a function might need stack realignment, but if it is not possible then this function returns false. Furthermore, needsStackRealignment is not virtual and therefore some backends have made use of canRealignStack to indicate whether a function needs stack realignment. This patch attempts to clarify the situation by separating them and introducing new names: - shouldRealignStack - true if there is any reason the stack should be realigned - canRealignStack - true if we are still able to realign the stack (e.g. we can still reserve/have reserved a frame pointer) - hasStackRealignment = shouldRealignStack && canRealignStack (not target customisable) Targets can now override shouldRealignStack to indicate that stack realignment is required. This change will make it easier in a future change to handle the case where we need to realign the stack but can't do so (for example when the register allocator creates an aligned spill after the frame pointer has been eliminated). Differential Revision: https://reviews.llvm.org/D98716 Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87	2021-03-30 17:31:39 +01:00
Albion Fung	e29bb074c6	[PowerPC] Exploit xxsplti32dx (constant materialization) for scalars This patch exploits the xxsplti32dx instruction available on Power10 in place of constant pool loads where xxspltidp would not be able to, usually because the immediate cannot fit into 32 bits. Differential Revision: https://reviews.llvm.org/D95458	2021-03-24 15:59:59 -04:00
Sander de Smalen	55d18b3cc2	[TTI] Return a TypeSize from getRegisterBitWidth. This patch changes the interface to take a RegisterKind, to indicate whether the register bitwidth of a scalar register, fixed-width vector register, or scalable vector register must be returned. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D98874	2021-03-24 14:45:13 +00:00
Stefan Pintilie	91f4c11133	[PowerPC] Add mprivileged option Add an option to tell the compiler that it can use privileged instructions. This patch only adds the option. Backend implementation will be added in a future patch. Reviewed By: lei, amyk Differential Revision: https://reviews.llvm.org/D99193	2021-03-24 08:33:22 -05:00
Stefan Pintilie	0e4f5f3ea6	[PowerPC] Change option to mrop-protect In order to have the same option on power PC LLVM and power PC gcc the option will be changed from -mrop-protection to -mrop-protect. The feature will be off by default and turned on when the option is used. Reviewed By: lei, amyk Differential Revision: https://reviews.llvm.org/D99185	2021-03-24 05:51:35 -05:00
Stefan Pintilie	b8f3c6d011	[PowerPC][NFC] Do not enter prefix selection if it cannot do better. Do not try to materialize a constant using prefix instructions if the selection using non prefix instructions was able to do it using a single non prefix instruction. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D98791	2021-03-22 09:17:52 -05:00
Qiu Chaofan	52f33f7953	[PowerPC] Enable redundant TOC save removal on AIX Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D97039	2021-03-22 14:29:22 +08:00
Nemanja Ivanovic	ea48bf8649	[PowerPC][NFC] Do not produce i64 constants in 32-bit mode There are some instances where we produce constants of type MVT::i64 unconditionally in the target DAG combines. This is not actually valid in 32-bit mode.	2021-03-19 22:54:47 -05:00
Anshil Gandhi	697f90ebfa	[NFC] [PowerPC] Determine Endianness in PPCTargetMachine The TargetMachine uses the triple to determine endianness. Just use that logic rather than replicating it in PPCSubtarget. Differential revision: https://reviews.llvm.org/D98674	2021-03-19 20:22:16 -05:00
Nemanja Ivanovic	a8697c57fa	[PowerPC] Fix the check for 16-bit signed field in peephole When a D-Form instruction is fed by an add-immediate, we attempt to merge the two immediates to form a single displacement so we can remove the add-immediate. However, we don't check whether the new displacement fits into a 16-bit signed immediate field early enough. Namely, we do a sign-extend from 16 bits first which will discard high bits and then we check whether the result is a 16-bit signed immediate. It of course will always be. Move the check prior to the sign extend to ensure we are checking the correct value. Fixes https://bugs.llvm.org/show_bug.cgi?id=49640	2021-03-19 07:15:53 -05:00
David Green	e2935dcfc4	[TTI] Add a Mask to getShuffleCost This adds an Mask ArrayRef to getShuffleCost, so that if an exact mask can be provided a more accurate cost can be provided by the backend. For example VREV costs could be returned by the ARM backend. This should be an NFC until then, laying the groundwork for that to be added. Differential Revision: https://reviews.llvm.org/D98206	2021-03-17 17:46:26 +00:00
Fangrui Song	5d44c92bf8	Change void getNoop(MCInst &NopInst) to MCInst getNop() Prefer (self-documenting) return values to output parameters (which are liable to be used). While here, rename Noop to Nop which is more widely used and improves consistency with hasEmitNops/setEmitNops/emitNop/etc.	2021-03-15 12:05:34 -07:00
Simon Pilgrim	f6524b4ada	[PPC] Fix UBSAN warning about out of range shift. NFCI.	2021-03-12 12:03:48 +00:00
Simon Pilgrim	400952980f	[PPC] Fix static analyzer / UBSAN warnings about out of range shifts. NFCI.	2021-03-12 10:34:35 +00:00
Stefan Pintilie	e021de0aab	[PowerPC] Exploit paddi instruction on Power 10 for constant materialization Starting with Power 10 the instruction paddi is available to use. The instruction allows for immediates that are 34 bits. This patch adds exploitation of the paddi instruction to allow us to materialize constants. Reviewed By: lei, amyk Differential Revision: https://reviews.llvm.org/D93300	2021-03-11 08:37:49 -06:00
Qiu Chaofan	72c4cbd60e	[PowerPC] Fix multi-use case for swap reduction `4c973ae` implemented reduction of vector swap for lane-insensitive operations. This commit fixes it for checking number of uses of the vector operation.	2021-03-11 21:58:33 +08:00
Nikita Popov	2489cbaa80	[PowerPC] Fix infinite loop in peephole CR optimization (PR49509) If we encounter a degenerate select node where both operands are the same, then we can continue negating the condition while swapping operands, resulting in an infinite loop. Avoid this by bailing out if both operands are the same. Fixes https://bugs.llvm.org/show_bug.cgi?id=49509. Differential Revision: https://reviews.llvm.org/D98340	2021-03-11 14:25:22 +01:00
Amy Kwan	8b540c542c	[PowerPC] Implement patterns for PC-Rel zextload/extload byte loads This patch adds patterns to select the PC-Relative extloadi1 and zextloadi1 byte loads. Differential Revision: https://reviews.llvm.org/D98042	2021-03-10 12:18:13 -06:00
Qiu Chaofan	4c973ae51b	[PowerPC] Reduce symmetrical swaps for lane-insensitive vector ops This patch simplifies pattern (xxswap (vec-op (xxswap a) (xxswap b))) into (vec-op a b) if vec-op is lane-insensitive. The motivating case is ScalarToVector-VecOp-ExtractElement sequence on LE, but the peephole itself is not related to endianness, so BE may also benefit from this. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D97658	2021-03-10 15:21:32 +08:00
Albion Fung	9b6ac9e999	[P10] [Power PC] Exploiting new load rightmost vector element instructions. This pull request implements patterns to exploit the load rightmost vector element instructions for loading element 0 on little endian PowerPC subtargets into v8i16 and v16i8 vector registers for i16 and i8 data types. Differential Revision: https://reviews.llvm.org/D94816#inline-921403	2021-03-09 16:08:17 -05:00
Lei Huang	535a4192a9	[AIX][TLS] Generate 64-bit general-dynamic access code sequence Add support for the TLS general dynamic access model to assembly files on AIX 64-bit. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D98078	2021-03-08 16:41:25 -06:00
Masoud Ataei	820f508b08	[PowerPC] Removing _massv place holder Since P8 is the oldest machine supported by MASSV pass, _massv place holder is removed and the oldest version of MASSV functions is assumed. If the P9 vector specific is detected in the compilation process, the P8 prefix will be updated to P9. Differential Revision: https://reviews.llvm.org/D98064	2021-03-08 21:43:24 +00:00
Nemanja Ivanovic	b0f0115308	[AIX][TLS] Generate 32-bit general-dynamic access code sequence Adds support for the TLS general dynamic access model to assembly files on AIX 32-bit. To generate the correct code sequence when accessing a TLS variable `v`, we first create two TOC entry nodes, one for the variable offset, one for the region handle. These nodes are followed by a `PPCISD::TLSGD_AIX` node (new node introduced by this patch). The `PPCISD::TLSGD_AIX` node (`TLSGDAIX` pseudo instruction) is expanded to 2 copies (to put the variable offset and region handle in the right registers) and a call to `__tls_get_addr`. This patch also changes the way TC entries are generated in asm files. If the generated TC entry is for the region handle of a TLS variable, we add the `@m` relocation and the `.` prefix to the entry name. For example: ``` L..C0: .tc .v[TC],v[TL]@m -> region handle L..C1: .tc v[TC],v[TL] -> variable offset ``` Reviewed By: nemanjai, sfertile Differential Revision: https://reviews.llvm.org/D97948	2021-03-08 09:30:19 -06:00
Ahsan Saghir	acce401068	[PowerPC] Change target data layout for 16-byte stack alignment This changes the target data layout to make stack align to 16 bytes on Power10. Before this change, stack was being aligned to 32 bytes. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D96265	2021-03-08 08:13:08 -06:00
Sean Fertile	f0904a6208	[PowePC][AIX] Handle variadic vector call operands. Patch adds support for passing vector call operands to variadic functions. Arguments which are fixed shadow GPRs and stack space even when they are passed in vector registers, while arguments passed through ellipses are passed in properly aligned GPRs if available and on the stack once all GPR arguments registers are consumed. Differential Revision: https://reviews.llvm.org/D97956	2021-03-06 13:49:55 -05:00
Fangrui Song	3110187f1f	[MC][PowerPC] Support .reloc , BFD_RELOC_{NONE,16,32,64}, BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.	2021-03-05 21:31:45 -08:00
Zarko Todorovski	2b50ce1524	[PowerPC][AIX] Enable the default AltiVec ABI on AIX This patch adds support for the default AltiVec ABI for AIX. Vector registers 20 through 31 are marked as reserved and cannot be used in the default ABI. This patch adds handling for this case and also remove the default AltiVec ABI errors. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D96351	2021-03-05 12:46:27 -05:00
Jinsong Ji	cc21de6789	[PowerPC] Update Copy/Paste encodings according to ISA3.1 Copy-paste P9 insns were added back in 2016, however, looks like the opcodes has changed in ISA3.1. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D97416	2021-03-05 17:05:50 +00:00
Chen Zheng	87bbf3d1f8	[XCOFF][DebugInfo] support DWARF for XCOFF for assembly output. Reviewed By: jasonliu Differential Revision: https://reviews.llvm.org/D95518	2021-03-04 21:07:52 -05:00
Jinsong Ji	7967221a72	[PowerPC] Disable more extended mne on AIX To avoid assembler errors. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D97418	2021-03-04 21:13:37 +00:00
Benjamin Kramer	e897feeb8a	[PPC] Silence unused variable warning in release builds. NFC.	2021-03-04 21:43:19 +01:00
Sean Fertile	aaeffbe007	[PowerPC][AIX] Handle variadic vector formal arguments. Patch adds support for passing vector arguments to variadic functions. Arguments which are fixed shadow GPRs and stack space even when they are passed in vector registers, while arguments passed through ellipses are passed in(properly aligned GPRs if available and on the stack once all GPR arguments registers are consumed. Differential Revision: https://reviews.llvm.org/D97485	2021-03-04 10:56:53 -05:00
Qiu Chaofan	72d4a41ba6	[PowerPC] Allow spilling GPR to VSR on AIX This patch enables spilling GPR to VSRs instead of stack under AIX ABI. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D97367	2021-03-03 13:32:39 +08:00
Victor Huang	1756b2adc9	[AIX][TLS] Generate TLS variables in assembly files This patch allows generating TLS variables in assembly files on AIX. Initialized and external uninitialized variables are generated with the .csect pseudo-op and local uninitialized variables are generated with the .comm/.lcomm pseudo-ops. The patch also adds a check to explicitly say that TLS is not yet supported on AIX. Reviewed by: daltenty, jasonliu, lei, nemanjai, sfertile Originally patched by: bsaleil Commandeered by: NeHuang Differential Revision: https://reviews.llvm.org/D96184	2021-03-02 18:22:48 -06:00
Amara Emerson	8a316045ed	[AArch64][GlobalISel] Enable use of the optsize predicate in the selector. To do this while supporting the existing functionality in SelectionDAG of using PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis dependencies to the instruction selector pass. Then, use the predicate to generate constant pool loads for f32 materialization, if we're targeting optsize/minsize. Differential Revision: https://reviews.llvm.org/D97732	2021-03-02 12:55:51 -08:00
Kazu Hirata	4ed47858ab	[llvm] Use llvm::drop_begin (NFC)	2021-02-22 20:17:16 -08:00
Jacques Pienaar	3bec7ed59e	Different fix for gcc bug Was still running into from definition of 'template<class T> struct llvm::DenseMapInfo' [-fpermissive] template <typename T> struct DenseMapInfo; ^	2021-02-19 16:41:00 -08:00
Leonard Chan	c77659e549	[llvm][IR] Do not place constants with static relocations in a mergeable section This patch provides two major changes: 1. Add getRelocationInfo to check if a constant will have static, dynamic, or no relocations. (Also rename the original needsRelocation to needsDynamicRelocation.) 2. Only allow a constant with no relocations (static or dynamic) to be placed in a mergeable section. This will allow unused symbols that contain static relocations and happen to fit in mergeable constant sections (.rodata.cstN) to instead be placed in unique-named sections if -fdata-sections is used and subsequently garbage collected by --gc-sections. See https://lists.llvm.org/pipermail/llvm-dev/2021-February/148281.html. Differential Revision: https://reviews.llvm.org/D95960	2021-02-18 15:39:00 -08:00
Sean Fertile	bb260b1ca7	[PowerPC][AIX] Add support for vector arg passing on the stack. Enable passing more vector arguments then available vector argument passing registers. Differential Revision: https://reviews.llvm.org/D96415	2021-02-18 13:32:40 -05:00
Baptiste Saleil	34dc1ccb96	[PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10 This patch generates the vinsw, vinsd, vinsblx, vinshlx, vinswlx, vinsdlx, vinsbrx, vinshrx, vinswrx and vinsdrx instructions for vector insertion on P10. Differential Revision: https://reviews.llvm.org/D94454	2021-02-18 14:17:47 +00:00
Stefan Pintilie	b80357d46e	[PowerPC] Add option for ROP Protection Added -mrop-protection for Power PC to turn on codegen that provides some protection from ROP attacks. The option is off by default and can be turned on for Power 8, Power 9 and Power 10. This patch is for the option only. The feature will be implemented by a later patch. Reviewed By: amyk Differential Revision: https://reviews.llvm.org/D96512	2021-02-18 12:15:50 +00:00
Chen Zheng	5517923b1c	[XCOFF][NFC] make csect properties optional for getXCOFFSection We are going to support debug sections for XCOFF. So the csect properties are not necessary. This patch makes these properties optional. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D95931	2021-02-17 20:51:42 -05:00
Sidharth Baveja	cb2876800c	[PowerPC][AIX] Enable Shrinkwrapping on 32 and 64 bit AIX. Summary: Currently Shrinkwrap is not enabled on AIX. This patch enables shrink wrap on 32 and 64 bit AIX, and 64 bit ELF. Reviewed By: sfertile, nemanjai Differential Revision: https://reviews.llvm.org/D95094	2021-02-17 14:54:57 +00:00
Sean Fertile	4e127bce2d	[PowerPC] Handle FP physical register in inline asm constraint. Do not defer to the base class when the register constraint is a physical fpr. The base class will select SPILLTOVSRRC as the register class and register allocation will fail on subtargets without VSX registers. Differential Revision: https://reviews.llvm.org/D91629	2021-02-17 09:27:03 -05:00
Douglas Yung	0e3d7e6186	Fix gcc build after `de3a485d9` due to a gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92598 This should fix gcc based builders such as http://lab.llvm.org:8011/#/builders/76/builds/1683	2021-02-16 21:57:12 -08:00
Victor Huang	de3a485d9c	[NFC][PPC] Refactor TOC representation to allow several entries for the same symbol We currently represent TOC entries by an MCSymbol. This is not enough in some situations. For example, when accessing an initialized TLS variable v on AIX using the general dynamic model, we need to generate the two following entries for v: .tc .v[TC],v@m .tc v[TC],v One is for the region handle (with the @m relocation), the other is for the variable offset. This refactoring allows storing several entries for the same symbol with different VariantKind in the TOC. If the VariantKind is not specified, we default to VK_None. The AIX TLS implementation using this refactoring to generate the two entries will be posted in a subsequent patch. Patched By: bsaleil Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D96346	2021-02-16 21:32:16 +00:00
Nemanja Ivanovic	a5222aa085	[DAGCombine] Do not remove masking argument to FP16_TO_FP for some targets As of commit `284f2bffc9`, the DAG Combiner gets rid of the masking of the input to this node if the mask only keeps the bottom 16 bits. This is because the underlying library function does not use the high order bits. However, on PowerPC's ELFv2 ABI, it is the caller that is responsible for clearing the bits from the register. Therefore, the library implementation of __gnu_h2f_ieee will return an incorrect result if the bits aren't cleared. This combine is desired for ARM (and possibly other targets) so this patch adds a query to Target Lowering to check if this zeroing needs to be kept. Fixes: https://bugs.llvm.org/show_bug.cgi?id=49092 Differential revision: https://reviews.llvm.org/D96283	2021-02-09 06:33:48 -06:00
Simon Pilgrim	518af8df44	[PowerPC] Fix multiclass template parameter types. NFC. Fixes TableGen parser errors reported by D95874.	2021-02-06 15:39:26 +00:00
Craig Topper	11ef356d9e	[TargetLowering] Use Align in allowsMisalignedMemoryAccesses. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D96097	2021-02-04 19:22:06 -08:00
Chen Zheng	b0869a7d72	[PowerPC] [NFC] fix wording typos Post commit comments address for D92071.	2021-02-02 21:03:17 -05:00
Stefan Pintilie	288f762b6f	[PowerPC] Materialize 34 bit constants with pli on Power 10. NOTE: This patch was originally written by Anil Mahmud. His code has been rebased but otherwise left mostly unchanged. A new instructon on Power 10 allows for the materialization of 34 bit immediate values. This patch allows the compiler to take advantage of the new instruction in this situation. Reviewed By: amyk Differential Revision: https://reviews.llvm.org/D92879	2021-02-02 09:49:22 -06:00
Craig Topper	94206f1f90	[PowerPC] Remove vnot_ppc and replace with the standard vnot. immAllOnesV has special support for looking through bitcasts automatically so isel patterns don't need to explicitly look for the bitconvert.	2021-01-31 19:41:33 -08:00
Kazu Hirata	627b5bda11	[llvm] Add missing header guards (NFC) Identified with llvm-header-guard.	2021-01-30 09:53:42 -08:00
Kazu Hirata	8ed1636184	[llvm] Use isa instead of dyn_cast (NFC)	2021-01-29 23:23:37 -08:00
Kazu Hirata	7925aa091d	[llvm] Populate SmallVector at construction time (NFC)	2021-01-28 22:21:14 -08:00
Albion Fung	2e470e03b4	[PowerPC][Power10] Fix XXSPLI32DX not correctly exploiting specific cases Some cases may be transformed into 32 bit splats before hitting the boolean statement, which may cause incorrect behaviour and provide XXSPLTI32DX with the incorrect values of splat. The condition was reversed so that the shortcut prevents this problem. Differential Revision: https://reviews.llvm.org/D95634	2021-01-28 15:17:32 -05:00
Nemanja Ivanovic	54e570d94a	[PowerPC] Do not emit XXSPLTI32DX for sub 64-bit constants If the APInt returned by BuildVectorSDNode::isConstantSplat() is narrower than 64 bits, the result produced by XXSPLTI32DX is incorrect. The result returned by the function appears to be incorrect and we'll investigate/fix it in a follow-up commit. However, since this causes miscompiles, we must temporarily disable emitting this instruction for such values.	2021-01-28 04:16:48 -06:00
Nemanja Ivanovic	8018f731f0	[PowerPC] Do not emit HW loop with half precision operations If a loop has any operations on half precision values, there will be calls to library functions on Power8. Even on Power9, there is a small subset of instructions that are actually supported for the type. This patch disables HW loops whenever any operations on the type are found (other than the handfull of supported ones when compiling for Power9). Fixes a few PR's opened by Julia: https://bugs.llvm.org/show_bug.cgi?id=48785 https://bugs.llvm.org/show_bug.cgi?id=48786 https://bugs.llvm.org/show_bug.cgi?id=48519 Differential revision: https://reviews.llvm.org/D94980	2021-01-25 20:55:56 -06:00
Nemanja Ivanovic	1150bfa6bb	[PowerPC] Add missing negate for VPERMXOR on little endian subtargets This intrinsic is supposed to have the permute control vector complemented on little endian systems (as the ABI specifies and GCC implements). With the current code gen, the result vector is byte-reversed. Differential revision: https://reviews.llvm.org/D95004	2021-01-25 12:23:33 -06:00
QingShan Zhang	ffc3e800c6	[NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero For now, we correct the result for sqrt if iteration > 0. This doesn't make sense as they are not strict relative. Reviewed By: dmgreen, spatel, RKSimon Differential Revision: https://reviews.llvm.org/D94480	2021-01-25 04:02:44 +00:00
Chen Zheng	0ed4cf4bf3	[PowerPC] support register pressure reduction in machine combiner. Reassociating some patterns to generate more fma instructions to reduce register pressure. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92071	2021-01-24 21:28:21 -05:00
Kazu Hirata	16baad8f4e	[llvm] Use pop_back_val (NFC)	2021-01-24 12:18:57 -08:00
Kazu Hirata	054444177b	[Target] Use llvm::append_range (NFC)	2021-01-24 12:18:56 -08:00
Kazu Hirata	e4847a7fcf	Revert "[Target] Use llvm::append_range (NFC)" This reverts commit `cc7a238286`. The X86WinEHState.cpp hunk seems to break certain builds.	2021-01-23 11:25:27 -08:00
Kazu Hirata	cc7a238286	[Target] Use llvm::append_range (NFC)	2021-01-23 10:56:31 -08:00
Stanislav Mekhanoshin	607bec0bb9	Change materializeFrameBaseRegister() to return register The only caller of this function is in the LocalStackSlotAllocation and it creates base register of class returned by the target's getPointerRegClass(). AMDGPU wants to use a different reg class here so let materializeFrameBaseRegister to just create and return whatever it wants. Differential Revision: https://reviews.llvm.org/D95268	2021-01-22 15:51:06 -08:00
Kazu Hirata	cfa241680f	[llvm] Don't include StringSwitch.h where unnecessary (NFC)	2021-01-21 19:59:48 -08:00
Qiu Chaofan	449f2f7140	[PowerPC] Duplicate inherited heuristic from base scheduler PowerPC has its custom scheduler heuristic. It calls parent classes' tryCandidate in override version, but the function returns void, so this way doesn't actually help. This patch duplicates code from base scheduler into PPC machine scheduler class, which does what we wanted. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D94464	2021-01-22 10:11:03 +08:00
Albion Fung	719b563ecf	[PowerPC][Power10] Exploit splat instruction xxsplti32dx in Power10 Exploits the instruction xxsplti32dx. It can be used to materialize any 64 bit scalar/vector splat by using two instances, one for the upper 32 bits and the other for the lower 32 bits. It should not materialize the cases which can be materialized by using the instruction xxspltidp. Differential Revision: https://https://reviews.llvm.org/D90173	2021-01-20 12:55:52 -05:00
Kazu Hirata	8857202489	[llvm] Use llvm::find (NFC)	2021-01-19 20:19:14 -08:00
Victor Huang	909d6c86ea	[PowerPC] Fix the check for the instruction using FRSP/XSRSP output register When performing peephole optimization to simplify the code, after removing passed FPSP/XSRSP instruction we will set any uses of that FRSP/XSRSP to the source of the FRSP/XSRSP. We are finding the machine instruction using virtual register holding FRSP/XSRSP results by searching all following instructions and encountering an issue that the first use of the virtual register is a debug MI causing: 1. virtual register in the debug MI removed unexpectedly. 2. virtual register used in non-debug MI not replaced with the source of FRSP/XSRSP. which stays in a undef status. This patch fix the issue by only searching non-debug machine instruction using virtual register holding FRSP/XSRSP results when the vr only has one non debug usage. Differential Revisien: https://reviews.llvm.org/D94711 Reviewed by: nemanjai	2021-01-19 09:20:03 -06:00
Nemanja Ivanovic	61f69153e8	[PowerPC] Sign extend comparison operand for signed atomic comparisons As of `8dacca943a`, we sign extend the atomic loaded operand for signed subword comparisons. However, the assumption that the other operand is correctly sign extended doesn't always hold. This patch sign extends the other operand if it needs to be sign extended. This is a second fix for https://bugs.llvm.org/show_bug.cgi?id=30451 Differential revision: https://reviews.llvm.org/D94058	2021-01-18 21:19:25 -06:00
Sean Fertile	ead71a23ed	[PowerPC][AIX]Do not emit xxspltd mnemonic on AIX. A bug in the system assembler can assemble the xxspltd extended menemonic into the wrong instruction (extracting the wrong element). Emit the full xxpermdi with all operands to work around the problem. Differential Revision: https://reviews.llvm.org/D94419	2021-01-18 09:25:31 -05:00
Tres Popp	3bd24574c7	Revert "[PowerPC] support register pressure reduction in machine combiner." This reverts commit `26a396c4ef`. See https://reviews.llvm.org/D92071 for a description of the issue.	2021-01-18 12:01:57 +01:00
Chen Zheng	26a396c4ef	[PowerPC] support register pressure reduction in machine combiner. Reassociating some patterns to generate more fma instructions to reduce register pressure. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92071	2021-01-17 23:56:13 -05:00
Kazu Hirata	7dc3575ef2	[llvm] Remove redundant return and continue statements (NFC) Identified with readability-redundant-control-flow.	2021-01-14 20:30:34 -08:00
Jinsong Ji	0f588ac03e	[PowerPC] Only use some extend mne if assembler is modern enough Legacy AIX assembly might not support all extended mnes, add one feature bit to control the generation in MC, and avoid generating them by default on AIX. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D94458	2021-01-14 20:36:10 +00:00
Esme-Yi	ff40fb07ad	[PowerPC] Try to fold sqrt/sdiv test results with the branch. Summary: The patch tries to fold sqrt/sdiv test node, i.g FTSQRT, XVTDIVDP, and the branch, i.e br_cc if they meet these patterns: (br_cc seteq, (truncateToi1 SWTestOp), 0) -> (BCC PRED_NU, SWTestOp) (br_cc seteq, (and SWTestOp, 2), 0) -> (BCC PRED_NE, SWTestOp) (br_cc seteq, (and SWTestOp, 4), 0) -> (BCC PRED_LE, SWTestOp) (br_cc seteq, (and SWTestOp, 8), 0) -> (BCC PRED_GE, SWTestOp) (br_cc setne, (truncateToi1 SWTestOp), 0) -> (BCC PRED_UN, SWTestOp) (br_cc setne, (and SWTestOp, 2), 0) -> (BCC PRED_EQ, SWTestOp) (br_cc setne, (and SWTestOp, 4), 0) -> (BCC PRED_GT, SWTestOp) (br_cc setne, (and SWTestOp, 8), 0) -> (BCC PRED_LT, SWTestOp) Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D94054	2021-01-14 02:15:19 +00:00
Jinsong Ji	93b54b7c67	[PowerPC][NFCI] PassSubtarget to ASMWriter Subtarget feature bits are needed to change instprinter's behavior based on feature bits. Most of the other popular targets were updated back in 2015, in https://reviews.llvm.org/rGb46d0234a6969 we should update it too. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D94449	2021-01-12 16:25:35 +00:00
Nemanja Ivanovic	3f7b4ce960	[PowerPC] Add support for embedded devices with EFPU2 PowerPC cores like e200z759n3 [1] using an efpu2 only support single precision hardware floating point instructions. The single precision instructions efs* and evfs* are identical to the spe float instructions while efd* and evfd* instructions trigger a not implemented exception. This patch introduces a new command line option -mefpu2 which leads to single-hardware / double-software code generation. [1] Core reference: https://www.nxp.com/files-static/32bit/doc/ref_manual/e200z759CRM.pdf Differential revision: https://reviews.llvm.org/D92935	2021-01-12 09:47:00 -06:00
Kazu Hirata	e5b4dbab04	[llvm] Simplify string comparisons (NFC) Identified with readability-string-compare.	2021-01-11 18:48:09 -08:00
Kazu Hirata	e3d3dbd339	[llvm] Ensure newlines at the end of files (NFC) This patch eliminates pesky "No newline at end of file" messages from git diff.	2021-01-10 09:24:57 -08:00
Kazu Hirata	b7c5e0b02c	[Target, Transforms] Use *Set::contains (NFC)	2021-01-08 18:39:54 -08:00
Fangrui Song	022cc6e343	[PowerPC] Delete dead Lower*	2021-01-06 21:58:40 -08:00
Fangrui Song	bfa6ca07a8	[PowerPC] Delete remnant Darwin ISelLowering code	2021-01-06 21:40:40 -08:00
Fangrui Song	01a2508aa5	[PowerPC] Delete remnant isOSDarwin references	2021-01-06 21:18:35 -08:00
Kit Barton	4bdab54826	[PPC] Remove old PPCSubTarget variable. The PPCSubTarget variable has been replaced with the Subtarget variable. This removes the remaining instances of PPCSubTarget as they are no longer necessary.	2021-01-06 17:44:07 -06:00
Stefan Pintilie	cb0c034edc	[PowerPC] Fix issue where vsrq is given incorrect shift vector The new Power10 instruction vsrq was being given the wrong shift vector. The original code assumed that the shift would be found in bits 121 to 127. This is not correct. The shift is found in bits 57 to 63. This can be fixed by swaping the first and second double words. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D94113	2021-01-06 05:56:09 -06:00
Christudasan Devadasan	d68458bd56	[GlobalISel] Base implementation for sret demotion. If the return values can't be lowered to registers SelectionDAG performs the sret demotion. This patch contains the basic implementation for the same in the GlobalISel pipeline. Furthermore, targets should bring relevant changes during lowerFormalArguments, lowerReturn and lowerCall to make use of this feature. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D92953	2021-01-06 10:30:50 +05:30
Qiu Chaofan	b6c8feb29f	[NFC] [PowerPC] Remove dead code in BUILD_VECTOR peephole The piece of code tries to use splat+shift to lower build_vector with repeating bit pattern. And immediate field of vector splat is only 5 bits (-16~15). It iterates over them one by one to find which shifts/rotates to number in build_vector. This patch removes code to try matching constant with algebraic right-shift because that's meaningless - any negative number's algebraic right-shift won't produce result smaller than itself. Besides, code (int)((unsigned)i >> j) means logical shift-right in C. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D93937	2021-01-05 11:35:00 +08:00
Kai Luo	f6515b0520	[PowerPC] Do not fold `cmp(d\|w)` and `subf` instruction to `subf.` if `nsw` is not present In `PPCInstrInfo::optimizeCompareInstr` we seek opportunities to fold `cmp(d\|w)` and `subf` as an `subf.`. However, if `subf.` gets overflow, `cr0` can't reflect the correct order, violating the semantics of `cmp(d\|w)`. Fixed https://bugs.llvm.org/show_bug.cgi?id=47830. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D90156	2021-01-04 07:54:15 +00:00
Brandon Bergren	8f004471c2	[PowerPC] Add the LLVM triple for powerpcle [1/5] Add a triple for powerpcle--. This is a little-endian encoding of the 32-bit PowerPC ABI, useful in certain niche situations: 1) A loader such as the FreeBSD loader which will be loading a little endian kernel. This is required for PowerPC64LE to load properly in pseries VMs. Such a loader is implemented as a freestanding ELF32 LSB binary. 2) Userspace emulation of a 32-bit LE architecture such as x86 on 64-bit hosts such as PowerPC64LE with tools like box86 requires having a 32-bit LE toolchain and library set, as they operate by translating only the main binary and switching to native code when making library calls. 3) The Void Linux for PowerPC project is experimenting with running an entire powerpcle userland. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D93918	2021-01-02 12:17:22 -06:00
Kai Luo	f904d50c29	[PowerPC] Remaining KnownBits should be constant when performing non-sign comparison In `PPCTargetLowering::DAGCombineTruncBoolExt`, when checking if it's correct to perform the transformation for non-sign comparison, as the comment says ``` // This is neither a signed nor an unsigned comparison, just make sure // that the high bits are equal. ``` Origin check ``` if (Op1Known.Zero != Op2Known.Zero \|\| Op1Known.One != Op2Known.One) return SDValue(); ``` is not strong enough. For example, ``` Op1Known = 111x000x; Op2Known = 111x000x; ``` Bit 4, besides bit 0, is still unknown and affects the final result. This patch fixes https://bugs.llvm.org/show_bug.cgi?id=48388. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D93092	2020-12-30 02:00:47 +00:00
Nemanja Ivanovic	7486de1b2e	[PowerPC] Provide patterns for permuted scalar to vector for pre-P8 We will emit these permuted nodes on all VSX little endian subtargets but don't have the patterns available to match them on subtargets that don't have direct moves. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47916	2020-12-29 06:49:25 -06:00
Nemanja Ivanovic	0a19fc3088	[PowerPC] Disable CTR loops containing operations on half-precision On subtargets prior to Power9, conversions to/from half precision are lowered to libcalls. This makes loops containing such operations invalid candidates for HW loops. Fixes: https://bugs.llvm.org/show_bug.cgi?id=48519	2020-12-29 05:12:50 -06:00
Nemanja Ivanovic	4f568fbd21	[PowerPC] Do not emit HW loop when TLS var accessed in PHI of loop exit If any PHI nodes in loop exit blocks have incoming values from the loop that are accesses of TLS variables with local dynamic or general dynamic TLS model, the address will be computed inside the loop. Since this includes a call to __tls_get_addr, this will in turn cause the CTR loops verifier to complain. Disable CTR loops in such cases. Fixes: https://bugs.llvm.org/show_bug.cgi?id=48527	2020-12-28 20:36:16 -06:00
Fangrui Song	f931290308	[PowerPC] Parse and ignore .machine glibc/sysdeps/powerpc/powerpc64 has .machine {altivec,power4,power5,power6,power7,power8} (.machine power9 is planned in sysdeps/powerpc/powerpc64/power9/strcmp.S). The diagnostic is not useful anyway so just delete it.	2020-12-28 12:20:40 -08:00
Nemanja Ivanovic	e73f885c98	[PowerPC] Remove redundant COPY_TO_REGCLASS introduced by `8a58f21f5b`	2020-12-28 09:26:51 -06:00
Kamau Bridgeman	8a58f21f5b	[PowerPC][Power10] Exploit store rightmost vector element instructions Using the store rightmost vector element instructions to do vector element extraction and store. The rightmost vector element on little endian is the zeroth vector element, with these patterns that element can be extracted and stored in one instruction for all vector types. Differential Revision: https://reviews.llvm.org/D89195	2020-12-22 12:06:43 -05:00
Nemanja Ivanovic	ba1202a1e4	[PowerPC] Restore stack ptr from base ptr when available On subtargets that have a red zone, we will copy the stack pointer to the base pointer in the prologue prior to updating the stack pointer. There are no other updates to the base pointer after that. This suggests that we should be able to restore the stack pointer from the base pointer rather than loading it from the back chain or adding the frame size back to either the stack pointer or the frame pointer. This came about because functions that call setjmp need to restore the SP from the FP because the back chain might have been clobbered (see https://reviews.llvm.org/D92906). However, if the stack is realigned, the restored SP might be incorrect (which is what caused the failures in the two ASan test cases). This patch was tested quite extensivelly both with sanitizer runtimes and general code. Differential revision: https://reviews.llvm.org/D93327	2020-12-22 05:44:03 -06:00
Fangrui Song	d9a0c40bce	[MC] Split MCContext::createTempSymbol, default AlwaysAddSuffix to true, and add comments CanBeUnnamed is rarely false. Splitting to a createNamedTempSymbol makes the intention clearer and matches the direction of reverted r240130 (to drop the unneeded parameters). No behavior change.	2020-12-21 14:04:13 -08:00
Fangrui Song	d33abc337c	Migrate MCContext::createTempSymbol call sites to AlwaysAddSuffix=true Most call sites set AlwaysAddSuffix to true. The two use cases do not really need false and can be more consistent with other temporary symbol usage.	2020-12-21 14:04:13 -08:00

1 2 3 4 5 ...

6630 Commits