Commit Graph

1659 Commits

Author SHA1 Message Date
Sean Fertile d52bb6d099 [PowerPC][AIX] ByVal formal argument support: passing on the stack.
Adds support for passing a ByVal formal argument completely on the stack
(ie after all argument registers are exhausted).

Differential Revision: https://reviews.llvm.org/D78263
2020-04-20 12:04:59 -04:00
Stefan Pintilie b771c4a842 [PowerPC][Future] More support for PCRel addressing for global values
Add initial support for PC Relative addressing for global values that
require GOT indirect addressing. This patch adds PCRelative support for
global addresses that may not be known at link time and may require
access through the GOT.

Differential Revision: https://reviews.llvm.org/D76064
2020-04-17 11:06:13 -05:00
Stefan Pintilie 18b6050324 [PowerPC][Future] Initial support for PC Relative addressing for global values
This patch adds PC Relative support for global values that are known at link
time. If a global value requires access through the global offset table (GOT)
it is not covered in this patch.

Differential Revision: https://reviews.llvm.org/D75280
2020-04-16 12:45:22 -05:00
Chris Bowler bee6c234ed [AIX][PowerPC] Implement caller byval arguments in stack memory
Differential Revision: https://reviews.llvm.org/D77578
2020-04-15 17:57:31 -04:00
Craig Topper 113f37a1f9 [CallSite removal][TargetLowering] Replace ImmutableCallSite with CallBase
Differential Revision: https://reviews.llvm.org/D77995
2020-04-13 13:50:15 -07:00
Nemanja Ivanovic 512600e3c0 [PowerPC] Handle f16 as a storage type only
The PPC back end currently crashes (fails to select) with f16 input. This patch
expands it on subtargets prior to ISA 3.0 (Power9) and uses the HW conversions
on Power9.

Fixes https://bugs.llvm.org/show_bug.cgi?id=39865

Differential revision: https://reviews.llvm.org/D68237
2020-04-11 07:34:47 -05:00
Kai Luo b7d5229d78 [PowerPC] Update alignment for ReuseLoadInfo in LowerFP_TO_INTForReuse
In LowerFP_TO_INTForReuse, when emitting `stfiwx`, alignment of 4 is
set for the `MachineMemOperand`, but RLI(ReuseLoadInfo)'s alignment is
not updated for following loads.

It's related to failed alignment check reported in
https://bugs.llvm.org/show_bug.cgi?id=45297

Differential Revision: https://reviews.llvm.org/D77624
2020-04-10 05:49:19 +00:00
jasonliu 085689d44c [PPC][AIX] Implement variadic function handling in LowerFormalArguments_AIX
Summary:
This patch adds support for handling of variadic functions for AIX.
This includes ensuring that use and consume correct type of
va_list (char *va_list) for AIX.

Authored by: ZarkoCA

Reviewers: cebowleratibm, sfertile, jasonliu

Reviewed by: jasonliu

Differential Revision: https://reviews.llvm.org/D76130
2020-04-09 16:49:44 +00:00
Stefan Pintilie 75828ef615 [PowerPC][Future] Initial support for PCRel addressing for constant pool loads
Add initial support for PC Relative addressing for constant pool loads.
This includes adding a new relocation for @pcrel and adding a new PowerPC flag
to identify PC relative addressing.

Differential Revision: https://reviews.llvm.org/D74486
2020-04-09 11:17:23 -05:00
Sean Fertile d0b57b41f4 [PowerPC][AIX][NFC] Replace deprecated getByValAlign call.
Replace call to deprecated 'getByValAlign()' with
'getNonZeroByValAlign()'.
2020-04-08 13:27:39 -04:00
Matt Arsenault 84aa58cbe2 CodeGen: Use Register in TargetLowering 2020-04-08 12:10:58 -04:00
Sean Fertile 8abfd2c3bb [PowerPC][AIX] Enable passing byval formal arguments in multiple registers.
Any or all the argument registers can be used to pass a byval formal
argument, with the limitation that the argument must fit in the
available registers (ie: is not split between registers and stack).

Differential Revision: https://reviews.llvm.org/D76902
2020-04-08 11:16:33 -04:00
Stefan Pintilie 6c4b40def7 [PowerPC][Future] Add Support For Functions That Do Not Use A TOC.
On PowerPC most functions require a valid TOC pointer.

This is the case because either the function itself needs to use this
pointer to access the TOC or because other functions that are called
from that function expect a valid TOC pointer in the register R2.
The main exception to this is leaf functions that do not access the TOC
since they are guaranteed not to need a valid TOC pointer.

This patch introduces a feature that will allow more functions to not
require a valid TOC pointer in R2.

Differential Revision: https://reviews.llvm.org/D73664
2020-04-08 08:07:35 -05:00
Chris Bowler d6ea82d11c [AIX][PPC] Implement by-val caller arguments in multiple registers
Differential Revision: https://reviews.llvm.org/D76380
2020-04-06 11:06:51 -04:00
jasonliu d65557d15d [NFC][XCOFF][AIX] Refactor get/setContainingCsect
Summary:
For current architect, we always require setContainingCsect to be
called on every MCSymbol got used in XCOFF context.
This is very hard to achieve because symbols gets created everywhere
 and other MCSymbol types(ELF, COFF) do not have similar rules.
It's very easy to miss setting the containing csect, and we would
need to add a lot of XCOFF specialized code around some common code area.

This patch intendeds to do
1. Rely on getFragment().getParent() to get csect from labels.
2. Only use get/setRepresentedCsect (was get/setContainingCsect)
if symbol itself represents a csect.

Reviewers: DiggerLin, hubert.reinterpretcast, daltenty

Differential Revision: https://reviews.llvm.org/D77080
2020-04-03 13:33:12 +00:00
Guillaume Chatelet 1dffa2550b [Alignment][NFC] Transition to MachineFrameInfo::getObjectAlign()
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77215
2020-04-01 14:08:28 +00:00
Guillaume Chatelet c7468c1696 [Alignment][NFC] Use Align in SelectionDAG::getMemIntrinsicNode
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, nemanjai, hiraditya, kbarton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77149
2020-04-01 09:32:05 +00:00
Kai Luo 8eb40e41f6 [PowerPC] Don't generate ST_VSR_SCAL_INT if power8-vector is disabled
Summary:
In https://bugs.llvm.org/show_bug.cgi?id=45297, it fails selecting
instructions for `PPCISD::ST_VSR_SCAL_INT`. The reason it generate the
`PPCISD::ST_VSR_SCAL_INT` with `-power8-vector` in IR is PPC's
combiner checks `hasP8Altivec` rather than `hasP8Vector`. This patch
should resolve PR45297.

Differential Revision: https://reviews.llvm.org/D76773
2020-04-01 02:15:25 +00:00
Guillaume Chatelet c9d5c19597 [Alignment][NFC] Transitionning more getMachineMemOperand call sites
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, Jim, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77121
2020-03-31 08:36:18 +00:00
Guillaume Chatelet bdf77209b9 [Alignment][NFC] Use Align version of getMachineMemOperand
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jyknight, sdardis, nemanjai, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, jfb, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77059
2020-03-30 15:46:27 +00:00
Guillaume Chatelet 74eac9031a [Alignment][NFC] MachineMemOperand::getAlign/getBaseAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, dschuff, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, jrtc27, atanasyan, jfb, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76925
2020-03-27 15:49:13 +00:00
Guillaume Chatelet b727aabcb8 [Alignment][NFC] Use llvmTargetFrameLowering::getStackAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Reviewed By: courbet

Subscribers: wuzish, arsenm, jyknight, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, fedor.sergeev, jrtc27, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76613
2020-03-26 18:15:53 +00:00
QingShan Zhang 1ef7bf4121 [PowerPC] Improve the way legalize mul for v8i16 and add pattern to match mul + add
We can legalize the operation MUL for v8i16 with instruction (vmladduhm A, B, 0)
if altivec enabled. Now, it is set as custom and expand it later, which is not
the right way. And then, we can add the pattern to match the mul + add with (vmladduhm A, B, C)

Reviewed By: Nemanjai

Differential Revision: https://reviews.llvm.org/D76751
2020-03-26 04:46:49 +00:00
Sean Fertile 3282d875d6 [PowerPC][AIX] ByVal formal arguments in a single register.
Adds support for passing ByVal formal arguments as long as they fit in a
single register.

Differential Revision: https://reviews.llvm.org/D76401
2020-03-25 11:09:40 -04:00
Chen Zheng 9d07d91fb6 [PowerPC] fix a typo in commit 3f85134d71
Implement target hook isProfitableToHoist - typo fix.
2020-03-24 01:56:15 -04:00
Chen Zheng 3f85134d71 [PowerPC] implement target hook isProfitableToHoist
On Powerpc fma is faster than fadd + fmul for some types,
(PPCTargetLowering::isFMAFasterThanFMulAndFAdd). we should implement target
hook isProfitableToHoist to prevent simplifyCFGpass from breaking fma
pattern by hoisting fmul to predecessor block.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D76207
2020-03-19 00:17:25 -04:00
Chen Zheng aacf022cd5 [PowerPC] add IR level isFMAFasterThanFMulAndFAdd - NFC
And also refactor legacy MIR level isFMAFasterThanFMulAndFAdd.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D76265
2020-03-18 23:24:40 -04:00
Chris Bowler c21866476e [PowerPC][AIX] Implement by-val caller arguments in a single register.
This is the first of a series of patches that adds caller support for
by-value arguments. This patch add support for arguments that are passed in a
single GPR.

There are 3 limitation cases:
-The by-value argument is larger than a single register.
-There are no remaining GPRs even though the by-value argument would
otherwise fit in a single GPR.
-The by-value argument requires alignment greater than register width.

Future patches will be required to add support for these cases as well
as for the callee handling (in LowerFormalArguments_AIX) that
corresponds to this work.

Differential Revision: https://reviews.llvm.org/D75863
2020-03-18 10:57:28 -04:00
QingShan Zhang 0b126eec6d [NFC][PowerPC] Simplify the logic in lower select_cc
The logic in select_cc is messy and hard to follow. This is a NFC patch to simplify the logic.

Differential Revision: https://reviews.llvm.org/D75834
2020-03-17 03:47:39 +00:00
QingShan Zhang d0fb34dc09 [PowerPC] Replace the PPCISD:: SExtVElems with ISD::SIGN_EXTEND_INREG to leverage the combine rules
The PPCISD::SExtVElems was added by commit https://reviews.llvm.org/D34009. However,
we have another ISD node ISD::SIGN_EXTEND_INREG that perfectly match the semantics
of SExtVElems. And the DAGCombiner has some combine rules for SIGN_EXTEND_INREG
that produce better code.

Differential Revision: https://reviews.llvm.org/D70771
2020-03-13 07:28:28 +00:00
Zarko Todorovski d688312660 [PowerPC][AIX] Implement formal arguments passed in stack memory.
This patch is the callee side counterpart for https://reviews.llvm.org/D73209.
It removes the fatal error when we pass more formal arguments than available
registers.

Differential Revision: https://reviews.llvm.org/D74225
2020-03-12 11:48:00 -04:00
Xiangling Liao 3e53bf5781 [PowerPC32] Fix the `setcc` inconsistent result type problem
Summary:
On 32-bit PPC target[AIX and BE], when we convert an `i64` to `f32`, a `setcc` operand expansion is needed. The expansion will set the result type of expanded `setcc` operation based on if the subtarget use CRBits or not. If the subtarget does use the CRBits, like AIX and BE, then it will set the result type to `i1`, leading to an inconsistency with original `setcc` result type[i32].
And the reason why it crashed underneath is because we don't set result type of setcc consistent in those two places.

This patch fixes this problem by setting original setcc opnode result type also with `getSetCCResultType`  interface.

Reviewers: sfertile, cebowleratibm, hubert.reinterpretcast, Xiangling_L

Reviewed By: sfertile

Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75702
2020-03-12 10:50:37 -04:00
Qiu Chaofan 096d545376 [PowerPC] Add strict-fp intrinsic to FP arithmetic
This patch adds basic strict-fp intrinsics support to PowerPC backend,
including basic arithmetic operations (add/sub/mul/div).

Reviewed By: steven.zhang, andrew.w.kaylor

Differential Revision: https://reviews.llvm.org/D63916
2020-03-12 17:02:54 +08:00
Chris Bowler c7b6fa8f4b [AIX] Extend int arguments to register width when passed in stack memory.
This is a follow up to the previous patch: [AIX] Implement caller
arguments passed in stack memory.

This corrects a defect in AIX 64-bit where an i32 is written to the
stack with stw (4 bytes) rather than the expected std (8 bytes.) Integer
arguments pass on the stack as images of their register representation.

I also took the opportunity to tidy up some of the calling convention
AIX tests I added in my last commit. This patch adds the missed assembly
expected output for the stack arg int case, which would have caught this
problem.

Differential Revision: https://reviews.llvm.org/D75126
2020-03-05 11:49:16 -05:00
Xiangling Liao e7375e9932 [AIX] Remove whitelist checking for ExternalSymbolSDNodes
Allow all ExternalSymbolSDNode on AIX, and rely on the linker error to find
symbols which we don't have definitions from any library/compiler-rt.

Differential Revision: https://reviews.llvm.org/D75075
2020-02-26 10:09:25 -05:00
Kang Zhang b083d7a346 [PowerPC] Fix the unexpected modification caused by D62993 in LowerSELECT_CC for power9
Summary:
The patch D62993 : `[PowerPC] Emit scalar min/max instructions with unsafe fp math`
has modified the functionality when `Subtarget.hasP9Vector() && (!HasNoInfs || !HasNoNaNs)`,
 this modification is not expected.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D74701
2020-02-26 02:59:03 +00:00
Craig Topper 735d27dc40 [SelectionDAG][PowerPC][AArch64][X86][ARM] Add chain input and output the ISD::FLT_ROUNDS_
This node reads the rounding control which means it needs to be ordered properly with operations that change the rounding control. So it needs to be chained to maintain order.

This patch adds a chain input and output to the node and connects it to the chain in SelectionDAGBuilder. I've update all in-tree targets to connect their chain through their lowering code.

Differential Revision: https://reviews.llvm.org/D75132
2020-02-25 16:58:23 -08:00
Qiu Chaofan 87c773082a [PowerPC] Exploit VSX rounding instrs for rint
Exploit native VSX rounding instruction, x(v|s)r(d|s)pic, which does
rounding using current rounding mode.

According to C standard library, rint may raise INEXACT exception while
nearbyint won't.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D72685
2020-02-13 20:59:50 +08:00
Craig Topper eeb63944e4 [LegalizeTypes][ARM][AArch64][PowerPC][RISCV][X86] Use BUILD_PAIR to return expanded integer results from ReplaceNodeResults instead of just returning two results.
Remove code from LegalizeTypes that allowed this to work.

We were already using BUILD_PAIR for this in some places so this
standardizes on a single way to do this.
2020-02-08 09:52:31 -08:00
Guillaume Chatelet f85d3408e6 [NFC] Introduce an API for MemOp
Summary: This patch introduces an API for MemOp in order to simplify and tighten the client code.

Reviewers: courbet

Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73964
2020-02-07 11:32:27 +01:00
Chris Bowler b373ec8ce7 [AIX] Implement caller arguments passed in stack memory.
This patch implements the caller side of placing function call arguments
in stack memory. This removes the current limitation where LLVM on AIX
will report fatal error when arguments can't be contained in registers.

There is a particular oddity that a float argument that passes in a
register and also in stack memory requires that the caller initialize
both. From what AIX "ABI" documentation I have it's not clear that this
needs to be done, however, it is necessary for compatibility with the
AIX XL compiler so I think it's best to implement it the same way.

Note a later patch will follow to address the callee side.

Differential Revision: https://reviews.llvm.org/D73209
2020-02-06 12:07:34 -05:00
Guillaume Chatelet b8144c0536 [NFC] Encapsulate MemOp logic
Summary:
This patch simply introduces functions instead of directly accessing the fields.
This helps introducing additional check logic. A second patch will add simplifying functions.

Reviewers: courbet

Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73945
2020-02-04 10:36:26 +01:00
Guillaume Chatelet 333f2ad8b8 [Alignment][NFC] Use Align for getMemcpy/Memmove/Memset
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73885
2020-02-03 17:13:19 +01:00
Guillaume Chatelet 3c89b75f23 [NFC] Introduce a type to model memory operation
Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code.

Reviewers: courbet

Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73785
2020-01-31 17:29:01 +01:00
Guillaume Chatelet 805c157e8a [Alignment][NFC] Deprecate Align::None()
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.

Reviewers: xbolva00, courbet, bollu

Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73099
2020-01-24 12:53:58 +01:00
Sean Fertile 9aa816a816 [PowerPC] Collect some CallLowering arguments into a struct. [NFC]
Collect the calling convention and a number of boolean arguments into a
structure to slightly reduces the number of arguments passed around between
LowerCall_<Subtarget>, FinishCall and a few of the helpers. Also
calulates if a call is indirect once using the exisitng helper and caches the
result replacing several instances where we duplicated the logic determining if
a call is indirect.
2020-01-22 16:55:27 -05:00
Fangrui Song 8e1f0974c2 [PowerPC] Delete PPCSubtarget::isDarwin and isDarwinABI
http://lists.llvm.org/pipermail/llvm-dev/2018-August/125614.html developers have agreed to remove Darwin support from POWER backends.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D72067
2020-01-21 09:54:44 -08:00
Michael Liao 6d0d86a64d [DAG] Add helper for creating constant vector index with correct type. NFC. 2020-01-18 01:23:36 -05:00
Nemanja Ivanovic 9c64f04df8 [PowerPC] Legalize saturating vector add/sub
These intrinsics and the corresponding ISD nodes were recently added. PPC has
instructions that do this for vectors. Legalize them and add patterns to emit
the satuarting instructions.

Differential revision: https://reviews.llvm.org/D71940
2020-01-15 07:00:38 -06:00
Xiangling Liao 25a8aec7f3 [AIX] ExternalSymbolSDNode lowering
For memcpy/memset/memmove etc., replace ExternalSymbolSDNode with a
MCSymbolSDNode, which have a prefix dot before function name as entry
point symbol.

Differential Revision: https://reviews.llvm.org/D70718
2020-01-14 09:39:02 -05:00
jasonliu dfed052fb3 [AIX] Allow vararg calls when all arguments reside in registers
Summary:
This patch pushes the AIX vararg unimplemented error diagnostic later
and allows vararg calls so long as all the arguments can be passed in register.
This patch extends the AIX calling convention implementation to initialize
GPR(s) for vararg float arguments. On AIX, both GPR(s) and FPR are allocated
for floating point arguments. The GPR(s) are only initialized for vararg calls,
otherwise the callee is expected to retrieve the float argument in the FPR.

f64 in AIX PPC32 requires special handling in order to allocated and
initialize 2 GPRs. This is performed with bitcast, SRL, truncation to
initialize one GPR for the MSW and bitcast, truncations to initialize
the other GPR for the LSW.

A future patch will follow to add support for arguments passed on the stack.

Patch provided by: cebowleratibm

Reviewers: sfertile, ZarkoCA, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D71013
2020-01-10 17:33:35 +00:00
Matt Arsenault 255cc5a760 CodeGen: Use LLT instead of EVT in getRegisterByName
Only PPC seems to be using it, and only checks some simple cases and
doesn't distinguish between FP. Just switch to using LLT to simplify
use from GlobalISel.
2020-01-09 17:37:52 -05:00
Jinsong Ji 24ee4edee8 [PowerPC][NFC] Rename record instructions to use _rec suffix instead of o
We use o suffix to indicate record form instuctions,
(as it is similar to dot '.' in mne?)

This was fine before, as we did not support XO-form.
However, with https://reviews.llvm.org/D66902,
we now have XO-form support.

It becomes confusing now to still use 'o' for record form,
and it is weird to have something like 'Oo' .

This patch rename all 'o' instructions to use '_rec' instead.
Also rename `isDot` to `isRecordForm`.

Reviewed By: #powerpc, hfinkel, nemanjai, steven.zhang, lkail

Differential Revision: https://reviews.llvm.org/D70758
2020-01-06 22:27:07 +00:00
Reid Kleckner 9c2b72821b Move tail call disabling code to target independent code
When the "disable-tail-calls" attribute was added, checks were added for
it in various backends. Now this code has proliferated, and it is
something the target is responsible for checking. Move that
responsibility back to the ISels (fast, global, and SD).

There's no major functionality change, except for targets that never
implemented this check.

This LLVM attribute was originally added in
d9699bc7bd (2015).

Reviewers: echristo, MaskRay

Differential Revision: https://reviews.llvm.org/D72118
2020-01-03 11:27:41 -08:00
Sean Fertile 479e9406c2 [PowerPC][AIX] Enable sret arguments.
Removes the fatal error for sret arguments and adds lit testing.

Differential Revision: https://reviews.llvm.org/D71504
2020-01-02 19:31:01 -05:00
Nemanja Ivanovic 781b78a361 [PowerPC] Only legalize FNEARBYINT with unsafe fp math
Commit 0f0330a787 legalized these nodes on PPC without consideration of
unsafe math which means that we get inexact exceptions raised for nearbyint.
Since this doesn't conform to the standard, switch this legalization to depend
on unsafe fp math.
2020-01-02 13:45:54 -06:00
Jinsong Ji fcbf05bbdc [PowerPC][NFC] Fix clang-tidy warning
Reported by
https://results.llvm-merge-guard.org/amd64_debian_testing_clang8-726/clang-tidy.txt

/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:11672:10:
warning: invalid case style for variable 'isEQ'
[readability-identifier-naming]
    bool isEQ = (MI.getOpcode() == PPC::ANDI_rec_1_EQ_BIT ||
         ^~~~
         IsEq
/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:11679:14:
warning: invalid case style for variable 'dl'
[readability-identifier-naming]
    DebugLoc dl = MI.getDebugLoc();
             ^~
             Dl
2019-12-31 16:24:40 +00:00
Nemanja Ivanovic 0f0330a787 [PowerPC] Legalize rounding nodes
VSX provides a full complement of rounding instructions yet we somehow ended up
with some of them legal and others not. This just legalizes all of the FP
rounding nodes and the FP -> int rounding nodes with unsafe math.

Differential revision: https://reviews.llvm.org/D69949
2019-12-30 08:03:53 -06:00
Nemanja Ivanovic a9ad65a2b3 [PowerPC] Change default for unaligned FP access for older subtargets
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40554

Some CPU's trap to the kernel on unaligned floating point access and there are
kernels that do not handle the interrupt. The program then fails with a SIGBUS
according to the PR. This just switches the default for unaligned access to only
allow it on recent server CPUs that are known to allow this.

Differential revision: https://reviews.llvm.org/D71954
2019-12-28 11:20:52 -06:00
Fangrui Song 7a7334663c Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821
Intrinsic has incorrect argument type!
  i32 (i32*)* @llvm.setjmp

*wipes tear*
2019-12-27 00:00:14 -08:00
QingShan Zhang 6d5e35e89d [Power9] Remove the PPCISD::XXREVERSE as it has completely the same semantics of ISD::BSWAP
The custom node PPCISD::XXREVERSE has completely the same semantics of generic node ISD::BSWAP.
We need to clean up it as we have the combine rules for bswap in the base class, while nothing for xxreverse.

Differential Revision: https://reviews.llvm.org/D70657
2019-12-23 07:44:33 +00:00
Kai Luo 9681dc9627 [PowerPC] Exploit `vrl(b|h|w|d)` to perform vector rotation
Summary:
Currently, we set legalization action of `ISD::ROTL` vectors as
`Expand` in `PPCISelLowering`. However, we can exploit `vrl(b|h|w|d)`
to lower `ISD::ROTL` directly.

Differential Revision: https://reviews.llvm.org/D71324
2019-12-23 03:04:43 +00:00
Fangrui Song e8054f0933 [PPC32] Emit R_PPC_PLTREL24 for calls to dso_local ifunc
static void *ifunc(void) __attribute__((ifunc("resolver")));
  void foo() { ifunc(); }

The relocation produced by the ifunc() call:

1. gcc -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
2. gcc -msecure-plt -PIE => R_PPC_PLTREL24 r_addend=0x8000
3. clang -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
4. clang -msecure-plt -fPIE => R_PPC_REL24

4 is incorrect. The R_PPC_REL24 needs a call stub due to ifunc. If this
relocation is mixed with other R_PPC_PLTREL24(r_addend=0x8000) in a
function, both GNU ld and lld (after D71621 fix) may produce a wrong
result.

This patch fixes 4 to use R_PPC_PLTREL24, which matches GCC.
Both GNU ld and lld (after D71621) will be happy.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D71649
2019-12-20 11:32:02 -08:00
Justin Hibbits d3aeac8e20 [PowerPC] Only use PLT annotations if using PIC relocation model
Summary:
The default static (non-PIC, non-PIE) model for 32-bit powerpc does not
use @PLT annotations and relocations in GCC.  LLVM shouldn't use @PLT
annotations either, because it breaks secure-PLT linking with (some
versions of?) GNU LD.

Update the available-externally.ll test to reflect that default mode should be
the same as the static relocation, by using the same check prefix.

Reviewed by:    sfertile
Differential Revision: https://reviews.llvm.org/D70570
2019-12-19 09:27:13 -06:00
Stefan Pintilie ec3d6f3ecb [PowerPC][NFC] Refactor splat of constant to vector.
Refactor the splatting of a constant to a vector so that common code is used
both for Power9 and Power8.

Patch by: Anil Mahmud

Differential Revision: https://reviews.llvm.org/D71481
2019-12-18 12:43:19 -06:00
Nemanja Ivanovic a5da8d90da [PowerPC] Add missing legalization for vector BSWAP
We somehow missed doing this when we were working on Power9 exploitation.
This just adds the missing legalization and cost for producing the vector
intrinsics.

Differential revision: https://reviews.llvm.org/D70436
2019-12-17 19:07:34 -06:00
Sean Fertile 93faa237da [PowerPC] Add Support for indirect calls on AIX.
Extends the desciptor-based indirect call support for 32-bit codegen,
and enables indirect calls for AIX.

In-depth Description:
In a function descriptor based ABI, a function pointer points at a
descriptor structure as opposed to the function's entry point. The
descriptor takes the form of 3 pointers: 1 for the function's entry
point, 1 for the TOC anchor of the module containing the function
definition, and 1 for the environment pointer:

struct FunctionDescriptor {
  void *EntryPoint;
  void *TOCAnchor;
  void *EnvironmentPointer;
};

An indirect call has several steps of loading the the information from
the descriptor into the proper registers for setting up the call. Namely
it has to:

1) Save the caller's TOC pointer into the TOC save slot in the linkage
   area, and then load the callee's TOC pointer into the TOC register
   (GPR 2 on AIX).

2) Load the function descriptor's entry point into the count register.

3) Load the environment pointer into the environment pointer register
   (GPR 11 on AIX).

4) Perform the call by branching on count register.

5) Restore the caller's TOC pointer after returning from the indirect call.

A couple important caveats to the above:

- There is no way to directly load a value from memory into the count register.
  Instead we populate the count register by loading the entry point address into
  a gpr and then moving the gpr to the count register.

- The TOC restore has to come immediately after the branch on count register
  instruction (i.e., the 1st instruction executed after we return from the
  call). This is an implementation limitation. We could, in theory, schedule
  the restore elsewhere as long as no uses of the TOC pointer fall in between
  the call and the restore; however, to keep it simple, we insert a pseudo
  instruction that represents both the indirect branch instruction and the
  load instruction that restores the caller's TOC from the linkage area. As
  they flow through the compiler as a single pseudo instruction, nothing can be
  inserted between them and the caller's TOC is then valid at any use.

Differtential Revision: https://reviews.llvm.org/D70724
2019-12-13 20:07:00 -05:00
Reid Kleckner 5d986953c8 [IR] Split out target specific intrinsic enums into separate headers
This has two main effects:
- Optimizes debug info size by saving 221.86 MB of obj file size in a
  Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of
  object file size.
- Incremental step towards decoupling target intrinsics.

The enums are still compact, so adding and removing a single
target-specific intrinsic will trigger a rebuild of all of LLVM.
Assigning distinct target id spaces is potential future work.

Part of PR34259

Reviewers: efriedma, echristo, MaskRay

Reviewed By: echristo, MaskRay

Differential Revision: https://reviews.llvm.org/D71320
2019-12-11 18:02:14 -08:00
QingShan Zhang eba7cbd3d0 [NFC][PowerPC] Remove the dead conditions in the if(cond) 2019-12-11 09:57:06 +00:00
Huihui Zhang 6507e13589 [NFC] Add { } to silence compiler warning [-Wmissing-braces].
../llvm/lib/Target/PowerPC/PPCISelLowering.cpp:5371:37: warning: suggest braces around initialization of subobject [-Wmissing-braces]
  std::array<EVT, 2> ReturnTypes = {MVT::Other, MVT::Glue};
                                    ^~~~~~~~~~~~~~~~~~~~~
                                    {                    }
2019-12-09 17:19:34 -08:00
Jinsong Ji 3d41a58eac [PowerPC][NFC] Rename ANDI(S)o8 to ANDI(S)8o
Summary:
This is found during https://reviews.llvm.org/D70758
All the other record forms are having suffix o at the end.
ANDIo8 and ANDISo8 are the only two that put o before 8.

This patch rename them to be consistent with others.

Reviewers: #powerpc, hfinkel, nemanjai, lei, steven.zhang, echristo, jhibbits, joerg

Reviewed By: jhibbits

Subscribers: wuzish, hiraditya, kbarton, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70928
2019-12-09 19:21:34 +00:00
Sean Fertile c78726fae0 [PowerPC] Refactor FinishCall. [NFC]
Refactor FinishCall to be more easily understandable as a precursor to
implementing indirect calls for AIX. The refactor tries to group similar
code together at the cost of some code duplication. The high level
overview of the refactor:

- Adds a number of helper functions for things like:
  * Determining if a call is indirect.
  * What the Opcode for a call is.
  * Transforming the callee for a direct function call.
  * Extracting the Chain operand from a CallSeqStart node.
  * Building the operands of the call.

- Adds helpers for building the indirect call DAG nodes
  (excluding the call instruction itself which is created in
  `FinishCall`).

- Removes PrepareCall, which has been subsumed by the
  helpers.

- Rename 'InFlag' to 'Glue'.

- FinishCall has been refactored to:
  1) Set TOC pointer usage on the DAG for the TOC based
     subtargets.
  2) Calculate if a call is indirect.
  3) Determine the Opcode to use for the call
     instruction.
  4) Transform the Callee for direct calls, or build
     the DAG nodes for indirect calls.
  5) Buildup the call operands.
  6) Emit the call instruction.
  7) If needed, emit the callSeqEnd Node and
     finish lowering by calling `LowerCallResult`

Differential Revision: https://reviews.llvm.org/D70126
2019-12-09 12:40:15 -05:00
Sean Fertile 26ab827c24 [PowerPC][AIX] Add support for lowering int/float/double formal arguments.
This patch adds LowerFormalArguments_AIX, support is added for lowering
int, float, and double formal arguments into general purpose and
floating point registers only.

The aix calling convention testcase have been redone to test for caller
and callee functionality in the same lit test.

Patch by Zarko Todorovski!

Differential Revision: https://reviews.llvm.org/D69578
2019-11-29 12:46:53 -05:00
Stefan Pintilie dcceab1a0a [PowerPC] Add new Future CPU for PowerPC in LLVM
This is a continuation of D70262
The previous patch as listed above added the future CPU in clang. This patch
adds the future CPU in the PowerPC backend. At this point the patch simply
assumes that a future CPU will have the same characteristics as pwr9. Those
characteristics may change with later patches.

Differential Revision: https://reviews.llvm.org/D70333
2019-11-27 14:30:06 -06:00
jasonliu 7707d8aa9d [XCOFF][AIX] Check linkage on the function, and two fixes for comments
This is a follow up commit to address post-commit comment in D70443

Differential revision: https://reviews.llvm.org/D70443
2019-11-26 16:09:31 +00:00
Kit Barton 85e4f5bcf6 [PowerPC] Rename DarwinDirective to CPUDirective (NFC)
Summary:
This patch renames the DarwinDirective (used to identify which CPU was defined)
to CPUDirective. It also adds the getCPUDirective() method and replaces all uses
of getDarwinDirective() with getCPUDirective().

Once this patch lands and downstream users of the getDarwinDirective() method
have switched to the getCPUDirective() method, the old getDarwinDirective()
method will be removed.

Reviewers: nemanjai, hfinkel, power-llvm-team, jsji, echristo, #powerpc, jhibbits

Reviewed By: hfinkel, jsji, jhibbits

Subscribers: hiraditya, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70352
2019-11-25 14:26:08 -06:00
Nemanja Ivanovic 7fbaa8097e [PowerPC] Fix VSX clobbers of CSR registers
If an inline asm statement clobbers a VSX register that overlaps with a
callee-saved Altivec register or FPR, we will not record the clobber and will
therefore violate the ABI. This is clearly a bug so this patch fixes it.

Differential revision: https://reviews.llvm.org/D68576
2019-11-25 11:41:34 -06:00
jasonliu 906ecae2ed [AIX][XCOFF] Generate undefined symbol in symbol table for external function call
Summary:
This patch sets up the infrastructure for

 1. Associate MCSymbolXCOFF with an MCSectionXCOFF when it could not
    get implicitly associated.
 2. Generate undefined symbols. The patch itself generates undefined symbol
    for external function call only. Generate undefined symbol for external
    global variable and external function descriptors will be handled in
    separate patch(s) after this is land.

Differential Revision: https://reviews.llvm.org/D70443
2019-11-25 15:02:01 +00:00
QingShan Zhang a4cc895aee [PowerPC] Implement the vector extend sign instruction pattern match
Power9 has instructions to implement the semantics of SIGN_EXTEND_INREG for vector type.
Mark it as legal and add the match pattern.

Differential Revision: https://reviews.llvm.org/D69601
2019-11-22 08:58:27 +00:00
Xiangling Liao ca33727abe [AIX] Lowering jump table, constant pool and block address in asm
This patch lowering jump table, constant pool and block address in assembly.
1. On AIX, jump table index is always relative;
2. Put CPI and JTI into ReadOnlySection until we support unique data sections;
3. Create the temp symbol for block address symbol;
4. Update MIR testcases and add related assembly part;

Differential Revision: https://reviews.llvm.org/D70243
2019-11-20 10:27:15 -05:00
Matt Arsenault b696b9dba7 DAG: Add function context to isFMAFasterThanFMulAndFAdd
AMDGPU needs to know the FP mode for the function to answer this
correctly when this is removed from the subtarget.

AArch64 had to make this more complicated by using this from an IR
hook, so add an IR typed overload.
2019-11-19 19:25:26 +05:30
Nemanja Ivanovic 9af28400d6 [PowerPC] Option for enabling absolute jumptables with command line
This option allows the user to specify the use of absolute jumptables instead
of relative which is the default on most PPC subtargets.

Patch by Kamauu Bridgeman

Differential revision: https://reviews.llvm.org/D69108
2019-11-07 19:33:15 -06:00
Xiangling Liao 5c9bdc79e1 [AIX] Lowering CPI/JTI/BA to MIR
Enable lowering of constant pool index, jump table index, and bloack address to MIR on AIX.

Differential Revision: https://reviews.llvm.org/D69264
2019-10-30 11:21:37 -04:00
Nemanja Ivanovic 25a41ad242 [PowerPC] Emit scalar fp min/max instructions
VSX provides floating point minimum and maximum instructions that conform
to IEEE semantics. This legalizes the respective nodes and emits VSX code
for them. Furthermore, on Power9 cores we have xsmaxcdp and xsmincdp
instructions that conform to language semantics for the conditional operator
even in the presence of NaNs.

Differential revision: https://reviews.llvm.org/D62993
2019-10-28 19:13:33 -05:00
Sean Fertile 582e3c09d4 [AIX] Refactor AIX Call Lowering to use CCState. NFCI.
This patch reworks the AIX call lowering to use CCState. Some defensive errors
are added in this patch to protect from emitting bad code for calling convention
logic that has not been implemented by design. The use of CCState follows the
precedent of other targets and enables the reuse of calling convention logic in
LowerFormalArguments, which will be rewritten to also use CCState in a late
patch.

Patch by Chris Bowler.

Differential Revision: https://reviews.llvm.org/D69101
2019-10-28 12:44:22 -04:00
Xiangling Liao ee68f1ec67 [NFC] Replace 'isDarwin' with 'IsDarwin'
Summary: Replace 'isDarwin' with 'IsDarwin' based on LLVM naming convention.

Differential Revision: https://reviews.llvm.org/D68336

llvm-svn: 373852
2019-10-06 14:44:22 +00:00
Matt Arsenault f24ac13aaa TLI: Remove DAG argument from getRegisterByName
Replace with the MachineFunction. X86 is the only user, and only uses
it for the function. This removes one obstacle from using this in
GlobalISel. The other is the more tolerable EVT argument.

The X86 use of the function seems questionable to me. It checks hasFP,
before frame lowering.

llvm-svn: 373292
2019-10-01 01:44:39 +00:00
Guillaume Chatelet 18f805a7ea [Alignment][NFC] Remove unneeded llvm:: scoping on Align types
llvm-svn: 373081
2019-09-27 12:54:21 +00:00
Benjamin Kramer 1b38002c7d Move classes into anonymous namespaces. NFC.
llvm-svn: 372495
2019-09-22 09:28:47 +00:00
Jinsong Ji e065e5f12a [NFC][PowerPC] Refactor classifyGlobalReference
We always(and only) check the NLP flag after calling
classifyGlobalReference to see whether it is accessed
indirectly.

Refactor to code to use isGVIndirectSym instead.

llvm-svn: 372417
2019-09-20 18:21:07 +00:00
Guillaume Chatelet 35b4b403b4 [Alignment][NFC] Use Align::None instead of 1
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67704

llvm-svn: 372230
2019-09-18 15:40:20 +00:00
Nemanja Ivanovic 1461fb6e78 [PowerPC] Exploit single instruction load-and-splat for word and doubleword
We currently produce a load, followed by (possibly a move for integers and) a
splat as separate instructions. VSX has always had a splatting load for
doublewords, but as of Power9, we have it for words as well. This patch just
exploits these instructions.

Differential revision: https://reviews.llvm.org/D63624

llvm-svn: 372139
2019-09-17 16:45:20 +00:00
Graham Hunter 1a9195d817 [SVE][MVT] Fixed-length vector MVT ranges
* Reordered MVT simple types to group scalable vector types
    together.
  * New range functions in MachineValueType.h to only iterate over
    the fixed-length int/fp vector types.
  * Stopped backends which don't support scalable vector types from
    iterating over scalable types.

Reviewers: sdesmalen, greened

Reviewed By: greened

Differential Revision: https://reviews.llvm.org/D66339

llvm-svn: 372099
2019-09-17 10:19:23 +00:00
Nemanja Ivanovic e63c676825 [PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32
Add the missing piece of r372029.
Somehow when the patch for review D61961 was committed, only the test case
went in and the code didn't. This of course caused all kinds of build bot
breaks.
This patch just adds the code for that patch.

Author: Lei Huang
Differential revision: https://reviews.llvm.org/D61961

llvm-svn: 372043
2019-09-16 22:54:52 +00:00
Craig Topper 36e04d14e9 [PowerPC] Remove the SPE4RC register class and instead add f32 to the GPRC register class.
Summary:
Since the SPE4RC register class contains an identical set of registers
and an identical spill size to the GPRC class its slightly confusing
the tablegen emitter. It's preventing the GPRC_and_GPRC_NOR0 synthesized
register class from inheriting VTs and AltOrders from GPRC or GPRC_NOR0.
This is because SPE4C is found first in the super register class list
when inheriting these properties and it doesn't set the VTs or
AltOrders the same way as GPRC or GPRC_NOR0.

This patch replaces all uses of GPE4RC with GPRC and allows GPRC and
GPRC_NOR0 to contain f32.

The test changes here are because the AltOrders are being inherited
to GPRC_NOR0 now.

Found while trying to determine if getCommonSubClass needs to take
a VT argument. It was originally added to support fp128 on x86-64,
I've changed some things about that so that it might be needed
anymore. But a PowerPC test crashed without it and I think its
due to this subclass issue.

Reviewers: jhibbits, nemanjai, kbarton, hfinkel

Subscribers: wuzish, nemanjai, mehdi_amini, hiraditya, kbarton, MaskRay, dexonsmith, jsji, shchenz, steven.zhang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67513

llvm-svn: 371779
2019-09-12 22:07:35 +00:00
Guillaume Chatelet 3729b17cff [Alignment][NFC] Use llvm::Align for TargetLowering::getPrefLoopAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Reviewed By: courbet

Subscribers: wuzish, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67386

llvm-svn: 371511
2019-09-10 12:00:43 +00:00
Guillaume Chatelet b6722af068 [Alignment] Use Align for TargetLowering::MinStackArgumentAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67288

llvm-svn: 371498
2019-09-10 09:01:18 +00:00
Craig Topper 5ebd0a6e88 [SelectionDAG] Remove ISD::FP_ROUND_INREG
I don't think anything in tree creates this node. So all of this
code appears to be dead.

Code coverage agrees
http://lab.llvm.org:8080/coverage/coverage-reports/llvm/coverage/Users/buildslave/jenkins/workspace/clang-stage2-coverage-R/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp.html

Differential Revision: https://reviews.llvm.org/D67312

llvm-svn: 371431
2019-09-09 17:54:44 +00:00
Guillaume Chatelet ad1cea0dda [Alignment][NFC] Use Align with TargetLowering::setPrefFunctionAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: nemanjai, javed.absar, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, ychen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67267

llvm-svn: 371212
2019-09-06 15:03:49 +00:00
Guillaume Chatelet 9fcf066d0c [Alignment][NFC] Use Align with TargetLowering::setPrefLoopAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, ychen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67278

llvm-svn: 371210
2019-09-06 14:51:15 +00:00
Guillaume Chatelet 4fc3ad9e13 [Alignment][NFC] Use Align with TargetLowering::setMinFunctionAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jyknight, sdardis, nemanjai, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67229

llvm-svn: 371200
2019-09-06 12:48:34 +00:00
Guillaume Chatelet aff45e4b23 [LLVM][Alignment] Make functions using log of alignment explicit
Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:

 - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
 - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
 - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,

Reviewers: lattner, thegameg, courbet

Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65945

llvm-svn: 371045
2019-09-05 10:00:22 +00:00
Roland Froese b4051e57b1 [PowerPC] Expand v1i128 smin
The smin opcode and friends for v1i128 are incorrectly marked as legal for PPC.
Change them to expand.

Differential Revision: https://reviews.llvm.org/D64960

llvm-svn: 369797
2019-08-23 19:04:47 +00:00
Sean Fertile 5f85a7b1cf [PowerPC] Add combined ELF ABI and 32/64 bit queries to the subtarget. [NFC]
A lot of places in the code combine checks for both ABI (SVR4/Darwin/AIX) and
addressing mode (64-bit vs 32-bit). In an attempt to make some of the code more
readable I've added a couple functions that combine checking for the ELF abi and
64-bit/32-bit code at once. As we add more AIX support I intend to add similar
functions for the AIX ABI.

Differential Revision: https://reviews.llvm.org/D65814

llvm-svn: 369658
2019-08-22 15:11:28 +00:00
Daniel Sanders 0c47611131 Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor

Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&

Depends on D65919

Reviewers: arsenm, bogner, craig.topper, RKSimon

Reviewed By: arsenm

Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65962

llvm-svn: 369041
2019-08-15 19:22:08 +00:00
Jason Liu 8fc095d453 [AIX] Add call lowering for parameters that could pass onto FPRs
Summary:
This patch adds call lowering functionality to enable passing
parameters onto floating point registers when needed.

Differential Revision: https://reviews.llvm.org/D63654

llvm-svn: 368855
2019-08-14 14:13:11 +00:00
Xiangling Liao a8c624a1c4 [AIX]Lowering global address for 32/64bit small/large code models
This patch implements global address lowering for 32/64 bit with small/large code models.
    1.For 32bit large code model on AIX, there are newly added pseudo opcode LWZtocL & ADDIStocHA32, the support of which on MC layer will be
       provided by future patches.
    2.The default code model on AIX should be small code model.
    3.Since AIX does not have medium code model, "report_fatal_error" when users specify it.

    Differential Revision: https://reviews.llvm.org/D63547

llvm-svn: 368744
2019-08-13 20:29:01 +00:00
Qiu Chaofan 4fb99a3330 [PowerPC] Fix ICE when truncating some vectors
The legalizer would hit an assertion on PowerPC platform when truncating
a vector whose size is not power of 2.  This patch is to add a check to
prevent vectors with such odd-size elements from being custom lowered.

Reviewed By: Hal Finkel

Differential Revision: https://reviews.llvm.org/D65261

llvm-svn: 368654
2019-08-13 07:53:29 +00:00
Guillaume Chatelet c97a3d15d2 [LLVM][Alignment] Introduce Alignment Type
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, jfb, jakehehrlich

Reviewed By: jfb

Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65514

llvm-svn: 367828
2019-08-05 11:02:05 +00:00
Zi Xuan Wu 66c320908b recommit:[PowerPC] Eliminate loads/swap feeding swap/store for vector type by using big-endian load/store
In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. 
So we can combine vector load + reverse into big endian load to eliminate the swap instruction.
Also combine vector reverse + store into big endian store.

Differential Revision: https://reviews.llvm.org/D65063

llvm-svn: 367516
2019-08-01 05:26:02 +00:00
Zi Xuan Wu 54d446f70e revert r367382 because buildbot failure
llvm-svn: 367388
2019-07-31 07:03:42 +00:00
Zi Xuan Wu e85f6bf66c [PowerPC] Eliminate loads/swap feeding swap/store for vector type by using big-endian load/store
In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. 
So we can combine vector load + reverse into big endian load to eliminate the swap instruction.
Also combine vector reverse + store into big endian store.

llvm-svn: 367382
2019-07-31 02:56:00 +00:00
Jason Liu 8dd563ef4b [NFC][PowerPC]Change ADDIStocHA to ADDIStocHA8 to follow 64-bit naming convention
Summary:

Since we are planning to add ADDIStocHA for 32bit in later patch, we decided
 to change 64bit one first to follow naming convention with 8 behind opcode.

Patch by: Xiangling_L

Differential Revision: https://reviews.llvm.org/D64814

llvm-svn: 366731
2019-07-22 19:55:33 +00:00
Justin Hibbits 5214956eaa PowerPC/SPE: Fix load/store handling for SPE
Summary:
Pointed out in a comment for D49754, register spilling will currently
spill SPE registers at almost any offset.  However, the instructions
`evstdd` and `evldd` require a) 8-byte alignment, and b) a limit of 256
(unsigned) bytes from the base register, as the offset must fix into a
5-bit offset, which ranges from 0-31 (indexed in double-words).

The update to the register spill test is taken partially from the test
case shown in D49754.

Additionally, pointed out by Kei Thomsen, globals will currently use
evldd/evstdd, though the offset isn't known at compile time, so may
exceed the 8-bit (unsigned) offset permitted.  This fixes that as well,
by forcing it to always use evlddx/evstddx when accessing globals.

Part of the patch contributed by Kei Thomsen.

Reviewers: nemanjai, hfinkel, joerg

Subscribers: kbarton, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D54409

llvm-svn: 366318
2019-07-17 12:30:04 +00:00
David Tenty a2681296e0 [NFC]Fix IR/MC depency issue for function descriptor SDAG implementation
Summary: llvm/IR/GlobalValue.h can't be included in MC, that creates a circular dependency between MC and IR libraries. This circular dependency is causing an issue for build system that enforce layering.

Author: Xiangling_L

Reviewers: sfertile, jasonliu, hubert.reinterpretcast, gribozavr

Reviewed By: gribozavr

Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64445

llvm-svn: 365701
2019-07-10 22:13:55 +00:00
Fangrui Song 1f333562de [PowerPC] Support constraint code "ww"
Summary:
"ww" and "ws" are both constraint codes for VSX vector registers that
hold scalar double data. "ww" is preferred for float while "ws" is
preferred for double.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D64119

llvm-svn: 365106
2019-07-04 04:44:42 +00:00
Roman Lebedev c4b83a6054 [Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457)
Summary:
This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]].
In middle-end, we'd want to prefer the form with two adds - D63992,
but as this diff shows, not every target will prefer that pattern.

Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars,
but only X86 prefer that same pattern for vectors.

Here i'm adding a new TLI hook, always defaulting to the inc-of-add,
but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars.

Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel

Reviewed By: efriedma

Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64090

llvm-svn: 365010
2019-07-03 09:41:35 +00:00
Jinsong Ji 157b073fa5 [PowerPC][HTM] Fix disassembling buffer overflow for tabortdc and others
This was reported in https://bugs.llvm.org/show_bug.cgi?id=41751
llvm-mc aborted when disassembling tabortdc.

This patch try to clean up TM related DAGs.

* Fixes the problem by remove explicit output of cr0, and put it as implicit def.
* Update int_ppc_tbegin pattern to accommodate the implicit def of cr0.
* Update the TCHECK operand and int_ppc_tcheck accordingly.
* Add some builtin test and disassembly tests.
* Remove unused CRRC0/crrc0

Differential Revision: https://reviews.llvm.org/D61935

llvm-svn: 364544
2019-06-27 14:11:31 +00:00
Nemanja Ivanovic 8265e8ff36 [PowerPC] Mark FCOPYSIGN legal for FP vectors
This was just an omission in the back end. We have had the instructions for both
single and double precision for a few HW generations, but never got around to
legalizing these.

Differential revision: https://reviews.llvm.org/D63634

llvm-svn: 364373
2019-06-26 01:48:57 +00:00
Matt Arsenault e3a676e9ad CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().

llvm-svn: 364191
2019-06-24 15:50:29 +00:00
Justin Hibbits 1d1cf30b73 PowerPC: Optimize SPE double parameter calling setup
Summary:
SPE passes doubles the same as soft-float, in register pairs as i32
types.  This is all handled by the target-independent layer.  However,
this is not optimal when splitting or reforming the doubles, as it
pushes to the stack and loads from, on either side.

For instance, to pass a double argument to a function, assuming the
double value is in r5, the sequence currently looks like this:

    evstdd      5, X(1)
    lwz         3, X(1)
    lwz         4, X+4(1)

Likewise, to form a double into r5 from args in r3 and r4:

    stw         3, X(1)
    stw         4, X+4(1)
    evldd       5, X(1)

This optimizes the fence to use SPE instructions.  Now, to pass a double
to a function:

    mr          4, 5
    evmergehi   3, 5, 5

And to form a double into r5 from args in r3 and r4:

    evmergelo   5, 3, 4

This is comparable to the way that gcc generates the double splits.

This also fixes a bug with expanding builtins to libcalls, where the
LowerCallTo() code path was generating intermediate illegal type nodes.

Reviewers: nemanjai, hfinkel, joerg

Subscribers: kbarton, jfb, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D54583

llvm-svn: 363526
2019-06-17 03:15:23 +00:00
Kang Zhang 2d51adcb57 [PowerPC] Set the innermost hot loop to align 32 bytes
Summary:
If the nested loop is an innermost loop, prefer to a 32-byte alignment, so that
we can decrease cache misses and branch-prediction misses. Actual alignment of
 the loop will depend on the hotness check and other logic in alignBlocks.

The old code will only align hot loop to 32 bytes when the LoopSize larger than
16 bytes and smaller than 32 bytes, this patch will align the innermost hot loop
 to 32 bytes not only for the hot loop whose size is 16~32 bytes.

Reviewed By: steven.zhang, jsji

Differential Revision: https://reviews.llvm.org/D61228

llvm-svn: 363495
2019-06-15 15:10:24 +00:00
Simon Pilgrim 4e0648a541 [TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests (PR42123)
As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space.

This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them.

If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores.

Differential Revision: https://reviews.llvm.org/D63075

llvm-svn: 363179
2019-06-12 17:14:03 +00:00
Sam Parker c5ef502ee8 [CodeGen] Generic Hardware Loop Support
Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.
    
Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
  Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
  Takes the maximum number of elements processed in an iteration of
  the loop body and subtracts this from the total count. Returns
  false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
  Takes the number of elements remaining to be processed as well as
  the maximum numbe of elements processed in an iteration of the loop
  body. Returns the updated number of elements remaining.

llvm-svn: 362774
2019-06-07 07:35:30 +00:00
Nemanja Ivanovic ef4a3aa549 [PowerPC] Exploit the vector min/max instructions
Use the PPC vector min/max instructions for computing the corresponding
operation as these should be faster than the compare/select sequences
we currently emit.

Differential revision: https://reviews.llvm.org/D47332

llvm-svn: 362759
2019-06-06 23:49:01 +00:00
Jason Liu 60ec248148 [AIX] Implement function descriptor on SDAG
Summary:
(1) Function descriptor on AIX
On AIX, a called routine may have 2 distinct symbols associated with it:
 * A function descriptor (Name)
 * A function entry point (.Name)

The descriptor structure on AIX is the same as those in the ELF V1 ABI:
 * The address of the entry point of the function.
 * The TOC base address for the function.
 * The environment pointer.

The descriptor symbol uses the same name as the source level function in C.
The function entry point is analogous to the symbol we would generate for a
 function in a non-descriptor-based ABI, except that it is renamed by
prepending a ".".

Which symbol gets referenced depends on the context:
 * Taking the address of the function references the descriptor symbol.
 * Calling the function references the entry point symbol.

(2) Speaking of implementation on AIX, for direct function call target, we
 create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to
 replace original TargetGlobalAddress SDNode. Then down the path, we can
 take advantage of this MCSymbol.

Patch by: Xiangling_L

Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara

Differential Revision: https://reviews.llvm.org/D62532

llvm-svn: 362735
2019-06-06 19:13:36 +00:00
Jason Liu 0338b88861 [AIX] Implement call lowering with parameters could pass onto GPRs
Summary:
This patch implements SDAG call lowering on AIX for functions
which only have parameters that could fit into GPRs.

Reviewers: hubert.reinterpretcast, syzaara

Differential Revision: https://reviews.llvm.org/D62823

llvm-svn: 362708
2019-06-06 14:36:43 +00:00
Jason Liu 8e1d921bb3 Implement call lowering without parameters on AIX
Summary:dd
This patch implements call lowering for calls without parameters
on AIX as initial support.

Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma

Differential Revision: https://reviews.llvm.org/D61948

llvm-svn: 361669
2019-05-24 20:54:35 +00:00
Chen Zheng 9970665f60 [PowerPC] [ISEL] select x-form instruction for unaligned offset
Differential Revision: https://reviews.llvm.org/D62173

llvm-svn: 361346
2019-05-22 02:57:31 +00:00
Chen Zheng c4c407a0eb [PowerPC] use more meaningful name - NFC
llvm-svn: 361218
2019-05-21 03:54:42 +00:00
Lei Huang 1ac6e9636c [PowerPC] custom lower `v2f64 fpext v2f32`
Reduces scalarization overhead via custom lowering of v2f64 fpext v2f32.

eg. For the following IR
  %0 = load <2 x float>, <2 x float>* %Ptr, align 8
  %1 = fpext <2 x float> %0 to <2 x double>
  ret <2 x double> %1

Pre custom lowering:
  ld r3, 0(r3)
  mtvsrd f0, r3
  xxswapd vs34, vs0
  xscvspdpn f0, vs0
  xxsldwi vs1, vs34, vs34, 3
  xscvspdpn f1, vs1
  xxmrghd vs34, vs0, vs1

After custom lowering:
  lfd f0, 0(r3)
  xxmrghw vs0, vs0, vs0
  xvcvspdp vs34, vs0

Differential Revision: https://reviews.llvm.org/D57857

llvm-svn: 360429
2019-05-10 14:04:06 +00:00
Nemanja Ivanovic b4f028f0f3 [PowerPC] Use the two-constant NR algorithm for refining estimates
The single-constant algorithm produces infinities on a lot of denormal values.
The precision of the two-constant algorithm is actually sufficient across the
range of denormals. We will switch to that algorithm for now to avoid the
infinities on denormals. In the future, we will re-evaluate the algorithm to
find the optimal one for PowerPC.

Differential revision: https://reviews.llvm.org/D60037

llvm-svn: 360144
2019-05-07 13:48:03 +00:00
Nemanja Ivanovic 70afe4f7e1 [PowerPC] Fix erroneous condition for converting uint-to-fp vector conversion
A condition for exiting the legalization of v4i32 conversion to v2f64 through
extract/convert/build erroneously checks for the extract having type i32.
This is not adequate as smaller extracts are actually legalized to i32 as well.
Furthermore, an early exit is missing which means that we only check that
both extracts are from the same vector if that check fails.
As a result, both cases in the included test case fail - the first gets a
select error and the second generates incorrect code.

The culprit commit is r274535.

llvm-svn: 360043
2019-05-06 13:35:49 +00:00
Simon Pilgrim aa49be4926 Avoid cppcheck operator precedence warnings. NFCI.
Prefer ((X & Y) ? A : B) to (X & Y ? A : B)

llvm-svn: 359884
2019-05-03 13:50:38 +00:00
Kang Zhang 1a0d6d6899 [NFC][PowerPC] Return early if the element type is not byte-sized in combineBVOfConsecutiveLoads
Summary:
Based on the Eli Friedman's comments in https://reviews.llvm.org/D60811 , we'd better return early if the element type is not byte-sized in `combineBVOfConsecutiveLoads`.

Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D61076

llvm-svn: 359764
2019-05-02 08:15:13 +00:00
Sjoerd Meijer 180f1ae57c [TargetLowering] Change getOptimalMemOpType to take a function attribute list
The MachineFunction wasn't used in getOptimalMemOpType, but more importantly,
this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType.

This is the groundwork for the changes in D59766 and D59787, that allows
implementation of TTI::getMemcpyCost.

Differential Revision: https://reviews.llvm.org/D59785

llvm-svn: 359537
2019-04-30 08:38:12 +00:00
Roland Froese 728e139700 [PowerPC] Try harder to avoid load/move-to VSR for partial vector loads
Change the PPCISelLowering.cpp function that decides to avoid update form in
favor of partial vector loads to know about newer load types and to not be
confused by the chain operand.

Differential Revision: https://reviews.llvm.org/D60102

llvm-svn: 359504
2019-04-29 21:08:35 +00:00
Joerg Sonnenberger 8372b467f1 [PowerPC] Allow using initial-exec TLS with PIC
Using initial-exec TLS variables is a reasonable performance
optimisation for system libraries. Use the correct PIC mechanism to get
hold of the GOT to avoid text relocations.

Differential Revision: https://reviews.llvm.org/D61026

llvm-svn: 359146
2019-04-24 22:12:22 +00:00
Sean Fertile 526633deea Add period at end of comment.
llvm-svn: 359144
2019-04-24 21:51:30 +00:00
Kang Zhang 009a21d2fd [PowerPC] Fix wrong ElemSIze when calling isConsecutiveLS()
Summary:
This issue from the bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41177

When the two operands for BUILD_VECTOR are same, we will get assert error.
llvm::SDValue combineBVOfConsecutiveLoads(llvm::SDNode*, llvm::SelectionDAG&):
Assertion `!(InputsAreConsecutiveLoads && InputsAreReverseConsecutive) &&
"The loads cannot be both consecutive and reverse consecutive."' failed.

This error caused by the wrong ElemSIze when calling isConsecutiveLS(). We
should use `getScalarType().getStoreSize();` to get the ElemSize instread of
 `getScalarSizeInBits() / 8`.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60811

llvm-svn: 358644
2019-04-18 07:24:15 +00:00
Evandro Menezes 85bd3978ae [IR] Refactor attribute methods in Function class (NFC)
Rename the functions that query the optimization kind attributes.

Differential revision: https://reviews.llvm.org/D60287

llvm-svn: 357731
2019-04-04 22:40:06 +00:00
Kang Zhang 05f78b35ae [PowerPC] Add the support for __builtin_setrnd()
Summary:
PowerPC64/PowerPC64le supports the builtin function __builtin_setrnd to set the floating point rounding mode. This function will use the least significant two bits of integer argument to set the floating point rounding mode.
double __builtin_setrnd(int mode);
The effective values for mode are:
0 - round to nearest
1 - round to zero
2 - round to +infinity
3 - round to -infinity
Note that the mode argument will modulo 4, so if the int argument is greater than 3, it will only use the least significant two bits of the mode. Namely, builtin_setrnd(102)) is equal to builtin_setrnd(2).

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D59405

llvm-svn: 357241
2019-03-29 08:45:24 +00:00
Zi Xuan Wu 1445b77e8c [PowerPC] Strength reduction of multiply by a constant by shift and add/sub in place
A shift and add/sub sequence combination is faster in place of a multiply by constant. 
Because the cycle or latency of multiply is not huge, we only consider such following
worthy patterns.

```
(mul x, 2^N + 1) => (add (shl x, N), x)
(mul x, -(2^N + 1)) => -(add (shl x, N), x)
(mul x, 2^N - 1) => (sub (shl x, N), x)
(mul x, -(2^N - 1)) => (sub x, (shl x, N))
```

And the cycles or latency is subtarget-dependent so that we need consider the
subtarget to determine to do or not do such transformation. 
Also data type is considered for different cycles or latency to do multiply.

Differential Revision: https://reviews.llvm.org/D58950

llvm-svn: 357233
2019-03-29 03:08:39 +00:00
Simon Pilgrim 77482120da Fix for ABS legalization on PPC buildbot.
llvm-svn: 356498
2019-03-19 18:55:46 +00:00
Simon Pilgrim a56f2822d0 [SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect
These changes are related to PR37743 and include:

    SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node.

    Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner.

    Add promoting the integer ABS node in the LegalizeIntegerType.

    Expand-based legalization of integer result for the ABS nodes.

    Expand-based legalization of ABS vector operations.

    Add some integer abs testcases for different typesizes for Thumb arch

    Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to:
        tmp = (SRA, Hi, 31)
        Lo = (UADDO tmp, Lo)
        Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1))
        Lo = (XOR tmp, Lo)

    The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern:
        (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))).

    Change integer abs testcases for codegen with the ABS node support for AArch64.
        Indicate that the ABS is legal for the i64 type when the NEON is supported.
        Change the integer abs testcases to show changing of codegen.

    Add combine and legalization of ABS nodes for Thumb arch.

    Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition.

For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743

Patch by: @ikulagin (Ivan Kulagin)

Differential Revision: https://reviews.llvm.org/D49837

llvm-svn: 356468
2019-03-19 16:24:55 +00:00
Adhemerval Zanella 664c1ef528 [TargetLowering] Add code size information on isFPImmLegal. NFC
This allows better code size for aarch64 floating point materialization
in a future patch.

Reviewers: evandro

Differential Revision: https://reviews.llvm.org/D58690

llvm-svn: 356389
2019-03-18 18:40:07 +00:00
Roland Froese 732fe22454 [PowerPC] Avoid scalarization of vector truncate
The PowerPC code generator currently scalarizes vector truncates that would fit in a vector register, resulting in vector extracts, scalar operations, and vector merges. This patch custom lowers a vector truncate that would fit in a register to a vector shuffle instead.

Differential Revision: https://reviews.llvm.org/D56507

llvm-svn: 353724
2019-02-11 17:29:14 +00:00
Reid Kleckner 85e72c3d56 [PPC] Include tablegenerated PPCGenCallingConv.inc once
Move the CC analysis implementation to its own .cpp file instead of
duplicating it and artificually using functions in PPCISelLowering.cpp
and PPCFastISel.cpp. Follow-up to the same change done for X86, ARM, and
AArch64.

llvm-svn: 352444
2019-01-29 00:30:35 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Kang Zhang 9d78c60bf4 [PowerPC] Fix machine verify pass error for PATCHPOINT pseudo instruction that bad machine code
Summary:
For SDAG, we pretend patchpoints aren't special at all until we emit the code for the pseudo.
Then the verifier runs and it seems like we have a use of an undefined register (the register will 
be reserved later, but the verifier doesn't know that).

So this patch call setUsesTOCBasePtr before emit the code for the pseudo, so verifier can know 
X2 is a reserved register.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D56148

llvm-svn: 350165
2018-12-30 15:13:51 +00:00
Nemanja Ivanovic 0f7715afe1 [PowerPC] Complete the custom legalization of vector int to fp conversion
A recent patch has added custom legalization of vector conversions of
v2i16 -> v2f64. This just rounds it out for other types where the input vector
has an illegal (narrower) type than the result vector. Specifically, this will
handle the following conversions:

v2i8 -> v2f64
v4i8 -> v4f32
v4i16 -> v4f32

Differential revision: https://reviews.llvm.org/D54663

llvm-svn: 350155
2018-12-29 13:40:48 +00:00
Zi Xuan Wu 5187444345 [NFC] clang-format functions related to r350113
llvm-svn: 350114
2018-12-28 02:45:17 +00:00
Zi Xuan Wu a02a3feecf [PowerPC] Fix assert from machine verify pass that atomic pseudo expanding causes mismatched register class
For atomic value operand which less than 4 bytes need to be masked. 
And the related operation to calculate the newvalue can be done in 32 bit gprc. 
So just use gprc for mask and value calculation.

Differential Revision: https://reviews.llvm.org/D56077

llvm-svn: 350113
2018-12-28 02:12:55 +00:00
Kang Zhang d501a1e596 [PowerPC] Fix the bug of ISD::ADDE to set its second return type to glue
Summary:
This patch is to fix the bug imported by rL341634.
In above submit , the the return type of ISD::ADDE is 
14224: SDVTList VTs = DAG.getVTList(MVT::i64, MVT::i64), 
but in fact, the second return type of ISD::ADDE should be 
MVT::Glue not MVT::i64.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D55977

llvm-svn: 350061
2018-12-25 03:29:51 +00:00
Simon Pilgrim af1ab22a76 [PPC] Always use the version of computeKnownBits that returns a value. NFCI.
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.

llvm-svn: 349903
2018-12-21 14:32:39 +00:00
Kewen Lin a6247e7cf4 [PowerPC]Exploit P9 vabsdu for unsigned vselect patterns
For type v4i32/v8ii16/v16i8, do following transforms:
  (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) -> (vabsd a, b)
  (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) -> (vabsd a, b)
  (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) -> (vabsd a, b)
  (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) -> (vabsd a, b)

Differential Revision: https://reviews.llvm.org/D55812

llvm-svn: 349599
2018-12-19 03:04:07 +00:00
Kewen Lin 3dac1252da [PowerPC] Improve vec_abs on P9
Improve the current vec_abs support on P9, generate ISD::ABS node for vector types,
combine ABS node to VABSD node for some special cases to make use of P9 VABSD* insns,
do custom lowering to vsub(vneg later)+vmax if it has no combination opportunity.

Differential Revision: https://reviews.llvm.org/D54783

llvm-svn: 349437
2018-12-18 03:16:43 +00:00
QingShan Zhang 8b7653db72 [NFC] [PowerPC] add an routine in PPCTargetLowering to determine if a global is accessed as got-indirect or not.
In theory, we should let the PPC target to determine how to lower the TOC Entry for globals. 
And the PPCTargetLowering requires this query to do some optimization for TOC_Entry. 

Differential Revision: https://reviews.llvm.org/D54925

llvm-svn: 348108
2018-12-03 03:32:16 +00:00
Nemanja Ivanovic 5cf902ccd4 [PowerPC] Do not use vectors to codegen bswap with Altivec turned off
We have efficient codegen on P9 for lowering bswap that involves moving
the value into a vector reg and moving it back. However, the check under
which we custom lowered it did not adequately reflect the actual requirements.
It required only that the subtarget be an implementation of ISA 3.0 since all
compliant implementations have to provide the vector instructions.
However, the kernel builds have a valid use case for -mno-altivec -mcpu=pwr9
(i.e. don't emit vector code, don't have to save vector regs for context
switch). So we should require the correct features for this lowering.
Fixes https://bugs.llvm.org/show_bug.cgi?id=39334

llvm-svn: 347376
2018-11-21 02:53:50 +00:00
Nemanja Ivanovic 9b393909e2 [PowerPC] Don't combine to bswap store on 1-byte truncating store
Turns out that there was no check for a store that truncates down
to a single byte when combining a (store (bswap...)) into a byte-swapping
store. This patch just adds that check.

Fixes https://bugs.llvm.org/show_bug.cgi?id=39478.

llvm-svn: 347288
2018-11-20 04:42:31 +00:00
Zi Xuan Wu 6a3c279d1c [PowerPC] Enhance the selection(ISD::VSELECT) of vector type
To make ISD::VSELECT available(legal) so long as there are altivec instruction, otherwise it's default behavior is expanding,
which is legalized at type-legalization phase. Use xxsel to match vselect if vsx is open, or use vsel.


Differential Revision: https://reviews.llvm.org/D49531

llvm-svn: 346824
2018-11-14 02:34:45 +00:00
Reid Kleckner 4dc0b1ac60 Fix clang -Wimplicit-fallthrough warnings across llvm, NFC
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
   of only 'break'.

We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
   doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
   the outer case.

I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.

Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu

Differential Revision: https://reviews.llvm.org/D53950

llvm-svn: 345882
2018-11-01 19:54:45 +00:00
Li Jia He 03170a904f [PowerPC] Support constraint 'wi' in asm
From the gcc manual, we can see that the specific limit of wi inline asm is “FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS”. The link is https://gcc.gnu.org/onlinedocs/gcc-8.2.0/gcc/Machine-Constraints.html#Machine-Constraints. We should accept this constraint.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D53265

llvm-svn: 345810
2018-11-01 02:35:17 +00:00
Li Jia He f6fb752fe8 [PowerPC] Fix some missed optimization opportunities in combineSetCC
For both operands are bool, short, int, long, long long, add the following optimization.
1. 0-x == y --> x+y ==0
2. 0-x != y --> x+y != 0

Review: nemanjai
Differential Revision: https://reviews.llvm.org/D53360

llvm-svn: 345366
2018-10-26 06:48:53 +00:00
Nemanja Ivanovic 6a74bfba20 [PowerPC] Keep vector int to fp conversions in vector domain
At present a v2i16 -> v2f64 convert is implemented by extracts to scalar,
scalar converts, and merge back into a vector. Use vector converts instead,
with the int data permuted into the proper position and extended if necessary.

Patch by RolandF.

Differential revision: https://reviews.llvm.org/D53346

llvm-svn: 345361
2018-10-26 03:19:13 +00:00
Stefan Pintilie 927e8bf316 [Power9] Add __float128 support in the backend for bitcast to a i128
Add support to allow bit-casting from f128 to i128 and then
extracting 64 bits from the result.

Differential Revision: https://reviews.llvm.org/D49507

llvm-svn: 345053
2018-10-23 17:11:36 +00:00
QingShan Zhang bc1586352e [PowerPC] Fix the assert of ISD::SIGN_EXTEND_INREG when type is v2i16 and v2i8
For ISD::SIGN_EXTEND_INREG operation of v2i16 and v2i8 types will cause assert because they are registered as custom operation. 
So that the type legalization phase will enter the custom hook, which do not handle ISD::SIGN_EXTEND_INREG operation and fall throw into unreachable assert.

Patch By: wuzish (Zixuan Wu)
Differential Revision: https://reviews.llvm.org/D52449

llvm-svn: 344109
2018-10-10 02:33:48 +00:00
Nemanja Ivanovic 87873d04c3 [PowerPC] Implement hasBitPreservingFPLogic for types that can be supported
This is the PPC-specific non-controversial part of
https://reviews.llvm.org/D44548 that simply enables this combine for PPC
since PPC has these instructions.
This commit will allow the target-independent portion to be truly target
independent.

llvm-svn: 344077
2018-10-09 20:35:15 +00:00
QingShan Zhang cae9425a3c [PowerPC] Fix the assert of combineBVOfConsecutiveLoads when element num is 1
Building a vector out of multiple loads can be converted to a load of the vector type if the loads are consecutive.
But the special condition is that the element number is 1, such as <1 x i128>. So just early exit to fix the assert.

Patch By: wuzish (Zixuan Wu)
Differential Revision: https://reviews.llvm.org/D52072

llvm-svn: 342611
2018-09-20 03:09:15 +00:00
Strahinja Petrovic 488fd4e625 [PowerPC] Fix label address calculation for ppc64
This patch fixes calculating address of label for non-pic ppc64.

Differential Revision: https://reviews.llvm.org/D50965

llvm-svn: 342368
2018-09-17 11:03:40 +00:00
Lion Yang c68f78d5d8 [PowerPC] Fix the calling convention for i1 arguments on PPC32
Summary:
Integer types smaller than i32 must be extended to i32 by default.
The feature "crbits" introduced at r202451 handles i1 as a special case,
but it did not extend properly.
The caller was, therefore, passing i1 stack arguments by writing 0/1 to
the first byte of the 4-byte stack object and callee was
reading the first byte for the value.

"crbits" is enabled if the optimization level is greater than 1,
which is very common in "release builds".
Such discrepancies with ABI specification also introduces
potential incompatibility with programs or libraries
built with other compilers e.g. GCC.

Fixes PR38661

Reviewers: hfinkel, cuviper

Subscribers: sylvestre.ledru, glaubitz, nagisa, nemanjai, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D51108

llvm-svn: 342288
2018-09-14 21:26:05 +00:00
QingShan Zhang abbb894ff5 [PowerPC] Combine ADD to ADDZE
On the ppc64le platform, if ir has the following form,

define i64 @addze1(i64 %x, i64 %z) local_unnamed_addr #0 {
entry:
  %cmp = icmp ne i64 %z, CONSTANT      (-32767 <= CONSTANT <= 32768)
  %conv1 = zext i1 %cmp to i64
  %add = add nsw i64 %conv1, %x
  ret i64 %add
}
we can optimize it to the form below.

                                when C == 0
                            --> addze X, (addic Z, -1))
                           /
add X, (zext(setne Z, C))--
                           \    when -32768 <= -C <= 32767 && C != 0
                            --> addze X, (addic (addi Z, -C), -1)

Patch By: HLJ2009 (Li Jia He)
Differential Revision: https://reviews.llvm.org/D51403
Reviewed By: Nemanjai 

llvm-svn: 341634
2018-09-07 07:56:05 +00:00
Nemanja Ivanovic 5d06f17b8a [PowerPC] Revert commit r339779
This commit has caused failures in some internal benchmarks. Temporarily
reverting this patch until the issue can be diagnosed and fixed.

llvm-svn: 340740
2018-08-27 13:20:42 +00:00
Nemanja Ivanovic f2588a28a8 [PowerPC] Recommit r340016 after fixing the reported issue
The internal benchmark failure reported by Google was due to a missing
check for the result type for the sign-extend and shift DAG. This commit
adds the check and re-commits the patch.

llvm-svn: 340734
2018-08-27 11:20:27 +00:00
Eric Christopher 3dc594c1e6 Temporarily Revert "[PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction" due to it causing a compiler crash on valid.
This reverts commit r340016, testcase forthcoming.

llvm-svn: 340315
2018-08-21 18:35:08 +00:00
Stefan Pintilie 39869ccf51 [PowerPC] Generate lxsd instead of the ld->mtvsrd sequence for vector loads
This patch addresses:

- Implementation within PPCISelLowering.cpp to check if we should use direct
load into vector instructions (such as lxsd/lfd ) when the scalar_to_vector
function is used; which will allow us to catch as many cases of the
scalar_to_vector uses as possible to translate the ld->mtvsrd sequence into
lxsd.

- Test cases to exhibit the behaviour of emitting lxsd/lfd.

Patch by amyk

Differential revision: https://reviews.llvm.org/D49698

llvm-svn: 340037
2018-08-17 15:15:26 +00:00
Nemanja Ivanovic 39751276b0 [PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction
Add a DAG combine for the PowerPC code generator to generate the Power9 extswsli
extend sign and shift immediate instruction.

Patch by RolandF.

Differential revision: https://reviews.llvm.org/D49879

llvm-svn: 340016
2018-08-17 12:35:44 +00:00
Chandler Carruth c73c0307fe [MI] Change the array of `MachineMemOperand` pointers to be
a generically extensible collection of extra info attached to
a `MachineInstr`.

The primary change here is cleaning up the APIs used for setting and
manipulating the `MachineMemOperand` pointer arrays so chat we can
change how they are allocated.

Then we introduce an extra info object that using the trailing object
pattern to attach some number of MMOs but also other extra info. The
design of this is specifically so that this extra info has a fixed
necessary cost (the header tracking what extra info is included) and
everything else can be tail allocated. This pattern works especially
well with a `BumpPtrAllocator` which we use here.

I've also added the basic scaffolding for putting interesting pointers
into this, namely pre- and post-instruction symbols. These aren't used
anywhere yet, they're just there to ensure I've actually gotten the data
structure types correct. I'll flesh out support for these in
a subsequent patch (MIR dumping, parsing, the works).

Finally, I've included an optimization where we store any single pointer
inline in the `MachineInstr` to avoid the allocation overhead. This is
expected to be the overwhelmingly most common case and so should avoid
any memory usage growth due to slightly less clever / dense allocation
when dealing with >1 MMO. This did require several ergonomic
improvements to the `PointerSumType` to reasonably support the various
usage models.

This also has a side effect of freeing up 8 bits within the
`MachineInstr` which could be repurposed for something else.

The suggested direction here came largely from Hal Finkel. I hope it was
worth it. ;] It does hopefully clear a path for subsequent extensions
w/o nearly as much leg work. Lots of thanks to Reid and Justin for
careful reviews and ideas about how to do all of this.

Differential Revision: https://reviews.llvm.org/D50701

llvm-svn: 339940
2018-08-16 21:30:05 +00:00
Nemanja Ivanovic 5b9a4f8ee5 [PowerPC] Enhance the selection(ISD::VSELECT) of vector type
To make ISD::VSELECT available(legal) so long as there are altivec instruction,
otherwise it's default behavior is expanding.
Use xxsel to match vselect if vsx is open, or use vsel.

In order to do not write many patterns in td file, promote (for vector it's
bitcast) all other type into v4i32 and only pattern match vselect of v4i32 into
vsel or xxsel.

Patch by wuzish
Differential revision: https://reviews.llvm.org/D49531

llvm-svn: 339779
2018-08-15 15:30:36 +00:00
Nemanja Ivanovic 8b4bd09e22 [PowerPC] Don't run BV DAG Combine before legalization if it assumes legal types
When trying to combine a DAG that builds a vector out of sign-extensions of
vector extracts, the code assumes legal input types. Due to that, we have to
disable this combine prior to legalization.
In some cases, the DAG will look slightly different after legalization so
account for that in the matching code.

This is a fix for https://bugs.llvm.org/show_bug.cgi?id=38087

Differential Revision: https://reviews.llvm.org/D49080

llvm-svn: 339769
2018-08-15 12:58:13 +00:00
Zaara Syeda b2595b988b [PowerPC] Improve codegen for vector loads using scalar_to_vector
This patch aims to improve the codegen for vector loads involving the
scalar_to_vector (load X) sequence. Initially, ld->mv instructions were used
for scalar_to_vector (load X), so this patch allows scalar_to_vector (load X)
to utilize:

LXSD and LXSDX for i64 and f64
LXSIWAX for i32 (sign extension to i64)
LXSIWZX for i32 and f64

Committing on behalf of Amy Kwan.
Differential Revision: https://reviews.llvm.org/D48950

llvm-svn: 339260
2018-08-08 15:20:43 +00:00
Nemanja Ivanovic e1a525ed06 [PowerPC] Do not round values prior to converting to integer
Adding the FP_ROUND nodes when combining FP_TO_[SU]INT of elements
feeding a BUILD_VECTOR into an FP_TO_[SU]INT of the built vector
loses precision. This patch removes the code that adds these nodes
to true f64 operands. It also adds patterns required to ensure
the code is still vectorized rather than converting individual
elements and inserting into a vector.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38342

Differential Revision: https://reviews.llvm.org/D50121

llvm-svn: 338658
2018-08-02 00:03:22 +00:00
Craig Topper 2f60ef2c78 [DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to BuildSDIV/BuildUDIV/etc.
The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation.

llvm-svn: 338329
2018-07-30 23:22:00 +00:00
Craig Topper a568a27dfa [DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to BuildSDIVPow2.
llvm-svn: 338303
2018-07-30 21:04:34 +00:00
Matt Arsenault 81920b0a25 DAG: Add calling convention argument to calling convention funcs
This seems like a pretty glaring omission, and AMDGPU
wants to treat kernels differently from other calling
conventions.

llvm-svn: 338194
2018-07-28 13:25:19 +00:00
Justin Hibbits 22e939a15b Fix build failures from r337347, found by clang
* Delete a no-longer-used override, and mark the other
getRegisterTypeForCallingConv() as override.
* SPE only supports i32, not i64, as the internal type, so simply remove
the type check, so that DestReg and Opc are provably always set.

GCC 6.4 did not warn about either of the above.

llvm-svn: 337350
2018-07-18 05:19:25 +00:00
Justin Hibbits d52990c71b Introduce codegen for the Signal Processing Engine
Summary:
The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1,
e500v2, and several e200 cores.  This adds support targeting the e500v2,
as this is more common than the e500v1, and is in SoCs still on the
market.

This patch is very intrusive because the SPE is binary incompatible with
the traditional FPU.  After discussing with others, the cleanest
solution was to make both SPE and FPU features on top of a base PowerPC
subset, so all FPU instructions are now wrapped with HasFPU predicates.

Supported by this are:
* Code generation following the SPE ABI at the LLVM IR level (calling
conventions)
* Single- and Double-precision math at the level supported by the APU.

Still to do:
* Vector operations
* SPE intrinsics

As this changes the Callee-saved register list order, one test, which
tests the precise generated code, was updated to account for the new
register order.

Reviewed by: nemanjai
Differential Revision: https://reviews.llvm.org/D44830

llvm-svn: 337347
2018-07-18 04:25:10 +00:00
Justin Hibbits ceb3cd96f7 Add PowerPC e500(v2) core scheduler and directives.
Differential Revision: https://reviews.llvm.org/D44828

llvm-svn: 337345
2018-07-18 04:24:49 +00:00
Stefan Pintilie 133acb22bb [Power9] Add __float128 builtins for Rounding Operations
Added __float128 support for a number of rounding operations:

trunc
rint
nearbyint
round
floor
ceil

Differential Revision: https://reviews.llvm.org/D48415

llvm-svn: 336601
2018-07-09 20:38:40 +00:00
Stefan Pintilie 3d76326d24 [Power9] Add __float128 support for compare operations
Added handling for the select f128.

Differential Revision: https://reviews.llvm.org/D48294

llvm-svn: 336548
2018-07-09 13:36:14 +00:00
Stefan Pintilie b351f09c9e [Power9] Add __float128 library call for frem
Power 9 does not have a hardware instruction for frem but we can call fmodf128.

Differential Revision: https://reviews.llvm.org/D48552

llvm-svn: 336406
2018-07-06 02:47:02 +00:00
Lei Huang 5612b90694 [Power9] Add lib calls for float128 operations with no equivalent PPC instructions
Map the following instructions to the proper float128 lib calls:
  pow[i], exp[2], log[2|10], sin, cos, fmin, fmax

Differential Revision: https://reviews.llvm.org/D48544

llvm-svn: 336361
2018-07-05 15:21:37 +00:00
Lei Huang a855e17f09 [Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg
Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory,
or in memory. This patch ensures that float128 members of non-homogenous
aggregates are passed via VSX registers.

This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128
to a new PPCISD node, BUILD_FP128.

Differential Revision: https://reviews.llvm.org/D48308

llvm-svn: 336310
2018-07-05 06:21:37 +00:00
Lei Huang d17c39ccaa [Power9]Legalize and emit code for quad-precision convert from single-precision
Legalize and emit code for quad-precision floating point operation conversion of
single-precision value to quad-precision.

Differential Revision: https://reviews.llvm.org/D47569

llvm-svn: 336307
2018-07-05 04:18:37 +00:00
Lei Huang a26f3be454 [Power9] Implement float128 parameter passing and return values
This patch enable parameter passing and return by value for float128 types.
Passing aggregate/union which contain float128 members will be submitted in
subsequent patches.

Differential Revision: https://reviews.llvm.org/D47552

llvm-svn: 336306
2018-07-05 04:10:15 +00:00
Lei Huang 6270ab6ce4 [Power9]Legalize and emit code for round & convert quad-precision values
Legalize and emit code for round & convert float128 to double precision and
single precision.

Differential Revision: https://reviews.llvm.org/D46997

llvm-svn: 336299
2018-07-04 21:59:16 +00:00
Strahinja Petrovic bb2b00bb80 [PowerPC] Fix label address calculation for ppc32
This patch fixes calculating address of label on ppc32 (for -fPIC).

Differential Revision: https://reviews.llvm.org/D46582

llvm-svn: 335043
2018-06-19 13:07:40 +00:00
Hiroshi Inoue 0f7f59f073 [PowerPC] fix trivial typos in comment, NFC
llvm-svn: 334583
2018-06-13 08:54:13 +00:00
Amaury Sechet 8467411dad Set ADDE/ADDC/SUBE/SUBC to expand by default
Summary:
They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while.

Target that uses these opcodes are changed in order to ensure their behavior doesn't change.

Reviewers: efriedma, craig.topper, dblaikie, bkramer

Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D47422

llvm-svn: 333748
2018-06-01 13:21:33 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Lei Huang c517e95bc6 [Power9]Legalize and emit code for truncate and convert QP to DW
Legalize and emit code for:

  * xscvqpsdz : VSX Scalar truncate & Convert Quad-Precision to Signed Dword
  * xscvqpudz : VSX Scalar truncate & Convert Quad-Precision to Unsigned Dword

Differential Revision: https://reviews.llvm.org/D45553

llvm-svn: 331787
2018-05-08 18:23:31 +00:00
Lei Huang c29229a644 [PowerPC] Unify handling for conversion of FP_TO_INT feeding a store
Existing DAG combine only handles conversions for FP_TO_SINT:
"{f32, f64} x { i32, i16 }"

This patch simplifies the code to handle:
"{ FP_TO_SINT, FP_TO_UINT } x { f64, f32 } x { i64, i32, i16, i8 }"

Differential Revision: https://reviews.llvm.org/D46102

llvm-svn: 331778
2018-05-08 17:36:40 +00:00
Nemanja Ivanovic 61ffbf21cd Commit r331416 breaks the big-endian PPC bot. On the big endian build, we
actually encounter constants wider than 64-bits. Add the guard to prevent
tripping the assert.

llvm-svn: 331420
2018-05-03 01:04:13 +00:00
Nemanja Ivanovic 01e2e79abf [PowerPC] Implement isMaskAndCmp0FoldingBeneficial
Sinking the and closer to a compare against zero is beneficial on PPC as it
allows us to emit record-form instructions. In the future, we may expand this
to a larger set of operations that feed compares against zero since PPC has
lots of record-form instructions.

Differential revision: https://reviews.llvm.org/D46060

llvm-svn: 331416
2018-05-02 23:55:23 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Lei Huang 42ab1d3d03 [NFC] Move verificaiton check for f128 conversion into LowerINT_TO_FP()
Move veriication check for legal conversions to f128 into LowerINT_TO_FP()
and fix some indentations to match other sections of the code for readability.

llvm-svn: 330138
2018-04-16 17:30:24 +00:00
Lei Huang 10367eb422 [Power9]Legalize and emit code for converting (Un)Signed DWord to Quad-Precision
Legalize and emit code for:

  * xscvsdqp
  * xscvudqp

Differential Revision: https://reviews.llvm.org/D45230

llvm-svn: 329931
2018-04-12 18:00:14 +00:00
Lei Huang 09fda63af0 [Power9]Legalize and emit code for quad-precision fma instructions
Legalize and emit code for the following quad-precision fma:

  * xsmaddqp
  * xsnmaddqp
  * xsmsubqp
  * xsnmsubqp

Differential Revision: https://reviews.llvm.org/D44843

llvm-svn: 329206
2018-04-04 16:43:50 +00:00
Craig Topper 2fa1436206 [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.

The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.

Differential Revision: https://reviews.llvm.org/D45017

llvm-svn: 328806
2018-03-29 17:21:10 +00:00
Lei Huang be0afb0870 [Power9]Legalize and emit code for quad-precision convert from double-precision
Legalize and emit code for quad-precision floating point operation xscvdpqp
and add option to guard the quad precision operation support.

Differential Revision: https://reviews.llvm.org/D44746

llvm-svn: 328558
2018-03-26 17:46:25 +00:00
David Blaikie 36a0f226b1 Fix layering by moving ValueTypes.h from CodeGen to IR
ValueTypes.h is implemented in IR already.

llvm-svn: 328397
2018-03-23 23:58:31 +00:00
David Blaikie 13e77db2df Fix layering of MachineValueType.h by moving it from CodeGen to Support
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)

llvm-svn: 328395
2018-03-23 23:58:25 +00:00
Craig Topper c2dbd677bd [PowerPC][LegalizeFloatTypes] Move the PPC hacks for (i32 fp_to_sint/fp_to_uint (ppcf128 X)) out of LegalizeFloatTypes and into PPC specific code
I'm not entirely sure these hacks are still needed. If you remove the hacks completely, the name of the library call that gets generated doesn't match the grep the test previously had. So the test wasn't really checking anything.

If the hack is still needed it belongs in PPC specific code. I believe the FP_TO_SINT code here is the only place in the tree where a FP_ROUND_INREG node is created today. And I don't think its even being used correctly because the legalization returned a BUILD_PAIR with the same value twice. That doesn't seem right to me. By moving the code entirely to PPC we can avoid creating the FP_ROUND_INREG at all.

I replaced the grep in the existing test with full checks generated by hacking update_llc_test_check.py to support ppc32 just long enough to generate it.

Differential Revision: https://reviews.llvm.org/D44061

llvm-svn: 328017
2018-03-20 18:49:28 +00:00
Lei Huang 6d1596a98c [PowerPC][Power9]Legalize and emit code for quad-precision add/div/mul/sub
Legalize and emit code for quad-precision floating point operations:

  * xsaddqp
  * xssubqp
  * xsdivqp
  * xsmulqp

Differential Revision: https://reviews.llvm.org/D44506

llvm-svn: 327878
2018-03-19 18:52:20 +00:00
Guozhi Wei 9c916584ba [PPC] Avoid non-simple MVT in STBRX optimization
PR35402 triggered this case. It bswap and stores a 48bit value, current STBRX optimization transforms it into STBRX. Unfortunately 48bit is not a simple MVT, there is no PPC instruction to support it, and it can't be automatically expanded by llvm, so caused a crash.

This patch detects the non-simple MVT and returns early.

Differential Revision: https://reviews.llvm.org/D44500

llvm-svn: 327651
2018-03-15 17:49:12 +00:00
Lei Huang 1f8da3ae19 [PowerPC][NFC] formatting-only fix
llvm-svn: 327599
2018-03-15 03:06:44 +00:00
Craig Topper 80d3bb3b4b [TargetLowering] Rename DAGCombinerInfo::isAfterLegalizeVectorOps to DAGCombiner::isAfterLegalizeDAG since that's what it checks. NFC
The code checks Level == AfterLegalizeDAG which is the fourth and last of the possible DAG combine stages that we have.

There is a Level called AfterLegalVectorOps, but that's the third DAG combine and it doesn't always run.

A function called isAfterLegalVectorOps should imply it returns true in either of the DAG combines that runs after the legalize vector ops stage, but that's not what this function does.

llvm-svn: 326832
2018-03-06 19:44:52 +00:00
Chih-Hung Hsieh 9f9e4681ac [TLS] use emulated TLS if the target supports only this mode
Emulated TLS is enabled by llc flag -emulated-tls,
which is passed by clang driver.
When llc is called explicitly or from other drivers like LTO,
missing -emulated-tls flag would generate wrong TLS code for targets
that supports only this mode.
Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether
emulated TLS code should be generated.
Unit tests are modified to run with and without the -emulated-tls flag.

Differential Revision: https://reviews.llvm.org/D42999

llvm-svn: 326341
2018-02-28 17:48:55 +00:00
Lei Huang dfd41552f4 [PowerPC] Reduce stack frame for fastcc functions by only allocating parameter save area when needed
Current implementation always allocates the parameter save area conservatively
for fastcc functions. There is no reason to allocate the parameter save area if
all the parameters can be passed via registers.

Differential Revision: https://reviews.llvm.org/D42602

llvm-svn: 325581
2018-02-20 15:09:45 +00:00
Zaara Syeda 1f59ae311b Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This recommits r322721 reverted due to sanitizer memory leak build bot failures.

Original commit message:
This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.

Differential Revision: https://reviews.llvm.org/D38413

llvm-svn: 323778
2018-01-30 16:17:22 +00:00
Tim Shen 7abe9887b0 [PPC] Avoid incorrect fp-i128-fp lowering.
Summary:
Fix an issue that's similar to what D41411 fixed:
  float(__int128(float_var)) shouldn't be optimized to xscvdpsxds +
  xscvsxdsp, as they mean (float)(int64_t)float_var.

Reviewers: jtony, hfinkel, echristo

Subscribers: sanjoy, nemanjai, hiraditya, llvm-commits, kbarton

Differential Revision: https://reviews.llvm.org/D42400

llvm-svn: 323270
2018-01-23 22:06:57 +00:00
Zaara Syeda c9dc7b451b Revert [PowerPC] This reverts commit rL322721
Failing build bots. Revert the commit now.

llvm-svn: 322748
2018-01-17 20:00:15 +00:00
Zaara Syeda 8e951fd2f6 [PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.

Differential Revision: https://reviews.llvm.org/D38413

llvm-svn: 322721
2018-01-17 18:22:55 +00:00
Nemanja Ivanovic ebb23078e9 [PowerPC] Zero-extend the compare operand for ATOMIC_CMP_SWAP
Part of the fix for https://bugs.llvm.org/show_bug.cgi?id=35812.
This patch ensures that the compare operand for the atomic compare and swap
is properly zero-extended to 32 bits if applicable.
A follow-up commit will fix the extension for the SETCC node generated when
expanding an ATOMIC_CMP_SWAP_WITH_SUCCESS. That will complete the bug fix.

Differential Revision: https://reviews.llvm.org/D41856

llvm-svn: 322372
2018-01-12 14:58:41 +00:00
Craig Topper d461aefe5f [PowerPC] Add an ISD::TRUNCATE to the legalization for ppc_is_decremented_ctr_nonzero
Summary:
I believe legalization is really expecting that ReplaceNodeResults will return something with the same type as the thing that's being legalized. Ultimately, it uses the output to replace the uses in the DAG so the type should match to make that work.

There are two relevant cases here. When crbits are enabled, then i1 is a legal type and getSetCCResultType should return i1. In this case, the truncate will be between i1 and i1 and should be removed (SelectionDAG::getNode does this). Otherwise, getSetCCResultType will be i32 and the legalizer will promote the truncate to be i32 -> i32 which will be similarly removed.

With this fixed we can remove some code from PromoteIntRes_SETCC that seemed to only exist to deal with the intrinsic being replaced with a larger type without changing the other operand. With the truncate being used for connectivity this doesn't happen anymore.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: nemanjai, llvm-commits, kbarton

Differential Revision: https://reviews.llvm.org/D41654

llvm-svn: 321959
2018-01-07 07:51:36 +00:00
Hiroshi Inoue ca3cdd7f27 [PowerPC] fix a bug in TCO eligibility check
If the callee and caller use different calling convensions, we cannot apply TCO if the callee requires arguments on stack; e.g. C calling convention and Fast CC use the same registers for parameter passing, but the stack offset is not necessarily same.

This patch also recommit r319218 "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." by @sfertile since the problem reported in r320106 should be fixed.

Differential Revision: https://reviews.llvm.org/D40893

llvm-svn: 321579
2017-12-30 08:09:04 +00:00
Tony Jiang eba757e45c [PowerPC] Fix parest build failure in SPEC2017.
The build failure was caused by an assertion in pre-legalization DAGCombine:

Combining: t6: ppcf128 = uint_to_fp t5
... into: t20: f32 = PPCISD::FCFIDUS t19

which is clearly wrong since ppcf128 are definitely different type with f32 and
we cannot change the node value type when do DAGCombine. The fix is don't
handle ppc_fp128 or i1 conversions in PPCTargetLowering::combineFPToIntToFP and
leave it to downstream to legalize it and expand it to small legal types.

Differential Revision: https://reviews.llvm.org/D41411

llvm-svn: 321276
2017-12-21 15:42:50 +00:00
Matthias Braun f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Nemanja Ivanovic 1794cdc481 Fix code causing fallthrough warnings in the PPC back end.
llvm-svn: 320806
2017-12-15 11:47:48 +00:00
Matt Arsenault 7d7adf4f2e TLI: Allow using PSV for intrinsic mem operands
llvm-svn: 320756
2017-12-14 22:34:10 +00:00
Matt Arsenault 1117133687 DAG: Expose all MMO flags in getTgtMemIntrinsic
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.

On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.

llvm-svn: 320746
2017-12-14 21:39:51 +00:00
Nemanja Ivanovic 50d37a1129 [PowerPC] Sign-extend negative constant stores
Second part of https://reviews.llvm.org/D40348.
Revision r318436 has extended all constants feeding a store to 64 bits
to allow for CSE on the SDAG. However, negative constants were zero extended
which made the constant being loaded appear to be a positive value larger than
16 bits. This resulted in long sequences to materialize such constants
rather than simply a "load immediate". This patch just sign-extends those
updated constants so that they remain 16-bit signed immediates if they started
out that way.

llvm-svn: 320368
2017-12-11 14:35:48 +00:00
Eric Christopher a469acac03 Temporarily revert "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions."
It is causing sanitizer failures on llvm tests in a bootstrapped compiler. No bot link since it's currently down, but following up to get the bot up.

This reverts commit r319218.

llvm-svn: 320106
2017-12-07 22:26:19 +00:00
Sean Fertile e200016ea9 [PowerPC] Allow tail calls of fastcc functions from C CallingConv functions.
Allow fastcc callees to be tail-called from ccc callers.

Differential Revision: https://reviews.llvm.org/D40355

llvm-svn: 319218
2017-11-28 20:25:58 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Guozhi Wei 433e8d3e04 [PPC] Change i32 constant in store instruction to i64
This patch changes all i32 constant in store instruction to i64 with truncation, to increase the chance that the referenced constant can be shared with other i64 constant.

Differential Revision: https://reviews.llvm.org/D39352

llvm-svn: 318436
2017-11-16 18:27:34 +00:00
Sean Fertile 0f0837e84e [PowerPC] Implement mayBeEmittedAsTailCall for PPC
Implements TargetLowering callback 'mayBeEmittedAsTailCall' that enables
CodeGenPrepare to duplicate returns when they might enable a tail-call.

Differential Revision: https://reviews.llvm.org/D39777

llvm-svn: 318321
2017-11-15 18:58:27 +00:00
Sean Fertile 7b056b3048 [PowerPC] Split out the tailcall calling convention checks. NFC.
Move the calling convention checks for tail-call eligibility for the 64-bit
SysV ABI into a separate function. This is so that it can be shared with
'mayBeEmittedAsTailCall' in a subsequent change.

llvm-svn: 318305
2017-11-15 16:53:41 +00:00
David Blaikie 3f833edc7c Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647
2017-11-08 01:01:31 +00:00
Graham Yiu 5cd044e8c8 Use new vector insert half-word and byte instructions when we see insertelement on '8 x i16' and '16 x i8' types. Also extended existing lit testcase to cover these cases.
Differential Revision: https://reviews.llvm.org/D34630

llvm-svn: 317613
2017-11-07 20:55:43 +00:00
Graham Yiu 52a52a6cab Fix buildbot breakages from r317503. Add parentheses to assignment when using result as a condition.
llvm-svn: 317508
2017-11-06 21:04:19 +00:00
Graham Yiu 030621bbcb Adds code to PPC ISEL lowering to recognize byte inserts from vector_shuffles, and use P9 shift and vector insert byte instructions instead of vperm. Extends tests from vector insert half-word.
Differential Revision: https://reviews.llvm.org/D34497

llvm-svn: 317503
2017-11-06 20:18:30 +00:00
Guozhi Wei e3b8d9a312 [PPC] Use xxbrd to speed up bswap64
Power doesn't have bswap instructions, so llvm generates following code sequence for bswap64.

  rotldi   5, 3, 16
  rotldi   4, 3, 8
  rotldi   9, 3, 24
  rotldi   10, 3, 32
  rotldi   11, 3, 48
  rotldi   12, 3, 56
  rldimi 4, 5, 8, 48
  rldimi 4, 9, 16, 40
  rldimi 4, 10, 24, 32
  rldimi 4, 11, 40, 16
  rldimi 4, 12, 48, 8
  rldimi 4, 3, 56, 0

But Power9 has vector bswap instructions, they can also be used to speed up scalar bswap intrinsic. With this patch, bswap64 can be translated to:

  mtvsrdd 34, 3, 3
  xxbrd 34, 34
  mfvsrld 3, 34

Differential Revision: https://reviews.llvm.org/D39510

llvm-svn: 317499
2017-11-06 19:09:38 +00:00
Graham Yiu 671526148c Adds code to PPC ISEL lowering to recognize half-word inserts from vector_shuffles, and use P9 shift and vector insert instructions instead of vperm.
Differential Revision: https://reviews.llvm.org/D34160

llvm-svn: 317111
2017-11-01 18:06:56 +00:00
Hiroshi Inoue e3a3e3c9e9 [PowerPC] Eliminate sign- and zero-extensions if already sign- or zero-extended
This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass.
If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated.
One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI. 
For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated.

void int_func(int);
void ii_test(int a) {
    if (a & 1) return int_func(a);
}

Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG.

Differential Revision: https://reviews.llvm.org/D31319

llvm-svn: 315888
2017-10-16 04:12:57 +00:00
Matt Arsenault f2db97d8fa DAG: Add opcode and source type to isFPExtFree
This is only currently used for mad/fma transforms.
This is the only case where it should be used for AMDGPU,
so add an opcode to be sure.

llvm-svn: 315740
2017-10-13 19:55:45 +00:00
Hal Finkel 112a6bac72 [PowerPC] Don't use xscvdpspn on the P7
xscvdpspn was not introduced until the P8, so don't use it on the P7. Fixes a
regression introduced in r288152.

llvm-svn: 312612
2017-09-06 03:08:26 +00:00
Tony Jiang 61ef1c540c [PPC][NFC] Renaming things with 'xxinsert' moniker to 'vecinsert' to make it more general.
Commit on behalf of Graham Yiu (gyiu@ca.ibm.com)

llvm-svn: 312547
2017-09-05 18:08:02 +00:00
Sean Fertile 00393cce3a [PPC] Refine checks for emiting TOC restore nop and tail-call eligibility.
For the medium and large code models we only need to check if a call crosses
dso-boundaries when considering tail-call elgibility.

Differential Revision: https://reviews.llvm.org/D34245

llvm-svn: 311353
2017-08-21 17:35:32 +00:00
Nemanja Ivanovic 979dcb6f09 [PowerPC] Don't crash on larger splats achieved through 1-byte splats
We've implemented a 1-byte splat using XXSPLTISB on P9. However, LLVM will
produce a 1-byte splat even for wider element BUILD_VECTOR nodes. This patch
prevents crashing in that situation.

Differential Revision: https://reviews.llvm.org/D35650

llvm-svn: 310358
2017-08-08 13:52:45 +00:00