Commit Graph

1553 Commits

Author SHA1 Message Date
Albion Fung d7eb917a7c [PowerPC] Implementation of 128-bit Binary Vector Mod and Sign Extend builtins
This patch implements 128-bit Binary Vector Mod and Sign Extend builtins for PowerPC10.

Differential: https://reviews.llvm.org/D87394#inline-815858
2020-09-23 01:18:14 -05:00
Qiu Chaofan 1d782c2987 [PowerPC] Pass nofpexcept flag to custom lowered constrained ops
This is a follow-up of D86605. For strict DAG FP node, if its FP
exception behavior metadata is ignore, it should have nofpexcept flag.
But during custom lowering, this flag isn't passed down.

This is also seen on X86 target.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D87390
2020-09-21 10:44:25 +08:00
Qiu Chaofan ebfbdebe96 [PowerPC] Fix store-fptoi combine of f128 on Power8
llc would crash for (store (fptosi-f128-i32)) when -mcpu=pwr8, we should
not generate FP_TO_(S|U)INT_IN_VSR for f128 types at this time. This
patch fixes it.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D86686
2020-09-17 10:21:35 +08:00
Albion Fung 05aa997d51 [PowerPC] Implement __int128 vector divide operations
This patch implements __int128 vector divide operations for ISA3.1.

Differential Revision: https://reviews.llvm.org/D85453
2020-09-15 15:19:35 -04:00
Kamau Bridgeman c0f199e566 [PowerPC] Implement Thread Local Storage Support for Local Exec
This patch is the initial support for the Local Exec Thread Local
Storage model to produce code sequence and relocations correct
to the ABI for the model when using PC relative memory operations.

Patch by: Kamau Bridgeman

Differential Revision: https://reviews.llvm.org/D83404
2020-09-14 14:16:28 -05:00
Qiu Chaofan 6afb279100 [PowerPC] [FPEnv] Disable strict FP mutation by default
22a0edd0 introduced a config IsStrictFPEnabled, which controls the
strict floating point mutation (transforming some strict-fp operations
into non-strict in ISel). This patch disables the mutation by default
since we've finished PowerPC strict-fp enablement in backend.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D87222
2020-09-10 13:28:09 +08:00
Qiu Chaofan 88ff4d2ca1 [PowerPC] Fix STRICT_FRINT/STRICT_FNEARBYINT lowering
In standard C library, both rint and nearbyint returns rounding result
in current rounding mode. But nearbyint never raises inexact exception.
On PowerPC, x(v|s)r(d|s)pic may modify FPSCR XX, raising inexact
exception. So we can't select constrained fnearbyint into xvrdpic.

One exception here is xsrqpi, which will not raise inexact exception, so
fnearbyint f128 is okay here.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D87220
2020-09-09 22:40:58 +08:00
Brad Smith 88b368a1c4 [PowerPC] Set setMaxAtomicSizeInBitsSupported appropriately for 32-bit PowerPC in PPCTargetLowering
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D86165
2020-09-08 21:21:14 -04:00
Craig Topper b1e68f885b [SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.:
This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130.

This required adding a SDNodeFlags to SelectionDAG::getSetCC.

Now we manage to contant fold some stuff undefs during the
initial getNode that we don't do in later DAG combines.

Differential Revision: https://reviews.llvm.org/D87200
2020-09-08 15:27:21 -07:00
Qiu Chaofan 705271d9cd [PowerPC] Expand constrained ppc_fp128 to i32 conversion
Libcall __gcc_qtou is not available, which breaks some tests needing
it. On PowerPC, we have code to manually expand the operation, this
patch applies it to constrained conversion. To keep it strict-safe,
it's using the algorithm similar to expandFP_TO_UINT.

For constrained operations marking FP exception behavior as 'ignore',
we should set the NoFPExcept flag. However, in some custom lowering
the flag is missed. This should be fixed by future patches.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D86605
2020-09-05 13:16:20 +08:00
Nemanja Ivanovic 2771407584 [PowerPC] Do not legalize vector FDIV without VSX
Quite a while ago, we legalized these nodes as we added custom
handling for reciprocal estimates in the back end. We have since
moved to target-independent combines but neglected to turn off
legalization. As a result, we can now get selection failures on
non-VSX subtargets as evidenced in the listed PR.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=47373
2020-09-02 16:03:36 -05:00
Albion Fung 331dcc43ea [PowerPC] Implemented Vector Load with Zero and Signed Extend Builtins
This patch implements the builtins for Vector Load with Zero and Signed Extend Builtins (lxvr_x for b, h, w, d), and adds the appropriate test cases for these builtins. The builtins utilize the vector load instructions itnroduced with ISA 3.1.

Differential Revision: 	https://reviews.llvm.org/D82502#inline-797941
2020-08-28 11:28:58 -05:00
Roland Froese b6d7ed469f [PowerPC] Extend custom lower of vector truncate to handle wider input
Current custom lowering of truncate vector handles a source of up to 128 bits, but that only uses one of the two shuffle vector operands. Extend it to use both operands to handle 256 bit sources.

Differential Revision: https://reviews.llvm.org/D68035
2020-08-24 15:33:43 -04:00
Qiu Chaofan 41ba9d7723 [PowerPC] Support constrained vector fp/int conversion
This patch makes these operations legal, and add necessary codegen
patterns.

There's still some issue similar to D77033 for conversion from v1i128
type. But normal type tests synced in vector-constrained-fp-intrinsic
are passed successfully.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D83654
2020-08-24 10:10:27 +08:00
Qiu Chaofan a5b7b8cce0 [PowerPC] Support constrained scalar sitofp/uitofp
This patch adds support for constrained scalar int to fp operations on
PowerPC. Besides, this also fixes the FP exception bit of FCFID*
instructions.

Reviewed By: steven.zhang, uweigand

Differential Revision: https://reviews.llvm.org/D81669
2020-08-22 02:10:29 +08:00
Kamau Bridgeman 365f861c45 [PowerPC][PCRelative] Thread Local Storage Support for Initial Exec
This patch is the initial support for the Intial Exec Thread Local
Local Storage model to produce code sequence and relocations correct
to the ABI for the model when using PC relative memory operations.

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D81947
2020-08-21 10:13:11 -05:00
Kamau Bridgeman b74b80bb2d [PowerPC][PCRelative] Thread Local Storage Support for General Dynamic
This patch is the initial support for the General Dynamic Thread Local
Local Storage model to produce code sequence and relocations correct
to the ABI for the model when using PC relative memory operations.

Patch by: NeHuang

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D82315
2020-08-20 15:08:13 -05:00
Qiu Chaofan 131b3b9ed4 [PowerPC] Support constrained scalar fptosi/fptoui
This patch adds support for constrained scalar fp to int operations on
PowerPC. Besides, this fixes the FP exception bit of quad-precision
convert & truncate instructions.

Reviewed By: steven.zhang, uweigand

Differential Revision: https://reviews.llvm.org/D81537
2020-08-20 13:29:43 +08:00
Albion Fung 3136cbe29e [PowerPC] Implement Vector Shift Builtins
This patch implements the builtins for the vector shifts (shl, srl, sra), and
adds the appropriate test cases for these builtins. The builtins utilize the
vector shift instructions introduced within ISA 3.1.

Differential Revision: https://reviews.llvm.org/D83338
2020-08-12 18:26:58 -05:00
diggerlin e9ac1495e2 [AIX][XCOFF] change the operand of branch instruction from symbol name to qualified symbol name for function declarations
SUMMARY:

1. in the patch  , remove setting storageclass in function .getXCOFFSection and construct function of class MCSectionXCOFF
there are

XCOFF::StorageMappingClass MappingClass;
XCOFF::SymbolType Type;
XCOFF::StorageClass StorageClass;
in the MCSectionXCOFF class,
these attribute only used in the XCOFFObjectWriter, (asm path do not need the StorageClass)

we need get the value of StorageClass, Type,MappingClass before we invoke the getXCOFFSection every time.

actually , we can get the StorageClass of the MCSectionXCOFF  from it's delegated symbol.

2. we also change the oprand of branch instruction from symbol name to qualify symbol name.
for example change
bl .foo
extern .foo
to
bl .foo[PR]
extern .foo[PR]

3. and if there is reference indirect call a function bar.
we also add
  extern .bar[PR]

Reviewers:  Jason liu, Xiangling Liao

Differential Revision: https://reviews.llvm.org/D84765
2020-08-11 15:26:19 -04:00
Kerry McLaughlin 85c7e89f3b [CodeGen] Refactor getMemBasePlusOffset & getObjectPtrOffset to accept a TypeSize
Changes the Offset arguments to both functions from int64_t to TypeSize
& updates all uses of the functions to create the offset using TypeSize::Fixed()

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85220
2020-08-11 12:17:10 +01:00
Qiu Chaofan dbcfbffc7a [PowerPC] Add intrinsic to read or set FPSCR register
This patch introduces two intrinsics: llvm.ppc.setflm and
llvm.ppc.readflm. They read from or write to FPSCR register
(floating-point status & control) which contains rounding mode and
exception status.

To ensure correctness of program, we need to prevent FP operations from
being moved across these intrinsics (mffs/mtfsf instruction), so here I
set them as scheduling boundaries. We can relax such restriction if
FPSCR is modeled well in the future.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D84914
2020-08-10 18:27:45 +08:00
Kamau Bridgeman d8c6d083c9 [PowerPC][PCRelative] Set TLS unsupported with PC relative memops
Introduce a fatal error if any thread local storage code is compiled
using pc relative memory operations as well as a hidden override
option `-enable-ppc-pcrel-tls` so that this support can be incrementally
added if possible.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D85448
2020-08-07 10:56:24 -05:00
QingShan Zhang 55de46f3b2 [PowerPC] Support constrained fp operation for setcc
The constrained fp operation fcmp was added by https://reviews.llvm.org/D69281.
This patch is trying to add the support for PowerPC backend.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D81727
2020-08-07 05:16:36 +00:00
Jay Foad 28e322ea93 [PowerPC] Custom lowering for funnel shifts
The custom lowering saves an instruction over the generic expansion, by
taking advantage of the fact that PowerPC shift instructions are well
defined in the shift-by-bitwidth case.

Differential Revision: https://reviews.llvm.org/D83948
2020-08-04 16:30:49 +01:00
Qiu Chaofan 6a78a8dd37 [NFC] [PowerPC] Refactor fp/int conversion lowering
For FP_TO_INT and INT_TO_FP lowering, we have direct-move and
non-direct-move methods. But they share some conversion logic, so we can
reduce redundant code by introducing new methods.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81818
2020-08-04 15:48:16 +08:00
Chen Zheng 45c46d180e [PowerPC] mark r+i as legal address mode for vector type after pwr9
Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D84735
2020-08-04 00:02:37 -04:00
Justin Hibbits 7e9153e940 PowerPC: Don't lower SELECT_CC to PPCISD::FSEL on SPE
SPE doesn't have a fsel instruction, so don't try to lower to it.

This fixes a "Cannot select: tN: f64 = PPCISD::FSEL tX, tY, tZ" error.

Reviewed By: #powerpc, lkail
Differential Revision: https://reviews.llvm.org/D77773
2020-07-31 22:52:47 -05:00
Justin Hibbits 914dbf4808 PowerPC: Fix SPE extloadf32 handling.
The patterns were incorrect copies from the FPU code, and are
unnecessary, since there's no extended load for SPE.  Just let LLVM
itself do the work by marking it expand.

Reviewed By: #powerpc, lkail
Differential Revision: https://reviews.llvm.org/D78670
2020-07-31 22:42:57 -05:00
Kang Zhang a18953c1c0 [PowerPC] Fix RM operands for some instructions
Summary:
Some instructions have set the wrong [RM] flag, this patch is to fix it.

Instructions x(v|s)r(d|s)pi[zmp]? and fri[npzm] use fixed rounding
directions without referencing current rounding mode.

Also, the SETRNDi, SETRND, BCLRn, MTFSFI, MTFSB0, MTFSB1, MTFSFb,
MTFSFI, MTFSFI_rec, MTFSF, MTFSF_rec should also fix the RM flag.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D81360
2020-07-30 02:10:49 +00:00
Kang Zhang 802c043078 [PowerPC] Set v1i128 to expand for SETCC to avoid crash
Summary:
PPC only supports the instruction selection for v16i8, v8i16, v4i32,
v2i64, v4f32 and v2f64 for ISD::SETCC, don't support the v1i128, so
v1i128 for ISD::SETCC will crash.

This patch is to set v1i128 to expand to avoid crash.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D84238
2020-07-29 16:39:27 +00:00
jasonliu f8ab66538c [NFC][XCOFF] Use getFunctionEntryPointSymbol from TLOF to simplify logic
Reviewed By: Xiangling_L

Differential Revision: https://reviews.llvm.org/D84693
2020-07-28 18:59:51 +00:00
Jinsong Ji d28f86723f Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit bf544fa1c3.

Fixed the typo in PPCInstrInfo.cpp.
2020-07-28 14:00:11 +00:00
Jinsong Ji bf544fa1c3 Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit adffce7153.

This is breaking test-suite, revert while investigation.
2020-07-27 21:07:00 +00:00
Jinsong Ji adffce7153 [PowerPC] Remove QPX/A2Q BGQ/BGP CNK support
Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html
no one is making use of QPX/A2Q/BGQ/BGP CNK anymore.

This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang,
CNK support in openmp/polly.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D83915
2020-07-27 19:24:39 +00:00
Nemanja Ivanovic cdead4f89c [PowerPC][NFC] Fix an assert that cannot trip from 7d076e19e3
I mixed up the precedence of operators in the assert and thought I
had it right since there was no compiler warning. This just
adds the parentheses in the expression as needed.
2020-07-25 20:28:52 -04:00
Amy Kwan 739cd2638b [PowerPC] Exploit the High Order Vector Multiply Instructions on Power10
This patch aims to exploit the following vector multiply high instructions on Power10.
vmulhsw VRT, VRA, VRB
vmulhsd VRT, VRA, VRB
vmulhuw VRT, VRA, VRB
vmulhud VRT, VRA, VRB

Differential Revision: https://reviews.llvm.org/D82584
2020-07-24 20:57:57 -05:00
Nemanja Ivanovic 7d076e19e3 [PowerPC] Fix computation of offset for load-and-splat for permuted loads
Unfortunately this is another regression from my canonicalization patch
(1fed131660). The patch contained two implicit assumptions:
1. That we would have a permuted load only if we are loading a partial vector
2. That a partial vector load would necessarily be as wide as the splat

However, assumption 2 is not correct since it is possible to do a wider
load and only splat a half of it. This patch corrects this assumption by
simply checking if the load is permuted and adjusting the offset if it is.
2020-07-24 15:38:46 -04:00
Amy Kwan 1dc1a3fb0c [PowerPC] Implement low-order Vector Multiply, Modulus and Divide Instructions
This patch aims to implement the low order vector multiply, divide and modulo
instructions available on Power10.

The patch involves legalizing the ISD nodes MUL, UDIV, SDIV, UREM and SREM for
v2i64 and v4i32 vector types in order to utilize the following instructions:
- Vector Multiply Low Doubleword: vmulld
- Vector Modulus Word/Doubleword: vmodsw, vmoduw, vmodsd, vmodud
- Vector Divide Word/Doubleword: vdivsw, vdivsd, vdivuw, vdivud

Differential Revision: https://reviews.llvm.org/D82510
2020-07-23 17:18:36 -05:00
jasonliu b98b1700ef [XCOFF] Enable symbol alias for AIX
Summary:
AIX assembly's .set directive is not usable for aliasing purpose.
We need to use extra-label-at-defintion strategy to generate symbol
aliasing on AIX.

Reviewed By: DiggerLin, Xiangling_L

Differential Revision: https://reviews.llvm.org/D83252
2020-07-22 14:03:55 +00:00
Kai Luo c3f9697f1f [PowerPC] Fix wrong codegen when stack pointer has to realign performing dynalloc
Current powerpc backend generates wrong code sequence if stack pointer
has to realign if `-fstack-clash-protection` enabled. When probing
dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes
`NegSizeReg` as input and returns
`FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated
correctly, however code following `PREPARE_PROBED_ALLOCA` still uses
value of `NegSizeReg`, which does not contain `ActualNegSize` if
`MaxAlign > TargetAlign`, to calculate loop trip count and residual
number of bytes.

This patch is part of fix of
https://bugs.llvm.org/show_bug.cgi?id=46759.

Differential Revision: https://reviews.llvm.org/D84152
2020-07-22 06:35:12 +00:00
Chen Zheng e8425b27fe [PowerPC] add store (load float*) pattern to isProfitableToHoist
store (load float*) can be optimized to store(load i32*) in InstCombine pass.

Add store (load float*) to isProfitableToHoist to make sure we don't break
the opt in InstCombine pass.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D82341
2020-07-21 20:55:13 -04:00
diggerlin 11546898e2 [AIX][XCOFF]emit extern linkage for the llvm intrinsic symbol
SUMMARY:

when we call memset, memcopy,memmove etc(this are llvm intrinsic function) in the c source code. the llvm will generate IR
like call call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
for c source code
bash> cat test_memset.call

struct S{
 int a;
 int b;
};
extern struct  S s;
void bar() {
  memset(&s, s.b, s.b);
}
like

%struct.S = type { i32, i32 }
@s = external global %struct.S, align 4
; Function Attrs: noinline nounwind optnone
define void @bar() #0 {
entry:
  %0 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
  %1 = trunc i32 %0 to i8
  %2 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
  call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
  ret void
}
declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #1
If we want to let the aix as assembly compile pass without -u
it need to has following assembly code.
.extern .memset
(we do not output extern linkage for llvm instrinsic function.
even if we output the extern linkage for llvm intrinsic function, we should not out .extern llvm.memset.p0i8.i32,
instead of we should emit .extern memset)

for other llvm buildin function floatdidf . even if we do not call these function floatdidf in the c source code(the generated IR also do not the call __floatdidf . the function call
was generated in the LLVM optimized.
the function is not in the functions list of Module, but we still need to emit extern .__floatdidf

The solution for it as :
We record all the lllvm intrinsic extern symbol when transformCallee(), and emit all these symbol in the AsmPrinter::doFinalization(Module &M)

Reviewers:  jasonliu, Sean Fertile, hubert.reinterpretcast,

Differential Revision: https://reviews.llvm.org/D78929
2020-07-21 16:03:04 -04:00
Fangrui Song eafe7c14ea [PowerPC] Fix combineVectorShuffle regression after D77448
Commit 1fed131660 assumed that NewShuffle (shuffle vector
canonicalization result) will always be ShuffleVectorSDNode, which may
be false (it may be a BITCAST node):

```
...
t12: v4i32 = scalar_to_vector t2
t15: v16i8 = bitcast t12  # LHS
t17: v16i8 = vector_shuffle<u,u,u,u,u,u,u,u,0,1,2,3,u,u,u,u> t15, undef:v16i8  # SVN
```

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D83617
2020-07-13 16:57:27 -07:00
Qiu Chaofan b6912c879e [PowerPC] Support constrained conversion in SPE target
This patch adds support for constrained int/fp conversion between
signed/unsigned i32 and f32/f64.

Reviewed By: jhibbits

Differential Revision: https://reviews.llvm.org/D82747
2020-07-13 12:18:36 +08:00
Lei Huang 90b1a710ae [PowerPC] Enable default support of quad precision operations
Summary: Remove option guarding support of quad precision operations.

Reviewers: nemanjai, #powerpc, steven.zhang

Reviewed By: nemanjai, #powerpc, steven.zhang

Subscribers: qiucf, wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83437
2020-07-10 13:27:48 -05:00
Kai Luo e2b93185b8 [PowerPC] Only make copies of registers on stack in variadic function when va_start is called
On PPC64, for a variadic function, if va_start is not called, it won't
access any variadic argument on stack, thus we can save stores of
registers used to pass arguments.

Differential Revision: https://reviews.llvm.org/D82361
2020-07-09 07:18:17 +00:00
Nemanja Ivanovic 1b1539712e [PowerPC] Do not RAUW combined nodes in VECTOR_SHUFFLE legalization
When legalizing shuffles, we make an attempt to combine it into
a PPC specific canonical form that avoids a need for a swap. If the
combine is successful, we RAUW the node and the custom legalization
replaces the now dead node instead of the one it should replace.
Remove that erroneous call to RAUW.
2020-07-06 22:09:28 -05:00
Amy Kwan c13e3e2c2e [PowerPC][Power10] Exploit the xxsplti32dx instruction when lowering VECTOR_SHUFFLE.
This patch aims to exploit the xxsplti32dx XT, IX, IMM32 instruction when lowering VECTOR_SHUFFLEs.
We implement lowerToXXSPLTI32DX when lowering vector shuffles to check if:
- Element size is 4 bytes
- The RHS is a constant vector (and constant splat of 4-bytes)
- The shuffle mask is a suitable mask for the XXSPLTI32DX instruction where it is one of the 32 masks:
<0, 4-7, 2, 4-7>
<4-7, 1, 4-7, 3>

Differential Revision: https://reviews.llvm.org/D83245
2020-07-06 20:28:38 -05:00
jasonliu 6d3ae365bd [XCOFF][AIX] Give symbol an internal name when desired symbol name contains invalid character(s)
Summary:

When a desired symbol name contains invalid character that the
system assembler could not process, we need to emit .rename
directive in assembly path in order for that desired symbol name
to appear in the symbol table.

Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L

Differential Revision: https://reviews.llvm.org/D82481
2020-07-06 15:49:15 +00:00