Commit Graph

1600 Commits

Author SHA1 Message Date
Kazu Hirata 7dc3575ef2 [llvm] Remove redundant return and continue statements (NFC)
Identified with readability-redundant-control-flow.
2021-01-14 20:30:34 -08:00
Nemanja Ivanovic 3f7b4ce960 [PowerPC] Add support for embedded devices with EFPU2
PowerPC cores like e200z759n3 [1] using an efpu2 only support single precision
hardware floating point instructions. The single precision instructions efs*
and evfs* are identical to the spe float instructions while efd* and evfd*
instructions trigger a not implemented exception.

This patch introduces a new command line option -mefpu2 which leads to
single-hardware / double-software code generation.

[1] Core reference:
  https://www.nxp.com/files-static/32bit/doc/ref_manual/e200z759CRM.pdf

Differential revision: https://reviews.llvm.org/D92935
2021-01-12 09:47:00 -06:00
Fangrui Song 022cc6e343 [PowerPC] Delete dead Lower* 2021-01-06 21:58:40 -08:00
Fangrui Song bfa6ca07a8 [PowerPC] Delete remnant Darwin ISelLowering code 2021-01-06 21:40:40 -08:00
Qiu Chaofan b6c8feb29f [NFC] [PowerPC] Remove dead code in BUILD_VECTOR peephole
The piece of code tries to use splat+shift to lower build_vector with
repeating bit pattern. And immediate field of vector splat is only 5
bits (-16~15). It iterates over them one by one to find which
shifts/rotates to number in build_vector.

This patch removes code to try matching constant with algebraic
right-shift because that's meaningless - any negative number's algebraic
right-shift won't produce result smaller than itself. Besides, code
(int)((unsigned)i >> j) means logical shift-right in C.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D93937
2021-01-05 11:35:00 +08:00
Kai Luo f904d50c29 [PowerPC] Remaining KnownBits should be constant when performing non-sign comparison
In `PPCTargetLowering::DAGCombineTruncBoolExt`, when checking if it's correct to perform the transformation for non-sign comparison, as the comment says
```
      // This is neither a signed nor an unsigned comparison, just make sure
      // that the high bits are equal.
```
Origin check
```
      if (Op1Known.Zero != Op2Known.Zero || Op1Known.One != Op2Known.One)
        return SDValue();
```
is not strong enough. For example,
```
Op1Known = 111x000x;
Op2Known = 111x000x;
```
Bit 4, besides bit 0, is still unknown and affects the final result.

This patch fixes https://bugs.llvm.org/show_bug.cgi?id=48388.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D93092
2020-12-30 02:00:47 +00:00
Reid Kleckner 0985a8bfea Fix left shift overflow UB in PPC backend on LLP64 platforms 2020-12-19 17:46:09 -08:00
Baptiste Saleil c2892978e9 [PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_
On PPC, the vector pair instructions are independent from MMA.
This patch renames the vector pair LLVM intrinsics and Clang builtins to replace the _mma_ prefix by _vsx_ in their names.
We also move the vector pair type/intrinsic/builtin tests to their own files.

Differential Revision: https://reviews.llvm.org/D91974
2020-12-17 13:19:27 -05:00
QingShan Zhang ebdd20f430 Expand the fp_to_int/int_to_fp/fp_round/fp_extend as libcall for fp128
X86 and AArch64 expand it as libcall inside the target. And PowerPC also
want to expand them as libcall for P8. So, propose an implement in the
legalizer to common the logic and remove the code for X86/AArch64 to
avoid the duplicate code.

Reviewed By: Craig Topper

Differential Revision: https://reviews.llvm.org/D91331
2020-12-17 07:59:30 +00:00
QingShan Zhang 08e287aaf3 [PowerPC][FP128] Fix the incorrect signature for math library call
The runtime library has two family library implementation for ppc_fp128 and fp128.
For IBM Long double(ppc_fp128), it is suffixed with 'l', i.e(sqrtl). For
IEEE Long double(fp128), it is suffixed with "ieee128" or "f128".
We miss to map several libcall for IEEE Long double.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D91675
2020-12-14 07:52:56 +00:00
Zarko Todorovski ce4040a43d [PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern matching built vectors
Some of the pattern matching in PPCInstrVSX.td and node lowering involving vectors assumes 64bit mode.  This patch disables some of the unsafe pattern matching and lowering of BUILD_VECTOR in 32bit mode.

Reviewed By: Xiangling_L

Differential Revision: https://reviews.llvm.org/D92789
2020-12-12 15:28:28 -05:00
diggerlin 997d286f2d [AIX][XCOFF] emit traceback table for function in aix
SUMMARY:
 1. added a new option -xcoff-traceback-table to control whether generate traceback table for function.
 2. implement the functionality of emit traceback table of a function.

Reviewers: hubert.reinterpretcast, Jason Liu
Differential Revision: https://reviews.llvm.org/D92398
2020-12-11 17:50:25 -05:00
Stefan Pintilie 2812c15156 [PowerPC] Fix missing nop after call to weak callee.
Weak functions can be replaced by other functions at link time. Previously it
was assumed that no matter what the weak callee function was replaced with it
would still share the same TOC as the caller. This is no longer true as a weak
callee with a TOC setup can be replaced by another function that was compiled
with PC Relative and does not have a TOC at all.

This patch makes sure that all calls to functions defined as weak from a caller
that has a valid TOC have a nop after the call to allow a place for the linker
to restore the TOC.

Reviewed By: NeHuang

Differential Revision: https://reviews.llvm.org/D91983
2020-12-08 09:38:44 -06:00
Qiu Chaofan efdd463050 [PowerPC] Fix chain for i1-to-fp operation
A simple SELECT is used for converting i1 to floating types on ppc32,
but in constrained cases, the chain is not handled properly. This patch
will fix that.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92365
2020-12-07 10:38:56 +08:00
QingShan Zhang c25b039e21 [PowerPC] Fix the regression caused by commit 9c588f53fc
Add a TypeLegal check for MVT::i1 and add the test.
2020-12-04 10:22:13 +00:00
QingShan Zhang 9bf0fea372 [PowerPC] Add the hw sqrt test for vector type v4f32/v2f64
PowerPC ISA support the input test for vector type v4f32 and v2f64.
Replace the software compare with hw test will improve the perf.

Reviewed By: ChenZheng

Differential Revision: https://reviews.llvm.org/D90914
2020-12-03 03:19:18 +00:00
Qiu Chaofan ffa2dce590 [PowerPC] Fix FLT_ROUNDS_ on little endian
In lowering of FLT_ROUNDS_, FPSCR content will be moved into FP register
and then GPR, and then truncated into word.

For subtargets without direct move support, it will store and then load.
The load address needs adjustment (+4) only on big-endian targets. This
patch fixes it on using generic opcodes on little-endian and subtargets
with direct-move.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D91845
2020-12-02 17:16:32 +08:00
QingShan Zhang 47f784ace6 [PowerPC] Promote the i1 to i64 for SINT_TO_FP/FP_TO_SINT
i1 is the native type for PowerPC if crbits is enabled. However, we need
to promote the i1 to i64 as we didn't have the pattern for i1.

Reviewed By: Qiu Chao Fang

Differential Revision: https://reviews.llvm.org/D92067
2020-12-02 05:37:45 +00:00
QingShan Zhang 4d83aba422 [DAGCombine] Adding a hook to improve the precision of fsqrt if the input is denormal
For now, we will hardcode the result as 0.0 if the input is denormal or 0. That will
have the impact the precision. As the fsqrt added belong to the cold path of the
cmp+branch, it won't impact the performance for normal inputs for PowerPC, but improve
the precision if the input is denormal.

Reviewed By: Spatel

Differential Revision: https://reviews.llvm.org/D80974
2020-11-27 02:10:55 +00:00
Zarko Todorovski 6d648e69c0 [AIX] Add support for non var_arg extended vector ABI calling convention on AIX
This patch enables passing non variadic vector type parameters on the caller and callee side and vector return on AIX that are passed in vector registers only.

So far, support is enabled for only the AIX extended Altivec ABI Calling convention.

Reviewed By: sfertile, DiggerLin

Differential Revision: https://reviews.llvm.org/D86476
2020-11-26 12:03:51 -05:00
Simon Pilgrim 0637dfe88b [DAG] Legalize abs(x) -> smax(x,sub(0,x)) iff smax/sub are legal
If smax() is legal, this is likely to result in smaller codegen expansion for abs(x) than the xor(add,ashr) method.

This is also what PowerPC has been doing for its abs implementation, so it lets us get rid of a load of custom lowering code there (and which was never updated when they added smax lowering).

Alive2: https://alive2.llvm.org/ce/z/xRk3cD

Differential Revision: https://reviews.llvm.org/D92095
2020-11-25 15:03:03 +00:00
QingShan Zhang 9c588f53fc [DAGCombine] Add hook to allow target specific test for sqrt input
PowerPC has instruction ftsqrt/xstsqrtdp etc to do the input test for software square root.
LLVM now tests it with smallest normalized value using abs + setcc. We should add hook to
target that has test instructions.

Reviewed By: Spatel, Chen Zheng, Qiu Chao Fang

Differential Revision: https://reviews.llvm.org/D80706
2020-11-25 05:37:15 +00:00
QingShan Zhang fa42f08b26 [PowerPC][FP128] Fix the incorrect calling convention for IEEE long double on Power8
For now, we are using the GPR to pass the arguments/return value for fp128 on Power8,
which is incorrect. It should be VSR. The reason why we do it this way is that,
we are setting the fp128 as illegal which make LLVM try to emulate it with i128 on
Power8. So, we need to correct it as legal.

Reviewed By: Nemanjai

Differential Revision: https://reviews.llvm.org/D91527
2020-11-25 01:43:48 +00:00
Zarko Todorovski c92f29b05e [AIX] Add mabi=vec-extabi options to enable the AIX extended and default vector ABIs.
Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX.
The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc.

Reviewed By: Xiangling_L, DiggerLin

Differential Revision: https://reviews.llvm.org/D89684
2020-11-24 18:17:53 -05:00
Sean Fertile 4f5355ee73 [PowerPC] Don't reuse an illegal typed load for int_to_fp conversion.
When the operand to an (s/u)int_to_fp node is an illegally typed load we
cannot reuse the load address since we can not build a proper dependancy
chain. The legalized loads will use a different chain output then the
illegal load. If we reuse the load address then we will build a
conversion node that uses the chain of the illegal load and operations
which modify the memory address in the other dependancy chain can be
scheduled before the floating point load which feeds the conversion.

Differential Revision: https://reviews.llvm.org/D91265
2020-11-24 15:45:33 -05:00
Craig Topper a7eae62a42 [SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version.
The default version only works if the returned node has a single
result. The X86 and PowerPC versions support multiple results
and allow a single result to be returned from a node with
multiple outputs. And allow a single result that is not result 0
of the node.

Also replace the Mips version since the new version should work
for it. The original version handled multiple results, but only
if the new node and original node had the same number of results.

Differential Revision: https://reviews.llvm.org/D91846
2020-11-20 10:06:53 -08:00
Simon Pilgrim 5f3a8074a4 [PPC] Fix dead store value clang static analyzer warning. NFCI.
Simplify the SplatBits 2-byte -> 4-byte 'splat'.
2020-11-17 16:27:45 +00:00
Baptiste Saleil 3f78605a8c [PowerPC] Add paired vector load and store builtins and intrinsics
This patch adds the Clang builtins and LLVM intrinsics to load and store vector pairs.

Differential Revision: https://reviews.llvm.org/D90799
2020-11-13 12:35:10 -06:00
Qiu Chaofan 3204ffeade [PowerPC] [NFC] Rename VCMPo to VCMP_rec
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D90581
2020-11-03 11:10:59 +08:00
Nemanja Ivanovic 5459d08795 [PowerPC] Fix single-use check and update chain users for ld-splat
When converting a BUILD_VECTOR or VECTOR_SHUFFLE to a splatting load
as of 1461fb6e78, we inaccurately check
for a single user of the load and neglect to update the users of the
output chain of the original load. As a result, we can emit a new
load when the original load is kept and the new load can be reordered
after a dependent store. This patch fixes those two issues.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47891
2020-10-27 16:49:38 -05:00
Victor Huang 2e1a737f46 [PowerPC][PCRelative] Turn on TLS support for PCRel by default
Turn on TLS support for PCRel by default and update the test cases.

Differential Revision: https://reviews.llvm.org/D88738
Reviewed by: stefanp, kamaub
2020-10-27 13:58:44 -05:00
Amy Kwan 6a946fd06f [DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead.
MULH is often expanded on targets.
This patch removes the isMulhCheaperThanMulShift hook and uses
isOperationLegalOrCustom instead.

Differential Revision: https://reviews.llvm.org/D80485
2020-10-19 12:23:04 -05:00
Kai Luo 354d3106c6 [PowerPC] Skip combining (uint_to_fp x) if x is not simple type
Current powerpc64le backend hits
```
Combining: t7: f64 = uint_to_fp t6
llc: llvm-project/llvm/include/llvm/CodeGen/ValueTypes.h:291: llvm::MVT llvm::EVT::getSimpleVT() const: Assertion `isSimple() && "Expected a SimpleValueType!"' failed.
```
This patch fixes it by skipping combination if `t6` is not simple type.
Fixed https://bugs.llvm.org/show_bug.cgi?id=47660.

Reviewed By: #powerpc, steven.zhang

Differential Revision: https://reviews.llvm.org/D88388
2020-10-19 05:23:46 +00:00
Albion Fung d30155feaa [PowerPC] Implementation of 128-bit Binary Vector Rotate builtins
This patch implements 128-bit Binary Vector Rotate builtins for PowerPC10.

Differential Revision: https://reviews.llvm.org/D86819
2020-10-16 18:03:22 -04:00
David Sherwood 47f2dc7e5f [SVE][NFC] Replace some TypeSize comparisons in non-AArch64 Targets
In most of lib/Target we know that we are not dealing with scalable
types so it's perfectly fine to replace TypeSize comparison operators
with their fixed width equivalents, making use of getFixedSize()
and so on.

Differential Revision: https://reviews.llvm.org/D89101
2020-10-15 09:01:21 +01:00
Ahsan Saghir f3202b30b8 [PowerPC] Add assemble disassemble intrinsics for MMA
This patch adds support for assemble disassemble intrinsics
for MMA.

Reviewed By: bsaleil, #powerpc

Differential Revision: https://reviews.llvm.org/D88739
2020-10-13 13:21:58 -05:00
Simon Pilgrim 2c3e4a21f9 [PowerPC] ReplaceNodeResults - bail on funnel shifts and let generic legalizers deal with it
Fixes regression raised on D88834 for 32-bit triple + 64-bit cpu cases (which apparently is a thing).
2020-10-10 19:13:16 +01:00
Fangrui Song 2bd4730850 [PowerPC] Fix signed overflow in decomposeMulByConstant after D88201
Caught by multipliers LONG_MAX (after +1) and LONG_MIN (after -1) in CodeGen/PowerPC/mul-const-i64.ll
2020-10-09 18:29:12 -07:00
Esme-Yi e9fd8823ba [DAGCombiner] Add decomposition patterns for Mul-by-Imm.
Summary: This patch is derived from D87384.
In this patch we expand the existing decomposition of mul-by-constant to be more general by implementing 2 patterns:
```
  mul x, (2^N + 2^M) --> (add (shl x, N), (shl x, M))
  mul x, (2^N - 2^M) --> (sub (shl x, N), (shl x, M))
```
The conversion will be trigged if the multiplier is a big constant that the target can't use a single multiplication instruction to handle. This is controlled by the hook `decomposeMulByConstant`.
More over, the conversion benefits from an ILP improvement since the instructions are independent. A case with the sequence like following also gets benefit since a shift instruction is saved.

```
*res1 = a * 0x8800;
*res2 = a * 0x8080;
```

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D88201
2020-10-09 08:51:40 +00:00
Chen Zheng 0492dd91c4 [PowerPC] add more builtins for PPCTargetLowering::getTgtMemIntrinsic
Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D88374
2020-10-06 23:48:33 -04:00
Craig Topper 1127662c6d [SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS.
getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize
FP constants to the RHS. When this happens we should create the node
with the FMF that was requested. By using FlagInserter when can ensure
any calls to getNode/getSetcc during canonicalization will also get the flags.

Differential Revision: https://reviews.llvm.org/D88063
2020-10-05 14:55:23 -07:00
Zarko Todorovski 052c5bf40a [PPC] Do not emit extswsli in 32BIT mode when using -mcpu=pwr9
It looks like in some circumstances when compiling with `-mcpu=pwr9` we create an EXTSWSLI node when which causes llc to fail. No such error occurs in pwr8 or lower.

This occurs in 32BIT AIX and BE Linux. the cause seems to be that the default return in combineSHL is to create an EXTSWSLI node.  Adding a check for whether we are in PPC64 before that fixes the issue.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D87046
2020-09-30 11:06:20 -04:00
Sean Fertile dfb717da1f [PowerPC] Remove support for VRSAVE save/restore/update.
After removal of Darwin as a PowerPC subtarget, the VRSAVE
save/restore/spill/update code is no longer needed by any supported
subtarget, so remove it while keeping support for vrsave and related instruction
aliases for inline asm. I've pre-commited tests to document the existing vrsave
handling in relation to @llvm.eh.unwind.init and inline asm usage, as
well as a test which shows a beahviour change on AIX related to
returning vector type as we were wrongly emiting VRSAVE_UPDATE on AIX.
2020-09-30 10:05:53 -04:00
Baptiste Saleil 0156914275 [PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types
This patch legalizes the v256i1 and v512i1 types that will be used for MMA.

It implements loads and stores of these types.
v256i1 is a pair of VSX registers, so for this type, we load/store the two
underlying registers. v512i1 is used for MMA accumulators. So in addition to
loading and storing the 4 associated VSX registers, we generate instructions to
prime (copy the VSX registers to the accumulator) after loading and unprime
(copy the accumulator back to the VSX registers) before storing.

This patch also adds the UACC register class that is necessary to implement the
loads and stores. This class represents accumulator in their unprimed form and
allow the distinction between primed and unprimed accumulators to avoid invalid
copies of the VSX registers associated with primed accumulators.

Differential Revision: https://reviews.llvm.org/D84968
2020-09-28 14:39:37 -05:00
Amy Kwan 2e7117f847 [PowerPC] Implement the 128-bit vec_[all|any]_[eq | ne | lt | gt | le | ge] builtins in Clang/LLVM
This patch implements the vec_[all|any]_[eq | ne | lt | gt | le | ge] builtins for vector signed/unsigned __int128.

Differential Revision: https://reviews.llvm.org/D87910
2020-09-23 16:49:40 -04:00
Albion Fung 88cdbeab41 [PowerPC] Implement Vector signed/unsigned __int128 overloads for the comparison builtins
This patch implements Vector signed/unsigned __int128 overloads for the comparison builtins.

Differential Revision: https://reviews.llvm.org/D87804
2020-09-23 16:49:40 -04:00
Victor Huang 652a8f150d [PowerPC][PCRelative] Thread Local Storage Support for Local Dynamic
This patch is the initial support for the Local Dynamic Thread Local Storage
model to produce code sequence and relocation correct to the ABI for the model
when using PC relative memory operations.

Differential Revision: https://reviews.llvm.org/D87721
2020-09-23 13:48:06 -05:00
Albion Fung d7eb917a7c [PowerPC] Implementation of 128-bit Binary Vector Mod and Sign Extend builtins
This patch implements 128-bit Binary Vector Mod and Sign Extend builtins for PowerPC10.

Differential: https://reviews.llvm.org/D87394#inline-815858
2020-09-23 01:18:14 -05:00
Qiu Chaofan 1d782c2987 [PowerPC] Pass nofpexcept flag to custom lowered constrained ops
This is a follow-up of D86605. For strict DAG FP node, if its FP
exception behavior metadata is ignore, it should have nofpexcept flag.
But during custom lowering, this flag isn't passed down.

This is also seen on X86 target.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D87390
2020-09-21 10:44:25 +08:00
Qiu Chaofan ebfbdebe96 [PowerPC] Fix store-fptoi combine of f128 on Power8
llc would crash for (store (fptosi-f128-i32)) when -mcpu=pwr8, we should
not generate FP_TO_(S|U)INT_IN_VSR for f128 types at this time. This
patch fixes it.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D86686
2020-09-17 10:21:35 +08:00