Commit Graph

6272 Commits

Author SHA1 Message Date
Jinsong Ji adffce7153 [PowerPC] Remove QPX/A2Q BGQ/BGP CNK support
Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html
no one is making use of QPX/A2Q/BGQ/BGP CNK anymore.

This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang,
CNK support in openmp/polly.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D83915
2020-07-27 19:24:39 +00:00
jasonliu c25f61cf6a [XCOFF][AIX] Handle llvm.used and llvm.compiler.used global array
For now, just return and do nothing when we see llvm.used and
llvm.compiler.used global array.
Hopefully, we could come up with a good solution later to prevent
linker from eliminating symbols in llvm.used array.

Reviewed By: DiggerLin, daltenty

Differential Revision: https://reviews.llvm.org/D84363
2020-07-27 15:28:32 +00:00
biplmish 825ed2d43d [PowerPC] Add Vector Extract Double Instruction Definitions and MC tests.
This patch adds the td definitions and asm/disasm tests for the following instructions:

Vector Extract Double Left Index - vextdubvlx, vextduhvlx, vextduwvlx, vextddvlx
Vector Extract Double Right Index - vextdubvrx, vextduhvrx, vextduwvrx, vextddvrx

Differential Revision: https://reviews.llvm.org/D84384
2020-07-26 23:56:19 -05:00
Nemanja Ivanovic cdead4f89c [PowerPC][NFC] Fix an assert that cannot trip from 7d076e19e3
I mixed up the precedence of operators in the assert and thought I
had it right since there was no compiler warning. This just
adds the parentheses in the expression as needed.
2020-07-25 20:28:52 -04:00
Amy Kwan 739cd2638b [PowerPC] Exploit the High Order Vector Multiply Instructions on Power10
This patch aims to exploit the following vector multiply high instructions on Power10.
vmulhsw VRT, VRA, VRB
vmulhsd VRT, VRA, VRB
vmulhuw VRT, VRA, VRB
vmulhud VRT, VRA, VRB

Differential Revision: https://reviews.llvm.org/D82584
2020-07-24 20:57:57 -05:00
Amy Kwan 74790a5dde [PowerPC] Implement Truncate and Store VSX Vector Builtins
This patch implements the `vec_xst_trunc` function in altivec.h in  order to
utilize the Store VSX Vector Rightmost [byte | half | word | doubleword] Indexed
instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82467
2020-07-24 19:22:39 -05:00
Nemanja Ivanovic 7d076e19e3 [PowerPC] Fix computation of offset for load-and-splat for permuted loads
Unfortunately this is another regression from my canonicalization patch
(1fed131660). The patch contained two implicit assumptions:
1. That we would have a permuted load only if we are loading a partial vector
2. That a partial vector load would necessarily be as wide as the splat

However, assumption 2 is not correct since it is possible to do a wider
load and only splat a half of it. This patch corrects this assumption by
simply checking if the load is permuted and adjusting the offset if it is.
2020-07-24 15:38:46 -04:00
Amy Kwan 1dc1a3fb0c [PowerPC] Implement low-order Vector Multiply, Modulus and Divide Instructions
This patch aims to implement the low order vector multiply, divide and modulo
instructions available on Power10.

The patch involves legalizing the ISD nodes MUL, UDIV, SDIV, UREM and SREM for
v2i64 and v4i32 vector types in order to utilize the following instructions:
- Vector Multiply Low Doubleword: vmulld
- Vector Modulus Word/Doubleword: vmodsw, vmoduw, vmodsd, vmodud
- Vector Divide Word/Doubleword: vdivsw, vdivsd, vdivuw, vdivud

Differential Revision: https://reviews.llvm.org/D82510
2020-07-23 17:18:36 -05:00
Amy Kwan 5f11027395 [PowerPC][Power10] Fix vins*vlx instructions to have i32 arguments.
Previously, the vins*vlx instructions were incorrectly defined with i64 as the
second argument. This patches fixes this issue by correcting the second argument
of the vins*vlx instructions/intrinsics to be i32.

Differential Revision: https://reviews.llvm.org/D84277
2020-07-22 17:58:14 -05:00
Amy Kwan 08b4a50e39 [PowerPC][Power10] Fix the Test LSB by Byte (xvtlsbb) Builtins Implementation
The implementation of the xvtlsbb builtins/intrinsics were not correct as the
intrinsics previously used i1 as an argument type. This patch changes the i1
argument type used in these intrinsics to be i32 instead, as having the second
as an i1 can lead to issues in the backend.

Differential Revision: https://reviews.llvm.org/D84291
2020-07-22 13:27:05 -05:00
Stefan Pintilie a60251d739 [PowerPC] Add linker opt for PC Relative GOT indirect accesses
A linker optimization is available on PowerPC for GOT indirect PCRelative loads.

The idea is that we can mark a usual GOT indirect load:

pld 3, vec@got@pcrel(0), 1
lwa 3, 4(3)

With a relocation to say that if we don't need to go through the GOT we can let
the linker further optimize this and replace a load with a nop.

  pld 3, vec@got@pcrel(0), 1
.Lpcrel1:
.reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8)
  lwa 3, 4(3)

This patch adds the logic that allows the compiler to add the R_PPC64_PCREL_OPT.

Reviewers: nemanjai, lei, hfinkel, sfertile, efriedma, tstellar, grosbach

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D79864
2020-07-22 09:08:23 -05:00
jasonliu b98b1700ef [XCOFF] Enable symbol alias for AIX
Summary:
AIX assembly's .set directive is not usable for aliasing purpose.
We need to use extra-label-at-defintion strategy to generate symbol
aliasing on AIX.

Reviewed By: DiggerLin, Xiangling_L

Differential Revision: https://reviews.llvm.org/D83252
2020-07-22 14:03:55 +00:00
Sebastian Neubauer 2a6c871596 [InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific
intrinsics. Having target specific code in general passes was noted as
an area for improvement for a long time.

D81728 moves most target specific code out of the InstCombine pass.
Applying the target specific combinations in an extra pass would
probably result in inferior optimizations compared to the current
fixed-point iteration, therefore the InstCombine pass resorts to newly
introduced functions in the TargetTransformInfo when it encounters
unknown intrinsics.
The patch should not have any effect on generated code (under the
assumption that code never uses intrinsics from a foreign target).

This introduces three new functions:
TargetTransformInfo::instCombineIntrinsic
TargetTransformInfo::simplifyDemandedUseBitsIntrinsic
TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic

A few target specific parts are left in the InstCombine folder, where
it makes sense to share code. The largest left-over part in
InstCombineCalls.cpp is the code shared between arm and aarch64.

This allows to move about 3000 lines out from InstCombine to the targets.

Differential Revision: https://reviews.llvm.org/D81728
2020-07-22 15:59:49 +02:00
Chen Zheng 36f9fe2d34 [PowerPC] fixupIsDeadOrKill start and end in different block fixing
In fixupIsDeadOrKill, we assume StartMI and EndMI not exist in same
basic block, so we add an assertion in that function. This is wrong
before RA, as before RA the true definition may exist in another
block through copy like instructions.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D83365
2020-07-22 06:27:13 -04:00
Kai Luo c3f9697f1f [PowerPC] Fix wrong codegen when stack pointer has to realign performing dynalloc
Current powerpc backend generates wrong code sequence if stack pointer
has to realign if `-fstack-clash-protection` enabled. When probing
dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes
`NegSizeReg` as input and returns
`FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated
correctly, however code following `PREPARE_PROBED_ALLOCA` still uses
value of `NegSizeReg`, which does not contain `ActualNegSize` if
`MaxAlign > TargetAlign`, to calculate loop trip count and residual
number of bytes.

This patch is part of fix of
https://bugs.llvm.org/show_bug.cgi?id=46759.

Differential Revision: https://reviews.llvm.org/D84152
2020-07-22 06:35:12 +00:00
Kai Luo 8912252252 [PowerPC] Fix wrong codegen when stack pointer has to realign in prologue
Current powerpc backend generates wrong code sequence if stack pointer
has to realign if -fstack-clash-protection enabled. When probing in
prologue, backend should generate a subtraction instruction rather
than a `stux` instruction to realign the stack pointer.

This patch is part of fix of
https://bugs.llvm.org/show_bug.cgi?id=46759.

Differential Revision: https://reviews.llvm.org/D84218
2020-07-22 06:35:12 +00:00
Kang Zhang 9bbf0ecff3 [PowerPC] Fix the implicit operands in PredicateInstruction()
Summary:
In the function `PPCInstrInfo::PredicateInstruction()`, we will replace
non-Predicate Instructions to Predicate Instruction. But we forget add
the new implicit operands the new Predicate Instruction needed. This
patch is to fix this.

Reviewed By: jsji, efriedma

Differential Revision: https://reviews.llvm.org/D82390
2020-07-22 05:51:03 +00:00
Chen Zheng e8425b27fe [PowerPC] add store (load float*) pattern to isProfitableToHoist
store (load float*) can be optimized to store(load i32*) in InstCombine pass.

Add store (load float*) to isProfitableToHoist to make sure we don't break
the opt in InstCombine pass.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D82341
2020-07-21 20:55:13 -04:00
Amy Kwan 1eb279d2a8 [PowerPC][Power10] Add Vector Multiply/Mod/Divide Instruction Definitions and MC Tests
This patch adds the td definitions and asm/disasm tests for the following instructions:
- Vector Multiply Low Doubleword: vmulld
- Vector Modulus Word/Doubleword: vmodsw, vmoduw, vmodsd, vmodud
- Vector Divide Word/Doubleword: vdivsw, vdivuw, vdivsd, vdivud
- Vector Multiply High Word/Doubleword: vmulhsw, vmulhsd, vmulhuw, vmulhud
- Vector Divide Extended Word/Doubleword: vdivesw, vdiveuw, vdivesd, vdiveud

Differential Revision: https://reviews.llvm.org/D82929
2020-07-21 18:05:35 -05:00
diggerlin 11546898e2 [AIX][XCOFF]emit extern linkage for the llvm intrinsic symbol
SUMMARY:

when we call memset, memcopy,memmove etc(this are llvm intrinsic function) in the c source code. the llvm will generate IR
like call call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
for c source code
bash> cat test_memset.call

struct S{
 int a;
 int b;
};
extern struct  S s;
void bar() {
  memset(&s, s.b, s.b);
}
like

%struct.S = type { i32, i32 }
@s = external global %struct.S, align 4
; Function Attrs: noinline nounwind optnone
define void @bar() #0 {
entry:
  %0 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
  %1 = trunc i32 %0 to i8
  %2 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
  call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
  ret void
}
declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #1
If we want to let the aix as assembly compile pass without -u
it need to has following assembly code.
.extern .memset
(we do not output extern linkage for llvm instrinsic function.
even if we output the extern linkage for llvm intrinsic function, we should not out .extern llvm.memset.p0i8.i32,
instead of we should emit .extern memset)

for other llvm buildin function floatdidf . even if we do not call these function floatdidf in the c source code(the generated IR also do not the call __floatdidf . the function call
was generated in the LLVM optimized.
the function is not in the functions list of Module, but we still need to emit extern .__floatdidf

The solution for it as :
We record all the lllvm intrinsic extern symbol when transformCallee(), and emit all these symbol in the AsmPrinter::doFinalization(Module &M)

Reviewers:  jasonliu, Sean Fertile, hubert.reinterpretcast,

Differential Revision: https://reviews.llvm.org/D78929
2020-07-21 16:03:04 -04:00
Kang Zhang d37befdfe5 [PowerPC] Remove the redundant implicit operands in ppc-early-ret pass
Summary:
In the `ppc-early-ret` pass, we have use `BuildMI` and `copyImplicitOps` when the branch instructions can do the early return. But the two functions will add implicit operands twice, this is not correct.

This patch is to remove the redundant implicit operands in `ppc-early-ret pass`.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D76042
2020-07-19 07:01:45 +00:00
Albion Fung c273563552 [PowerPC][Power10] Add 128-bit Binary Integer Operation instruction definitions and MC Tests
This patch adds the instruction definitions and MC tests for the 128-bit Binary
Integer Operation instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D83516
2020-07-16 17:16:43 -05:00
Amy Kwan fc55308628 [PowerPC][Power10] Fix VINS* (vector insert byte/half/word) instructions to have i32 arguments.
Previously, the vins* intrinsic was incorrectly defined to have its second and
third argument arguments as an i64. This patch fixes the second and third
argument of the vins* instruction and intrinsic to have i32s instead.

Differential Revision: https://reviews.llvm.org/D83497
2020-07-16 00:30:24 -05:00
Logan Smith a19461d9e1 [NFC] Add 'override' keyword where missing in include/ and lib/.
This fixes warnings raised by Clang's new -Wsuggest-override, in preparation for enabling that warning in the LLVM build. This patch also removes the virtual keyword where redundant, but only in places where doing so improves consistency within a given file. It also removes a couple unnecessary virtual destructor declarations in derived classes where the destructor inherited from the base class is already virtual.

Differential Revision: https://reviews.llvm.org/D83709
2020-07-14 09:47:29 -07:00
Amy Kwan 62f5ba624b [PowerPC][Power10] Implement Test LSB by Byte Builtins in LLVM/Clang
This patch implements builtins for the Test LSB by Byte instruction introduced
in Power10.

Differential Revision: https://reviews.llvm.org/D82431
2020-07-13 22:47:47 -05:00
Kai Luo d4e7d126b0 [PowerPC] Generate CFI directives when probing in prologue
Add missing CFI directives when probing in prologue if
`stack-clash-protection` is enabled.

Differential Revision: https://reviews.llvm.org/D83276
2020-07-14 02:56:12 +00:00
Fangrui Song eafe7c14ea [PowerPC] Fix combineVectorShuffle regression after D77448
Commit 1fed131660 assumed that NewShuffle (shuffle vector
canonicalization result) will always be ShuffleVectorSDNode, which may
be false (it may be a BITCAST node):

```
...
t12: v4i32 = scalar_to_vector t2
t15: v16i8 = bitcast t12  # LHS
t17: v16i8 = vector_shuffle<u,u,u,u,u,u,u,u,0,1,2,3,u,u,u,u> t15, undef:v16i8  # SVN
```

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D83617
2020-07-13 16:57:27 -07:00
Qiu Chaofan b6912c879e [PowerPC] Support constrained conversion in SPE target
This patch adds support for constrained int/fp conversion between
signed/unsigned i32 and f32/f64.

Reviewed By: jhibbits

Differential Revision: https://reviews.llvm.org/D82747
2020-07-13 12:18:36 +08:00
Jinsong Ji 3e3acc1cc7 [PowerPC][MachinePipeliner] Enable pipeliner if hasInstrSchedModel
P9 is the only one with InstrSchedModel, but we may have more in the
future, we should not hardcoded it to P9, check hasInstrSchedModel
instead.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D83590
2020-07-11 02:24:12 +00:00
Sidharth Baveja e541e1b757 [NFC] Separate Peeling Properties into its own struct (re-land after minor fix)
Summary:
This patch separates the peeling specific parameters from the UnrollingPreferences,
and creates a new struct called PeelingPreferences. Functions which used the
UnrollingPreferences struct for peeling have been updated to use the PeelingPreferences struct.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel), anhtuyen (Anh Tuyen Tran), nikic (Nikita Popov)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-10 18:39:30 +00:00
Lei Huang 90b1a710ae [PowerPC] Enable default support of quad precision operations
Summary: Remove option guarding support of quad precision operations.

Reviewers: nemanjai, #powerpc, steven.zhang

Reviewed By: nemanjai, #powerpc, steven.zhang

Subscribers: qiucf, wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83437
2020-07-10 13:27:48 -05:00
Albion Fung 5ffec46720 [PowerPC][Power10] Add Instruction definition/MC Tests for Load/Store Rightmost VSX Vector
This patch adds the instruction definitions and the assembly/disassembly
tests for the Load/Store VSX Vector Rightmose instructions.

Differential Revision: https://reviews.llvm.org/D83364
2020-07-09 17:06:03 -05:00
Eric Christopher ce1e4853b5 Temporarily Revert "[PowerPC] Split s34imm into two types"
as it was failing in Release+Asserts mode with an assert.

This reverts commit bd20680311.
2020-07-09 13:36:32 -07:00
Stefan Pintilie bd20680311 [PowerPC] Split s34imm into two types
Currently the instruction paddi always takes s34imm as the type for the
34 bit immediate. However, the PC Relative form of the instruction should
not produce the same fixup as the non PC Relative form.
This patch splits the s34imm type into s34imm and s34imm_pcrel so that two
different fixups can be emitted.

Reviewed By: kamaub, nemanjai

Differential Revision: https://reviews.llvm.org/D83255
2020-07-09 11:28:32 -05:00
Kai Luo e2b93185b8 [PowerPC] Only make copies of registers on stack in variadic function when va_start is called
On PPC64, for a variadic function, if va_start is not called, it won't
access any variadic argument on stack, thus we can save stores of
registers used to pass arguments.

Differential Revision: https://reviews.llvm.org/D82361
2020-07-09 07:18:17 +00:00
Nikita Popov 0b39d2d752 Revert "[NFC] Separate Peeling Properties into its own struct"
This reverts commit 0369dc98f9.

Many failing tests.
2020-07-08 21:43:32 +02:00
Sidharth Baveja 0369dc98f9 [NFC] Separate Peeling Properties into its own struct
Summary:
This patch makes the peeling properties of the loop accessible by other loop transformations.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-08 18:59:59 +00:00
Anh Tuyen Tran 6965af43e6 Revert "[NFC] Separate Peeling Properties into its own struct"
This reverts commit fead250b43.
2020-07-08 18:58:05 +00:00
Anh Tuyen Tran fead250b43 [NFC] Separate Peeling Properties into its own struct
Summary:
This patch makes the peeling properties of the loop accessible by other loop transformations.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-08 18:56:03 +00:00
Biplob Mishra 62ba48b45f [PowerPC] Implement Vector Replace Builtins in LLVM
Provide the LLVM intrinsics needed to implement vector replace element
builtins in altivec.h which will be added in a subsequent patch.

Differential Revision: https://reviews.llvm.org/D83308
2020-07-07 12:22:52 -05:00
Nemanja Ivanovic 1b1539712e [PowerPC] Do not RAUW combined nodes in VECTOR_SHUFFLE legalization
When legalizing shuffles, we make an attempt to combine it into
a PPC specific canonical form that avoids a need for a swap. If the
combine is successful, we RAUW the node and the custom legalization
replaces the now dead node instead of the one it should replace.
Remove that erroneous call to RAUW.
2020-07-06 22:09:28 -05:00
Amy Kwan c13e3e2c2e [PowerPC][Power10] Exploit the xxsplti32dx instruction when lowering VECTOR_SHUFFLE.
This patch aims to exploit the xxsplti32dx XT, IX, IMM32 instruction when lowering VECTOR_SHUFFLEs.
We implement lowerToXXSPLTI32DX when lowering vector shuffles to check if:
- Element size is 4 bytes
- The RHS is a constant vector (and constant splat of 4-bytes)
- The shuffle mask is a suitable mask for the XXSPLTI32DX instruction where it is one of the 32 masks:
<0, 4-7, 2, 4-7>
<4-7, 1, 4-7, 3>

Differential Revision: https://reviews.llvm.org/D83245
2020-07-06 20:28:38 -05:00
jasonliu 6d3ae365bd [XCOFF][AIX] Give symbol an internal name when desired symbol name contains invalid character(s)
Summary:

When a desired symbol name contains invalid character that the
system assembler could not process, we need to emit .rename
directive in assembly path in order for that desired symbol name
to appear in the symbol table.

Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L

Differential Revision: https://reviews.llvm.org/D82481
2020-07-06 15:49:15 +00:00
Esme-Yi 0607c8df7f [PowerPC] Legalize SREM/UREM directly on P9.
Summary: As Bugzilla-35090 reported, the rationale for using custom lowering SREM/UREM should no longer be true. At the IR level, the div-rem-pairs pass performs the transformation where the remainder is computed from the result of the division when both a required. We should now be able to lower these directly on P9. And the pass also fixed the problem that divide is in a different block than the remainder. This is a patch to remove redundant code and make SREM/UREM legal directly on P9.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D82145
2020-07-06 11:47:31 +00:00
Kai Luo c352e0885a [PowerPC] Implement probing for prologue
This patch is part of supporting `-fstack-clash-protection`. Implemented
probing when emitting prologue.

Differential Revision: https://reviews.llvm.org/D81460
2020-07-04 03:07:08 +00:00
Lei Huang e359ab1eca [PowerPC][NFC] Fix indentation 2020-07-03 16:47:24 -05:00
Biplob Mishra 0939e04e41 [PowerPC] Implement Vector Insert Builtins in LLVM/Clang
Implements vec_insertl() and vec_inserth().

Differential Revision: https://reviews.llvm.org/D82365
2020-07-03 15:30:41 -05:00
Sean Fertile 484a36b97d Enable basepointer for AIX.
Differential Revision: https://reviews.llvm.org/D82030
2020-07-03 11:55:49 -04:00
Guillaume Chatelet 87e2751cf0 [Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D83082
2020-07-03 08:06:43 +00:00
Kai Luo 03828e38c3 [PowerPC] Implement probing for dynamic stack allocation
This patch is part of supporting `-fstack-clash-protection`. Mainly do
such things compared to existing `lowerDynamicAlloc`

- Added a new pseudo instruction PPC::PREPARE_PROBED_ALLOC to get
  actual frame pointer and final stack pointer.
- Synthesize a loop to probe by blocks.
- Use DYNAREAOFFSET to get MaxCallFrameSize which is calculated in
  prologepilog.

Differential Revision: https://reviews.llvm.org/D81358
2020-07-03 05:36:40 +00:00
Kai Luo d8921a8005 [PowerPC][NFC] Prevent unused error when assertion is disabled. 2020-07-03 04:23:19 +00:00
Kai Luo 40e9e0826b [PowerPC][NFC] Refactor lowerDynamicAlloc
When performing dynamic stack allocation, calculation of frame pointer
and actual negsize can be separated. This patch refactors
`lowerDynamicAlloc` in preparation of supporting
`-fstack-clash-protection` which also has to calculate actual frame
pointer and negsize.

Differential Revision: https://reviews.llvm.org/D81354
2020-07-03 03:33:24 +00:00
Biplob Mishra ca464639a1 [PowerPC] Implement Vector Blend Builtins in LLVM/Clang
Implements vec_blendv()

Differential Revision: https://reviews.llvm.org/D82774
2020-07-02 16:52:52 -05:00
Amy Kwan 6076fc698d [PowerPC]Add Vector Insert Instruction Definitions and MC Test
Adds td definitions and asm/disasm tests for the following instructions:

  VINSBVLX
  VINSBVRX
  VINSHVLX
  VINSHVRX
  VINSWVLX
  VINSWVRX
  VINSBLX
  VINSBRX
  VINSHLX
  VINSHRX
  VINSWLX
  VINSWRX
  VINSDLX
  VINSDRX
  VINSW
  VINSD

Differential Revision: https://reviews.llvm.org/D83052
2020-07-02 15:49:16 -05:00
Biplob Mishra 286073484f [PowerPC]Implement Vector Permute Extended Builtin
Implements vector permute builtin: vec_permx()

Differential Revision: https://reviews.llvm.org/D82869
2020-07-02 14:53:18 -05:00
Nemanja Ivanovic a701dc5510 [PowerPC] Remove undefs from splat input when changing shuffle mask
As of 1fed131660, we have code that
changes shuffle masks so that we can put the shuffle in a canonical
form that can be matched to a single instruction. However, it
does not properly account for undef elements in the BUILD_VECTOR
that is the RHS splat so we can end up with undefs where they
shouldn't be. This patch converts the splat input with undefs to
one without.
2020-07-02 12:26:56 -05:00
Guillaume Chatelet 8dbafd24d6 [Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82977
2020-07-02 11:28:02 +00:00
Biplob Mishra 88874f0746 [PowerPC]Implement Vector Shift Double Bit Immediate Builtins
Implement Vector Shift Double Bit Immediate Builtins in LLVM/Clang.
  * vec_sldb ();
  * vec_srdb ();

Differential Revision: https://reviews.llvm.org/D82440
2020-07-01 20:34:53 -05:00
Lei Huang 99c4207d42 [PowerPC][NFC] Update doc for FeatureISA3_1/FeatureISA3_0 definitions 2020-07-01 19:36:19 -05:00
Anil Mahmud c5b4f03b53 [PowerPC] Exploit xxspltiw and xxspltidp instructions
Exploits the VSX Vector Splat Immediate Word and
VSX Vector Splat Immediate Double Precision instructions:

  xxspltiw XT,IMM32
  xxspltidp XT,IMM32

Differential Revision: https://reviews.llvm.org/D82911
2020-07-01 19:18:29 -05:00
James Y Knight 4b0aa5724f Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.
Before this instruction supported output values, it fit fairly
naturally as a terminator. However, being a terminator while also
supporting outputs causes some trouble, as the physreg->vreg COPY
operations cannot be in the same block.

Modeling it as a non-terminator allows it to be handled the same way
as invoke is handled already.

Most of the changes here were created by auditing all the existing
users of MachineBasicBlock::isEHPad() and
MachineBasicBlock::hasEHPadSuccessor(), and adding calls to
isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate.

Reviewed By: nickdesaulniers, void

Differential Revision: https://reviews.llvm.org/D79794
2020-07-01 12:51:50 -04:00
Stefan Pintilie b294e00fb0 [PowerPC] Fix for PC Relative call protocol
The situation where the caller uses a TOC and the callee does not
but is marked as clobbers the TOC (st_other=1) was not being compiled
correctly if both functions where in the same object file.

The call site where we had `callee` was missing a nop after the call.
This is because it was assumed that since the two functions where in
the same DSO they would be sharing a TOC. This is not the case if the
callee uses PC Relative because in that case it may clobber the TOC.
This patch makes sure that we add the cnop correctly so that the
linker has a place to restore the TOC.

Reviewers: sfertile, NeHuang, saghir

Differential Revision: https://reviews.llvm.org/D81126
2020-07-01 07:08:41 -05:00
Guillaume Chatelet 28de229bc6 [Alignment][NFC] Migrate MachineFrameInfo::CreateStackObject to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82894
2020-07-01 07:28:11 +00:00
Kit Barton 4c2c6c7cc1 [PPC][NFC] Replace TM with Subtarget->getTargetMachine() in preparation for GlobalISel.
There are two uses of TM (instance of TargetMachine) when checking options.
These will not work once we enable GlobalISel. This patch replaces those uses of
TM with Subtarget->getTargetMachine().
2020-06-30 17:19:24 -05:00
Amy Kwan 73377c4597 [PowerPC][Power10] Add Vector Splat Imm/Permute/Blend/Shift Double Bit Imm Definitions and MC Tests
This patch adds the td definitions and asm/disasm tests for the
following instructions:

XXSPLTIW
XXSPLTIDP
XXSPLTI32DX
XXPERMX
XXBLENDVB
XXBLENDVH
XXBLENDVW
XXBLENDVD
VSLDBI
VSRDBI

Differential Revision: https://reviews.llvm.org/D82896
2020-06-30 16:07:21 -05:00
Matt Arsenault d9f0c3663f PPC: Don't store function in PPCFunctionInfo
Continue migrating targets from depending on the MachineFunction
during the initial construction.
2020-06-30 16:08:51 -04:00
Guillaume Chatelet a976ea3209 [Alignment][NFC] Migrate PPC, X86 and XCore backends to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82779
2020-06-30 08:08:45 +00:00
Lei Huang af9cc2d2af [PowerPC] Fix FeatureISA3_1 def in PPC.td to imply FeatureISA3_0. 2020-06-29 16:13:02 -05:00
Nemanja Ivanovic d2533d96e1 [PowerPC] Fix crash for shuffle canonicalization with elt 0 from RHS
Commit 1fed131660 assumed that shuffle vector canonicalization will
always ensure that the shuffle mask will be ordered so that element
zero comes from the LHS vector. However there is code out there for
which this is not the case. This patch simply removes that unsafe
assumption and makes the code work regardless of the source of the
first element.
2020-06-29 12:26:08 -05:00
Nemanja Ivanovic 57ad8f4730 [PowerPC] Don't combine SCALAR_TO_VECTOR without VSX
Most of the patterns for PPCISD::SCALAR_TO_VECTOR_PERMUTED require
VSX. So don't emit them if the subtarget doesn't have VSX.
This resolves the issue reported on
https://reviews.llvm.org/rG1fed131660b2c5d3ea7007e273a7a5da80699445
2020-06-29 09:48:57 -05:00
Guillaume Chatelet 368a5e3a66 [Alignment][NFC] migrate DataLayout::getPreferredAlignment
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82752
2020-06-29 11:24:36 +00:00
Amy Kwan fa0da7ec6a [PowerPC] Add support for llvm.ppc.dcbt, llvm.ppc.dcbtst, llvm.ppc.isync intrinsics
This patch adds LLVM intrinsics for the dcbt (Data Cache Block Touch),
dcbtst (Data Cache Block Touch for Store) and isync (Instruction
Synchronize) instructions.

The intrinsic for dcbt and dcbst in this patch are named llvm.ppc.dcbt.with.hint
and llvm.ppc.dcbtst.with.hint respectively as there already exists an intrinsic
for llvm.ppc.dcbt and llvm.ppc.dcbtst. However, the original variants of the
intrinsics do not accept the TH immediate field, whereas these variants do.

Differential Revision: https://reviews.llvm.org/D79633
2020-06-26 13:02:18 -05:00
Kit Barton 5ca75130f5 [PPC][NFC] Add Subtarget and replace all uses of PPCSubTarget with Subtarget.
Summary:
In preparation for GlobalISel, PPCSubTarget needs to be renamed to Subtarget as there places in GlobalISel that assume the presence of the variable Subtarget.
This patch introduces the variable Subtarget, and replaces all existing uses of PPCSubTarget with Subtarget. A subsequent patch will remove the definiton of
PPCSubTarget, once any downstream users have the opportunity to rename any uses they have.

Reviewers: hfinkel, nemanjai, jhibbits, #powerpc, echristo, lkail

Reviewed By: #powerpc, echristo, lkail

Subscribers: echristo, lkail, wuzish, nemanjai, hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81623
2020-06-26 11:23:38 -05:00
Guillaume Chatelet fdc7c7fb87 [Alignment][NFC] Migrate TTI::getInterleavedMemoryOpCost to Align
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82573
2020-06-26 11:00:53 +00:00
Amy Kwan e0c02dc980 [PowerPC][Power10] Implement centrifuge, vector gather every nth bit, vector evaluate Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:

unsigned long long __builtin_cfuged (unsigned long long, unsigned long long);
vector unsigned long long vec_cfuge (vector unsigned long long, vector unsigned long long);
unsigned long long vec_gnb (vector unsigned __int128, const unsigned int);
vector unsigned char vec_ternarylogic (vector unsigned char, vector unsigned char, vector unsigned char, const unsigned int);
vector unsigned short vec_ternarylogic (vector unsigned short, vector unsigned short, vector unsigned short, const unsigned int);
vector unsigned int vec_ternarylogic (vector unsigned int, vector unsigned int, vector unsigned int, const unsigned int);
vector unsigned long long vec_ternarylogic (vector unsigned long long, vector unsigned long long, vector unsigned long long, const unsigned int);
vector unsigned __int128 vec_ternarylogic (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128, const unsigned int);

Differential Revision: https://reviews.llvm.org/D80970
2020-06-25 21:34:41 -05:00
Zarko Todorovski e504a23b63 [NFC][PPC][AIX] Add stack frame layout diagram to PPCISelLowering.cpp
Summary:
This NFC patch adds a diagram of the AIX ABI stack frame layout.

Based on https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/assembler/idalangref_runtime_process.html

Reviewers: sfertile, cebowleratibm, hubert.reinterpretcast, Xiangling_L

Reviewed By: sfertile

Subscribers: wuzish, nemanjai, hiraditya, kbarton, llvm-commits

Tags: #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D82408
2020-06-25 09:41:42 -04:00
Simon Pilgrim 1815b77c3e LiveIntervals.h.h - reduce AliasAnalysis.h include to forward declaration. NFC.
Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.
2020-06-25 14:22:21 +01:00
Amy Kwan d82f26cc4b [PowerPC][Power10] Implement Count Leading/Trailing Zeroes Builtins under bit Mask in LLVM/Clang
This patch implements builtins for the following prototypes:

unsigned long long __builtin_cntlzdm (unsigned long long, unsigned long long)
unsigned long long __builtin_cnttzdm (unsigned long long, unsigned long long)
vector unsigned long long vec_cntlzm (vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_cnttzm (vector unsigned long long, vector unsigned long long)

Differential Revision: https://reviews.llvm.org/D80941
2020-06-24 16:03:45 -05:00
Jinsong Ji 81b2d1d112 [NFC][PowerPC] Fix some typos in MachineCombiner comments 2020-06-24 20:40:57 +00:00
Eli Friedman a2caa3b614 Remove GlobalValue::getAlignment().
This function is deceptive at best: it doesn't return what you'd expect.
If you have an arbitrary GlobalValue and you want to determine the
alignment of that pointer, Value::getPointerAlignment() returns the
correct value.  If you want the actual declared alignment of a function
or variable, GlobalObject::getAlignment() returns that.

This patch switches all the users of GlobalValue::getAlignment to an
appropriate alternative.

Differential Revision: https://reviews.llvm.org/D80368
2020-06-23 19:13:42 -07:00
Chen Zheng 7ab05d9a60 [PowerPC] fold addi's imm operand to its imm form consumer's displacement
This patch adds a function to do following transformation:

%0:g8rc_and_g8rc_nox0 = ADDI8 %5:g8rc_and_g8rc_nox0, 144
STD killed %7:g8rc, 16, %0:g8rc_and_g8rc_nox0 :: (store 8 into %ir.8)

------>

STD killed %7:g8rc, 160, %5:g8rc_and_g8rc_nox0 :: (store 8 into %ir.8)

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81723
2020-06-23 06:28:18 -04:00
Amy Kwan 19df9e2959 [PowerPC][Power10] Implement VSX PCV Generate Operations in LLVM/Clang
This patch implements builtins for the following prototypes for the VSX Permute
Control Vector Generate with Mask Instructions:

vector unsigned char vec_genpcvm (vector unsigned char, const int);
vector unsigned short vec_genpcvm (vector unsigned short, const int);
vector unsigned int vec_genpcvm (vector unsigned int, const int);
vector unsigned long long vec_genpcvm (vector unsigned long long, const int);

Differential Revision: https://reviews.llvm.org/D81774
2020-06-22 21:09:34 -05:00
Amy Kwan cc95635b1b [PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:
```
vector signed char vec_clrl (vector signed char a, unsigned int n);
vector unsigned char vec_clrl (vector unsigned char a, unsigned int n);
vector signed char vec_clrr (vector signed char a, unsigned int n);
vector signed char vec_clrr (vector unsigned char a, unsigned int n);
```

Differential Revision: https://reviews.llvm.org/D81707
2020-06-20 18:29:16 -05:00
Nemanja Ivanovic 1fed131660 [PowerPC] Canonicalize shuffles to match more single-instruction masks on LE
We currently miss a number of opportunities to emit single-instruction
VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although
this in itself is not a huge performance opportunity since loading the permute
vector for a VPERM can always be pulled out of loops, producing such merge
instructions is useful to downstream optimizations.
Since VPERM is essentially opaque to all subsequent optimizations, we want to
avoid it as much as possible. Other permute instructions have semantics that can
be reasoned about much more easily in later optimizations.

This patch does the following:
- Canonicalize shuffles so that the first element comes from the first vector
  (since that's what most of the mask matching functions want)
- Switch the elements that come from splat vectors so that they match the
  corresponding elements from the other vector (to allow for merges)
- Adds debugging messages for when a shuffle is matched to a VPERM so that
  anyone interested in improving this further can get the info for their code

Differential revision: https://reviews.llvm.org/D77448
2020-06-18 21:54:22 -05:00
Amy Kwan c45c161130 [PowerPC][Power10] Implement Parallel Bits Deposit/Extract Builtins in LLVM/Clang
This patch implements builtins for the following prototypes:

vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long);
vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b);
unsigned long long __builtin_pdepd (unsigned long long, unsigned long long);
unsigned long long __builtin_pextd (unsigned long long, unsigned long long);

Revision Depends on D80758

Differential Revision: https://reviews.llvm.org/D80935
2020-06-18 16:23:56 -05:00
Kang Zhang 58e19d465a [PowerPC] Don't convert Loop to CTR Loop for fp128 BinaryOperator
Summary:
For PPC BinaryOperator of fp128 will become libcall, we shouldn't
convert loop to CTR loop if the loop contain libCall.

But currently, in the PPCTTIImpl::mightUseCTR() function, we only deal
with BinaryOperator for ppc_fp128, don't deal with the fp128.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81353
2020-06-18 02:54:19 +00:00
Esme-Yi ad6024e29f [PowerPC] Custom lower rotl v1i128 to vector_shuffle.
Summary: A bug is reported in bugzilla-45628, where the swap_with_shift case can’t be matched to a single HW instruction xxswapd as expected.
In fact the case matches the idiom of rotate. We have MatchRotate to handle an ‘or’ of two operands and generate a rot[lr] if the case matches the idiom of rotate. While PPC doesn’t support ROTL v1i128. We can custom lower ROTL v1i128 to the vector_shuffle. The vector_shuffle will be matched to a single HW instruction during the phase of instruction selection.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81076
2020-06-18 01:32:23 +00:00
Benjamin Kramer df9a51dab3 Remove global std::strings. NFCI. 2020-06-17 14:29:42 +02:00
Kang Zhang c2574dc9f7 [NFC]][PowerPC] Remove unused intrinsic for old CTR loop pass
Summary:

In the patch D62907 the PPC CTRLoops pass has been replaced by Generic
Hardware Loop pass, and it has imported some new intrinsic for Generic
Hardware Loop.

The old intrinsic used in PPC CTRLoops int_ppc_mtctr and
int_ppc_is_decremented_ctr_nonzero is been replaced by
int_set_loop_iterations and loop_decrement.

This patch is to remove above unused two instrinsic.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D81539
2020-06-17 07:06:46 +00:00
Ahsan Saghir 37e72f47a4 [PowerPC] Add -m[no-]power10-vector clang and llvm option
Summary: This patch adds command line option for enabling power10-vector support.

Reviewers: hfinkel, nemanjai, lei, amyk, #powerpc

Reviewed By: lei, amyk, #powerpc

Subscribers: wuzish, kbarton, hiraditya, shchenz, cfe-commits, llvm-commits

Tags: #llvm, #clang, #powerpc

Differential Revision: https://reviews.llvm.org/D80758
2020-06-16 14:47:35 -05:00
Nick Desaulniers 2d8e105db6 [PPCAsmPrinter] support 'L' output template for memory operands
Summary:
L is meant to support the second word used by 32b calling conventions for 64b arguments.

This is required for build 32b PowerPC Linux kernels after upstream
commit 334710b1496a ("powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'")

Thanks for the report from @nathanchance, and reference to GCC's
implementation from @segher.

Fixes: pr/46186
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1044

Reviewers: echristo, hfinkel, MaskRay

Reviewed By: MaskRay

Subscribers: MaskRay, wuzish, nemanjai, hiraditya, kbarton, steven.zhang, llvm-commits, segher, nathanchance, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81767
2020-06-15 14:31:44 -07:00
Davide Italiano e51e82745e [Target/PPC] Fold inside an assertion.
Pointed out by dblaikie.
2020-06-15 12:08:57 -07:00
Rahul Joshi 72d20b9604 [LLVM] Change isa<> to a variadic function template
Change isa<> to a variadic function template, so that it can be used to test against one of multiple types as follows:
   isa<Type0, Type1, Type2>(Val)

Differential Revision: https://reviews.llvm.org/D81045
2020-06-15 18:46:57 +00:00
Davide Italiano 5cb44196aa [Target/PPC] Silence an unused variable warning. NFC. 2020-06-15 11:05:01 -07:00
Stefan Pintilie 57c9dc0521 [PowerPC] Do not add the relocation addend to the instruction encoding
We should not be adding the relocation addend to the instruction encoding.
This patch removes that and sets those bits to zero.

Differential Revision: https://reviews.llvm.org/D81082
2020-06-15 09:51:34 -05:00
Sam Parker 2596da3174 [CostModel] getCFInstrCost in getUserCost.
Have BasicTTI call the base implementation so that both agree on the
default behaviour, which the default being a cost of '1'. This has
required an X86 specific implementation as it seems to be very
reliant on those instructions being free. Changes are also made to
AMDGPU so that their implementations distinguish between cost kinds,
so that the unrolling isn't affected. PowerPC also has its own
implementation to prevent changes to the reg-usage vectorizer test.

The cost model test changes now reflect that ret instructions are not
generally free.

Differential Revision: https://reviews.llvm.org/D79164
2020-06-15 09:28:46 +01:00
Chen Zheng bd7096b977 [PowerPC] fma chain break to expose more ILP
This patch tries to reassociate two patterns related to FMA to expose
more ILP on PowerPC.

// Pattern 1:
//   A =  FADD X,  Y          (Leaf)
//   B =  FMA  A,  M21,  M22  (Prev)
//   C =  FMA  B,  M31,  M32  (Root)
// -->
//   A =  FMA  X,  M21,  M22
//   B =  FMA  Y,  M31,  M32
//   C =  FADD A,  B

// Pattern 2:
//   A =  FMA  X,  M11,  M12  (Leaf)
//   B =  FMA  A,  M21,  M22  (Prev)
//   C =  FMA  B,  M31,  M32  (Root)
// -->
//   A =  FMUL M11,  M12
//   B =  FMA  X,  M21,  M22
//   D =  FMA  A,  M31,  M32
//   C =  FADD B,  D

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D80175
2020-06-15 00:00:04 -04:00
Kang Zhang 74abe50071 [PowerPC] Add some InstAlias for mtspr/mfspr instructions
Summary:

We have defined MTSPR/MFSPR and MTSPR8/MFSPR8, but we only defined
mtspr/mfspr InstAlias for some MTSPR/MFSPR.
This patch is to add the InstAlias definitions for MTSPR8/MFSPR8,
and add the some new mtspr/mfspr InstAlias we may use.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D77531
2020-06-15 02:43:13 +00:00
Chen Zheng 163162a0a4 [PowerPC] fold a bug for rlwinm folding when with full mask.
Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D81006
2020-06-14 21:27:01 -04:00
Qiu Chaofan 13edcd696e [PowerPC] Support constrained rounding operations
This patch adds handling of constrained FP intrinsics about round,
truncate and extend for PowerPC target, with necessary tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D64193
2020-06-14 23:43:31 +08:00
Qiu Chaofan 7315d221a2 [PowerPC] Exploit vnmsubfp instruction
On PowerPC, we have vnmsubfp Altivec instruction for fnmsub operation on
v4f32 type. Default pattern for this instruction never works since we
don't have legal fneg for v4f32 when VSX disabled.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80617
2020-06-14 23:19:17 +08:00
Masoud Ataei 2d038370bb DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked
Here, I am proposing to add an special case for massv powf4/powd2 function (SIMD counterpart of powf/pow function in MASSV library) in MASSV pass to get later optimizations like conversion from pow(x,0.75) and pow(x,0.25) for double and single precision to sequence of sqrt's in the DAGCombiner in vector float case. My reason for doing this is: the optimized pow(x,0.75) and pow(x,0.25) for double and single precision to sequence of sqrt's is faster than powf4/powd2 on P8 and P9.

In case MASSV functions is called, and if the exponent of pow is 0.75 or 0.25, we will get the sequence of sqrt's and if exponent is not 0.75 or 0.25 we will get the appropriate MASSV function.

Reviewed By: steven.zhang

Tags: #LLVM #PowerPC

Differential Revision: https://reviews.llvm.org/D80744
2020-06-12 10:02:16 -04:00
Chen Zheng 9b6e86a1a5 [PowerPC] refactor convertToImmediateForm - NFC
This is a NFC patch to make convertToImmediateForm a light wrapper
for converting xform and imm form instructions on PowerPC.

Reviewed By: Steven.zhang

Differential Revision: https://reviews.llvm.org/D80907
2020-06-12 03:57:54 -04:00
diggerlin c6be3ea524 [NFC] clean up the AsmPrinter::emitLinkage for AIX part
SUMMARY:

Since we deal with aix emitLinkage in the PPCAIXAsmPrinter::emitLinkage() in the patch https://reviews.llvm.org/D75866. It do not go to AsmPrinter::emitLinkage() any more, we clean up some aix related code in the AsmPrinter::emitLinkage()

Reviewers:  Jason liu

Differential Revision: https://reviews.llvm.org/D81613
2020-06-11 13:33:51 -04:00
Sam Parker fa8bff0cd1 [CostModel] Unify getArithmeticInstrCost
Add the remaining arithmetic opcodes into the generic implementation
of getUserCost and then call this from getInstructionThroughput. Most
of the backends have been modified to return the base implementation
for cost kinds other RecipThroughput. The outlier here is AMDGPU
which already uses getArithmeticInstrCost for all the cost kinds.
This change means that most of the opcodes can be removed from that
backends implementation of getUserCost.

Differential Revision: https://reviews.llvm.org/D80992
2020-06-10 09:08:45 +01:00
diggerlin edd819c757 [AIX] supporting the visibility attribute for aix assembly
SUMMARY:

in the aix assembly , it do not have .hidden and .protected directive.
in current llvm. if a function or a variable which has visibility attribute, it will generate something like the .hidden or .protected , it can not recognize by aix as.
in aix assembly, the visibility attribute are support in the pseudo-op like
.extern Name [ , Visibility ]
.globl Name [, Visibility ]
.weak Name [, Visibility ]

in this patch, we implement the visibility attribute for the global variable, function or extern function .

for example.

extern __attribute__ ((visibility ("hidden"))) int
  bar(int* ip);
__attribute__ ((visibility ("hidden"))) int b = 0;
__attribute__ ((visibility ("hidden"))) int
  foo(int* ip){
   return (*ip)++;
}
the visibility of .comm linkage do not support , we will have a separate patch for it.
we have the unsupported cases ("default" and "internal") , we will implement them in a a separate patch for it.

Reviewers: Jason Liu ,hubert.reinterpretcast,James Henderson

Differential Revision: https://reviews.llvm.org/D75866
2020-06-09 16:15:06 -04:00
Sam Parker 37289615c0 [NFCI][CostModel] Unify getCmpSelInstrCost
Add cases for icmp, fcmp and select into the switch statement of the
generic getUserCost implementation with getInstructionThroughput then
calling into it. The BasicTTI and backend implementations have be set
to return a default value (1) when a cost other than throughput is
being queried.

Differential Revision: https://reviews.llvm.org/D80550
2020-06-09 07:41:22 +01:00
Kang Zhang e3546c78ca [NFC][PowerPC] Remove the redundant InstAlias for OR instruction
Summary:
We have handle the InstAlias for OR instructions, but we handle it
agagin in PPCInstPrinter.cpp.
This patch is to Remove the redundant InstAlias for OR instruction.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80502
2020-06-09 03:32:27 +00:00
Chen Zheng 8aa52b19a7 [APInt] set all bits for getBitsSetWithWrap if loBit == hiBit
differentiate getBitsSetWithWrap & getBitsSet when loBit == hiBit
getBitsSetWithWrap sets all bits;
getBitsSet does nothing.

Reviewed By: lkail, RKSimon, lebedev.ri

Differential Revision: https://reviews.llvm.org/D81325
2020-06-08 22:55:24 -04:00
Anil Mahmud 246d106094 [PowerPC] Fix pattern for DCBFL/DCBFLP instrinsics.
The previous implementation used "asm parser only" pseudo-instructions in their
output patterns. Those are not meant to emit code and will caused crashes when
built with -filetype=obj.

Differential Revision: https://reviews.llvm.org/D80151
2020-06-08 20:54:59 -05:00
Anil Mahmud c9790d54f8 [PowerPC] Remove extra instruction left by emitRLDICWhenLoweringJumpTables
The function emitRLDICWhenLoweringJumpTables in PPCMIPeephole.cpp
was supposed to convert a pair of RLDICL and RLDICR to a single RLDIC,
but it was leaving out the RLDICL instruction. This PR fixes the bug.

Differential Revision: https://reviews.llvm.org/D78063
2020-06-08 20:43:56 -05:00
Stefan Pintilie b4036329f1 [PowerPC] Fix incorrect PC Relative relocations for Big Endian
Fix the incorrect PC Relative relocations for Big Endian for 34 bit offsets.
The offset should be zero for both BE and LE in this situation.

Differential Revision: https://reviews.llvm.org/D81033
2020-06-08 20:29:43 -05:00
Sam Parker 772349de88 [PPC] Try to fix builbots
Attempt to handle unsupported types, such as structs, in
getMemoryOpCost. The backend now checks for a supported type and
calls into BasicTTI as a fallback. BasicTTI will now also perform
the same check and will default to an expensive cost of 4 for 'Other'
MVTs.

Differential Revision: https://reviews.llvm.org/D80984
2020-06-08 09:13:37 +01:00
Guillaume Chatelet 1778564f91 [Alignment][NFC] Migrate the rest of backends
Summary: This is a followup on D81196

Reviewers: courbet

Subscribers: arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81278
2020-06-08 07:17:20 +00:00
QingShan Zhang 3f0cc7ac5e [NFC] Remove the extra ; to avoid the warning of build compiler 2020-06-08 03:51:05 +00:00
Nemanja Ivanovic a56d057dfe [PowerPC] Do not assume operand of ADDI is an immediate
After pseudo-expansion, we may end up with ADDI (add immediate)
instructions where the operand is not an immediate but a relocation.
For such instructions, attempts to get the immediate result in
assertion failures for obvious reasons.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45432
2020-06-07 22:18:31 -05:00
QingShan Zhang f8eabd6d01 [Power9] Add addi post-ra scheduling heuristic
The instruction addi is usually used to post increase the loop indvar, which looks like this:

label_X:
 load x, base(i)
 ...
 y = op x
 ...
 i = addi i, 1
 goto label_X

However, for PowerPC, if there are too many vsx instructions that between y = op x and  i = addi i, 1,
it will use all the hw resource that block the execution of i = addi, i, 1, which result in the stall
of the load instruction in next iteration. So, a heuristic is added to move the addi as early as possible
to have the load hide the latency of vsx instructions, if other heuristic didn't apply to avoid the starve.

Reviewed By: jji

Differential Revision: https://reviews.llvm.org/D80269
2020-06-08 01:31:07 +00:00
Stefan Pintilie 8dbf5a9501 [PowerPC] Remove extra nop after notoc call
Calls that are marked as @notoc do not require the extra nop after the call
for the TOC restore.

Differential Revision: https://reviews.llvm.org/D81081
2020-06-05 06:47:44 -05:00
Sam Parker 9303546b42 [CostModel] Unify getMemoryOpCost
Use getMemoryOpCost from the generic implementation of getUserCost
and have getInstructionThroughput return the result of that for loads
and stores.

This also means that the X86 implementation of getUserCost can be
removed with the functionality folded into its getMemoryOpCost.

Differential Revision: https://reviews.llvm.org/D80984
2020-06-05 10:13:38 +01:00
Qiu Chaofan 7a001a2d92 [PowerPC] Require nsz flag for c-a*b to FNMSUB
On PowerPC, FNMSUB (both VSX and non-VSX version) means -(a*b-c). But
the backend used to generate these instructions regardless whether nsz
flag exists or not. If a*b-c==0, such transformation changes sign of
zero.

This patch introduces PPC specific FNMSUB ISD opcode, which may help
improving combined FMA code sequence.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D76585
2020-06-04 16:41:27 +08:00
David Tenty d20fdcabf8 [AIX] Update data directives for AIX assembly
Summary:
The standard data emission directives (e.g. .short, .long) in the AIX assembler
have the unintended consequence of aligning their output to the natural byte
boundary. This cause problems because we aren't expecting behavior from the
Data*bitsDirectives, so the final alignment of data isn't correct in some cases
on AIX.

This patch updated the Data*bitsDirectives to use .vbyte pseudo-ops instead to emit the
data, since we will emit the .align directives as needed. We update the existing
testcases and add a test for emission of struct data.

Reviewers: hubert.reinterpretcast, Xiangling_L, jasonliu

Reviewed By: hubert.reinterpretcast, jasonliu

Subscribers: wuzish, nemanjai, hiraditya, kbarton, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80934
2020-06-03 10:55:59 -04:00
QingShan Zhang a462561cee [NFC][PowerPC] Remove unused node PPCISD::VMADDFP and PPCISD::VNMSUBFP
These two nodes were added by 69caef2b78 in 2005
and they are not used by PowerPC backend anymore. And the ISD::FMA is a prefer
way for VMADDFP if we really want to create that node. For VNMSUBFP, we will
also add a more generic node FNMSUB in D76585 if we really want it.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D80429
2020-06-03 06:36:30 +00:00
Li Rong Yi 3101601b54 [PowerPC] Exploit vabsd on P9
Summary: Exploit vabsd* for for absolute difference of vectors on P9,
for example:
void foo (char *restrict p, char *restrict q, char *restrict t)
{
  for (int i = 0; i < 16; i++)
     t[i] = abs (p[i] - q[i]);
}
this case should be matched to the HW instruction vabsdub.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80271
2020-06-01 02:30:27 +00:00
Zequan Wu 80e107ccd0 Add NoMerge MIFlag to avoid MIR branch folding
Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given

Differential Revision: https://reviews.llvm.org/D79537
2020-05-29 12:31:06 -07:00
Xiangling Liao 26604d06b6 [AIX] Emit AvailableExternally Linkage on AIX
Since on AIX, our strategy is to not use -u to suppress any undefined
symbols, we need to emit .extern for the symbols with AvailableExternally
linkage.

Differential Revision: https://reviews.llvm.org/D80642
2020-05-29 13:12:59 -04:00
Lei Huang 2368bf52cd [PowerPC] Add support for -mcpu=pwr10 in both clang and llvm
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.

Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc

Reviewed By: stefanp, nemanjai, amyk, #powerpc

Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo

Tags: #clang, #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D80020
2020-05-27 13:14:25 -05:00
Lei Huang 559845f8fe Revert "[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm"
This reverts commit 7eb666b155.
2020-05-27 09:40:21 -05:00
Lei Huang 7eb666b155 [PowerPC] Add support for -mcpu=pwr10 in both clang and llvm
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.

Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc

Reviewed By: stefanp, nemanjai, amyk, #powerpc

Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo

Tags: #clang, #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D80020
2020-05-26 13:48:22 -05:00
Sean Fertile 3e62289f42 [PowerPC][NFC] Add colon to TODO's and fix indentation. 2020-05-26 13:33:32 -04:00
Sean Fertile d6c8736287 [PowerPC][AIX] Spill CSRs to the ABI specified stack offsets.
Extend the CSR save/restore insertion code to support both 32-bit and
64-bit AIX.

Differential Revision: https://reviews.llvm.org/D79252
2020-05-26 12:24:29 -04:00
Nemanja Ivanovic 099a875f28 [PowerPC] Unaligned FP default should apply to scalars only
As reported in PR45186, we could be in a situation where we don't
want to handle unaligned memory accesses for FP scalars but still
have VSX (which allows unaligned access for vectors). Change the
default to only apply to scalars.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45186
2020-05-26 10:19:06 -05:00
Sam Parker 8aaabadece [CostModel] Unify getCastInstrCost
Add the remaining cast instruction opcodes to the base implementation
of getUserCost and directly return the result. This allows
getInstructionThroughput to return getUserCost for the casts. This
has required changes to PPC and SystemZ because they implement
getUserCost and/or getCastInstrCost with adjustments for vector
operations. Adjusts have also been made in the remaining backends
that implement the method so that they still produce a cost of zero
or one for cost kinds other than throughput.

Differential Revision: https://reviews.llvm.org/D79848
2020-05-26 11:29:57 +01:00
Nemanja Ivanovic 793cc518b9 [PowerPC] Prevent legalization loop from promoting SELECT_CC from v4i32 to v4i32
As reported in https://bugs.llvm.org/show_bug.cgi?id=45709 we can hit an
infinite loop in legalization since we set the legalization action for
ISD::SELECT_CC for all fixed length vector types to Promote. Without some
different legalization action for the type being promoted to, the legalizer
simply loops. Since we don't have patterns to match the node, the right
legalization action should be Expand.

Differential revision: https://reviews.llvm.org/D79854
2020-05-25 20:09:07 -05:00
Stefan Pintilie 5a4bcec8db [PowerPC][NFC] Split PPCELFStreamer::emitInstruction
Split off PPCELFStreamer::emitPrefixedInstruction from
PPCELFStreamer::emitInstruction.

Differential Revision: https://reviews.llvm.org/D79626
2020-05-25 06:48:58 -05:00
Kang Zhang 86e3abc9e6 [PowerPC] Add some InstAlias definitions
Summary:
This patch add the InstAlias definitions for below instructions.

ADDI ADDIS ADDI8 ADDIS8
RLWINM8
ISEL ISEL8
OR OR_rec ORI ORI8 XORI8
CNTLZW8 CNTLZW8_rec
TEND TSR
RFEBB
NOR NOR_rec
MTCRF
SUBF SUBF_rec SUBFC SUBFC_rec
RLDICL_32_64
TW

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D77559
2020-05-24 14:05:28 +00:00
Amy Kwan b631f86ac5 [TLI][PowerPC] Introduce TLI query to check if MULH is cheaper than MUL + SHIFT
This patch introduces a TargetLowering query, isMulhCheaperThanMulShift.

Currently in DAG Combine, it will transform mulhs/mulhu into a
wider multiply and a shift if the wide multiply is legal.

This TLI function is implemented on 64-bit PowerPC, as it is more desirable to
have multiply-high over multiply + shift for words and doublewords. Having
multiply-high can also aid in further transformations that can be done.

Differential Revision: https://reviews.llvm.org/D78271
2020-05-23 16:47:12 -05:00
Craig Topper 7392820f98 [Align] Remove operations on MaybeAlign that asserted that it had a defined value.
If the caller needs to reponsible for making sure the MaybeAlign
has a value, then we should just make the caller convert it to an Align
with operator*.

I explicitly deleted the relational comparison operators that
were being inherited from Optional. It's unclear what the meaning
of two MaybeAligns were one is defined and the other isn't
should be. So make the caller reponsible for defining the behavior.

I left the ==/!= operators from Optional. But now that exposed a
weird quirk that ==/!= between Align and MaybeAlign required the
MaybeAlign to be defined. But now we use the operator== from
Optional that takes an Optional and the Value.

Differential Revision: https://reviews.llvm.org/D80455
2020-05-22 21:54:28 -07:00
Fangrui Song 0840d725c4 [MC] Change MCCFIInstruction::createDefCfaOffset to cfiDefCfaOffset which does not negate Offset
The negative Offset has caused a bunch of problems and confused quite a
few call sites. Delete the unneeded negation and fix all call sites.
2020-05-22 17:07:11 -07:00
Fangrui Song 7e49dc6184 [MC] Change MCCFIInstruction::createDefCfa to cfiDefCfa which does not negate Offset
The negative Offset has caused a bunch of problems and confused quite a
few call sites. Delete the unneeded negation and fix all call sites.
2020-05-22 15:47:26 -07:00
Ahsan Saghir a28e9f1208 [PowerPC] Add support for vmsumudm
This patch adds support for Vector Multiply-Sum Unsigned Doubleword Modulo
instruction; vmsumudm.

Differential Revision: https://reviews.llvm.org/D80294
2020-05-22 14:35:13 -05:00
Nemanja Ivanovic 1a493b0fa5 [PowerPC] Add missing handling for half precision
The fix for PR39865 took care of some of the handling for half precision
but it missed a number of issues that still exist. This patch fixes the
remaining issues that cause crashes in the PPC back end.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45776

Differential revision: https://reviews.llvm.org/D79283
2020-05-22 07:50:11 -05:00
Chen Zheng 8086cdd1b0 [PowerPC] add more high latency opcodes for machine combiner pass
Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80097
2020-05-21 02:39:20 -04:00
Sam Parker fb3ba38021 [CostModel] Remove getExtCost
This has not been implemented by any backends which appear to cover
the functionality through getCastInstrCost. Sink what there is in the
default implementation into BasicTTI.

Differential Revision: https://reviews.llvm.org/D78922
2020-05-21 07:18:06 +01:00
Sam Parker 8cc911fa5b [NFCI][CostModel] Refactor getIntrinsicInstrCost
Combine the two API calls into one by introducing a structure to hold
the relevant data. This has the added benefit of moving the boiler
plate code for arguments and flags, into the constructors. This is
intended to be a non-functional change, but the complicated web of
logic involved here makes it very hard to guarantee.

Differential Revision: https://reviews.llvm.org/D79941
2020-05-20 11:59:08 +01:00
Florian Hahn bcbd26bfe6 [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.

The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-05-20 10:53:40 +01:00
Kang Zhang 3f376ecad0 [PowerPC] Enable machine verification for 3 passes
Summary:
For PowerPC, there are 3 passes has disabled the machine verification.
```
PPCTargetMachine.cpp:    addPass(&LiveVariablesID, false);
PPCTargetMachine.cpp:    addPass(createPPCEarlyReturnPass(), false);
PPCTargetMachine.cpp:  addPass(createPPCBranchSelectionPass(), false);
```
This patch is to enable machine verification for above three passes.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D79840
2020-05-20 09:40:25 +00:00
Matt Arsenault 4dad4914f7 CodeGen: Use Register 2020-05-19 17:56:55 -04:00
Lei Huang 2e6e27583c [PowerPC][NFC] Cleanup load/store spilling code
Summary: Cleanup and commonize code used for spilling to the stack.

Reviewers: stefanp, nemanjai, #powerpc, kamaub

Reviewed By: nemanjai, #powerpc, kamaub

Subscribers: kamaub, hiraditya, wuzish, shchenz, llvm-commits, kbarton

Tags: #llvm, #powerpc

Differential Revision: https://reviews.llvm.org/D79736
2020-05-19 14:57:32 -05:00
Simon Pilgrim cdafe59f95 TargetLoweringObjectFile.h - remove unnecessary includes. NFCI.
Replace with forward declarations and move includes down to source files where required.

I also needed to move the TargetLoweringObjectFile::SectionForGlobal wrapper implementation down into TargetLoweringObjectFile.cpp
2020-05-19 09:28:13 +01:00
Chen Zheng a6be4d17e3 [PowerPC-QPX] adjust operands order of qpx fma instructions.
convert
  %3 = QVFMADD %2, %0, %1, implicit $rm
to
  %3 = QVFMADD %2, %1, %0, implicit $rm

Reviewed By: hfinkel, steven.zhang

Differential Revision: https://reviews.llvm.org/D78986
2020-05-18 22:59:51 -04:00