Commit Graph

5525 Commits

Author SHA1 Message Date
Chen Zheng c4c407a0eb [PowerPC] use more meaningful name - NFC
llvm-svn: 361218
2019-05-21 03:54:42 +00:00
Fangrui Song ad7199f3e6 [PowerPC] Support .reloc *, R_PPC{,64}_NONE, *
This can be used to create references among sections. When --gc-sections
is used, the referenced section will be retained if the origin section
is retained.

llvm-svn: 360990
2019-05-17 06:04:11 +00:00
Fangrui Song e18a6ad0b8 [MC][PowerPC] Clean up PPCAsmBackend
Replace the member variable Target with Triple
Use Triple instead of TheTarget.getName() to dispatch on 32-bit/64-bit.
Delete redundant parameters

llvm-svn: 360986
2019-05-17 05:44:26 +00:00
Fangrui Song 3e92df3e39 Add Triple::isPPC64()
llvm-svn: 360864
2019-05-16 08:31:22 +00:00
Richard Trieu ee6ced196d [PowerPC] Create a TargetInfo header. NFC
Move the declarations of getThe<Name>Target() functions into a new header in
TargetInfo and make users of these functions include this new header.
This fixes a layering problem.

llvm-svn: 360731
2019-05-15 00:09:58 +00:00
Lei Huang 22561972af [PowerPC] Custom lower known CR bit spills
For known CRBit spills, CRSET/CRUNSET, it is more efficient to load and spill
the known value instead of extracting the bit.

eg. This sequence is currently used to spill a CRUNSET:
    crclr   4*cr5+lt
    mfocrf  r3,4
    rlwinm  r3,r3,20,0,0
    stw     r3,132(r1)

This patch custom lower it to:
    li  r3,0
    stw r3,132(r1)

Differential Revision: https://reviews.llvm.org/D61754

llvm-svn: 360677
2019-05-14 14:27:06 +00:00
Richard Trieu 4bdb136b0f [PowerPC] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc.  Merging them together will fix this.  For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.

llvm-svn: 360502
2019-05-11 02:33:18 +00:00
Lei Huang 1ac6e9636c [PowerPC] custom lower `v2f64 fpext v2f32`
Reduces scalarization overhead via custom lowering of v2f64 fpext v2f32.

eg. For the following IR
  %0 = load <2 x float>, <2 x float>* %Ptr, align 8
  %1 = fpext <2 x float> %0 to <2 x double>
  ret <2 x double> %1

Pre custom lowering:
  ld r3, 0(r3)
  mtvsrd f0, r3
  xxswapd vs34, vs0
  xscvspdpn f0, vs0
  xxsldwi vs1, vs34, vs34, 3
  xscvspdpn f1, vs1
  xxmrghd vs34, vs0, vs1

After custom lowering:
  lfd f0, 0(r3)
  xxmrghw vs0, vs0, vs0
  xvcvspdp vs34, vs0

Differential Revision: https://reviews.llvm.org/D57857

llvm-svn: 360429
2019-05-10 14:04:06 +00:00
Alina Sbirlea f31eba6494 [MemorySSA] Teach LoopSimplify to preserve MemorySSA.
Summary:
Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available.
Do not preserve it in the new pass manager.
Update tests.

Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60833

llvm-svn: 360270
2019-05-08 17:05:36 +00:00
Nemanja Ivanovic b4f028f0f3 [PowerPC] Use the two-constant NR algorithm for refining estimates
The single-constant algorithm produces infinities on a lot of denormal values.
The precision of the two-constant algorithm is actually sufficient across the
range of denormals. We will switch to that algorithm for now to avoid the
infinities on denormals. In the future, we will re-evaluate the algorithm to
find the optimal one for PowerPC.

Differential revision: https://reviews.llvm.org/D60037

llvm-svn: 360144
2019-05-07 13:48:03 +00:00
Simon Pilgrim c5ac14eef8 Fix uninitialized variable warning. NFCI.
This also fixes a scan-build "array subscript is undefined" warning.

llvm-svn: 360128
2019-05-07 10:30:22 +00:00
Nemanja Ivanovic 70afe4f7e1 [PowerPC] Fix erroneous condition for converting uint-to-fp vector conversion
A condition for exiting the legalization of v4i32 conversion to v2f64 through
extract/convert/build erroneously checks for the extract having type i32.
This is not adequate as smaller extracts are actually legalized to i32 as well.
Furthermore, an early exit is missing which means that we only check that
both extracts are from the same vector if that check fails.
As a result, both cases in the included test case fail - the first gets a
select error and the second generates incorrect code.

The culprit commit is r274535.

llvm-svn: 360043
2019-05-06 13:35:49 +00:00
Simon Pilgrim aa49be4926 Avoid cppcheck operator precedence warnings. NFCI.
Prefer ((X & Y) ? A : B) to (X & Y ? A : B)

llvm-svn: 359884
2019-05-03 13:50:38 +00:00
Kang Zhang 1a0d6d6899 [NFC][PowerPC] Return early if the element type is not byte-sized in combineBVOfConsecutiveLoads
Summary:
Based on the Eli Friedman's comments in https://reviews.llvm.org/D60811 , we'd better return early if the element type is not byte-sized in `combineBVOfConsecutiveLoads`.

Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D61076

llvm-svn: 359764
2019-05-02 08:15:13 +00:00
David L. Jones fccb505f0f Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract element"
This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thread for r359313.

llvm-svn: 359648
2019-05-01 05:01:03 +00:00
Sjoerd Meijer 180f1ae57c [TargetLowering] Change getOptimalMemOpType to take a function attribute list
The MachineFunction wasn't used in getOptimalMemOpType, but more importantly,
this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType.

This is the groundwork for the changes in D59766 and D59787, that allows
implementation of TTI::getMemcpyCost.

Differential Revision: https://reviews.llvm.org/D59785

llvm-svn: 359537
2019-04-30 08:38:12 +00:00
Roland Froese 728e139700 [PowerPC] Try harder to avoid load/move-to VSR for partial vector loads
Change the PPCISelLowering.cpp function that decides to avoid update form in
favor of partial vector loads to know about newer load types and to not be
confused by the chain operand.

Differential Revision: https://reviews.llvm.org/D60102

llvm-svn: 359504
2019-04-29 21:08:35 +00:00
Simon Pilgrim 2755b73ba0 Fix operator precedence warning. NFCI.
Reported in https://www.viva64.com/en/b/0629/

llvm-svn: 359469
2019-04-29 17:04:14 +00:00
Nick Desaulniers 7ab164c4a4 [AsmPrinter] refactor to support %c w/ GlobalAddress'
Summary:
Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when
printing the address of a MachineOperand::MO_GlobalAddress. Move that
handling into a new overriden method in each base class. A virtual
method was added to the base class for handling the generic case.

Refactors a few subclasses to support the target independent %a, %c, and
%n.

The patch also contains small cleanups for AVRAsmPrinter and
SystemZAsmPrinter.

It seems that NVPTXTargetLowering is possibly missing some logic to
transform GlobalAddressSDNodes for
TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended
inline assembly asm constraints.

Fixes:
- https://bugs.llvm.org/show_bug.cgi?id=41402
- https://github.com/ClangBuiltLinux/linux/issues/449

Reviewers: echristo, void

Reviewed By: void

Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60887

llvm-svn: 359337
2019-04-26 18:45:04 +00:00
Roland Froese 4b17772b9e [PowerPC] Update P9 vector costs for insert/extract element
The PPC vector cost model values for insert/extract element reflect older
processors that lacked vector insert/extract and move-to/move-from VSR
instructions.  Update getVectorInstrCost to give appropriate values for when
the newer instructions are present.

Differential Revision: https://reviews.llvm.org/D60160

llvm-svn: 359313
2019-04-26 16:14:17 +00:00
Joerg Sonnenberger 8372b467f1 [PowerPC] Allow using initial-exec TLS with PIC
Using initial-exec TLS variables is a reasonable performance
optimisation for system libraries. Use the correct PIC mechanism to get
hold of the GOT to avoid text relocations.

Differential Revision: https://reviews.llvm.org/D61026

llvm-svn: 359146
2019-04-24 22:12:22 +00:00
Sean Fertile 526633deea Add period at end of comment.
llvm-svn: 359144
2019-04-24 21:51:30 +00:00
Kang Zhang 009a21d2fd [PowerPC] Fix wrong ElemSIze when calling isConsecutiveLS()
Summary:
This issue from the bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41177

When the two operands for BUILD_VECTOR are same, we will get assert error.
llvm::SDValue combineBVOfConsecutiveLoads(llvm::SDNode*, llvm::SelectionDAG&):
Assertion `!(InputsAreConsecutiveLoads && InputsAreReverseConsecutive) &&
"The loads cannot be both consecutive and reverse consecutive."' failed.

This error caused by the wrong ElemSIze when calling isConsecutiveLS(). We
should use `getScalarType().getStoreSize();` to get the ElemSize instread of
 `getScalarSizeInBits() / 8`.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60811

llvm-svn: 358644
2019-04-18 07:24:15 +00:00
Nick Desaulniers a2077bab40 [AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFC
Summary:
None of these derived classes do anything that the base class cannot.
If we remove these case statements, then the base class can handle them
just fine.

Reviewers: peter.smith, echristo

Reviewed By: echristo

Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60803

llvm-svn: 358603
2019-04-17 18:22:48 +00:00
Sean Fertile 8d856488a8 Add slbfee instruction.
llvm-svn: 358425
2019-04-15 17:08:43 +00:00
Kang Zhang 2446f843ae [PowerPC] Add initialization for some ppc passes
Summary:

Some llc debug options need pass-name as the parameters.
But if we use the pass-name ppc-early-ret, we will get below error:
llc test.ll -stop-after ppc-early-ret
LLVM ERROR: "ppc-early-ret" pass is not registered.
Below pass-names have the pass is not registered error:
ppc-ctr-loops
ppc-ctr-loops-verify
ppc-loop-preinc-prep
ppc-toc-reg-deps
ppc-vsx-copy
ppc-early-ret
ppc-vsx-fma-mutate
ppc-vsx-swaps
ppc-reduce-cr-ops
ppc-qpx-load-splat
ppc-branch-coalescing
ppc-branch-select

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60248

llvm-svn: 358271
2019-04-12 09:59:40 +00:00
Eric Christopher b6926bdcff Revert "[PowerPC] Add initialization for some ppc passes"
This reverts commit 6f8f98ce8d as it
is breaking nearly every bot.

llvm-svn: 358260
2019-04-12 07:16:58 +00:00
Kang Zhang 6f8f98ce8d [PowerPC] Add initialization for some ppc passes
Summary:

Some llc debug options need pass-name as the parameters.
But if we use the pass-name ppc-early-ret, we will get below error:
llc test.ll -stop-after ppc-early-ret
LLVM ERROR: "ppc-early-ret" pass is not registered.
Below pass-names have the pass is not registered error:
ppc-ctr-loops
ppc-ctr-loops-verify
ppc-loop-preinc-prep
ppc-toc-reg-deps
ppc-vsx-copy
ppc-early-ret
ppc-vsx-fma-mutate
ppc-vsx-swaps
ppc-reduce-cr-ops
ppc-qpx-load-splat
ppc-branch-coalescing
ppc-branch-select

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60248

llvm-svn: 358256
2019-04-12 06:35:15 +00:00
Zi Xuan Wu ac79ef8f0e [PowerPC] More precise exploitation of P9 maddld instruction when operands are constant
There are 3 operands of maddld, (add (mul %1, %2), %3) and sometimes
they are constant. If there is constant operand, it takes extra li to 
materialize the operand, and one more extra register too. So it's not 
profitable to use maddld to optimize mul-add pattern.

Differential Revision: https://reviews.llvm.org/D60181

llvm-svn: 358253
2019-04-12 05:21:31 +00:00
Nick Desaulniers 5277b3ff25 [AsmPrinter] refactor to remove remove AsmVariant. NFC
Summary:
The InlineAsm::AsmDialect is only required for X86; no architecture
makes use of it and as such it gets passed around between arch-specific
and general code while being unused for all architectures but X86.

Since the AsmDialect is queried from a MachineInstr, which we also pass
around, remove the additional AsmDialect parameter and query for it deep
in the X86AsmPrinter only when needed/as late as possible.

This refactor should help later planned refactors to AsmPrinter, as this
difference in the X86AsmPrinter makes it harder to make AsmPrinter more
generic.

Reviewers: craig.topper

Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60488

llvm-svn: 358101
2019-04-10 16:38:43 +00:00
Hiroshi Inoue 30d3c58b81 [PowerPC] fix trivial typos in comment, NFC
llvm-svn: 357981
2019-04-09 08:40:02 +00:00
Chen Zheng 19ce6719bc [PowerPC] initialize SchedModel according to platform.
Differential Revision: https://reviews.llvm.org/D60177

llvm-svn: 357962
2019-04-09 01:25:25 +00:00
Evandro Menezes 85bd3978ae [IR] Refactor attribute methods in Function class (NFC)
Rename the functions that query the optimization kind attributes.

Differential revision: https://reviews.llvm.org/D60287

llvm-svn: 357731
2019-04-04 22:40:06 +00:00
Stefan Pintilie fa6cd5ceb9 [PowerPC] Fix reversed bit issue in DCMX mask for "xvtstdcdp" and "xvtstdcsp" P9 implementation
Did experiments on power 9 machine, checked the outputs for NaN & Infinity+
cases with corresponding DCMX bit set. Confirmed the DCMX mask bit for NaN and
infinity+ are reversed.

This patch fixes the issue.

Patch by Victor Huang.

Differential Revision: https://reviews.llvm.org/D59384

llvm-svn: 357494
2019-04-02 16:56:01 +00:00
Kang Zhang 05f78b35ae [PowerPC] Add the support for __builtin_setrnd()
Summary:
PowerPC64/PowerPC64le supports the builtin function __builtin_setrnd to set the floating point rounding mode. This function will use the least significant two bits of integer argument to set the floating point rounding mode.
double __builtin_setrnd(int mode);
The effective values for mode are:
0 - round to nearest
1 - round to zero
2 - round to +infinity
3 - round to -infinity
Note that the mode argument will modulo 4, so if the int argument is greater than 3, it will only use the least significant two bits of the mode. Namely, builtin_setrnd(102)) is equal to builtin_setrnd(2).

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D59405

llvm-svn: 357241
2019-03-29 08:45:24 +00:00
Zi Xuan Wu 1445b77e8c [PowerPC] Strength reduction of multiply by a constant by shift and add/sub in place
A shift and add/sub sequence combination is faster in place of a multiply by constant. 
Because the cycle or latency of multiply is not huge, we only consider such following
worthy patterns.

```
(mul x, 2^N + 1) => (add (shl x, N), x)
(mul x, -(2^N + 1)) => -(add (shl x, N), x)
(mul x, 2^N - 1) => (sub (shl x, N), x)
(mul x, -(2^N - 1)) => (sub x, (shl x, N))
```

And the cycles or latency is subtarget-dependent so that we need consider the
subtarget to determine to do or not do such transformation. 
Also data type is considered for different cycles or latency to do multiply.

Differential Revision: https://reviews.llvm.org/D58950

llvm-svn: 357233
2019-03-29 03:08:39 +00:00
QingShan Zhang 5321dcd608 [NFC][PowerPC] Custom PowerPC specific machine-scheduler
This patch lays the groundwork for extending the generic machine scheduler by providing a PPC-specific implementation.
There are no functional changes as this is an incremental patch that simply provides the necessary overrides which just
encapsulate the behavior of the generic scheduler. Subsequent patches will add specific behavior.

Differential Revision: https://reviews.llvm.org/D59284

llvm-svn: 357047
2019-03-27 03:50:16 +00:00
Guozhi Wei 330dcd9dab [PPC] Refactor PPCBranchSelector.cpp
This patch splits the huge function PPCBranchSelector.cpp:runOnMachineFunction into several smaller functions.

No functional change.

Differential Revision: https://reviews.llvm.org/D59623

llvm-svn: 357033
2019-03-26 21:27:38 +00:00
Stefan Pintilie e1d79a87c6 [PowerPC] Remove UseVSXReg
The UseVSXReg flag can be safely removed and the code cleaned up.

Patch By: Yi-Hong Liu

Differential Revision: https://reviews.llvm.org/D58685

llvm-svn: 357028
2019-03-26 20:28:21 +00:00
Simon Pilgrim 77482120da Fix for ABS legalization on PPC buildbot.
llvm-svn: 356498
2019-03-19 18:55:46 +00:00
Simon Pilgrim a56f2822d0 [SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect
These changes are related to PR37743 and include:

    SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node.

    Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner.

    Add promoting the integer ABS node in the LegalizeIntegerType.

    Expand-based legalization of integer result for the ABS nodes.

    Expand-based legalization of ABS vector operations.

    Add some integer abs testcases for different typesizes for Thumb arch

    Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to:
        tmp = (SRA, Hi, 31)
        Lo = (UADDO tmp, Lo)
        Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1))
        Lo = (XOR tmp, Lo)

    The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern:
        (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))).

    Change integer abs testcases for codegen with the ABS node support for AArch64.
        Indicate that the ABS is legal for the i64 type when the NEON is supported.
        Change the integer abs testcases to show changing of codegen.

    Add combine and legalization of ABS nodes for Thumb arch.

    Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition.

For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743

Patch by: @ikulagin (Ivan Kulagin)

Differential Revision: https://reviews.llvm.org/D49837

llvm-svn: 356468
2019-03-19 16:24:55 +00:00
Adhemerval Zanella 664c1ef528 [TargetLowering] Add code size information on isFPImmLegal. NFC
This allows better code size for aarch64 floating point materialization
in a future patch.

Reviewers: evandro

Differential Revision: https://reviews.llvm.org/D58690

llvm-svn: 356389
2019-03-18 18:40:07 +00:00
Jinsong Ji 9dc2c1d564 Set useful flags for vector imm setting instructions
Vector imm setting instructions like XXLXORz/XXLXORspz/XXLXORdpz
Should behave like LI8.

We should set corresponding flags to allow rematerialization and other
opts in LICM, RA, Scheduling etc.

Differential Revision: https://reviews.llvm.org/D58645

llvm-svn: 355948
2019-03-12 18:27:09 +00:00
Jinsong Ji 06bee01d2b [NFC][PowerPC]Assert when trying to generate directmove below P8.
This was found when we generated COPY from G8RC to F8RC in
EmitInstrWithCustomInserter without checking proper architecture,
we silently generated mtvsrd, which require P8 and up.

This is a NFC patch to add assert when we call copyPhysReg, in case
someone accidentally generate COPY between G8RC to F8RC for P7 and
below.

llvm-svn: 355920
2019-03-12 14:01:29 +00:00
Jinsong Ji c6063e83d5 [NFC][PowerPC] Add comment for PPCAsmPrinter::printOperand
Patch by Yi-Hong Lyu

llvm-svn: 355848
2019-03-11 17:57:49 +00:00
Stanislav Mekhanoshin e98944ed47 Use bitset for assembler predicates
AMDGPU target run out of Subtarget feature flags hitting the limit of 64.
AssemblerPredicates uses at most uint64_t for their representation.
At the same time CodeGen has exhausted this a long time ago and switched
to a FeatureBitset with the current limit of 192 bits.

This patch completes transition to the bitset for feature bits extending
it to asm matcher and MC code emitter.

Differential Revision: https://reviews.llvm.org/D59002

llvm-svn: 355839
2019-03-11 17:04:35 +00:00
Zi Xuan Wu 428dcd5c3f [PowerPC] Remove the override of isMachineVerifierClean() to open machine verifier
After fix all asserts found by machine verifier in PowerPC target with following patches, 
we can activate machine verifier as default.

rL293769, rL348566, rL349030, rL349029, rL350113, rL350111, 
rL350799, rL350165, rL355378, rL352174, rL354762, rL350115

It's also found in PR#27456, https://bugs.llvm.org/show_bug.cgi?id=27456

Differential Revision: https://reviews.llvm.org/D59011

llvm-svn: 355798
2019-03-11 03:31:09 +00:00
Jinsong Ji de3348ae3f [PowerPC] Run clang format to avoid compiling warning.
llvm-svn: 355623
2019-03-07 18:55:21 +00:00
Guozhi Wei 11308bdb43 [PPC] Adjust the computed branch offset for the possible shorter distance
In file PPCBranchSelector.cpp we tend to over estimate code size due to large
alignment and inline assembly. Usually it causes larger computed branch offset,
it is not big problem. But sometimes it may also causes smaller computed branch
offset than actual branch offset. If the offset is close to the limit of
encoding, it may cause problem at run time.
Following is a simplified example.

           actual        estimated
           address        address
 ...
bne Far      100            10c
.p2align 4
Near:        110            110
 ...
Far:        8108           8108

Actual offset:    0x8108 - 0x100 = 0x8008
Computed offset:  0x8108 - 0x10c = 0x7ffc

The computed offset is at most ((1 << alignment) - 4) bytes smaller than actual
offset. So we add this number to the offset for safety.

Differential Revision: https://reviews.llvm.org/D57718

llvm-svn: 355529
2019-03-06 18:22:22 +00:00
Strahinja Petrovic 94fccc93de [PowerPC] Add secure plt support for TLS symbols
This patch supports secure plt mode for TLS symbols.

Differential Revision: https://reviews.llvm.org/D45520

llvm-svn: 355513
2019-03-06 15:00:10 +00:00