Commit Graph

5589 Commits

Author SHA1 Message Date
Kai Luo 619e39bc72 [NFC][PowerPC] Fixed unused variable 'NewInstr'.
llvm-svn: 365433
2019-07-09 03:33:04 +00:00
Kai Luo 1931ed73c3 [PowerPC][Peephole] Combine extsw and sldi after instruction selection
Summary:
`extsw` and `sldi` are supposed to be combined if they are in the same
BB in instruction selection phase. This patch handles the case where
extsw and sldi are not in the same BB.

Differential Revision: https://reviews.llvm.org/D63806

llvm-svn: 365430
2019-07-09 02:55:08 +00:00
Chen Zheng 25ab27e6ef [PowerPC][NFC] remove redundant function isVFReg().
llvm-svn: 365429
2019-07-09 02:48:30 +00:00
Benjamin Kramer 05eebaa949 [PowerPC] Fold another unused variable into assertion. NFC.
llvm-svn: 365237
2019-07-05 19:58:39 +00:00
Benjamin Kramer 31f6b13e83 [PowerPC] Fold variable into assert. NFC.
Avoids a warning in Release builds.

llvm-svn: 365236
2019-07-05 19:46:48 +00:00
Benjamin Kramer 049230b4d2 [PowerPC] Remove unused variable. NFC.
llvm-svn: 365235
2019-07-05 19:28:02 +00:00
Nemanja Ivanovic 6c9a392c8e [PowerPC] Move TOC save to prologue when profitable
The indirect call sequence on PPC requires that the TOC base register be saved
prior to the indirect call and restored after the call since the indirect call
may branch to a global entry point in another DSO which will update the TOC
base. Over the last couple of years, we have improved this to:

- be able to hoist TOC saves from loops (with changes to MachineLICM)
- avoid multiple saves when one dominates the other[s]

However, it is still possible to have multiple TOC saves dynamically in the
execution path if there is no dominance relationship between them.

This patch moves the TOC save to the prologue when one of the TOC saves is in a
block that post-dominates entry (i.e. it cannot be avoided) or if it is in a
block that is hotter than entry.

Differential revision: https://reviews.llvm.org/D63803

llvm-svn: 365232
2019-07-05 18:38:09 +00:00
QingShan Zhang 63e62006cf [NFC][PowerPC] Make the PowerPC scheduling strategy feature only control the strategy instead of the scheduler.
llvm-svn: 365110
2019-07-04 07:43:51 +00:00
Fangrui Song 1f333562de [PowerPC] Support constraint code "ww"
Summary:
"ww" and "ws" are both constraint codes for VSX vector registers that
hold scalar double data. "ww" is preferred for float while "ws" is
preferred for double.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D64119

llvm-svn: 365106
2019-07-04 04:44:42 +00:00
Roman Lebedev c4b83a6054 [Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457)
Summary:
This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]].
In middle-end, we'd want to prefer the form with two adds - D63992,
but as this diff shows, not every target will prefer that pattern.

Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars,
but only X86 prefer that same pattern for vectors.

Here i'm adding a new TLI hook, always defaulting to the inc-of-add,
but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars.

Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel

Reviewed By: efriedma

Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64090

llvm-svn: 365010
2019-07-03 09:41:35 +00:00
Chen Zheng dfdccbb26b [PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware loop.
Differential Revision: https://reviews.llvm.org/D63477

llvm-svn: 364993
2019-07-03 01:49:03 +00:00
QingShan Zhang 7fdb3a293b [PowerPC] Implement the areMemAccessesTriviallyDisjoint hook
After implemented this hook, we will model the memory dependency in the scheduling dependency graph more precise,
and will have more opportunity to reorder the load/stores, as they didn't have the dependency at some condition

Differential Revision: https://reviews.llvm.org/D63804

llvm-svn: 364886
2019-07-02 03:28:52 +00:00
Jordan Rupprecht 351b7e7b24 Revert Recommit [PowerPC] Update P9 vector costs for insert/extract element
This reverts r364557 (git commit 9f7f5858fe)

This crashes as reported on the commit thread. Repro instructions TBD.

llvm-svn: 364876
2019-07-01 23:29:46 +00:00
Brad Smith 4b733ca617 Default to Secure PLT on PPC for musl libc.
This matches the default settings of clang.

llvm-svn: 364675
2019-06-28 19:48:31 +00:00
Zi Xuan Wu 588a170970 [NFC][PowerPC] Move XS*QP series instruction apart from XS*QPO series in position of td file
llvm-svn: 364620
2019-06-28 02:51:03 +00:00
Kai Luo c6fe8436e8 [PowerPC][NFC] Use `|=` to update `Simplified` flag
llvm-svn: 364617
2019-06-28 01:38:42 +00:00
Jinsong Ji c627aa2fa9 [PowerPC][NFC] Remove unused (and unsupported) fusion feature bits.
FeatureFusion bits was first introduced in
https://reviews.llvm.org/rL253724. for add/load integer fusion for P8.
The only use of `hasFusion` was https://reviews.llvm.org/rL255319.

However, this was removed later in https://reviews.llvm.org/rL280440.

So, there is NO any reference to fusion in code now.

Leaving it there is misleading and confusing, so remove it for now.
We can alwasy add back if we ever support fusion in the future.

llvm-svn: 364581
2019-06-27 19:35:11 +00:00
Roland Froese 9f7f5858fe Recommit [PowerPC] Update P9 vector costs for insert/extract element
Recommit patch D60160 after regression fix patch D63463.

llvm-svn: 364557
2019-06-27 16:20:24 +00:00
Jinsong Ji 157b073fa5 [PowerPC][HTM] Fix disassembling buffer overflow for tabortdc and others
This was reported in https://bugs.llvm.org/show_bug.cgi?id=41751
llvm-mc aborted when disassembling tabortdc.

This patch try to clean up TM related DAGs.

* Fixes the problem by remove explicit output of cr0, and put it as implicit def.
* Update int_ppc_tbegin pattern to accommodate the implicit def of cr0.
* Update the TCHECK operand and int_ppc_tcheck accordingly.
* Add some builtin test and disassembly tests.
* Remove unused CRRC0/crrc0

Differential Revision: https://reviews.llvm.org/D61935

llvm-svn: 364544
2019-06-27 14:11:31 +00:00
Kang Zhang 490bc46541 [NFC][PowerPC] Improve the for loop in Early Return
Summary:

In `PPCEarlyReturn.cpp`
```
183       for (MachineFunction::iterator I = MF.begin(); I != MF.end();) {
184         MachineBasicBlock &B = *I++;
185         if (processBlock(B))
186           Changed = true;
187       }
```
Above code can be improved to:
```
184       for (MachineFunction::iterator I = MF.begin(), E = MF.end(); I != E;) {
185         MachineBasicBlock &B = *I++;
186         Changed |= processBlock(B);
187       }
```

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D63800

llvm-svn: 364496
2019-06-27 03:39:09 +00:00
Kai Luo d6a8bc7a12 [PowerPC] Fixed missing change flag of emitRLDICWhenLoweringJumpTables
PPCMIPeephole::emitRLDICWhenLoweringJumpTables should return a bool
value to indicate optimization is conducted or not.

Differential Revision: https://reviews.llvm.org/D63801

llvm-svn: 364383
2019-06-26 05:25:16 +00:00
Nemanja Ivanovic 8265e8ff36 [PowerPC] Mark FCOPYSIGN legal for FP vectors
This was just an omission in the back end. We have had the instructions for both
single and double precision for a few HW generations, but never got around to
legalizing these.

Differential revision: https://reviews.llvm.org/D63634

llvm-svn: 364373
2019-06-26 01:48:57 +00:00
Kai Luo 174b4ff781 [PowerPC][NFC] Move peephole optimization of RLDICR into a method.
llvm-svn: 364372
2019-06-26 01:34:37 +00:00
Fangrui Song 96a192ea53 [PPC32] Support PLT calls for -msecure-plt -fpic
Summary:
In Secure PLT ABI, -fpic is similar to -fPIC. The differences are that:

* -fpic stores the address of _GLOBAL_OFFSET_TABLE_ in r30, while -fPIC stores .got2+0x8000.
* -fpic uses an addend of 0 for R_PPC_PLTREL24, while -fPIC uses 0x8000.

Reviewers: hfinkel, jhibbits, joerg, nemanjai, spetrovic

Reviewed By: jhibbits

Subscribers: adalava, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63563

llvm-svn: 364324
2019-06-25 15:56:32 +00:00
Nemanja Ivanovic 47b7d13459 [PowerPC] Emit XXSEL for vec_sel and code that has the same pattern
As pointed out in https://bugs.llvm.org/show_bug.cgi?id=41777
we do not emit a vector select even when the pretty much asks for one.
This patch changes that.

Differential revision: https://reviews.llvm.org/D61658

llvm-svn: 364289
2019-06-25 10:46:13 +00:00
Clement Courbet 3bc5ad551a [ExpandMemCmp] Move all options to TargetTransformInfo.
Split off from D60318.

llvm-svn: 364281
2019-06-25 08:04:13 +00:00
Matt Arsenault e3a676e9ad CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().

llvm-svn: 364191
2019-06-24 15:50:29 +00:00
Hubert Tong 6f3222ed94 [NFC] Fix indentation in PPCAsmPrinter.cpp
After r248261, the indentation switches, inside a namespace definition,
between indenting and not indenting one level in for that namespace; the
abomination occurs in the middle of a class definition. Fix that.

llvm-svn: 364133
2019-06-22 16:03:29 +00:00
Hubert Tong d801cb1f54 [PowerPC][NFC] Move comment to the relevant function
A comment that applies to a virtual destructor was placed on a class
constructor. Move the comment to where it belongs.

llvm-svn: 364132
2019-06-22 16:02:02 +00:00
Jinsong Ji 8b1abe568e [PowerPC][NFC] Fix comments for AltVSXFMARel mapping.
llvm-svn: 363987
2019-06-20 21:36:06 +00:00
Chen Zheng c5b918de58 [NFC] move some hardware loop checking code to a common place for other using.
Differential Revision: https://reviews.llvm.org/D63478

llvm-svn: 363758
2019-06-19 01:26:31 +00:00
Justin Hibbits 1d1cf30b73 PowerPC: Optimize SPE double parameter calling setup
Summary:
SPE passes doubles the same as soft-float, in register pairs as i32
types.  This is all handled by the target-independent layer.  However,
this is not optimal when splitting or reforming the doubles, as it
pushes to the stack and loads from, on either side.

For instance, to pass a double argument to a function, assuming the
double value is in r5, the sequence currently looks like this:

    evstdd      5, X(1)
    lwz         3, X(1)
    lwz         4, X+4(1)

Likewise, to form a double into r5 from args in r3 and r4:

    stw         3, X(1)
    stw         4, X+4(1)
    evldd       5, X(1)

This optimizes the fence to use SPE instructions.  Now, to pass a double
to a function:

    mr          4, 5
    evmergehi   3, 5, 5

And to form a double into r5 from args in r3 and r4:

    evmergelo   5, 3, 4

This is comparable to the way that gcc generates the double splits.

This also fixes a bug with expanding builtins to libcalls, where the
LowerCallTo() code path was generating intermediate illegal type nodes.

Reviewers: nemanjai, hfinkel, joerg

Subscribers: kbarton, jfb, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D54583

llvm-svn: 363526
2019-06-17 03:15:23 +00:00
Kang Zhang 2d51adcb57 [PowerPC] Set the innermost hot loop to align 32 bytes
Summary:
If the nested loop is an innermost loop, prefer to a 32-byte alignment, so that
we can decrease cache misses and branch-prediction misses. Actual alignment of
 the loop will depend on the hotness check and other logic in alignBlocks.

The old code will only align hot loop to 32 bytes when the LoopSize larger than
16 bytes and smaller than 32 bytes, this patch will align the innermost hot loop
 to 32 bytes not only for the hot loop whose size is 16~32 bytes.

Reviewed By: steven.zhang, jsji

Differential Revision: https://reviews.llvm.org/D61228

llvm-svn: 363495
2019-06-15 15:10:24 +00:00
Jinsong Ji bbab7acedf [PowerPC][NFC] Comments update and remove some unused def
llvm-svn: 363461
2019-06-14 21:33:51 +00:00
Jinsong Ji c9e3dbb0a5 [PowerPC][NFC] Format comments in P9InstrResrouce.td
llvm-svn: 363423
2019-06-14 17:04:24 +00:00
Simon Pilgrim 4e0648a541 [TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests (PR42123)
As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space.

This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them.

If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores.

Differential Revision: https://reviews.llvm.org/D63075

llvm-svn: 363179
2019-06-12 17:14:03 +00:00
Jinsong Ji ef2d6d99c0 [PowerPC] Enable MachinePipeliner for P9 with -ppc-enable-pipeliner
Implement necessary target hooks to enable MachinePipeliner for P9 only.
The pass is off by default, can be enabled with -ppc-enable-pipeliner for P9.

Differential Revision: https://reviews.llvm.org/D62164

llvm-svn: 363085
2019-06-11 17:40:39 +00:00
Tom Stellard 4b0b26199b Revert CMake: Make most target symbols hidden by default
This reverts r362990 (git commit 374571301d)

This was causing linker warnings on Darwin:

ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)'
from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol
'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&),
std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)'
means the weak symbol cannot be overridden at runtime. This was likely caused by different translation
units being compiled with different visibility settings.

llvm-svn: 363028
2019-06-11 03:21:13 +00:00
Tom Stellard 374571301d CMake: Make most target symbols hidden by default
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.

A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.

This patch reduces the number of public symbols in libLLVM.so by about
25%.  This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so

One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.

Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278

Reviewers: chandlerc, beanz, mgorny, rnk, hans

Reviewed By: rnk, hans

Subscribers: Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D54439

llvm-svn: 362990
2019-06-10 22:12:56 +00:00
Jinsong Ji 9c7f93e914 [PowerPC][HTM]Fix $zero is not a GPRC register for builtin_ttest
This was found during HTM cleanup.
Adding a test for builtin_ttest would expose following issue.

*** Bad machine code: Illegal physical register for instruction ***
 - function:    test10
 - basic block: %bb.0 entry (0xf0e57497b58)
 - instruction: %5:crrc0 = TABORTWCI 0, $zero, 0
 - operand 2:   $zero
  $zero is not a GPRC register.
LLVM ERROR: Found 1 machine code errors.

Differential Revision: https://reviews.llvm.org/D63079

llvm-svn: 362974
2019-06-10 19:04:14 +00:00
Sam Parker c5ef502ee8 [CodeGen] Generic Hardware Loop Support
Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.
    
Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
  Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
  Takes the maximum number of elements processed in an iteration of
  the loop body and subtracts this from the total count. Returns
  false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
  Takes the number of elements remaining to be processed as well as
  the maximum numbe of elements processed in an iteration of the loop
  body. Returns the updated number of elements remaining.

llvm-svn: 362774
2019-06-07 07:35:30 +00:00
Nemanja Ivanovic ef4a3aa549 [PowerPC] Exploit the vector min/max instructions
Use the PPC vector min/max instructions for computing the corresponding
operation as these should be faster than the compare/select sequences
we currently emit.

Differential revision: https://reviews.llvm.org/D47332

llvm-svn: 362759
2019-06-06 23:49:01 +00:00
Jason Liu 60ec248148 [AIX] Implement function descriptor on SDAG
Summary:
(1) Function descriptor on AIX
On AIX, a called routine may have 2 distinct symbols associated with it:
 * A function descriptor (Name)
 * A function entry point (.Name)

The descriptor structure on AIX is the same as those in the ELF V1 ABI:
 * The address of the entry point of the function.
 * The TOC base address for the function.
 * The environment pointer.

The descriptor symbol uses the same name as the source level function in C.
The function entry point is analogous to the symbol we would generate for a
 function in a non-descriptor-based ABI, except that it is renamed by
prepending a ".".

Which symbol gets referenced depends on the context:
 * Taking the address of the function references the descriptor symbol.
 * Calling the function references the entry point symbol.

(2) Speaking of implementation on AIX, for direct function call target, we
 create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to
 replace original TargetGlobalAddress SDNode. Then down the path, we can
 take advantage of this MCSymbol.

Patch by: Xiangling_L

Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara

Differential Revision: https://reviews.llvm.org/D62532

llvm-svn: 362735
2019-06-06 19:13:36 +00:00
Dmitri Gribenko 5438cc6910 Remove unused PPC.h includes under llvm/lib/Target/PowerPC.
llvm-svn: 362718
2019-06-06 16:47:06 +00:00
Jason Liu 0338b88861 [AIX] Implement call lowering with parameters could pass onto GPRs
Summary:
This patch implements SDAG call lowering on AIX for functions
which only have parameters that could fit into GPRs.

Reviewers: hubert.reinterpretcast, syzaara

Differential Revision: https://reviews.llvm.org/D62823

llvm-svn: 362708
2019-06-06 14:36:43 +00:00
Dmitri Gribenko 6fc4c1cc54 Include what you use in PPCFrameLowering.h
llvm-svn: 362590
2019-06-05 08:58:00 +00:00
Nemanja Ivanovic 7c842fadf1 [PowerPC] Collapse RLDICL/RLDICR into RLDIC when possible
Generally speaking, we lower to an optimal rotate sequence for nodes visible in
the SDAG. However, there are instances where the two rotates are not visible at
ISEL time - most notably those in a very common sequence when lowering switch
statements to jump tables.

A common situation is a switch on a 32-bit integer. This value has to have the
upper 32 bits cleared and because jump table offsets are word offsets, the value
needs to be shifted left by 2 bits. We currently emit the clear and the left
shift as two separate instructions, but this is not needed as we can lower it to
a single RLDIC.

This patch just cleans that up.

Differential revision: https://reviews.llvm.org/D60402

llvm-svn: 362576
2019-06-05 02:36:40 +00:00
Jinsong Ji 3144d7a2da [PowerPC] P9 Scheduling Model: dispatching rule fixes
This is to address some of the problems in existing P9 resource modeling,
especially about the dispatching rules.

Instead of using a hypothetical DISPATCHER , we try to use the number of
actual dispatch slots, and define SchedWriteRes to model dispatch rules,
then update instruction classes according to dispatch rules.

All the dispatch rules and instruction classes update are made according
to POWER9 User Manual.

Differential Revision: https://reviews.llvm.org/D61873

llvm-svn: 362509
2019-06-04 15:22:23 +00:00
Dmitri Gribenko 454fc77872 Include what you use in PPCRegisterInfo.cpp
llvm-svn: 362495
2019-06-04 12:55:00 +00:00
Dmitri Gribenko 73a15d4b78 Include what you use in PPC.h
llvm-svn: 362477
2019-06-04 09:16:35 +00:00