Commit Graph

6472 Commits

Author SHA1 Message Date
David Green e2935dcfc4 [TTI] Add a Mask to getShuffleCost
This adds an Mask ArrayRef to getShuffleCost, so that if an exact mask
can be provided a more accurate cost can be provided by the backend.
For example VREV costs could be returned by the ARM backend. This should
be an NFC until then, laying the groundwork for that to be added.

Differential Revision: https://reviews.llvm.org/D98206
2021-03-17 17:46:26 +00:00
Fangrui Song 5d44c92bf8 Change void getNoop(MCInst &NopInst) to MCInst getNop()
Prefer (self-documenting) return values to output parameters (which are
liable to be used).
While here, rename Noop to Nop which is more widely used and improves
consistency with hasEmitNops/setEmitNops/emitNop/etc.
2021-03-15 12:05:34 -07:00
Simon Pilgrim f6524b4ada [PPC] Fix UBSAN warning about out of range shift. NFCI. 2021-03-12 12:03:48 +00:00
Simon Pilgrim 400952980f [PPC] Fix static analyzer / UBSAN warnings about out of range shifts. NFCI. 2021-03-12 10:34:35 +00:00
Stefan Pintilie e021de0aab [PowerPC] Exploit paddi instruction on Power 10 for constant materialization
Starting with Power 10 the instruction paddi is available to use.
The instruction allows for immediates that are 34 bits.

This patch adds exploitation of the paddi instruction to allow us
to materialize constants.

Reviewed By: lei, amyk

Differential Revision: https://reviews.llvm.org/D93300
2021-03-11 08:37:49 -06:00
Qiu Chaofan 72c4cbd60e [PowerPC] Fix multi-use case for swap reduction
4c973ae implemented reduction of vector swap for lane-insensitive
operations. This commit fixes it for checking number of uses of the
vector operation.
2021-03-11 21:58:33 +08:00
Nikita Popov 2489cbaa80 [PowerPC] Fix infinite loop in peephole CR optimization (PR49509)
If we encounter a degenerate select node where both operands are
the same, then we can continue negating the condition while swapping
operands, resulting in an infinite loop. Avoid this by bailing out
if both operands are the same.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49509.

Differential Revision: https://reviews.llvm.org/D98340
2021-03-11 14:25:22 +01:00
Amy Kwan 8b540c542c [PowerPC] Implement patterns for PC-Rel zextload/extload byte loads
This patch adds patterns to select the PC-Relative extloadi1 and zextloadi1 byte loads.

Differential Revision: https://reviews.llvm.org/D98042
2021-03-10 12:18:13 -06:00
Qiu Chaofan 4c973ae51b [PowerPC] Reduce symmetrical swaps for lane-insensitive vector ops
This patch simplifies pattern (xxswap (vec-op (xxswap a) (xxswap b)))
into (vec-op a b) if vec-op is lane-insensitive. The motivating case
is ScalarToVector-VecOp-ExtractElement sequence on LE, but the
peephole itself is not related to endianness, so BE may also benefit
from this.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D97658
2021-03-10 15:21:32 +08:00
Albion Fung 9b6ac9e999 [P10] [Power PC] Exploiting new load rightmost vector element instructions.
This pull request implements patterns to exploit the load rightmost vector
element instructions for loading element 0 on little endian PowerPC subtargets
into v8i16 and v16i8 vector registers for i16 and i8 data types.

Differential Revision: https://reviews.llvm.org/D94816#inline-921403
2021-03-09 16:08:17 -05:00
Lei Huang 535a4192a9 [AIX][TLS] Generate 64-bit general-dynamic access code sequence
Add support for the TLS general dynamic access model to assembly
files on AIX 64-bit.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D98078
2021-03-08 16:41:25 -06:00
Masoud Ataei 820f508b08 [PowerPC] Removing _massv place holder
Since P8 is the oldest machine supported by MASSV pass,
_massv place holder is removed and the oldest version of
MASSV functions is assumed. If the P9 vector specific is
detected in the compilation process, the P8 prefix will
be updated to P9.

Differential Revision: https://reviews.llvm.org/D98064
2021-03-08 21:43:24 +00:00
Nemanja Ivanovic b0f0115308 [AIX][TLS] Generate 32-bit general-dynamic access code sequence
Adds support for the TLS general dynamic access model to
assembly files on AIX 32-bit.

To generate the correct code sequence when accessing a TLS variable
`v`, we first create two TOC entry nodes, one for the variable offset, one
for the region handle. These nodes are followed by a `PPCISD::TLSGD_AIX`
node (new node introduced by this patch).
The `PPCISD::TLSGD_AIX` node (`TLSGDAIX` pseudo instruction) is
expanded to 2 copies (to put the variable offset and region handle in
the right registers) and a call to `__tls_get_addr`.

This patch also changes the way TC entries are generated in asm files.
If the generated TC entry is for the region handle of a TLS variable,
we add the `@m` relocation and the `.` prefix to the entry name.
For example:

```
L..C0:
  .tc .v[TC],v[TL]@m -> region handle
L..C1:
  .tc v[TC],v[TL] -> variable offset
```

Reviewed By: nemanjai, sfertile

Differential Revision: https://reviews.llvm.org/D97948
2021-03-08 09:30:19 -06:00
Ahsan Saghir acce401068 [PowerPC] Change target data layout for 16-byte stack alignment
This changes the target data layout to make stack align to 16 bytes
on Power10. Before this change, stack was being aligned to 32 bytes.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D96265
2021-03-08 08:13:08 -06:00
Sean Fertile f0904a6208 [PowePC][AIX] Handle variadic vector call operands.
Patch adds support for passing vector call operands to variadic
functions. Arguments which are fixed shadow GPRs and stack space even
when they are passed in vector registers, while arguments passed through
ellipses are passed in properly aligned GPRs if available and on the
stack once all GPR arguments registers are consumed.

Differential Revision: https://reviews.llvm.org/D97956
2021-03-06 13:49:55 -05:00
Fangrui Song 3110187f1f [MC][PowerPC] Support .reloc *, BFD_RELOC_{NONE,16,32,64}, *
BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.
2021-03-05 21:31:45 -08:00
Zarko Todorovski 2b50ce1524 [PowerPC][AIX] Enable the default AltiVec ABI on AIX
This patch adds support for the default AltiVec ABI for AIX.

Vector registers 20 through 31 are marked as reserved and cannot
be used in the default ABI. This patch adds handling for this case
and also remove the default AltiVec ABI errors.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D96351
2021-03-05 12:46:27 -05:00
Jinsong Ji cc21de6789 [PowerPC] Update Copy/Paste encodings according to ISA3.1
Copy-paste P9 insns were added back in 2016,
however, looks like the opcodes has changed in ISA3.1.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D97416
2021-03-05 17:05:50 +00:00
Chen Zheng 87bbf3d1f8 [XCOFF][DebugInfo] support DWARF for XCOFF for assembly output.
Reviewed By: jasonliu

Differential Revision: https://reviews.llvm.org/D95518
2021-03-04 21:07:52 -05:00
Jinsong Ji 7967221a72 [PowerPC] Disable more extended mne on AIX
To avoid assembler errors.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D97418
2021-03-04 21:13:37 +00:00
Benjamin Kramer e897feeb8a [PPC] Silence unused variable warning in release builds. NFC. 2021-03-04 21:43:19 +01:00
Sean Fertile aaeffbe007 [PowerPC][AIX] Handle variadic vector formal arguments.
Patch adds support for passing vector arguments to variadic functions.
Arguments which are fixed shadow GPRs and stack space even when they are
passed in vector registers, while arguments passed through ellipses are
passed in(properly aligned GPRs if available and on the stack once all
GPR arguments registers are consumed.

Differential Revision: https://reviews.llvm.org/D97485
2021-03-04 10:56:53 -05:00
Qiu Chaofan 72d4a41ba6 [PowerPC] Allow spilling GPR to VSR on AIX
This patch enables spilling GPR to VSRs instead of stack under AIX ABI.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D97367
2021-03-03 13:32:39 +08:00
Victor Huang 1756b2adc9 [AIX][TLS] Generate TLS variables in assembly files
This patch allows generating TLS variables in assembly files on AIX.
Initialized and external uninitialized variables are generated with the
.csect pseudo-op and local uninitialized variables are generated with
the .comm/.lcomm pseudo-ops. The patch also adds a check to
explicitly say that TLS is not yet supported on AIX.

Reviewed by: daltenty, jasonliu, lei, nemanjai, sfertile
Originally patched by: bsaleil
Commandeered by: NeHuang

Differential Revision: https://reviews.llvm.org/D96184
2021-03-02 18:22:48 -06:00
Amara Emerson 8a316045ed [AArch64][GlobalISel] Enable use of the optsize predicate in the selector.
To do this while supporting the existing functionality in SelectionDAG of using
PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis
dependencies to the instruction selector pass.

Then, use the predicate to generate constant pool loads for f32 materialization,
if we're targeting optsize/minsize.

Differential Revision: https://reviews.llvm.org/D97732
2021-03-02 12:55:51 -08:00
Kazu Hirata 4ed47858ab [llvm] Use llvm::drop_begin (NFC) 2021-02-22 20:17:16 -08:00
Jacques Pienaar 3bec7ed59e Different fix for gcc bug
Was still running into

from definition of 'template<class T> struct llvm::DenseMapInfo'
[-fpermissive]
 template <typename T> struct DenseMapInfo;
                               ^
2021-02-19 16:41:00 -08:00
Leonard Chan c77659e549 [llvm][IR] Do not place constants with static relocations in a mergeable section
This patch provides two major changes:

1. Add getRelocationInfo to check if a constant will have static, dynamic, or
   no relocations. (Also rename the original needsRelocation to needsDynamicRelocation.)
2. Only allow a constant with no relocations (static or dynamic) to be placed
   in a mergeable section.

This will allow unused symbols that contain static relocations and happen to
fit in mergeable constant sections (.rodata.cstN) to instead be placed in
unique-named sections if -fdata-sections is used and subsequently garbage collected
by --gc-sections.

See https://lists.llvm.org/pipermail/llvm-dev/2021-February/148281.html.

Differential Revision: https://reviews.llvm.org/D95960
2021-02-18 15:39:00 -08:00
Sean Fertile bb260b1ca7 [PowerPC][AIX] Add support for vector arg passing on the stack.
Enable passing more vector arguments then available vector
argument passing registers.

Differential Revision: https://reviews.llvm.org/D96415
2021-02-18 13:32:40 -05:00
Baptiste Saleil 34dc1ccb96 [PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10
This patch generates the vinsw, vinsd, vinsblx, vinshlx, vinswlx, vinsdlx,
vinsbrx, vinshrx, vinswrx and vinsdrx instructions for vector insertion on P10.

Differential Revision: https://reviews.llvm.org/D94454
2021-02-18 14:17:47 +00:00
Stefan Pintilie b80357d46e [PowerPC] Add option for ROP Protection
Added -mrop-protection for Power PC to turn on codegen that provides some
protection from ROP attacks.

The option is off by default and can be turned on for Power 8, Power 9 and
Power 10.

This patch is for the option only. The feature will be implemented by a later
patch.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D96512
2021-02-18 12:15:50 +00:00
Chen Zheng 5517923b1c [XCOFF][NFC] make csect properties optional for getXCOFFSection
We are going to support debug sections for XCOFF. So the csect
properties are not necessary. This patch makes these properties
optional.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D95931
2021-02-17 20:51:42 -05:00
Sidharth Baveja cb2876800c [PowerPC][AIX] Enable Shrinkwrapping on 32 and 64 bit AIX.
Summary:
Currently Shrinkwrap is not enabled on AIX.
This patch enables shrink wrap on 32 and 64 bit AIX, and 64 bit ELF.

Reviewed By: sfertile, nemanjai

Differential Revision: https://reviews.llvm.org/D95094
2021-02-17 14:54:57 +00:00
Sean Fertile 4e127bce2d [PowerPC] Handle FP physical register in inline asm constraint.
Do not defer to the base class when the register constraint is a
physical fpr. The base class will select SPILLTOVSRRC as the register
class and register allocation will fail on subtargets without VSX
registers.

Differential Revision: https://reviews.llvm.org/D91629
2021-02-17 09:27:03 -05:00
Douglas Yung 0e3d7e6186 Fix gcc build after de3a485d9 due to a gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92598
This should fix gcc based builders such as http://lab.llvm.org:8011/#/builders/76/builds/1683
2021-02-16 21:57:12 -08:00
Victor Huang de3a485d9c [NFC][PPC] Refactor TOC representation to allow several entries for the same symbol
We currently represent TOC entries by an MCSymbol. This is not enough in some situations.
For example, when accessing an initialized TLS variable v on AIX using the general dynamic
model, we need to generate the two following entries for v:

.tc .v[TC],v@m
.tc v[TC],v

One is for the region handle (with the @m relocation), the other is for the variable offset.
This refactoring allows storing several entries for the same symbol with different VariantKind
in the TOC. If the VariantKind is not specified, we default to VK_None.

The AIX TLS implementation using this refactoring to generate the two entries will be posted
in a subsequent patch.

Patched By: bsaleil
Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D96346
2021-02-16 21:32:16 +00:00
Nemanja Ivanovic a5222aa085 [DAGCombine] Do not remove masking argument to FP16_TO_FP for some targets
As of commit 284f2bffc9, the DAG Combiner gets rid of the masking of the
input to this node if the mask only keeps the bottom 16 bits. This is because
the underlying library function does not use the high order bits. However, on
PowerPC's ELFv2 ABI, it is the caller that is responsible for clearing the bits
from the register. Therefore, the library implementation of __gnu_h2f_ieee will
return an incorrect result if the bits aren't cleared.

This combine is desired for ARM (and possibly other targets) so this patch adds
a query to Target Lowering to check if this zeroing needs to be kept.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=49092

Differential revision: https://reviews.llvm.org/D96283
2021-02-09 06:33:48 -06:00
Simon Pilgrim 518af8df44 [PowerPC] Fix multiclass template parameter types. NFC.
Fixes TableGen parser errors reported by D95874.
2021-02-06 15:39:26 +00:00
Craig Topper 11ef356d9e [TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097
2021-02-04 19:22:06 -08:00
Chen Zheng b0869a7d72 [PowerPC] [NFC] fix wording typos
Post commit comments address for D92071.
2021-02-02 21:03:17 -05:00
Stefan Pintilie 288f762b6f [PowerPC] Materialize 34 bit constants with pli on Power 10.
NOTE: This patch was originally written by Anil Mahmud. His code has been
rebased but otherwise left mostly unchanged.

A new instructon on Power 10 allows for the materialization of 34 bit
immediate values. This patch allows the compiler to take advantage of
the new instruction in this situation.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D92879
2021-02-02 09:49:22 -06:00
Craig Topper 94206f1f90 [PowerPC] Remove vnot_ppc and replace with the standard vnot.
immAllOnesV has special support for looking through bitcasts
automatically so isel patterns don't need to explicitly look
for the bitconvert.
2021-01-31 19:41:33 -08:00
Kazu Hirata 627b5bda11 [llvm] Add missing header guards (NFC)
Identified with llvm-header-guard.
2021-01-30 09:53:42 -08:00
Kazu Hirata 8ed1636184 [llvm] Use isa instead of dyn_cast (NFC) 2021-01-29 23:23:37 -08:00
Kazu Hirata 7925aa091d [llvm] Populate SmallVector at construction time (NFC) 2021-01-28 22:21:14 -08:00
Albion Fung 2e470e03b4 [PowerPC][Power10] Fix XXSPLI32DX not correctly exploiting specific cases
Some cases may be transformed into 32 bit splats before hitting the boolean statement, which may cause incorrect behaviour and provide XXSPLTI32DX with the incorrect values of splat. The condition was reversed so that the shortcut prevents this problem.

Differential Revision: https://reviews.llvm.org/D95634
2021-01-28 15:17:32 -05:00
Nemanja Ivanovic 54e570d94a [PowerPC] Do not emit XXSPLTI32DX for sub 64-bit constants
If the APInt returned by BuildVectorSDNode::isConstantSplat() is narrower than
64 bits, the result produced by XXSPLTI32DX is incorrect. The result returned
by the function appears to be incorrect and we'll investigate/fix it in a
follow-up commit. However, since this causes miscompiles, we must
temporarily disable emitting this instruction for such values.
2021-01-28 04:16:48 -06:00
Nemanja Ivanovic 8018f731f0 [PowerPC] Do not emit HW loop with half precision operations
If a loop has any operations on half precision values, there will be calls to
library functions on Power8. Even on Power9, there is a small subset of
instructions that are actually supported for the type.

This patch disables HW loops whenever any operations on the type are found
(other than the handfull of supported ones when compiling for Power9). Fixes a
few PR's opened by Julia:

https://bugs.llvm.org/show_bug.cgi?id=48785
https://bugs.llvm.org/show_bug.cgi?id=48786
https://bugs.llvm.org/show_bug.cgi?id=48519

Differential revision: https://reviews.llvm.org/D94980
2021-01-25 20:55:56 -06:00
Nemanja Ivanovic 1150bfa6bb [PowerPC] Add missing negate for VPERMXOR on little endian subtargets
This intrinsic is supposed to have the permute control vector complemented on
little endian systems (as the ABI specifies and GCC implements). With the
current code gen, the result vector is byte-reversed.

Differential revision: https://reviews.llvm.org/D95004
2021-01-25 12:23:33 -06:00
QingShan Zhang ffc3e800c6 [NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero
For now, we correct the result for sqrt if iteration > 0. This doesn't make
sense as they are not strict relative.

Reviewed By: dmgreen, spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D94480
2021-01-25 04:02:44 +00:00