Commit Graph

6482 Commits

Author SHA1 Message Date
Tomas Matheson a9968c0a33 [NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions
Currently needsStackRealignment returns false if canRealignStack returns false.
This means that the behavior of needsStackRealignment does not correspond to
it's name and description; a function might need stack realignment, but if it
is not possible then this function returns false. Furthermore,
needsStackRealignment is not virtual and therefore some backends have made use
of canRealignStack to indicate whether a function needs stack realignment.

This patch attempts to clarify the situation by separating them and introducing
new names:

 - shouldRealignStack - true if there is any reason the stack should be
   realigned

 - canRealignStack - true if we are still able to realign the stack (e.g. we
   can still reserve/have reserved a frame pointer)

 - hasStackRealignment = shouldRealignStack && canRealignStack (not target
   customisable)

Targets can now override shouldRealignStack to indicate that stack realignment
is required.

This change will make it easier in a future change to handle the case where we
need to realign the stack but can't do so (for example when the register
allocator creates an aligned spill after the frame pointer has been
eliminated).

Differential Revision: https://reviews.llvm.org/D98716

Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87
2021-03-30 17:31:39 +01:00
Albion Fung e29bb074c6 [PowerPC] Exploit xxsplti32dx (constant materialization) for scalars
This patch exploits the xxsplti32dx instruction available on Power10
in place of constant pool loads where xxspltidp would not be able to,
usually because the immediate cannot fit into 32 bits.

Differential Revision: https://reviews.llvm.org/D95458
2021-03-24 15:59:59 -04:00
Sander de Smalen 55d18b3cc2 [TTI] Return a TypeSize from getRegisterBitWidth.
This patch changes the interface to take a RegisterKind, to indicate
whether the register bitwidth of a scalar register, fixed-width vector
register, or scalable vector register must be returned.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D98874
2021-03-24 14:45:13 +00:00
Stefan Pintilie 91f4c11133 [PowerPC] Add mprivileged option
Add an option to tell the compiler that it can use privileged instructions.

This patch only adds the option. Backend implementation will be added in a
future patch.

Reviewed By: lei, amyk

Differential Revision: https://reviews.llvm.org/D99193
2021-03-24 08:33:22 -05:00
Stefan Pintilie 0e4f5f3ea6 [PowerPC] Change option to mrop-protect
In order to have the same option on power PC LLVM and power PC gcc
the option will be changed from -mrop-protection to -mrop-protect.

The feature will be off by default and turned on when the option is used.

Reviewed By: lei, amyk

Differential Revision: https://reviews.llvm.org/D99185
2021-03-24 05:51:35 -05:00
Stefan Pintilie b8f3c6d011 [PowerPC][NFC] Do not enter prefix selection if it cannot do better.
Do not try to materialize a constant using prefix instructions if the selection
using non prefix instructions was able to do it using a single non prefix
instruction.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D98791
2021-03-22 09:17:52 -05:00
Qiu Chaofan 52f33f7953 [PowerPC] Enable redundant TOC save removal on AIX
Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D97039
2021-03-22 14:29:22 +08:00
Nemanja Ivanovic ea48bf8649 [PowerPC][NFC] Do not produce i64 constants in 32-bit mode
There are some instances where we produce constants of type MVT::i64
unconditionally in the target DAG combines. This is not actually
valid in 32-bit mode.
2021-03-19 22:54:47 -05:00
Anshil Gandhi 697f90ebfa [NFC] [PowerPC] Determine Endianness in PPCTargetMachine
The TargetMachine uses the triple to determine endianness. Just
use that logic rather than replicating it in PPCSubtarget.

Differential revision: https://reviews.llvm.org/D98674
2021-03-19 20:22:16 -05:00
Nemanja Ivanovic a8697c57fa [PowerPC] Fix the check for 16-bit signed field in peephole
When a D-Form instruction is fed by an add-immediate, we attempt
to merge the two immediates to form a single displacement so we
can remove the add-immediate.

However, we don't check whether the new displacement fits into
a 16-bit signed immediate field early enough. Namely, we do a
sign-extend from 16 bits first which will discard high bits and
then we check whether the result is a 16-bit signed immediate.
It of course will always be.

Move the check prior to the sign extend to ensure we are checking
the correct value.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49640
2021-03-19 07:15:53 -05:00
David Green e2935dcfc4 [TTI] Add a Mask to getShuffleCost
This adds an Mask ArrayRef to getShuffleCost, so that if an exact mask
can be provided a more accurate cost can be provided by the backend.
For example VREV costs could be returned by the ARM backend. This should
be an NFC until then, laying the groundwork for that to be added.

Differential Revision: https://reviews.llvm.org/D98206
2021-03-17 17:46:26 +00:00
Fangrui Song 5d44c92bf8 Change void getNoop(MCInst &NopInst) to MCInst getNop()
Prefer (self-documenting) return values to output parameters (which are
liable to be used).
While here, rename Noop to Nop which is more widely used and improves
consistency with hasEmitNops/setEmitNops/emitNop/etc.
2021-03-15 12:05:34 -07:00
Simon Pilgrim f6524b4ada [PPC] Fix UBSAN warning about out of range shift. NFCI. 2021-03-12 12:03:48 +00:00
Simon Pilgrim 400952980f [PPC] Fix static analyzer / UBSAN warnings about out of range shifts. NFCI. 2021-03-12 10:34:35 +00:00
Stefan Pintilie e021de0aab [PowerPC] Exploit paddi instruction on Power 10 for constant materialization
Starting with Power 10 the instruction paddi is available to use.
The instruction allows for immediates that are 34 bits.

This patch adds exploitation of the paddi instruction to allow us
to materialize constants.

Reviewed By: lei, amyk

Differential Revision: https://reviews.llvm.org/D93300
2021-03-11 08:37:49 -06:00
Qiu Chaofan 72c4cbd60e [PowerPC] Fix multi-use case for swap reduction
4c973ae implemented reduction of vector swap for lane-insensitive
operations. This commit fixes it for checking number of uses of the
vector operation.
2021-03-11 21:58:33 +08:00
Nikita Popov 2489cbaa80 [PowerPC] Fix infinite loop in peephole CR optimization (PR49509)
If we encounter a degenerate select node where both operands are
the same, then we can continue negating the condition while swapping
operands, resulting in an infinite loop. Avoid this by bailing out
if both operands are the same.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49509.

Differential Revision: https://reviews.llvm.org/D98340
2021-03-11 14:25:22 +01:00
Amy Kwan 8b540c542c [PowerPC] Implement patterns for PC-Rel zextload/extload byte loads
This patch adds patterns to select the PC-Relative extloadi1 and zextloadi1 byte loads.

Differential Revision: https://reviews.llvm.org/D98042
2021-03-10 12:18:13 -06:00
Qiu Chaofan 4c973ae51b [PowerPC] Reduce symmetrical swaps for lane-insensitive vector ops
This patch simplifies pattern (xxswap (vec-op (xxswap a) (xxswap b)))
into (vec-op a b) if vec-op is lane-insensitive. The motivating case
is ScalarToVector-VecOp-ExtractElement sequence on LE, but the
peephole itself is not related to endianness, so BE may also benefit
from this.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D97658
2021-03-10 15:21:32 +08:00
Albion Fung 9b6ac9e999 [P10] [Power PC] Exploiting new load rightmost vector element instructions.
This pull request implements patterns to exploit the load rightmost vector
element instructions for loading element 0 on little endian PowerPC subtargets
into v8i16 and v16i8 vector registers for i16 and i8 data types.

Differential Revision: https://reviews.llvm.org/D94816#inline-921403
2021-03-09 16:08:17 -05:00
Lei Huang 535a4192a9 [AIX][TLS] Generate 64-bit general-dynamic access code sequence
Add support for the TLS general dynamic access model to assembly
files on AIX 64-bit.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D98078
2021-03-08 16:41:25 -06:00
Masoud Ataei 820f508b08 [PowerPC] Removing _massv place holder
Since P8 is the oldest machine supported by MASSV pass,
_massv place holder is removed and the oldest version of
MASSV functions is assumed. If the P9 vector specific is
detected in the compilation process, the P8 prefix will
be updated to P9.

Differential Revision: https://reviews.llvm.org/D98064
2021-03-08 21:43:24 +00:00
Nemanja Ivanovic b0f0115308 [AIX][TLS] Generate 32-bit general-dynamic access code sequence
Adds support for the TLS general dynamic access model to
assembly files on AIX 32-bit.

To generate the correct code sequence when accessing a TLS variable
`v`, we first create two TOC entry nodes, one for the variable offset, one
for the region handle. These nodes are followed by a `PPCISD::TLSGD_AIX`
node (new node introduced by this patch).
The `PPCISD::TLSGD_AIX` node (`TLSGDAIX` pseudo instruction) is
expanded to 2 copies (to put the variable offset and region handle in
the right registers) and a call to `__tls_get_addr`.

This patch also changes the way TC entries are generated in asm files.
If the generated TC entry is for the region handle of a TLS variable,
we add the `@m` relocation and the `.` prefix to the entry name.
For example:

```
L..C0:
  .tc .v[TC],v[TL]@m -> region handle
L..C1:
  .tc v[TC],v[TL] -> variable offset
```

Reviewed By: nemanjai, sfertile

Differential Revision: https://reviews.llvm.org/D97948
2021-03-08 09:30:19 -06:00
Ahsan Saghir acce401068 [PowerPC] Change target data layout for 16-byte stack alignment
This changes the target data layout to make stack align to 16 bytes
on Power10. Before this change, stack was being aligned to 32 bytes.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D96265
2021-03-08 08:13:08 -06:00
Sean Fertile f0904a6208 [PowePC][AIX] Handle variadic vector call operands.
Patch adds support for passing vector call operands to variadic
functions. Arguments which are fixed shadow GPRs and stack space even
when they are passed in vector registers, while arguments passed through
ellipses are passed in properly aligned GPRs if available and on the
stack once all GPR arguments registers are consumed.

Differential Revision: https://reviews.llvm.org/D97956
2021-03-06 13:49:55 -05:00
Fangrui Song 3110187f1f [MC][PowerPC] Support .reloc *, BFD_RELOC_{NONE,16,32,64}, *
BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.
2021-03-05 21:31:45 -08:00
Zarko Todorovski 2b50ce1524 [PowerPC][AIX] Enable the default AltiVec ABI on AIX
This patch adds support for the default AltiVec ABI for AIX.

Vector registers 20 through 31 are marked as reserved and cannot
be used in the default ABI. This patch adds handling for this case
and also remove the default AltiVec ABI errors.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D96351
2021-03-05 12:46:27 -05:00
Jinsong Ji cc21de6789 [PowerPC] Update Copy/Paste encodings according to ISA3.1
Copy-paste P9 insns were added back in 2016,
however, looks like the opcodes has changed in ISA3.1.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D97416
2021-03-05 17:05:50 +00:00
Chen Zheng 87bbf3d1f8 [XCOFF][DebugInfo] support DWARF for XCOFF for assembly output.
Reviewed By: jasonliu

Differential Revision: https://reviews.llvm.org/D95518
2021-03-04 21:07:52 -05:00
Jinsong Ji 7967221a72 [PowerPC] Disable more extended mne on AIX
To avoid assembler errors.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D97418
2021-03-04 21:13:37 +00:00
Benjamin Kramer e897feeb8a [PPC] Silence unused variable warning in release builds. NFC. 2021-03-04 21:43:19 +01:00
Sean Fertile aaeffbe007 [PowerPC][AIX] Handle variadic vector formal arguments.
Patch adds support for passing vector arguments to variadic functions.
Arguments which are fixed shadow GPRs and stack space even when they are
passed in vector registers, while arguments passed through ellipses are
passed in(properly aligned GPRs if available and on the stack once all
GPR arguments registers are consumed.

Differential Revision: https://reviews.llvm.org/D97485
2021-03-04 10:56:53 -05:00
Qiu Chaofan 72d4a41ba6 [PowerPC] Allow spilling GPR to VSR on AIX
This patch enables spilling GPR to VSRs instead of stack under AIX ABI.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D97367
2021-03-03 13:32:39 +08:00
Victor Huang 1756b2adc9 [AIX][TLS] Generate TLS variables in assembly files
This patch allows generating TLS variables in assembly files on AIX.
Initialized and external uninitialized variables are generated with the
.csect pseudo-op and local uninitialized variables are generated with
the .comm/.lcomm pseudo-ops. The patch also adds a check to
explicitly say that TLS is not yet supported on AIX.

Reviewed by: daltenty, jasonliu, lei, nemanjai, sfertile
Originally patched by: bsaleil
Commandeered by: NeHuang

Differential Revision: https://reviews.llvm.org/D96184
2021-03-02 18:22:48 -06:00
Amara Emerson 8a316045ed [AArch64][GlobalISel] Enable use of the optsize predicate in the selector.
To do this while supporting the existing functionality in SelectionDAG of using
PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis
dependencies to the instruction selector pass.

Then, use the predicate to generate constant pool loads for f32 materialization,
if we're targeting optsize/minsize.

Differential Revision: https://reviews.llvm.org/D97732
2021-03-02 12:55:51 -08:00
Kazu Hirata 4ed47858ab [llvm] Use llvm::drop_begin (NFC) 2021-02-22 20:17:16 -08:00
Jacques Pienaar 3bec7ed59e Different fix for gcc bug
Was still running into

from definition of 'template<class T> struct llvm::DenseMapInfo'
[-fpermissive]
 template <typename T> struct DenseMapInfo;
                               ^
2021-02-19 16:41:00 -08:00
Leonard Chan c77659e549 [llvm][IR] Do not place constants with static relocations in a mergeable section
This patch provides two major changes:

1. Add getRelocationInfo to check if a constant will have static, dynamic, or
   no relocations. (Also rename the original needsRelocation to needsDynamicRelocation.)
2. Only allow a constant with no relocations (static or dynamic) to be placed
   in a mergeable section.

This will allow unused symbols that contain static relocations and happen to
fit in mergeable constant sections (.rodata.cstN) to instead be placed in
unique-named sections if -fdata-sections is used and subsequently garbage collected
by --gc-sections.

See https://lists.llvm.org/pipermail/llvm-dev/2021-February/148281.html.

Differential Revision: https://reviews.llvm.org/D95960
2021-02-18 15:39:00 -08:00
Sean Fertile bb260b1ca7 [PowerPC][AIX] Add support for vector arg passing on the stack.
Enable passing more vector arguments then available vector
argument passing registers.

Differential Revision: https://reviews.llvm.org/D96415
2021-02-18 13:32:40 -05:00
Baptiste Saleil 34dc1ccb96 [PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10
This patch generates the vinsw, vinsd, vinsblx, vinshlx, vinswlx, vinsdlx,
vinsbrx, vinshrx, vinswrx and vinsdrx instructions for vector insertion on P10.

Differential Revision: https://reviews.llvm.org/D94454
2021-02-18 14:17:47 +00:00
Stefan Pintilie b80357d46e [PowerPC] Add option for ROP Protection
Added -mrop-protection for Power PC to turn on codegen that provides some
protection from ROP attacks.

The option is off by default and can be turned on for Power 8, Power 9 and
Power 10.

This patch is for the option only. The feature will be implemented by a later
patch.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D96512
2021-02-18 12:15:50 +00:00
Chen Zheng 5517923b1c [XCOFF][NFC] make csect properties optional for getXCOFFSection
We are going to support debug sections for XCOFF. So the csect
properties are not necessary. This patch makes these properties
optional.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D95931
2021-02-17 20:51:42 -05:00
Sidharth Baveja cb2876800c [PowerPC][AIX] Enable Shrinkwrapping on 32 and 64 bit AIX.
Summary:
Currently Shrinkwrap is not enabled on AIX.
This patch enables shrink wrap on 32 and 64 bit AIX, and 64 bit ELF.

Reviewed By: sfertile, nemanjai

Differential Revision: https://reviews.llvm.org/D95094
2021-02-17 14:54:57 +00:00
Sean Fertile 4e127bce2d [PowerPC] Handle FP physical register in inline asm constraint.
Do not defer to the base class when the register constraint is a
physical fpr. The base class will select SPILLTOVSRRC as the register
class and register allocation will fail on subtargets without VSX
registers.

Differential Revision: https://reviews.llvm.org/D91629
2021-02-17 09:27:03 -05:00
Douglas Yung 0e3d7e6186 Fix gcc build after de3a485d9 due to a gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92598
This should fix gcc based builders such as http://lab.llvm.org:8011/#/builders/76/builds/1683
2021-02-16 21:57:12 -08:00
Victor Huang de3a485d9c [NFC][PPC] Refactor TOC representation to allow several entries for the same symbol
We currently represent TOC entries by an MCSymbol. This is not enough in some situations.
For example, when accessing an initialized TLS variable v on AIX using the general dynamic
model, we need to generate the two following entries for v:

.tc .v[TC],v@m
.tc v[TC],v

One is for the region handle (with the @m relocation), the other is for the variable offset.
This refactoring allows storing several entries for the same symbol with different VariantKind
in the TOC. If the VariantKind is not specified, we default to VK_None.

The AIX TLS implementation using this refactoring to generate the two entries will be posted
in a subsequent patch.

Patched By: bsaleil
Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D96346
2021-02-16 21:32:16 +00:00
Nemanja Ivanovic a5222aa085 [DAGCombine] Do not remove masking argument to FP16_TO_FP for some targets
As of commit 284f2bffc9, the DAG Combiner gets rid of the masking of the
input to this node if the mask only keeps the bottom 16 bits. This is because
the underlying library function does not use the high order bits. However, on
PowerPC's ELFv2 ABI, it is the caller that is responsible for clearing the bits
from the register. Therefore, the library implementation of __gnu_h2f_ieee will
return an incorrect result if the bits aren't cleared.

This combine is desired for ARM (and possibly other targets) so this patch adds
a query to Target Lowering to check if this zeroing needs to be kept.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=49092

Differential revision: https://reviews.llvm.org/D96283
2021-02-09 06:33:48 -06:00
Simon Pilgrim 518af8df44 [PowerPC] Fix multiclass template parameter types. NFC.
Fixes TableGen parser errors reported by D95874.
2021-02-06 15:39:26 +00:00
Craig Topper 11ef356d9e [TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097
2021-02-04 19:22:06 -08:00
Chen Zheng b0869a7d72 [PowerPC] [NFC] fix wording typos
Post commit comments address for D92071.
2021-02-02 21:03:17 -05:00