Commit Graph

1649 Commits

Author SHA1 Message Date
Nemanja Ivanovic 4e22c7265d [PowerPC] Disable combine 64-bit bswap(load) without LDBRX
This causes failures on the big endian bootstrap bot.
Disabling this combine temporarily until I can get a proper fix.
2021-06-25 15:11:22 -05:00
Nemanja Ivanovic dcccb2f594 [PowerPC] Fix bswap combine for big endian systems
Commit 0464586ac5 added a combine
for a 64-bit load feeding a bswap but the implementation is only
correct for little endian systems.
This fixes it for big endian systems.
2021-06-24 18:04:50 -05:00
Martin Storsjö 42f74e8249 [llvm] Rename StringRef _lower() method calls to _insensitive()
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
2021-06-25 00:22:01 +03:00
Nemanja Ivanovic 0464586ac5 [PowerPC] Combine 64-bit bswap(load) without LDBRX
When targeting CPUs that don't have LDBRX, we end up producing code that is
very inefficient and large for this common idiom. This patch just
optimizes it two 32-bit LWBRX instructions along with a merge.

This fixes https://bugs.llvm.org/show_bug.cgi?id=49610

Differential revision: https://reviews.llvm.org/D104836
2021-06-24 15:11:47 -05:00
zhijian 7ed515d168 [AIX][XCOFF] emit vector info of traceback table.
Summary:

emit vector info of traceback table.

Reviewers: Jason Liu,Hubert Tong
Differential Revision: https://reviews.llvm.org/D93659
2021-06-14 11:15:22 -04:00
Zarko Todorovski c1bb75febe [PowerPC] Allow wa inline asm to also accept floating point arguments
GCC documentation for the `wa` constraint states that:
```
wa

    A VSX register (VSR), vs0…vs63. This is either an FPR (vs0…vs31 are f0…f31)
    or a VR (vs32…vs63 are v0…v31).
```
This technically means that we could accept floating point parameters. In fact,
gcc itself does. The following testcase compiles and runs on all PPC platforms with GCC,
whereas clang/llc will assert:
```
#include <stdio.h>
double foo ( vector double a ) {
  double b, c;
  asm("xvabsdp  %x0, %x2        \n"
             "xxsldwi  %x1, %x0, %x0, 2 \n"
      :  "+wa"    (b),
         "=wa"    (c)
      :  "wa"    (a)
      );
  return b+c;
}
int main(void) {
  vector double a = {-3., -4.};
  double t = foo( a );
  printf("%g\n", t);
}
```
This patch allows clang/llc to build and run this testcase.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D103409
2021-06-11 07:19:10 -04:00
Simon Pilgrim 01b77159e3 PPCISelLowering.cpp - don't dereference a dyn_cast<>.
dyn_cast<> can return nullptr which we would then dereference - use cast<> which will assert that the type is correct.
2021-06-08 17:59:05 +01:00
Nikita Popov 1ffa6499ea [TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC)
Don't require a specific kind of IRBuilder for TargetLowering hooks.
This allows us to drop the IRBuilder.h include from TargetLowering.h.

Differential Revision: https://reviews.llvm.org/D103759
2021-06-06 16:29:50 +02:00
Anshil Gandhi 1c5ff0b03f [PowerPC] [GlobalISel] Implementation of formal arguments lowering in the IRTranslator for the PPC backend
Differential Revision: https://reviews.llvm.org/D99812
2021-06-02 16:46:39 -06:00
Anshil Gandhi 3e5ddb83e3 Revert "Differential Revision: https://reviews.llvm.org/D99812"
This reverts commit c729f2a48a.
2021-06-02 16:36:00 -06:00
Anshil Gandhi c729f2a48a Differential Revision: https://reviews.llvm.org/D99812 2021-06-02 14:09:52 -06:00
Jinsong Ji b2581196eb [AIX] Enable stackprotect feature
AIX use `__ssp_canary_word` instead of `__stack_chk_guard`.
This patch update the target hook to use correct symbol,
so that the basic stackprotect feature can work.

The traceback will be handled in follow up patch.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D103100
2021-05-28 02:18:15 +00:00
Stefan Pintilie 45ad207e45 [PowerPC] Add fix to partword atomic operations
Partword atomic binaries are not zero extended as they should be.
This patch fixes them to ensure that they are zero extended.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D102819
2021-05-20 12:36:37 -05:00
Chen Zheng 15d4ed6d8c [PowerPC] only check the load instruction result number 0.
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D102596
2021-05-18 00:49:37 -04:00
Stefan Pintilie 15051f0b4a [PowerPC] Handle inline assembly clobber of link regsiter
This patch adds the handling of clobbers of the link register LR for inline
assembly.

This patch is to fix:
https://bugs.llvm.org/show_bug.cgi?id=50147

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101657
2021-05-13 07:43:37 -05:00
Nemanja Ivanovic 39e4676ca7 [PowerPC] Provide doubleword vector predicate form comparisons on Power7
There are two reasons this shouldn't be restricted to Power8 and up:
1. For XL compatibility
2. Because clang will expand comparison operators to these intrinsics*

*Without this patch, the following causes a selection error:

int test(vector signed long a, vector signed long b) {
  return a < b;
}

This patch provides the handling for the intrinsics in the back
end and removes the Power8 guards from the predicate functions
(vec_{all|any}_{eq|ne|gt|ge|lt|le}).
2021-05-13 04:56:56 -05:00
Zarko Todorovski 0c41f77857 [PowerPC] Enable safe for 32bit vins* P10 instructions
Correctly emit `vins`instructions that are safe in 32bit mode.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101383
2021-05-10 10:13:13 -04:00
Amy Kwan 1998a08655 [PowerPC][NFC] Update atomic patterns to use the refactored load/store implementation
This patch updates the scalar atomic patterns to use the refactored load/store
implementation introduced in D93370.
All existing test cases pass with when the refactored patterns are utilized.

Differential Revision: https://reviews.llvm.org/D94498
2021-05-04 10:46:45 -05:00
Amy Kwan 64d951be61 [PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns.
This patch introduces a new infrastructure that is used to select the load and
store instructions in the PPC backend.

The primary motivation is that the current implementation of selecting load/stores
is dependent on the ordering of patterns in TableGen. Given this limitation, we
are not able to easily and reliably generate the P10 prefixed load and stores
instructions (such as when the immediates that fit within 34-bits). This
refactoring is meant to provide us with more control over the patterns/different
forms to exploit, as well as eliminating dependency of pattern declaration in TableGen.

The idea of this refactoring is that it introduces a set of addressing modes that
correspond to different instruction formats of a particular load and store
instruction, along with a set of common flags that describes a load/store.
Whenever a load/store instruction is being selected, we analyze the instruction
and compute a set of flags for it. The computed flags are then used to
select the most optimal load/store addressing mode.

This patch is the first of a series of patches to be committed - it contains the
initial implementation of the refactored load/store selection infrastructure and
also updates P8/P9 patterns to adopt this infrastructure. The idea is that
incremental patches will add more implementation and support, and eventually
the old implementation will be removed.

Differential Revision: https://reviews.llvm.org/D93370
2021-04-30 09:53:19 -05:00
Victor Huang ae3377c553 [AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects
- Add new variantKinds for the symbol's variable offset and region handle
- Print the proper relocation specifier @gd in the asm streamer when emitting
  the TC Entry for the variable offset for the symbol
- Fix the switch section failure between the TC Entry of variable offset and
  region handle
- Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property

Reviewed by: sfertile

Differential Revision: https://reviews.llvm.org/D100956
2021-04-29 13:18:59 -05:00
Qiu Chaofan 56d923efdb [SPE] Support constrained float operations on SPE
This patch enables support on SPE for constrained arithmetic and
comparison operations. This fixes bugzilla 50070.

One thing not covered is fcmp vs. fcmps on SPE. Some condition code
generates singaling comparison while some not. In this patch, all are
considered as singaling. So there might be still some issue when
compiling from C code.

Reviewed By: jhibbits

Differential Revision: https://reviews.llvm.org/D101282
2021-04-29 16:34:10 +08:00
Qiu Chaofan d5c2492455 [PowerPC] Fix SELECT_CC with i64 operand on PPC32
This patch fixes the infinite loop in legalization of PPC32 SELECT_CC
with 64-bit operand.
2021-04-28 17:48:33 +08:00
Zarko Todorovski f818ec9dd1 [AIX] Allow safe for 32bit P9 VSX extract and insert pattern matches
In https://reviews.llvm.org/D92789 PPC64 checks were added that disallowed most
VSX pattern matching.  We enable some safe ones for 32bit in this patch.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D97503
2021-04-27 07:27:43 -04:00
Nemanja Ivanovic 03e7fefff8 [PowerPC] Canonicalize shuffles on big endian targets as well
Extend shuffle canonicalization and conversion of shuffles fed by vectorized
scalars to big endian subtargets. For big endian subtargets, loads and direct
moves of scalars into vector registers put the data in the correct element for
SCALAR_TO_VECTOR if the data type is 8 bytes wide. However, if the data type is
narrower, the value still ends up in the wrong place - althouth a different
wrong place than on little endian targets.

This patch extends the combine that keeps values where they are if they feed a
shuffle to big endian targets.

Differential revision: https://reviews.llvm.org/D100478
2021-04-20 07:29:47 -05:00
Qiu Chaofan b820339752 [PowerPC] Support f128 under VSX
This patch is the last one in backend to support fp128 type in
pre-POWER9 subtargets with VSX, removing temporary option and updating
remaining tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92374
2021-04-20 15:49:52 +08:00
Nemanja Ivanovic ff769dd111 [PowerPC] Minor improvement for insert_vector_elt codegen
For v2f64, all VSX subtargets can insert an element with a single
XXPERMDI.
2021-04-16 18:52:37 -05:00
Chen Zheng 80aa9b0f7b [PowerPC] stop reverse mem op generation for some cases.
We should consider the feeder user number when we do reverse memory
operation transformation. Otherwise, we may get negative impact.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D100166
2021-04-12 22:41:28 -04:00
Qiu Chaofan ece7345859 [PowerPC] Lower f128 SETCC/SELECT_CC as libcall if p9vector disabled
XSCMPUQP is not available for pre-P9 subtargets. This patch will lower
them into libcall for correct behavior on power7/power8.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92083
2021-04-12 10:33:32 +08:00
Albion Fung e29bb074c6 [PowerPC] Exploit xxsplti32dx (constant materialization) for scalars
This patch exploits the xxsplti32dx instruction available on Power10
in place of constant pool loads where xxspltidp would not be able to,
usually because the immediate cannot fit into 32 bits.

Differential Revision: https://reviews.llvm.org/D95458
2021-03-24 15:59:59 -04:00
Nemanja Ivanovic ea48bf8649 [PowerPC][NFC] Do not produce i64 constants in 32-bit mode
There are some instances where we produce constants of type MVT::i64
unconditionally in the target DAG combines. This is not actually
valid in 32-bit mode.
2021-03-19 22:54:47 -05:00
Lei Huang 535a4192a9 [AIX][TLS] Generate 64-bit general-dynamic access code sequence
Add support for the TLS general dynamic access model to assembly
files on AIX 64-bit.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D98078
2021-03-08 16:41:25 -06:00
Nemanja Ivanovic b0f0115308 [AIX][TLS] Generate 32-bit general-dynamic access code sequence
Adds support for the TLS general dynamic access model to
assembly files on AIX 32-bit.

To generate the correct code sequence when accessing a TLS variable
`v`, we first create two TOC entry nodes, one for the variable offset, one
for the region handle. These nodes are followed by a `PPCISD::TLSGD_AIX`
node (new node introduced by this patch).
The `PPCISD::TLSGD_AIX` node (`TLSGDAIX` pseudo instruction) is
expanded to 2 copies (to put the variable offset and region handle in
the right registers) and a call to `__tls_get_addr`.

This patch also changes the way TC entries are generated in asm files.
If the generated TC entry is for the region handle of a TLS variable,
we add the `@m` relocation and the `.` prefix to the entry name.
For example:

```
L..C0:
  .tc .v[TC],v[TL]@m -> region handle
L..C1:
  .tc v[TC],v[TL] -> variable offset
```

Reviewed By: nemanjai, sfertile

Differential Revision: https://reviews.llvm.org/D97948
2021-03-08 09:30:19 -06:00
Sean Fertile f0904a6208 [PowePC][AIX] Handle variadic vector call operands.
Patch adds support for passing vector call operands to variadic
functions. Arguments which are fixed shadow GPRs and stack space even
when they are passed in vector registers, while arguments passed through
ellipses are passed in properly aligned GPRs if available and on the
stack once all GPR arguments registers are consumed.

Differential Revision: https://reviews.llvm.org/D97956
2021-03-06 13:49:55 -05:00
Zarko Todorovski 2b50ce1524 [PowerPC][AIX] Enable the default AltiVec ABI on AIX
This patch adds support for the default AltiVec ABI for AIX.

Vector registers 20 through 31 are marked as reserved and cannot
be used in the default ABI. This patch adds handling for this case
and also remove the default AltiVec ABI errors.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D96351
2021-03-05 12:46:27 -05:00
Benjamin Kramer e897feeb8a [PPC] Silence unused variable warning in release builds. NFC. 2021-03-04 21:43:19 +01:00
Sean Fertile aaeffbe007 [PowerPC][AIX] Handle variadic vector formal arguments.
Patch adds support for passing vector arguments to variadic functions.
Arguments which are fixed shadow GPRs and stack space even when they are
passed in vector registers, while arguments passed through ellipses are
passed in(properly aligned GPRs if available and on the stack once all
GPR arguments registers are consumed.

Differential Revision: https://reviews.llvm.org/D97485
2021-03-04 10:56:53 -05:00
Victor Huang 1756b2adc9 [AIX][TLS] Generate TLS variables in assembly files
This patch allows generating TLS variables in assembly files on AIX.
Initialized and external uninitialized variables are generated with the
.csect pseudo-op and local uninitialized variables are generated with
the .comm/.lcomm pseudo-ops. The patch also adds a check to
explicitly say that TLS is not yet supported on AIX.

Reviewed by: daltenty, jasonliu, lei, nemanjai, sfertile
Originally patched by: bsaleil
Commandeered by: NeHuang

Differential Revision: https://reviews.llvm.org/D96184
2021-03-02 18:22:48 -06:00
Sean Fertile bb260b1ca7 [PowerPC][AIX] Add support for vector arg passing on the stack.
Enable passing more vector arguments then available vector
argument passing registers.

Differential Revision: https://reviews.llvm.org/D96415
2021-02-18 13:32:40 -05:00
Baptiste Saleil 34dc1ccb96 [PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10
This patch generates the vinsw, vinsd, vinsblx, vinshlx, vinswlx, vinsdlx,
vinsbrx, vinshrx, vinswrx and vinsdrx instructions for vector insertion on P10.

Differential Revision: https://reviews.llvm.org/D94454
2021-02-18 14:17:47 +00:00
Chen Zheng 5517923b1c [XCOFF][NFC] make csect properties optional for getXCOFFSection
We are going to support debug sections for XCOFF. So the csect
properties are not necessary. This patch makes these properties
optional.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D95931
2021-02-17 20:51:42 -05:00
Sean Fertile 4e127bce2d [PowerPC] Handle FP physical register in inline asm constraint.
Do not defer to the base class when the register constraint is a
physical fpr. The base class will select SPILLTOVSRRC as the register
class and register allocation will fail on subtargets without VSX
registers.

Differential Revision: https://reviews.llvm.org/D91629
2021-02-17 09:27:03 -05:00
Craig Topper 11ef356d9e [TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097
2021-02-04 19:22:06 -08:00
Kazu Hirata 8ed1636184 [llvm] Use isa instead of dyn_cast (NFC) 2021-01-29 23:23:37 -08:00
Albion Fung 2e470e03b4 [PowerPC][Power10] Fix XXSPLI32DX not correctly exploiting specific cases
Some cases may be transformed into 32 bit splats before hitting the boolean statement, which may cause incorrect behaviour and provide XXSPLTI32DX with the incorrect values of splat. The condition was reversed so that the shortcut prevents this problem.

Differential Revision: https://reviews.llvm.org/D95634
2021-01-28 15:17:32 -05:00
Nemanja Ivanovic 54e570d94a [PowerPC] Do not emit XXSPLTI32DX for sub 64-bit constants
If the APInt returned by BuildVectorSDNode::isConstantSplat() is narrower than
64 bits, the result produced by XXSPLTI32DX is incorrect. The result returned
by the function appears to be incorrect and we'll investigate/fix it in a
follow-up commit. However, since this causes miscompiles, we must
temporarily disable emitting this instruction for such values.
2021-01-28 04:16:48 -06:00
QingShan Zhang ffc3e800c6 [NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero
For now, we correct the result for sqrt if iteration > 0. This doesn't make
sense as they are not strict relative.

Reviewed By: dmgreen, spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D94480
2021-01-25 04:02:44 +00:00
Kazu Hirata 16baad8f4e [llvm] Use pop_back_val (NFC) 2021-01-24 12:18:57 -08:00
Albion Fung 719b563ecf [PowerPC][Power10] Exploit splat instruction xxsplti32dx in Power10
Exploits the instruction xxsplti32dx.

It can be used to materialize any 64 bit scalar/vector splat by using two instances, one for the upper 32 bits and the other for the lower 32 bits. It should not materialize the cases which can be materialized by using the instruction xxspltidp.

Differential Revision: https://https://reviews.llvm.org/D90173
2021-01-20 12:55:52 -05:00
Nemanja Ivanovic 61f69153e8 [PowerPC] Sign extend comparison operand for signed atomic comparisons
As of 8dacca943a, we sign extend the atomic loaded
operand for signed subword comparisons. However, the assumption that the other
operand is correctly sign extended doesn't always hold. This patch sign extends
the other operand if it needs to be sign extended.

This is a second fix for https://bugs.llvm.org/show_bug.cgi?id=30451

Differential revision: https://reviews.llvm.org/D94058
2021-01-18 21:19:25 -06:00
Kazu Hirata 7dc3575ef2 [llvm] Remove redundant return and continue statements (NFC)
Identified with readability-redundant-control-flow.
2021-01-14 20:30:34 -08:00