Commit Graph

1735 Commits

Author SHA1 Message Date
Nemanja Ivanovic b7bf937bbe [PowerPC] Provide XL-compatible vec_round implementation
The XL implementation of vec_round for vector double uses
"round-to-nearest, ties to even" just as the vector float
`version does. However clang and gcc use "round-to-nearest-away"
for vector double and "round-to-nearest, ties to even"
for vector float.

The XL behaviour is implemented under the __XL_COMPAT_ALTIVEC__
macro similarly to other instances of incompatibility.

Differential revision: https://reviews.llvm.org/D113642
2021-11-24 06:43:56 -06:00
Nemanja Ivanovic c9cb8edc51 [PowerPC] Allow scalars for asm constraint "v" with VSX
Similarly to what GCC does, we should allow scalars with
the "v" constraint rather than introducing unnecessary
new constraints for scalars in Altivec registers.

Differential revision: https://reviews.llvm.org/D113635
2021-11-23 17:03:04 -06:00
Lei Huang f50c6c1718 [PowerPC] Fix 32bit vector insert instructions for ISA3.1
The platform independent ISD::INSERT_VECTOR_ELT take a element index,
but vins* instructions take a byte index. Update 32bit td patterns for
vector insert to handle the element index accordingly.

Since vector insert for non constant index are supported in
ISA3.1, there is no need to use platform specific ISD node,
PPCISD::VECINSERT.  Update td pattern to directly use
ISD::INSERT_VECTOR_ELT instead.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D113802
2021-11-15 14:36:39 -06:00
Kazu Hirata d243cbf8ea [llvm] Use isa instead of dyn_cast (NFC) 2021-11-14 19:40:46 -08:00
Kazu Hirata 609ccbb240 [PowerPC] Use SDNode::uses (NFC) 2021-11-13 08:34:22 -08:00
Nemanja Ivanovic 5840f7197d [PowerPC] Respect rounding mode in the back end
Currently, the floating point instructions that depend on
rounding mode are correctly marked in the PPC back end with
an implicit use of the RM register. Similarly, instructions
that explicitly define the register are marked with an
implicit def of the same register. So for the most part,
RM-using code won't be moved across RM-setting instructions.

However, calls are not marked as RM-setting instructions so
code can be moved across calls. This is generally desired,
but so is the ability to turn off this behaviour with an
appropriate option - and -frounding-math really should be
that option.

This patch provides a set of call instructions (for direct
and indirect calls) that are marked with an implicit def of
the RM register. These will be used for calls that are marked
with the strictfp attribute.

Differential revision: https://reviews.llvm.org/D111433
2021-11-10 08:19:58 -06:00
Qiu Chaofan 5fd406e254 [PowerPC] Add intrinsic to convert between ppc_fp128 and fp128
ppc_fp128 and fp128 are both 128-bit floating point types. However, we
can't do conversion between them now, since trunc/ext are not allowed
for same-size fp types.

This patch adds two new intrinsics: llvm.ppc.convert.f128.to.ppcf128 and
llvm.convert.ppcf128.to.f128, to support such conversion.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D109421
2021-11-05 16:58:38 +08:00
Chen Zheng 9695027066 [PowerPC] address post-commit comments for D106555; NFC
Address namanjai post commit comments.
2021-11-05 05:30:53 +00:00
Qiu Chaofan 741aeda97d [PowerPC] Implement longdouble pack/unpack builtins
Implement two builtins to pack/unpack IBM extended long double float,
according to GCC 'Basic PowerPC Builtin Functions Available ISA 2.05'.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D112055
2021-11-03 17:57:25 +08:00
Chen Zheng 5a8b196340 [PowerPC] handle more splat loads without stack operation
This mostly improves splat loads code generation on Power7

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D106555
2021-11-03 05:17:41 +00:00
Simon Pilgrim 71e39e3f18 [ADT] Add APInt::isNegatedPowerOf2() helper
Inspired by D111968, provide a isNegatedPowerOf2() wrapper instead of obfuscating code with (-Value).isPowerOf2() patterns, which I'm sure are likely avenues for typos.....

Differential Revision: https://reviews.llvm.org/D111998
2021-10-19 14:38:21 +01:00
Arthur Eubanks a0a4935182 Make more places that use alignment use uint64_t
Followup to D110451.
2021-10-08 16:35:19 -07:00
Itay Bookstein 40ec1c0f16 [IR][NFC] Rename getBaseObject to getAliaseeObject
To better reflect the meaning of the now-disambiguated {GlobalValue,
GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction
(D109792), the function is renamed to getAliaseeObject.
2021-10-06 19:33:10 -07:00
Stefan Pintilie 740086596c [PowerPC] Fix issue with lowering byval parameters.
Lowering of byval parameters with sizes that are not represented by a single
store require multiple stores to properly address the correct size of the
parameter.

Sizes that cannot be done with a single store are 3 bytes, 5 bytes, 6 bytes,
7 bytes. It is not correct to simply perform an 8 byte store and for these
elements because then the store would be larger than the element and alias
analysis would assume that this is undefined behaivour and return NoAlias
for them.

This patch adds the correct stores so that the size of the store is not larger
than the size of the element.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D108795
2021-10-06 13:19:15 -05:00
Stefan Pintilie 4fc2f4979c [PowerPC] Fix __builtin_ppc_load2r to return short instead of int.
This patch fixes the return value of the builtin __builtin_ppc_load2r to
correctly return short instead of int.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D110771
2021-10-04 06:17:02 -05:00
Quinn Pham 70391b3468 [PowerPC] FP compare and test XL compat builtins.
This patch is in a series of patches to provide builtins for
compatability with the XL compiler. This patch adds builtins for compare
exponent and test data class operations on floating point values.

Reviewed By: #powerpc, lei

Differential Revision: https://reviews.llvm.org/D109437
2021-09-28 11:01:51 -05:00
Chen Zheng 80584f0056 Revert "[PowerPC][ELF] make sure local variable space does not overlap with parameter save area"
This causes mix-compile issues on PowerPC Linux.

This reverts commit 324bd467a2.
2021-09-17 08:07:18 +00:00
Amy Kwan 5041a485b9 [PowerPC] Exploit Prefixed Load/Stores using the refactored Load/Store Implementation
This patch exploits the prefixed load and store instructions utilizing the
refactored load/store implementation introduced in D93370.

Prefixed load and store instructions are emitted whenever we are loading or
storing a value with an offset that fits into a 34-bit signed immediate.
Patterns for the prefixed load and stores are added in this patch, as well as
the implementation that detects when we are loading and storing a value with an
offset that fits in 34-bits.

Differential Revision: https://reviews.llvm.org/D96075
2021-09-14 08:39:49 -05:00
Arthur Eubanks f94a118a6e [NFC] Avoid using pointee types in PPCISelLowering
A cmpxchg's new value type is the same as the pointer operand's pointee type.
2021-09-12 17:37:35 -07:00
Amy Kwan 351a0d8a90 [PowerPC] Update PC-Relative Load/Store Patterns to use the refactored Load/Store Implementation
This patch updates the PC-Relative load and store patterns to utilize the
refactored load/store implementation introduced in D93370.

PC-Relative implementation has been added to PPCISelLowering.cpp, and also the
patterns in PPCInstrPrefix.td have been updated and no longer require AddedComplexity.
All existing test cases pass with this update.

Differential Revision: https://reviews.llvm.org/D95116
2021-09-09 15:38:42 -05:00
Craig Topper 9af8f1b18e [SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode.
Soft deprecrate isNullValue/isAllOnesValue and update in tree
callers. This matches the changes to the APInt interface from
D109483.

Reviewed By: lattner

Differential Revision: https://reviews.llvm.org/D109535
2021-09-09 13:28:30 -07:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Alexander Kornienko 893ac53afc Fix -Wunused-variable 2021-09-01 11:29:30 +02:00
Kai Luo 5eaebd5d64 [PowerPC] Implement quadword atomic load/store
Add support to load/store i128 atomically.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D105612
2021-09-01 06:55:40 +00:00
Nick Desaulniers d8b6ae072d [PPCISelLowering] avoid emitting libcalls to __mulodi4()
Similar to D108842, D108844, and D108926.

__has_builtin(builtin_mul_overflow) returns true for 32b PPC targets,
but Clang is deferring to compiler RT when encountering long long types.
This breaks ppc44x_defconfig + CONFIG_BLK_DEV_NBD=y builds of the Linux
kernel that are using builtin_mul_overflow with these types for these
targets.

If the semantics of __has_builtin mean "the compiler resolves these,
always" then we shouldn't conditionally emit a libcall.

This will still need to be worked around in the Linux kernel in order to
continue to support these builds of the Linux kernel for this
target with older releases of clang.

Link: https://bugs.llvm.org/show_bug.cgi?id=28629
Link: https://github.com/ClangBuiltLinux/linux/issues/1438

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D108936
2021-08-31 11:09:58 -07:00
Chen Zheng 324bd467a2 [PowerPC][ELF] make sure local variable space does not overlap with parameter save area
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D105271
2021-08-27 01:58:41 +00:00
Victor Huang 99e00663d4 [PowerPC] Fix return address computation for "__builtin_return_address"
When depth > 0, callee frame address is used to compute the return address of
callee producing improper return address. This patch adds the fix to use caller
frame address to compute the return address of callee.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D107646
2021-08-12 09:44:49 -05:00
Kai Luo e2ee27b20b [PowerPC] Fallback to base's implementation of shouldExpandAtomicCmpXchgInIR and shouldExpandAtomicCmpXchgInIR
If we can't decide `shouldExpandAtomicCmpXchgInIR` or `shouldExpandAtomicCmpXchgInIR` in PPC's implementation after https://reviews.llvm.org/rGb9c3941cd61de1e1b9e4f3311ddfa92394475f4b, resort to base's implementation.

This fixes internal build of OpenMP which uses atomic operations on float.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D106234
2021-07-20 06:14:24 +00:00
Kai Luo b9c3941cd6 [PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand
This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D103614
2021-07-15 01:12:09 +00:00
Amy Kwan b5f4ac4c11 [PowerPC] Add FI alignment check if the addressing mode is DS/DQ-Form, emit X-Form if necessary.
This patch adds a function that checks whether or not the frame index
is aligned when the computed addressing mode is an aligned D-Form (DS, or DQ-Form).
If the frame index appears to be unaligned, within these two modes, reset
the mode to X-Form in order to fall back to selection X-Form loads.

A test case is added to ensure that the test emits X-Form loads and not DQ-Form
loads since the frame index is not aligned within the test case.

Differential Revision: https://reviews.llvm.org/D105661
2021-07-13 12:31:52 -05:00
Qiu Chaofan 6fd9c1901f [PowerPC] Fix typo in vector shuffle combining
a22ecb4 fixed a crash on big endian subtargets. This commit fixes a typo
in that commit which may cause miscompile.
2021-07-13 14:35:47 +08:00
Jinsong Ji 2377eca93c [PowerPC] Custom Lowering BUILD_VECTOR for v2i64 for P7 as well
The lowering for v2i64 is now guarded with hasDirectMove,
however, the current lowering can handle the pattern correctly,
only lowering it when there is efficient patterns and corresponding
instructions.

The original guard was added in D21135, and was for Legal action.
The code has evloved now, this guard is not necessary anymore.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D105596
2021-07-12 17:56:10 +00:00
Qiu Chaofan a22ecb4508 [PowerPC] Fix i64 to vector lowering on big endian
Lowering for scalar to vector would skip if current subtarget is big
endian and the scalar is larger or equal than 64 bits. However there's
some issue in implementation that SToVRHS may refer to SToVLHS's scalar
size if SToVLHS is present, which leads to some crash.o

Reviewed By: nemanjai, shchenz

Differential Revision: https://reviews.llvm.org/D105094
2021-07-08 11:05:09 +08:00
Nemanja Ivanovic 6a06dbafa1 [PowerPC] Disable permuted SCALAR_TO_VECTOR on LE without direct moves
There are some patterns involving the permuted scalar to vector node
for which we don't have patterns without direct moves on little endian
subtargets. This causes selection errors. While we can of course add
the missing patterns, any additional effort to make this work is not
useful since there is no support for any CPU that can run in
little endian mode and does not support direct moves.
2021-07-07 13:50:49 -05:00
Zarko Todorovski ee6ca9c7df [AIX] Use VSSRC/VSFRC Register classes for f32/f64 callee arguments on P8 and above
Adding usage of VSSRC and VSFRC when adding the live in registers on AIX.
This matches the behaviour of the rest of PPC Subtargets.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D104396
2021-07-07 09:18:20 -04:00
Nemanja Ivanovic 3553698de7 [PowerPC] Re-enable combine for i64 BSWAP on targets without LDBRX
The combine was disabled in 4e22c7265d as it caused failures in
the ppc64be-multistage (bootstrap) bot.
It turns out that the combine did not correctly update the MMO for
the high load which caused aliased stores to be reported as unaliased.
This patch fixes that problem and re-enables the combine.
2021-07-06 20:42:01 -05:00
Nemanja Ivanovic 4e22c7265d [PowerPC] Disable combine 64-bit bswap(load) without LDBRX
This causes failures on the big endian bootstrap bot.
Disabling this combine temporarily until I can get a proper fix.
2021-06-25 15:11:22 -05:00
Nemanja Ivanovic dcccb2f594 [PowerPC] Fix bswap combine for big endian systems
Commit 0464586ac5 added a combine
for a 64-bit load feeding a bswap but the implementation is only
correct for little endian systems.
This fixes it for big endian systems.
2021-06-24 18:04:50 -05:00
Martin Storsjö 42f74e8249 [llvm] Rename StringRef _lower() method calls to _insensitive()
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
2021-06-25 00:22:01 +03:00
Nemanja Ivanovic 0464586ac5 [PowerPC] Combine 64-bit bswap(load) without LDBRX
When targeting CPUs that don't have LDBRX, we end up producing code that is
very inefficient and large for this common idiom. This patch just
optimizes it two 32-bit LWBRX instructions along with a merge.

This fixes https://bugs.llvm.org/show_bug.cgi?id=49610

Differential revision: https://reviews.llvm.org/D104836
2021-06-24 15:11:47 -05:00
zhijian 7ed515d168 [AIX][XCOFF] emit vector info of traceback table.
Summary:

emit vector info of traceback table.

Reviewers: Jason Liu,Hubert Tong
Differential Revision: https://reviews.llvm.org/D93659
2021-06-14 11:15:22 -04:00
Zarko Todorovski c1bb75febe [PowerPC] Allow wa inline asm to also accept floating point arguments
GCC documentation for the `wa` constraint states that:
```
wa

    A VSX register (VSR), vs0…vs63. This is either an FPR (vs0…vs31 are f0…f31)
    or a VR (vs32…vs63 are v0…v31).
```
This technically means that we could accept floating point parameters. In fact,
gcc itself does. The following testcase compiles and runs on all PPC platforms with GCC,
whereas clang/llc will assert:
```
#include <stdio.h>
double foo ( vector double a ) {
  double b, c;
  asm("xvabsdp  %x0, %x2        \n"
             "xxsldwi  %x1, %x0, %x0, 2 \n"
      :  "+wa"    (b),
         "=wa"    (c)
      :  "wa"    (a)
      );
  return b+c;
}
int main(void) {
  vector double a = {-3., -4.};
  double t = foo( a );
  printf("%g\n", t);
}
```
This patch allows clang/llc to build and run this testcase.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D103409
2021-06-11 07:19:10 -04:00
Simon Pilgrim 01b77159e3 PPCISelLowering.cpp - don't dereference a dyn_cast<>.
dyn_cast<> can return nullptr which we would then dereference - use cast<> which will assert that the type is correct.
2021-06-08 17:59:05 +01:00
Nikita Popov 1ffa6499ea [TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC)
Don't require a specific kind of IRBuilder for TargetLowering hooks.
This allows us to drop the IRBuilder.h include from TargetLowering.h.

Differential Revision: https://reviews.llvm.org/D103759
2021-06-06 16:29:50 +02:00
Anshil Gandhi 1c5ff0b03f [PowerPC] [GlobalISel] Implementation of formal arguments lowering in the IRTranslator for the PPC backend
Differential Revision: https://reviews.llvm.org/D99812
2021-06-02 16:46:39 -06:00
Anshil Gandhi 3e5ddb83e3 Revert "Differential Revision: https://reviews.llvm.org/D99812"
This reverts commit c729f2a48a.
2021-06-02 16:36:00 -06:00
Anshil Gandhi c729f2a48a Differential Revision: https://reviews.llvm.org/D99812 2021-06-02 14:09:52 -06:00
Jinsong Ji b2581196eb [AIX] Enable stackprotect feature
AIX use `__ssp_canary_word` instead of `__stack_chk_guard`.
This patch update the target hook to use correct symbol,
so that the basic stackprotect feature can work.

The traceback will be handled in follow up patch.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D103100
2021-05-28 02:18:15 +00:00
Stefan Pintilie 45ad207e45 [PowerPC] Add fix to partword atomic operations
Partword atomic binaries are not zero extended as they should be.
This patch fixes them to ensure that they are zero extended.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D102819
2021-05-20 12:36:37 -05:00
Chen Zheng 15d4ed6d8c [PowerPC] only check the load instruction result number 0.
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D102596
2021-05-18 00:49:37 -04:00
Stefan Pintilie 15051f0b4a [PowerPC] Handle inline assembly clobber of link regsiter
This patch adds the handling of clobbers of the link register LR for inline
assembly.

This patch is to fix:
https://bugs.llvm.org/show_bug.cgi?id=50147

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101657
2021-05-13 07:43:37 -05:00
Nemanja Ivanovic 39e4676ca7 [PowerPC] Provide doubleword vector predicate form comparisons on Power7
There are two reasons this shouldn't be restricted to Power8 and up:
1. For XL compatibility
2. Because clang will expand comparison operators to these intrinsics*

*Without this patch, the following causes a selection error:

int test(vector signed long a, vector signed long b) {
  return a < b;
}

This patch provides the handling for the intrinsics in the back
end and removes the Power8 guards from the predicate functions
(vec_{all|any}_{eq|ne|gt|ge|lt|le}).
2021-05-13 04:56:56 -05:00
Zarko Todorovski 0c41f77857 [PowerPC] Enable safe for 32bit vins* P10 instructions
Correctly emit `vins`instructions that are safe in 32bit mode.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101383
2021-05-10 10:13:13 -04:00
Amy Kwan 1998a08655 [PowerPC][NFC] Update atomic patterns to use the refactored load/store implementation
This patch updates the scalar atomic patterns to use the refactored load/store
implementation introduced in D93370.
All existing test cases pass with when the refactored patterns are utilized.

Differential Revision: https://reviews.llvm.org/D94498
2021-05-04 10:46:45 -05:00
Amy Kwan 64d951be61 [PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns.
This patch introduces a new infrastructure that is used to select the load and
store instructions in the PPC backend.

The primary motivation is that the current implementation of selecting load/stores
is dependent on the ordering of patterns in TableGen. Given this limitation, we
are not able to easily and reliably generate the P10 prefixed load and stores
instructions (such as when the immediates that fit within 34-bits). This
refactoring is meant to provide us with more control over the patterns/different
forms to exploit, as well as eliminating dependency of pattern declaration in TableGen.

The idea of this refactoring is that it introduces a set of addressing modes that
correspond to different instruction formats of a particular load and store
instruction, along with a set of common flags that describes a load/store.
Whenever a load/store instruction is being selected, we analyze the instruction
and compute a set of flags for it. The computed flags are then used to
select the most optimal load/store addressing mode.

This patch is the first of a series of patches to be committed - it contains the
initial implementation of the refactored load/store selection infrastructure and
also updates P8/P9 patterns to adopt this infrastructure. The idea is that
incremental patches will add more implementation and support, and eventually
the old implementation will be removed.

Differential Revision: https://reviews.llvm.org/D93370
2021-04-30 09:53:19 -05:00
Victor Huang ae3377c553 [AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects
- Add new variantKinds for the symbol's variable offset and region handle
- Print the proper relocation specifier @gd in the asm streamer when emitting
  the TC Entry for the variable offset for the symbol
- Fix the switch section failure between the TC Entry of variable offset and
  region handle
- Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property

Reviewed by: sfertile

Differential Revision: https://reviews.llvm.org/D100956
2021-04-29 13:18:59 -05:00
Qiu Chaofan 56d923efdb [SPE] Support constrained float operations on SPE
This patch enables support on SPE for constrained arithmetic and
comparison operations. This fixes bugzilla 50070.

One thing not covered is fcmp vs. fcmps on SPE. Some condition code
generates singaling comparison while some not. In this patch, all are
considered as singaling. So there might be still some issue when
compiling from C code.

Reviewed By: jhibbits

Differential Revision: https://reviews.llvm.org/D101282
2021-04-29 16:34:10 +08:00
Qiu Chaofan d5c2492455 [PowerPC] Fix SELECT_CC with i64 operand on PPC32
This patch fixes the infinite loop in legalization of PPC32 SELECT_CC
with 64-bit operand.
2021-04-28 17:48:33 +08:00
Zarko Todorovski f818ec9dd1 [AIX] Allow safe for 32bit P9 VSX extract and insert pattern matches
In https://reviews.llvm.org/D92789 PPC64 checks were added that disallowed most
VSX pattern matching.  We enable some safe ones for 32bit in this patch.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D97503
2021-04-27 07:27:43 -04:00
Nemanja Ivanovic 03e7fefff8 [PowerPC] Canonicalize shuffles on big endian targets as well
Extend shuffle canonicalization and conversion of shuffles fed by vectorized
scalars to big endian subtargets. For big endian subtargets, loads and direct
moves of scalars into vector registers put the data in the correct element for
SCALAR_TO_VECTOR if the data type is 8 bytes wide. However, if the data type is
narrower, the value still ends up in the wrong place - althouth a different
wrong place than on little endian targets.

This patch extends the combine that keeps values where they are if they feed a
shuffle to big endian targets.

Differential revision: https://reviews.llvm.org/D100478
2021-04-20 07:29:47 -05:00
Qiu Chaofan b820339752 [PowerPC] Support f128 under VSX
This patch is the last one in backend to support fp128 type in
pre-POWER9 subtargets with VSX, removing temporary option and updating
remaining tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92374
2021-04-20 15:49:52 +08:00
Nemanja Ivanovic ff769dd111 [PowerPC] Minor improvement for insert_vector_elt codegen
For v2f64, all VSX subtargets can insert an element with a single
XXPERMDI.
2021-04-16 18:52:37 -05:00
Chen Zheng 80aa9b0f7b [PowerPC] stop reverse mem op generation for some cases.
We should consider the feeder user number when we do reverse memory
operation transformation. Otherwise, we may get negative impact.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D100166
2021-04-12 22:41:28 -04:00
Qiu Chaofan ece7345859 [PowerPC] Lower f128 SETCC/SELECT_CC as libcall if p9vector disabled
XSCMPUQP is not available for pre-P9 subtargets. This patch will lower
them into libcall for correct behavior on power7/power8.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92083
2021-04-12 10:33:32 +08:00
Albion Fung e29bb074c6 [PowerPC] Exploit xxsplti32dx (constant materialization) for scalars
This patch exploits the xxsplti32dx instruction available on Power10
in place of constant pool loads where xxspltidp would not be able to,
usually because the immediate cannot fit into 32 bits.

Differential Revision: https://reviews.llvm.org/D95458
2021-03-24 15:59:59 -04:00
Nemanja Ivanovic ea48bf8649 [PowerPC][NFC] Do not produce i64 constants in 32-bit mode
There are some instances where we produce constants of type MVT::i64
unconditionally in the target DAG combines. This is not actually
valid in 32-bit mode.
2021-03-19 22:54:47 -05:00
Lei Huang 535a4192a9 [AIX][TLS] Generate 64-bit general-dynamic access code sequence
Add support for the TLS general dynamic access model to assembly
files on AIX 64-bit.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D98078
2021-03-08 16:41:25 -06:00
Nemanja Ivanovic b0f0115308 [AIX][TLS] Generate 32-bit general-dynamic access code sequence
Adds support for the TLS general dynamic access model to
assembly files on AIX 32-bit.

To generate the correct code sequence when accessing a TLS variable
`v`, we first create two TOC entry nodes, one for the variable offset, one
for the region handle. These nodes are followed by a `PPCISD::TLSGD_AIX`
node (new node introduced by this patch).
The `PPCISD::TLSGD_AIX` node (`TLSGDAIX` pseudo instruction) is
expanded to 2 copies (to put the variable offset and region handle in
the right registers) and a call to `__tls_get_addr`.

This patch also changes the way TC entries are generated in asm files.
If the generated TC entry is for the region handle of a TLS variable,
we add the `@m` relocation and the `.` prefix to the entry name.
For example:

```
L..C0:
  .tc .v[TC],v[TL]@m -> region handle
L..C1:
  .tc v[TC],v[TL] -> variable offset
```

Reviewed By: nemanjai, sfertile

Differential Revision: https://reviews.llvm.org/D97948
2021-03-08 09:30:19 -06:00
Sean Fertile f0904a6208 [PowePC][AIX] Handle variadic vector call operands.
Patch adds support for passing vector call operands to variadic
functions. Arguments which are fixed shadow GPRs and stack space even
when they are passed in vector registers, while arguments passed through
ellipses are passed in properly aligned GPRs if available and on the
stack once all GPR arguments registers are consumed.

Differential Revision: https://reviews.llvm.org/D97956
2021-03-06 13:49:55 -05:00
Zarko Todorovski 2b50ce1524 [PowerPC][AIX] Enable the default AltiVec ABI on AIX
This patch adds support for the default AltiVec ABI for AIX.

Vector registers 20 through 31 are marked as reserved and cannot
be used in the default ABI. This patch adds handling for this case
and also remove the default AltiVec ABI errors.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D96351
2021-03-05 12:46:27 -05:00
Benjamin Kramer e897feeb8a [PPC] Silence unused variable warning in release builds. NFC. 2021-03-04 21:43:19 +01:00
Sean Fertile aaeffbe007 [PowerPC][AIX] Handle variadic vector formal arguments.
Patch adds support for passing vector arguments to variadic functions.
Arguments which are fixed shadow GPRs and stack space even when they are
passed in vector registers, while arguments passed through ellipses are
passed in(properly aligned GPRs if available and on the stack once all
GPR arguments registers are consumed.

Differential Revision: https://reviews.llvm.org/D97485
2021-03-04 10:56:53 -05:00
Victor Huang 1756b2adc9 [AIX][TLS] Generate TLS variables in assembly files
This patch allows generating TLS variables in assembly files on AIX.
Initialized and external uninitialized variables are generated with the
.csect pseudo-op and local uninitialized variables are generated with
the .comm/.lcomm pseudo-ops. The patch also adds a check to
explicitly say that TLS is not yet supported on AIX.

Reviewed by: daltenty, jasonliu, lei, nemanjai, sfertile
Originally patched by: bsaleil
Commandeered by: NeHuang

Differential Revision: https://reviews.llvm.org/D96184
2021-03-02 18:22:48 -06:00
Sean Fertile bb260b1ca7 [PowerPC][AIX] Add support for vector arg passing on the stack.
Enable passing more vector arguments then available vector
argument passing registers.

Differential Revision: https://reviews.llvm.org/D96415
2021-02-18 13:32:40 -05:00
Baptiste Saleil 34dc1ccb96 [PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10
This patch generates the vinsw, vinsd, vinsblx, vinshlx, vinswlx, vinsdlx,
vinsbrx, vinshrx, vinswrx and vinsdrx instructions for vector insertion on P10.

Differential Revision: https://reviews.llvm.org/D94454
2021-02-18 14:17:47 +00:00
Chen Zheng 5517923b1c [XCOFF][NFC] make csect properties optional for getXCOFFSection
We are going to support debug sections for XCOFF. So the csect
properties are not necessary. This patch makes these properties
optional.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D95931
2021-02-17 20:51:42 -05:00
Sean Fertile 4e127bce2d [PowerPC] Handle FP physical register in inline asm constraint.
Do not defer to the base class when the register constraint is a
physical fpr. The base class will select SPILLTOVSRRC as the register
class and register allocation will fail on subtargets without VSX
registers.

Differential Revision: https://reviews.llvm.org/D91629
2021-02-17 09:27:03 -05:00
Craig Topper 11ef356d9e [TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097
2021-02-04 19:22:06 -08:00
Kazu Hirata 8ed1636184 [llvm] Use isa instead of dyn_cast (NFC) 2021-01-29 23:23:37 -08:00
Albion Fung 2e470e03b4 [PowerPC][Power10] Fix XXSPLI32DX not correctly exploiting specific cases
Some cases may be transformed into 32 bit splats before hitting the boolean statement, which may cause incorrect behaviour and provide XXSPLTI32DX with the incorrect values of splat. The condition was reversed so that the shortcut prevents this problem.

Differential Revision: https://reviews.llvm.org/D95634
2021-01-28 15:17:32 -05:00
Nemanja Ivanovic 54e570d94a [PowerPC] Do not emit XXSPLTI32DX for sub 64-bit constants
If the APInt returned by BuildVectorSDNode::isConstantSplat() is narrower than
64 bits, the result produced by XXSPLTI32DX is incorrect. The result returned
by the function appears to be incorrect and we'll investigate/fix it in a
follow-up commit. However, since this causes miscompiles, we must
temporarily disable emitting this instruction for such values.
2021-01-28 04:16:48 -06:00
QingShan Zhang ffc3e800c6 [NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero
For now, we correct the result for sqrt if iteration > 0. This doesn't make
sense as they are not strict relative.

Reviewed By: dmgreen, spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D94480
2021-01-25 04:02:44 +00:00
Kazu Hirata 16baad8f4e [llvm] Use pop_back_val (NFC) 2021-01-24 12:18:57 -08:00
Albion Fung 719b563ecf [PowerPC][Power10] Exploit splat instruction xxsplti32dx in Power10
Exploits the instruction xxsplti32dx.

It can be used to materialize any 64 bit scalar/vector splat by using two instances, one for the upper 32 bits and the other for the lower 32 bits. It should not materialize the cases which can be materialized by using the instruction xxspltidp.

Differential Revision: https://https://reviews.llvm.org/D90173
2021-01-20 12:55:52 -05:00
Nemanja Ivanovic 61f69153e8 [PowerPC] Sign extend comparison operand for signed atomic comparisons
As of 8dacca943a, we sign extend the atomic loaded
operand for signed subword comparisons. However, the assumption that the other
operand is correctly sign extended doesn't always hold. This patch sign extends
the other operand if it needs to be sign extended.

This is a second fix for https://bugs.llvm.org/show_bug.cgi?id=30451

Differential revision: https://reviews.llvm.org/D94058
2021-01-18 21:19:25 -06:00
Kazu Hirata 7dc3575ef2 [llvm] Remove redundant return and continue statements (NFC)
Identified with readability-redundant-control-flow.
2021-01-14 20:30:34 -08:00
Nemanja Ivanovic 3f7b4ce960 [PowerPC] Add support for embedded devices with EFPU2
PowerPC cores like e200z759n3 [1] using an efpu2 only support single precision
hardware floating point instructions. The single precision instructions efs*
and evfs* are identical to the spe float instructions while efd* and evfd*
instructions trigger a not implemented exception.

This patch introduces a new command line option -mefpu2 which leads to
single-hardware / double-software code generation.

[1] Core reference:
  https://www.nxp.com/files-static/32bit/doc/ref_manual/e200z759CRM.pdf

Differential revision: https://reviews.llvm.org/D92935
2021-01-12 09:47:00 -06:00
Fangrui Song 022cc6e343 [PowerPC] Delete dead Lower* 2021-01-06 21:58:40 -08:00
Fangrui Song bfa6ca07a8 [PowerPC] Delete remnant Darwin ISelLowering code 2021-01-06 21:40:40 -08:00
Qiu Chaofan b6c8feb29f [NFC] [PowerPC] Remove dead code in BUILD_VECTOR peephole
The piece of code tries to use splat+shift to lower build_vector with
repeating bit pattern. And immediate field of vector splat is only 5
bits (-16~15). It iterates over them one by one to find which
shifts/rotates to number in build_vector.

This patch removes code to try matching constant with algebraic
right-shift because that's meaningless - any negative number's algebraic
right-shift won't produce result smaller than itself. Besides, code
(int)((unsigned)i >> j) means logical shift-right in C.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D93937
2021-01-05 11:35:00 +08:00
Kai Luo f904d50c29 [PowerPC] Remaining KnownBits should be constant when performing non-sign comparison
In `PPCTargetLowering::DAGCombineTruncBoolExt`, when checking if it's correct to perform the transformation for non-sign comparison, as the comment says
```
      // This is neither a signed nor an unsigned comparison, just make sure
      // that the high bits are equal.
```
Origin check
```
      if (Op1Known.Zero != Op2Known.Zero || Op1Known.One != Op2Known.One)
        return SDValue();
```
is not strong enough. For example,
```
Op1Known = 111x000x;
Op2Known = 111x000x;
```
Bit 4, besides bit 0, is still unknown and affects the final result.

This patch fixes https://bugs.llvm.org/show_bug.cgi?id=48388.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D93092
2020-12-30 02:00:47 +00:00
Reid Kleckner 0985a8bfea Fix left shift overflow UB in PPC backend on LLP64 platforms 2020-12-19 17:46:09 -08:00
Baptiste Saleil c2892978e9 [PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_
On PPC, the vector pair instructions are independent from MMA.
This patch renames the vector pair LLVM intrinsics and Clang builtins to replace the _mma_ prefix by _vsx_ in their names.
We also move the vector pair type/intrinsic/builtin tests to their own files.

Differential Revision: https://reviews.llvm.org/D91974
2020-12-17 13:19:27 -05:00
QingShan Zhang ebdd20f430 Expand the fp_to_int/int_to_fp/fp_round/fp_extend as libcall for fp128
X86 and AArch64 expand it as libcall inside the target. And PowerPC also
want to expand them as libcall for P8. So, propose an implement in the
legalizer to common the logic and remove the code for X86/AArch64 to
avoid the duplicate code.

Reviewed By: Craig Topper

Differential Revision: https://reviews.llvm.org/D91331
2020-12-17 07:59:30 +00:00
QingShan Zhang 08e287aaf3 [PowerPC][FP128] Fix the incorrect signature for math library call
The runtime library has two family library implementation for ppc_fp128 and fp128.
For IBM Long double(ppc_fp128), it is suffixed with 'l', i.e(sqrtl). For
IEEE Long double(fp128), it is suffixed with "ieee128" or "f128".
We miss to map several libcall for IEEE Long double.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D91675
2020-12-14 07:52:56 +00:00
Zarko Todorovski ce4040a43d [PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern matching built vectors
Some of the pattern matching in PPCInstrVSX.td and node lowering involving vectors assumes 64bit mode.  This patch disables some of the unsafe pattern matching and lowering of BUILD_VECTOR in 32bit mode.

Reviewed By: Xiangling_L

Differential Revision: https://reviews.llvm.org/D92789
2020-12-12 15:28:28 -05:00
diggerlin 997d286f2d [AIX][XCOFF] emit traceback table for function in aix
SUMMARY:
 1. added a new option -xcoff-traceback-table to control whether generate traceback table for function.
 2. implement the functionality of emit traceback table of a function.

Reviewers: hubert.reinterpretcast, Jason Liu
Differential Revision: https://reviews.llvm.org/D92398
2020-12-11 17:50:25 -05:00
Stefan Pintilie 2812c15156 [PowerPC] Fix missing nop after call to weak callee.
Weak functions can be replaced by other functions at link time. Previously it
was assumed that no matter what the weak callee function was replaced with it
would still share the same TOC as the caller. This is no longer true as a weak
callee with a TOC setup can be replaced by another function that was compiled
with PC Relative and does not have a TOC at all.

This patch makes sure that all calls to functions defined as weak from a caller
that has a valid TOC have a nop after the call to allow a place for the linker
to restore the TOC.

Reviewed By: NeHuang

Differential Revision: https://reviews.llvm.org/D91983
2020-12-08 09:38:44 -06:00
Qiu Chaofan efdd463050 [PowerPC] Fix chain for i1-to-fp operation
A simple SELECT is used for converting i1 to floating types on ppc32,
but in constrained cases, the chain is not handled properly. This patch
will fix that.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92365
2020-12-07 10:38:56 +08:00
QingShan Zhang c25b039e21 [PowerPC] Fix the regression caused by commit 9c588f53fc
Add a TypeLegal check for MVT::i1 and add the test.
2020-12-04 10:22:13 +00:00