Commit Graph

6898 Commits

Author SHA1 Message Date
Stefan Pintilie 1324bb29f7 [PowerPC] Fix issue with strict float to int conversion.
When doing the float to int conversion the strict conversion also needs to
retun a chain. This patch fixes that.

Reviewed By: nemanjai, #powerpc, qiucf

Differential Revision: https://reviews.llvm.org/D117464
2022-01-19 10:57:22 -06:00
Jim Lin d6b0734837 [NFC] Use Register instead of unsigned 2022-01-19 20:17:04 +08:00
Sean Fertile 10d3bf9518 [PowerPC][AIX] Fallback to DAG-ISEL if global has toc-data attribute.
FAST-ISEL should fall back to DAG-ISEL when a global variable has the
toc-data attribute. A number of the checks were duplicated in the lit
test becuase of
1) Slightly different output between -O0 and -O2 due to FAST-ISEL vs
   DAG-ISEL codegen.
2) In preperation of a peephole optimization that will run when
   optimizations are enabled.

Differential Revision: https://reviews.llvm.org/D115373
2022-01-17 16:21:38 -05:00
Fangrui Song 1ae1dd16cf [MC][PowerPC] Replace MCContext::reportFatalError calls with reportError
User errors should use reportError. reportError allows us to continue parsing
the file and collect more diagnostics.

While here, make the diagnostic follow convention, merge tests, and test
line/column numbers.
2022-01-15 00:01:36 -08:00
Nick Desaulniers 9c4b49db19 [ShrinkWrap] check for PPC's non-callee-saved LR
As pointed out in https://reviews.llvm.org/D115688#inline-1108193, we
don't want to sink the save point past an INLINEASM_BR, otherwise
prologepilog may incorrectly sink a prolog past the MBB containing an
INLINEASM_BR and into the wrong MBB.

ShrinkWrap is getting this wrong because LR is not in the list of callee
saved registers. Specifically, ShrinkWrap::useOrDefCSROrFI calls
RegisterClassInfo::getLastCalleeSavedAlias which reads
CalleeSavedAliases which was populated by
RegisterClassInfo::runOnMachineFunction by iterating the list of
MCPhysReg returned from MachineRegisterInfo::getCalleeSavedRegs.

Because PPC's LR is non-allocatable, it's NOT considered callee saved.
Add an interface to TargetRegisterInfo for such a case and use it in
Shrinkwrap to ensure we don't sink a prolog past an INLINEASM or
INLINEASM_BR that clobbers LR.

Reviewed By: jyknight, efriedma, nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D116424
2022-01-11 10:01:34 -08:00
Chen Zheng 2c46ca96e2 [PowerPC] fast isel can lower intrinsics call on AIX.
Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D114778
2022-01-10 02:30:05 +00:00
Kazu Hirata 2aed08131d [llvm] Use true/false instead of 1/0 (NFC)
Identified with modernize-use-bool-literals.
2022-01-07 00:39:14 -08:00
Qiu Chaofan c2cc70e4f5 [NFC] Fix endif comments to match with include guard 2022-01-07 15:52:59 +08:00
Kazu Hirata f3a344d212 [Target] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-06 22:01:44 -08:00
Stefan Pintilie 04496201e0 [PowerPC] Add support for ROP protection for 32 bit.
Add support for Return Oriented Programming (ROP) protection for 32 bit.
This patch also adds a testing for AIX on both 64 and 32 bit.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D111362
2022-01-05 15:15:53 -06:00
Kazu Hirata e5947760c2 Revert "[llvm] Remove redundant member initialization (NFC)"
This reverts commit fd4808887e.

This patch causes gcc to issue a lot of warnings like:

  warning: base class ‘class llvm::MCParsedAsmOperand’ should be
  explicitly initialized in the copy constructor [-Wextra]
2022-01-03 11:28:47 -08:00
Kazu Hirata 7e163afd9e Remove redundant void arguments (NFC)
Identified by modernize-redundant-void-arg.
2022-01-02 10:20:19 -08:00
Kazu Hirata fd4808887e [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-01 16:18:18 -08:00
Kazu Hirata 69ccc96162 [llvm] Use the default constructor for SDValue (NFC) 2022-01-01 10:36:59 -08:00
Kazu Hirata 5a667c0e74 [llvm] Use nullptr instead of 0 (NFC)
Identified with modernize-use-nullptr.
2021-12-28 08:52:25 -08:00
Nikita Popov f5ac23b5ae [ArgPromotion][TTI] Pass types to ABI compatibility hook
The areFunctionArgsABICompatible() hook currently accepts a list of
pointer arguments, though what we're actually interested in is the
ABI compatibility after these pointer arguments have been converted
into value arguments.

This means that a) the current API is incompatible with opaque
pointers (because it requires inspection of pointee types) and
b) it can only be used in the specific context of ArgPromotion.
I would like to reuse the API when inspecting calls during inlining.

This patch converts it into an areTypesABICompatible() hook, which
accepts a list of types. This makes the method more generally usable,
and compatible with opaque pointers from an API perspective (the
actual usage in ArgPromotion/Attributor is still incompatible,
I'll follow up on that in separate patches).

Differential Revision: https://reviews.llvm.org/D116031
2021-12-22 09:37:51 +01:00
Kazu Hirata 9db0e21660 [llvm] Use depth_first (NFC) 2021-12-21 22:28:48 -08:00
Nemanja Ivanovic 1674d9b6b2 [PowerPC] Fix vector equality comparison for v2i64 pre-Power8
The current code makes the assumption that equality
comparison can be performed with a word comparison
instruction. While this is true if the entire 64-bit
results are used, it does not generally work. It is
possible that the low order words and high order
words produce different results and a user of only
one will get the wrong result.

This patch adds an and of the result words so that
each word has the result of the comparison of the
entire doubleword that contains it.

Differential revision: https://reviews.llvm.org/D115678
2021-12-21 14:28:41 -06:00
Nemanja Ivanovic a3ea9052d6 [PowerPC] Do not increase cost for getUserCost with MMA types
Commit 150681f increases
cost of producing MMA types (vector pair and quad).
However, it increases the cost for getUserCost() which is
used in unrolling. As a result, loops that contain these
types already (from the user code) cannot be unrolled
(even with the user's unroll pragma). This was an unintended
sideeffect. Reverting that portion of the commit to allow
unrolling such loops.

Differential revision: https://reviews.llvm.org/D115424
2021-12-21 13:36:08 -06:00
Esme-Yi b66328701a [PowerPC][llvm-objdump] enable --symbolize-operands for PowerPC ELF/XCOFF.
Summary: When disassembling, symbolize a branch target operand
to print a label instead of a real address.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D114492
2021-12-21 04:17:57 +00:00
Nemanja Ivanovic 2fb9029f26 [PowerPC] Support hwsync extended mnemonic
This mnemonic has been supported by GAS for years and
it was added to the PowerPC ISA as of ISA 3.1. We will
support the mnemonic to be compatible with GAS.
2021-12-20 10:08:31 -06:00
Zaara Syeda 3f066ac648 Test commit 2021-12-14 15:37:28 +00:00
Craig Topper b3db7dde79 [TargetInstrInfo][PowerPC] Remove virtual function that is only called from PPC specific code.
There are two signatures of setSpecialOperandAttr in TargetInstrInfo.
One of them is only called from PPCInstrInfo which has an override
of it.

Remove it from TargetInstrInfo and make it a non-virtual method in
PPCInstrInfo.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D115404
2021-12-09 12:51:27 -08:00
Mikael Holmen cb413f208a [PowerPC] Fix gcc warning about unused variable [NFC]
gcc warned about
../lib/Target/PowerPC/PPCTargetTransformInfo.cpp:1401:13: warning: unused variable 'VecTy' [-Wunused-variable]
 1401 |   if (auto *VecTy = dyn_cast<FixedVectorType>(DataType)) {
      |             ^~~~~
2021-12-09 10:31:56 +01:00
Chen Zheng d0022a7250 [PowerPC] copy byval parameter to caller's stack when needed
Now we won't copy the byval parameter (bigger than 8 bytes) to
caller's parameter save area. Instead, we will only copy the
byval parameter when it can not be passed entirely in registers
which means we have to use parameter save area according to the
64 bit SVR4 ABI.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D111485
2021-12-09 01:00:47 +00:00
Chen Zheng 63cd1842a7 [PowerPC] use lvx + splat directly for aligned splat load
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D114062
2021-12-08 02:02:18 +00:00
Bardia Mahjour 8aee783366 [VP] Cost model for VPMemory operations on PowerPC.
PPC Implementation of getVPMemoryOpCost and hasActiveVectorLength.

Reviewed By: Roland Froese

Differential Revision: https://reviews.llvm.org/D109417
2021-12-07 14:19:09 -05:00
Qiu Chaofan e3c2694da9 [PowerPC] Implement general back2back fusion
Implement 'back-to-back' FX fusion according to Power10 User Manual
'19.1.5.4 Fusion', not enabled by default.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D114345
2021-12-06 10:15:05 +08:00
Nemanja Ivanovic d6c0ef7887 [PowerPC] Handle base load with reservation mnemonic
The Power ISA defined l[bhwdq]arx as both base and
extended mnemonics. The base mnemonic takes the EH
bit as an operand and the extended mnemonic omits
it, making it implicitly zero. The existing
implementation only handles the base mnemonic when
EH is 1 and internally produces a different
instruction. There are historical reasons for this.
This patch simply removes the limitation introduced
by this implementation that disallows the base
mnemonic with EH = 0 in the ASM parser.

This resolves an issue that prevented some files
in the Linux kernel from being built with
-fintegrated-as.

Also fix a crash if the value is not an integer immediate.
2021-12-03 09:13:02 -06:00
Amy Kwan c27734c183 [PowerPC] Fix load/store selection infrastructure when load/store intrinsics are used on P10.
The load/store infrastructure previously made an incorrect assumption that
whenever it is used with a load/store intrinsic on Power10 - those intrinsics
would automatically be the lxvp/stxvp intrinsics introduced in Power10.

However, this is obviously not the case as there are multiple instances of
pre-P10 intrinsics that use the refactored load/store implementation.
This patch corrects this assumption, and produces the expected intrinsic on pre-P10.

Differential Revision: https://reviews.llvm.org/D114978
2021-12-02 15:59:29 -06:00
Yousuf Ali 415e821a50 [PowerPC][AIX] Add toc-data support for 64-bit AIX small code model.
The patch expands the existing 32-bit toc-data attribute support to 64-bit.
In both 32-bit and 64-bit it is supported for small code model only.

Differential Revision: https://reviews.llvm.org/D114654
2021-12-01 10:56:21 -05:00
Tarique Islam 0850655da6 Big-endian version of vpermxor
A big-endian version of vpermxor, named vpermxor_be, is added to LLVM
and Clang. vpermxor_be can be called directly on both the little-endian
and the big-endian platforms.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D114540
2021-11-30 22:49:55 +00:00
Kazu Hirata ff649e0802 [Target] Use range-based for loops (NFC) 2021-11-27 11:16:19 -08:00
Amy Kwan 150681f2f3 [PowerPC] Prevent the optimizer from producing wide vector types in IR.
This patch prevents the optimizer from producing wide vectors in the IR,
specifically the MMA types (v256i1, v512i1). The idea is that on Power, we only
want to be producing these types only if the vector_pair and vector_quad types
are used specifically.

To prevent the optimizer from producing these types in the IR,
vectorCostAdjustmentFactor() is updated to return an instruction cost factor or
an invalid instruction cost if the current type is that of an MMA type. An
invalid instruction cost returned by this function signifies to other cost
computing functions to return the maximum instruction cost to inform the
optimizer that producing these types within the IR is expensive, and should not
be produced in the first place.

This issue was first seen in the test case included within this patch.

Differential Revision: https://reviews.llvm.org/D113900
2021-11-25 12:35:26 -06:00
Nemanja Ivanovic b7bf937bbe [PowerPC] Provide XL-compatible vec_round implementation
The XL implementation of vec_round for vector double uses
"round-to-nearest, ties to even" just as the vector float
`version does. However clang and gcc use "round-to-nearest-away"
for vector double and "round-to-nearest, ties to even"
for vector float.

The XL behaviour is implemented under the __XL_COMPAT_ALTIVEC__
macro similarly to other instances of incompatibility.

Differential revision: https://reviews.llvm.org/D113642
2021-11-24 06:43:56 -06:00
Nemanja Ivanovic c9cb8edc51 [PowerPC] Allow scalars for asm constraint "v" with VSX
Similarly to what GCC does, we should allow scalars with
the "v" constraint rather than introducing unnecessary
new constraints for scalars in Altivec registers.

Differential revision: https://reviews.llvm.org/D113635
2021-11-23 17:03:04 -06:00
Nemanja Ivanovic c933c2eb33 [PowerPC] Add BCD add/sub/cmp builtins
Support for builtins that use bcdadd./bcdsub. to add/subtract
Binary Coded Decimal values as well as to determine validity
and compare BCD values.

Differential revision: https://reviews.llvm.org/D114088
2021-11-23 11:42:36 -06:00
Qiu Chaofan 59f4b3d308 [PowerPC] Implement more fusion types for Power10
This implements the rest of Power10 instruction fusion pairs, according
to user manual, including 'wide immediate', 'load compare', 'zero move'
and 'SHA3 assist'.

Only 'SHA3 assist' is enabled by default.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D112912
2021-11-23 17:21:17 +08:00
Kazu Hirata d5b73a70a0 [llvm] Use range-based for loops (NFC) 2021-11-22 20:33:28 -08:00
Kazu Hirata ea5421bd0d [llvm] Use range-based for loops (NFC) 2021-11-21 19:24:15 -08:00
Victor Huang 86e77cdb08 [PowerPC] Add a flag for conditional trap optimization
This patch adds a flag to enable/disable conditional trap optimization.
Optimization disabled by default.

Peer reviewed by: nemanjai
2021-11-19 10:24:54 -06:00
Victor Huang 40c65655af [PowerPC] Remove the redundant terminator instruction when optimizing conditional trap
This patch is a follow up patch for ae27ca9a67 to
the remove redundant terminator when optimizing conditional trap.

Peer reviewed by: nemanjai
2021-11-18 17:52:26 -06:00
Chen Zheng 9bda9a3980 [PowerPC] fix typos in comments, NFC 2021-11-18 08:55:23 +00:00
Nico Weber 103cc914d6 [x86/asm] Make variants work when converting at&t inline asm input to intel asm output
`asm` always has AT&T-style input (`asm inteldialect` has Intel-style asm
input), so EmitGCCInlineAsmStr() always has to pick the same variant since it
cares about the input asm string, not the output asm string.

For PowerPC, that default variant is 1. For other targets, it's 0.

Without this, the included test case errors out with

    error: unknown use of instruction mnemonic without a size suffix
             mov rax, rbx

since it picks the intel branch and then tries to interpret it as AT&T
when selecting intel-style output with `-x86-asm-syntax=intel`.

Differential Revision: https://reviews.llvm.org/D113894
2021-11-17 13:23:18 -05:00
Benjamin Kramer 8b8e8704ce [PowerPC] Fix a nullptr dereference
LiMI1/LiMI2 can be null, so don't call a method on them before checking.
Found by ubsan.
2021-11-16 23:52:42 +01:00
Victor Huang ae27ca9a67 [PowerPC] PPC backend optimization on conditional trap intrustions
This patch adds PPC back end optimization to analyze the arguments of a
conditional trap instruction to execute one of the following:
1. Delete it if never trap
2. Replace it if always trap
3. Otherwise keep it

Reviewed By: nemanjai, amyk, PowerPC

Differential revision: https://reviews.llvm.org/D111434
2021-11-16 13:11:57 -06:00
Lei Huang f50c6c1718 [PowerPC] Fix 32bit vector insert instructions for ISA3.1
The platform independent ISD::INSERT_VECTOR_ELT take a element index,
but vins* instructions take a byte index. Update 32bit td patterns for
vector insert to handle the element index accordingly.

Since vector insert for non constant index are supported in
ISA3.1, there is no need to use platform specific ISD node,
PPCISD::VECINSERT.  Update td pattern to directly use
ISD::INSERT_VECTOR_ELT instead.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D113802
2021-11-15 14:36:39 -06:00
Kazu Hirata d243cbf8ea [llvm] Use isa instead of dyn_cast (NFC) 2021-11-14 19:40:46 -08:00
Chen Zheng eec9ca622c [PowerPC] guard update form prepare with non-const increment with option
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D113471
2021-11-15 02:16:46 +00:00
Kazu Hirata 609ccbb240 [PowerPC] Use SDNode::uses (NFC) 2021-11-13 08:34:22 -08:00
Jordan Rupprecht da4822f6c8 [PowerPC][NFC] Ignore unused var in release builds.
Note we can't inline this call into assert because `isIntS16Immediate` has a side effect. But we only look at the return value in asserts builds.
2021-11-11 08:57:40 -08:00
Victor Huang 18fe0a0d9e [PowerPC] PPC backend optimization to lower int_ppc_tdw/int_ppc_tw intrinsics to TDI/TWI machine instructions
This patch adds the backend optimization to match XL behavior for the two
builtins __tdw and __tw that when the second input argument is an immediate,
emitting tdi/twi instructions instead of td/tw.

Reviewed By: nemanjai, amyk, PowerPC

Differential revision: https://reviews.llvm.org/D112285
2021-11-11 09:52:00 -06:00
Nemanja Ivanovic 5840f7197d [PowerPC] Respect rounding mode in the back end
Currently, the floating point instructions that depend on
rounding mode are correctly marked in the PPC back end with
an implicit use of the RM register. Similarly, instructions
that explicitly define the register are marked with an
implicit def of the same register. So for the most part,
RM-using code won't be moved across RM-setting instructions.

However, calls are not marked as RM-setting instructions so
code can be moved across calls. This is generally desired,
but so is the ability to turn off this behaviour with an
appropriate option - and -frounding-math really should be
that option.

This patch provides a set of call instructions (for direct
and indirect calls) that are marked with an implicit def of
the RM register. These will be used for calls that are marked
with the strictfp attribute.

Differential revision: https://reviews.llvm.org/D111433
2021-11-10 08:19:58 -06:00
Kazu Hirata ef2d0e0f20 [llvm] Use MachineBasicBlock::{successors,predecessors} (NFC) 2021-11-09 23:05:15 -08:00
Qiu Chaofan 9b5e2b5261 [PowerPC] Implement basic macro fusion in Power10
Including basic fusion types around arithmetic and logical instructions.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D111693
2021-11-08 17:23:56 +08:00
Chen Zheng 7c6f5950f0 [PowerPC] comment for different input register classes; nfc
Add comments to explain why XXPERMDIs and XXPERMDI have different input register
classes, vsfrc for XXPERMDIs and vsrc for XXPERMDI.

This addresses the comments in abandoned patch D113178, we keep using `f0` instead
of using `vs0` for XXPERMDIs on purpose.
2021-11-08 02:21:30 +00:00
Kazu Hirata 14d656b3d8 [Target] Use llvm::reverse (NFC) 2021-11-06 13:08:21 -07:00
Kazu Hirata 2c4ba3e9d3 [Target] Use make_early_inc_range (NFC) 2021-11-05 09:14:32 -07:00
Chen Zheng fed2889f07 [PowerPC] use correct selection for v16i8/v8i16 splat load
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D113236
2021-11-05 10:04:03 +00:00
Qiu Chaofan 5fd406e254 [PowerPC] Add intrinsic to convert between ppc_fp128 and fp128
ppc_fp128 and fp128 are both 128-bit floating point types. However, we
can't do conversion between them now, since trunc/ext are not allowed
for same-size fp types.

This patch adds two new intrinsics: llvm.ppc.convert.f128.to.ppcf128 and
llvm.convert.ppcf128.to.f128, to support such conversion.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D109421
2021-11-05 16:58:38 +08:00
Chen Zheng 9695027066 [PowerPC] address post-commit comments for D106555; NFC
Address namanjai post commit comments.
2021-11-05 05:30:53 +00:00
Chen Zheng f6db18fd4a [PowerPC][NFC] make option ppc-formprep-max-vars can be set more than one time. 2021-11-04 13:44:58 +00:00
Qiu Chaofan a84118756c [PowerPC] Enforce side effects to FPSCR read/set intrinsics
Currently, FPSCR is not modeled, so in some early passes (such as
early-cse), the read/set intrinsics to FPSCR may get incorrect
simplification.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D112380
2021-11-04 11:45:32 +08:00
Qiu Chaofan 741aeda97d [PowerPC] Implement longdouble pack/unpack builtins
Implement two builtins to pack/unpack IBM extended long double float,
according to GCC 'Basic PowerPC Builtin Functions Available ISA 2.05'.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D112055
2021-11-03 17:57:25 +08:00
Chen Zheng 5a8b196340 [PowerPC] handle more splat loads without stack operation
This mostly improves splat loads code generation on Power7

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D106555
2021-11-03 05:17:41 +00:00
Chen Zheng eeed1545b2 [PowerPC] turn off chain commoning by default. 2021-11-01 04:11:10 +00:00
Chen Zheng 7591d21032 [PowerPC] fix a miscompile for Solaris build 2021-10-29 12:06:25 +00:00
Chen Zheng 631f44f338 [PowerPC] use right extend type for SCEV
Fix an issue caused by D108750

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D112502
2021-10-26 13:32:03 +00:00
Zarko Todorovski e9163660b1 [PPC][LLVM] Inclusive terms: remove references to sanity check in lib/Target/PowerPC
Removed references to `sanity check` in `PPCBranchCoalescing.cpp` code comments.
No word substitution made in this case, as the comments and code following illustrated are
sufficient IMO.

Reviewed By: quinnp

Differential Revision: https://reviews.llvm.org/D112452
2021-10-25 18:13:54 -04:00
Chen Zheng 80e6aff6bb [PowerPC] common chains to reuse offsets to reduce register pressure.
Add a new preparation pattern in PPCLoopInstFormPrep pass to reduce register
pressure.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D108750
2021-10-25 03:27:16 +00:00
Chen Zheng 86a5c32616 [PowerPC] iterate on the SmallSet directly; NFC 2021-10-22 06:18:07 +00:00
Chen Zheng 13755436bb [PowerPC] return early if there is no preparing candidate in the loop; NFC
This is to improve compiling time.

Differential Revision: https://reviews.llvm.org/D112196

Reviewed By: jsji
2021-10-22 05:39:51 +00:00
Simon Pilgrim 71e39e3f18 [ADT] Add APInt::isNegatedPowerOf2() helper
Inspired by D111968, provide a isNegatedPowerOf2() wrapper instead of obfuscating code with (-Value).isPowerOf2() patterns, which I'm sure are likely avenues for typos.....

Differential Revision: https://reviews.llvm.org/D111998
2021-10-19 14:38:21 +01:00
Qiu Chaofan 67c64d8337 [PowerPC] Implement scheduling model for Power10
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D110855
2021-10-18 15:27:49 +08:00
Qiu Chaofan 9e9b0f4621 [PowerPC] Support ppc-asm-full-reg-names for AIX
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D94282
2021-10-15 12:22:44 +08:00
Albion Fung b4b9f9b4b3 [PowerPC] Emit dcbt and dcbtst in place of their extended mnemonics on AIX
On AIX, the system assembler does not support the extended mnemonics
dcbtt and dcbtstt. This patch stops them from being emitted on
AIX and emits the base mnemonics instead, dcbt X, X, 16 and
dcbtstt X, X, 16 respectively.

Differential revision: https://reviews.llvm.org/D111258
2021-10-12 15:47:57 -05:00
Arthur Eubanks a0a4935182 Make more places that use alignment use uint64_t
Followup to D110451.
2021-10-08 16:35:19 -07:00
Reid Kleckner b3a6d096d7 Fix shlib builds for all lib/Target/*/TargetInfo libs
They all must depend on MC now that the target registry is in MC.
Also fix llvm-cxxdump
2021-10-08 15:21:13 -07:00
Reid Kleckner 89b57061f7 Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.

This allows us to ensure that Support doesn't have includes from MC/*.

Differential Revision: https://reviews.llvm.org/D111454
2021-10-08 14:51:48 -07:00
Chen Zheng 1bf05fbc98 [PowerPC] refactor rewriteLoadStores for reusing; nfc
This is split from https://reviews.llvm.org/D108750.
Refactor rewriteLoadStores() so that we can reuse the outlined
functions.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D110314
2021-10-07 12:59:20 +00:00
Itay Bookstein 40ec1c0f16 [IR][NFC] Rename getBaseObject to getAliaseeObject
To better reflect the meaning of the now-disambiguated {GlobalValue,
GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction
(D109792), the function is renamed to getAliaseeObject.
2021-10-06 19:33:10 -07:00
Stefan Pintilie 740086596c [PowerPC] Fix issue with lowering byval parameters.
Lowering of byval parameters with sizes that are not represented by a single
store require multiple stores to properly address the correct size of the
parameter.

Sizes that cannot be done with a single store are 3 bytes, 5 bytes, 6 bytes,
7 bytes. It is not correct to simply perform an 8 byte store and for these
elements because then the store would be larger than the element and alias
analysis would assume that this is undefined behaivour and return NoAlias
for them.

This patch adds the correct stores so that the size of the store is not larger
than the size of the element.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D108795
2021-10-06 13:19:15 -05:00
Amara Emerson 8bde5e58c0 Delay outgoing register assignments to last.
The delayed stack protector feature which is currently used for SDAG (and thus
allows for more commonly generating tail calls) depends on being able to extract
the tail call into a separate return block. To do this it also has to extract
the vreg->physreg copies that set up the call's arguments, since if it doesn't
then the call inst ends up using undefined physregs in it's new spliced block.

SelectionDAG implementations can do this because they delay emitting register
copies until  *after* the stack arguments are set up. GISel however just
processes and emits the arguments in IR order, so stack arguments always end up
last, and thus this breaks the code that looks for any register arg copies that
precede the call instruction.

This patch adds a thunk argument to the assignValueToReg() and custom assignment
hooks. For outgoing arguments, register assignments use this return param to
return a thunk that does the actual generating of the copies. We collect these
until all the outgoing stack assignments have been done and then execute them,
so that the copies (and perhaps some artifacts like G_SEXTs) are placed after
any stores.

Differential Revision: https://reviews.llvm.org/D110610
2021-10-04 12:33:20 -07:00
Stefan Pintilie 4fc2f4979c [PowerPC] Fix __builtin_ppc_load2r to return short instead of int.
This patch fixes the return value of the builtin __builtin_ppc_load2r to
correctly return short instead of int.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D110771
2021-10-04 06:17:02 -05:00
Jay Foad a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Albion Fung 4195ed9959 [PowerPC] Improved codegen related to xscvdpsxws/xscvdpuxws
This patch removes the uneccessary mf/mtvsr generated in conjunction
with xscvdpsxws/xscvdpuxws.

Differential revision: https://reviews.llvm.org/D109902
2021-09-30 14:31:00 -05:00
Stefan Pintilie fb4e44c4e7 [PowerPC] The builtins load8r and store8r are Power 7 plus.
This patch makes sure that the builtins __builtin_ppc_load8r and
__ builtin_ppc_store8r are only available for Power 7 and up.
Currently the builtins seem to produce incorrect code if used for
Power 6 or before.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D110653
2021-09-29 14:34:40 -05:00
Nemanja Ivanovic 09b67aa1c3 [PowerPC] Implement builtin for vbpermd
The instruction has similar semantics to vbpermq but for doublewords.
It was added in Power9 and the ABI documents the builtin.

Differential revision: https://reviews.llvm.org/D107899
2021-09-29 06:34:31 -05:00
Quinn Pham 70391b3468 [PowerPC] FP compare and test XL compat builtins.
This patch is in a series of patches to provide builtins for
compatability with the XL compiler. This patch adds builtins for compare
exponent and test data class operations on floating point values.

Reviewed By: #powerpc, lei

Differential Revision: https://reviews.llvm.org/D109437
2021-09-28 11:01:51 -05:00
Quinn Pham 682e15f371 [PowerPC] Fix td pattern for P10 VSLDBI and VSRDBI
This patch fixes the pattern for the P10 instructions Vector Shift Left
Double by Bit Immediate VN-form and Vector Shift Right Double by Bit
Immediate VN-form. The third argument should be a target constant (`timm`)
instead of an `i32` because an immediate is expected.

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D109920
2021-09-27 12:36:18 -05:00
Victor Huang 6e1aaf18af [PowerPC] Mark splat immediate instructions as rematerializable
This patch marks splat immediate instructions XXSPLTIW and XXSPLTIDP as
rematerializable to prevent MachineLICM from moving them out of loops.

Reviewed By: lei, amy

Differential revision: https://reviews.llvm.org/D108823
2021-09-24 12:03:34 -05:00
Simon Pilgrim b1f38a27f0 [Target][CodeGen] Remove default CostKind arguments on inner/impl TTI overrides
Based off a discussion on D110100, we should be avoiding default CostKinds whenever possible.

This initial patch removes them from the 'inner' target implementation callbacks - these should only be used by the main TTI calls, so this should guarantee that we don't cause changes in CostKind by missing it in an inner call. This exposed a few missing arguments in getGEPCost and reduction cost calls that I've cleaned up.

Differential Revision: https://reviews.llvm.org/D110242
2021-09-22 15:28:08 +01:00
Chen Zheng ffa9fa9ed2 [PowerPC] prepare for udpate form with non-const increment.
This is a follow-up of D105872. Now we are able to prepare for update
form with non-const increment.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D106032
2021-09-22 02:54:28 +00:00
Amy Kwan 2af57b6099 [PowerPC] Add prefix load pattern for fpext to v2f64
This patch adds a prefixed load pattern involving v2f32 fpext v2f64, where we
are dealing with a value with an offset that fits into a 34-bit signed immediate.
A reduced test case is also added to patch that tests the pattern, in which the
pattern is tested in the big endian CHECKs of the newly added test.

Differential Revision: https://reviews.llvm.org/D109887
2021-09-21 12:45:24 -05:00
Cullen Rhodes b23d22f7d5 [PowerPC] NFC: Remove unused tblgen template args
Identified in D109359.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D109715
2021-09-21 08:24:16 +00:00
Chen Zheng 80584f0056 Revert "[PowerPC][ELF] make sure local variable space does not overlap with parameter save area"
This causes mix-compile issues on PowerPC Linux.

This reverts commit 324bd467a2.
2021-09-17 08:07:18 +00:00
Amy Kwan 5041a485b9 [PowerPC] Exploit Prefixed Load/Stores using the refactored Load/Store Implementation
This patch exploits the prefixed load and store instructions utilizing the
refactored load/store implementation introduced in D93370.

Prefixed load and store instructions are emitted whenever we are loading or
storing a value with an offset that fits into a 34-bit signed immediate.
Patterns for the prefixed load and stores are added in this patch, as well as
the implementation that detects when we are loading and storing a value with an
offset that fits in 34-bits.

Differential Revision: https://reviews.llvm.org/D96075
2021-09-14 08:39:49 -05:00
Chen Zheng 946e69d253 [PowerPC] prepare more loop load/store instructions
PPCLoopInstrFormPrep pass now can prepare for load store instructions
in a loop whose increment is not a constant integer.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D105872
2021-09-14 05:00:48 +00:00
Arthur Eubanks f94a118a6e [NFC] Avoid using pointee types in PPCISelLowering
A cmpxchg's new value type is the same as the pointer operand's pointee type.
2021-09-12 17:37:35 -07:00
Amy Kwan 351a0d8a90 [PowerPC] Update PC-Relative Load/Store Patterns to use the refactored Load/Store Implementation
This patch updates the PC-Relative load and store patterns to utilize the
refactored load/store implementation introduced in D93370.

PC-Relative implementation has been added to PPCISelLowering.cpp, and also the
patterns in PPCInstrPrefix.td have been updated and no longer require AddedComplexity.
All existing test cases pass with this update.

Differential Revision: https://reviews.llvm.org/D95116
2021-09-09 15:38:42 -05:00
Craig Topper 9af8f1b18e [SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode.
Soft deprecrate isNullValue/isAllOnesValue and update in tree
callers. This matches the changes to the APInt interface from
D109483.

Reviewed By: lattner

Differential Revision: https://reviews.llvm.org/D109535
2021-09-09 13:28:30 -07:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Victor Huang 4a226529e2 [PowerPC] Fixed the crash due to early if conversion with fixed CR fields
This patch adds a fix to do early if conversion to select when
conditional branch not using physical register to prevent the crash when
expanding ISEL instruction.

Reviewed By: lei, kamaub, PowerPC

Differential revision: https://reviews.llvm.org/D108302
2021-09-07 10:51:03 -05:00
Jinsong Ji 042a6564d3 [PowerPC] Guard XSRSP in P8 for FastISel
This is exposed by enabling FastIsel on 64bit AIX.
We are generating XSRSP regardless of the arch,
which may be wrong when -mcpu=pwr7.

The fix is to guard the generation in P8 only.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D109365
2021-09-07 15:17:51 +00:00
Peter Smith e63455d5e0 [MC] Use local MCSubtargetInfo in writeNops
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.

On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.

For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.

This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.

I've attempted to take into account the in tree experimental backends.

Differential Revision: https://reviews.llvm.org/D45962
2021-09-07 15:46:19 +01:00
Peter Smith 5e71839f77 [MC] Add MCSubtargetInfo to MCAlignFragment
In preparation for passing the MCSubtargetInfo (STI) through to writeNops
so that it can use the STI in operation at the time, we need to record the
STI in operation when a MCAlignFragment may write nops as padding. The
STI is currently unused, a further patch will pass it through to
writeNops.

There are many places that can create an MCAlignFragment, in most cases
we can find out the STI in operation at the time. In a few places this
isn't possible as we are in initialisation or finalisation, or are
emitting constant pools. When possible I've tried to find the most
appropriate existing fragment to obtain the STI from, when none is
available use the per module STI.

For constant pools we don't actually need to use EmitCodeAlign as the
constant pools are data anyway so falling through into it via an
executable NOP is no better than falling through into data padding.

This is a prerequisite for D45962 which uses the STI to emit the
appropriate NOP for the STI. Which can differ per fragment.

Note that involves an interface change to InitSections. It is now
called initSections and requires a SubtargetInfo as a parameter.

Differential Revision: https://reviews.llvm.org/D45961
2021-09-07 15:46:19 +01:00
Qiu Chaofan d0f9553ef5 [PowerPC] Enable fast-isel on AIX 64 subtarget
This patch basically enables fast-isel for AIX 64-bit subtarget
(previously enabled only for ELF 64). The initial motivation is to
introduce branch folding to AIX generated code for correct debug
behavior. I also saw some compiling time improvement in a few LLVM
test-suite benchmarks. (toast, dbms, cjpeg, burg, etc.)

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D98844
2021-09-03 11:33:45 +08:00
Jinsong Ji 8671191d26 [NFC][PowerPC] Small code refactor in LoopInstrFormPrep
Avoid some duplicate code.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D109083
2021-09-02 03:16:01 +00:00
Chen Zheng 2596120199 [PowerPC] small code format refactor ; NFC
address the code review comments in patch https://reviews.llvm.org/D105872
2021-09-02 01:39:32 +00:00
Alexander Kornienko 893ac53afc Fix -Wunused-variable 2021-09-01 11:29:30 +02:00
Kai Luo 5eaebd5d64 [PowerPC] Implement quadword atomic load/store
Add support to load/store i128 atomically.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D105612
2021-09-01 06:55:40 +00:00
Nick Desaulniers d8b6ae072d [PPCISelLowering] avoid emitting libcalls to __mulodi4()
Similar to D108842, D108844, and D108926.

__has_builtin(builtin_mul_overflow) returns true for 32b PPC targets,
but Clang is deferring to compiler RT when encountering long long types.
This breaks ppc44x_defconfig + CONFIG_BLK_DEV_NBD=y builds of the Linux
kernel that are using builtin_mul_overflow with these types for these
targets.

If the semantics of __has_builtin mean "the compiler resolves these,
always" then we shouldn't conditionally emit a libcall.

This will still need to be worked around in the Linux kernel in order to
continue to support these builds of the Linux kernel for this
target with older releases of clang.

Link: https://bugs.llvm.org/show_bug.cgi?id=28629
Link: https://github.com/ClangBuiltLinux/linux/issues/1438

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D108936
2021-08-31 11:09:58 -07:00
Alexander Pivovarov eb946cc5b6 Fix typo in comments
Reviewed By: MaskRay, jsji

Differential Revision: https://reviews.llvm.org/D108857
2021-08-31 11:55:40 +05:30
Nikita Popov 0529e2e018 [InstrInfo] Use 64-bit immediates for analyzeCompare() (NFCI)
The backend generally uses 64-bit immediates (e.g. what
MachineOperand::getImm() returns), so use that for analyzeCompare()
and optimizeCompareInst() as well. This avoids truncation for
targets that support immediates larger 32-bit. In particular, we
can avoid the bugprone value normalization hack in the AArch64
target.

This is a followup to D108076.

Differential Revision: https://reviews.llvm.org/D108875
2021-08-30 19:46:04 +02:00
Qiu Chaofan 3bdd850d0c [PowerPC] Set branch/call instructions as no hasSideEffects
PowerPC can model these instructions, so we don't need this flag set.

Reviewed By: shchenz, jsji

Differential Revision: https://reviews.llvm.org/D71983
2021-08-30 12:23:35 +08:00
Chen Zheng 324bd467a2 [PowerPC][ELF] make sure local variable space does not overlap with parameter save area
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D105271
2021-08-27 01:58:41 +00:00
Zarko Todorovski b575bbd0c7 [PowerPC][AIX] Set the HasAlloca flag in the AIX Traceback Table only if R31 is used as a frame pointer
After c063946476 usage of R31 doesn't necessarily mean
that alloca is used. The `TracebackTable::IsAllocaUsedMask` flag should be set only
when R31 is used as a frame pointer.

On AIX the `function calls alloca' bit seems to be set whenever R31 is
set up as a frame pointer, even when there is no alloca call.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D108141
2021-08-23 15:20:41 -04:00
Kai Luo 7165e6713f [PowerPC] Use int64_t to represent stack object offset and frame size
This is the first step to enable PPC64 support huge frame size(>2G). Also fix an assertion error for frame size, i.e.,`int x; !isInt<32>(x);` should be always evaluated false, so the guard code for frame size is impossible to hit.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D107435
2021-08-23 02:13:21 +00:00
Qiu Chaofan 5ca250a03d [RegAlloc] Remove addAllocPriorityToGlobalRanges hook
It was introduced in 1a6dc92 and only enabled on PowerPC/AMDGPU. That
should be enabled for all targets.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D108010
2021-08-18 10:21:27 +08:00
Arthur Eubanks 80ea2bb574 [NFC] Rename AttributeList::getParam/Ret/FnAttributes() -> get*Attributes()
This is more consistent with similar methods.
2021-08-13 11:16:52 -07:00
Amy Kwan 581a80304c [PowerPC] Disable CTR Loop generate for fma with the PPC double double type.
It is possible to generate the llvm.fmuladd.ppcf128 intrinsic, and there is no actual
FMA instruction that corresponds to this intrinsic call for ppcf128. Thus, this
intrinsic needs to remain as a call as it cannot be lowered to any instruction, which
also means we need to disable CTR loop generation for fma involving the ppcf128 type.
This patch accomplishes this behaviour.

Differential Revision: https://reviews.llvm.org/D107914
2021-08-13 12:27:24 -05:00
Lei Huang 8930af45c3 [PowerPC] Implement XL compatibility builtin __addex
Add builtin and intrinsic for `__addex`.

This patch is part of a series of patches to provide builtins for
compatibility with the XL compiler.

Reviewed By: stefanp, nemanjai, NeHuang

Differential Revision: https://reviews.llvm.org/D107002
2021-08-12 16:38:21 -05:00
Victor Huang 99e00663d4 [PowerPC] Fix return address computation for "__builtin_return_address"
When depth > 0, callee frame address is used to compute the return address of
callee producing improper return address. This patch adds the fix to use caller
frame address to compute the return address of callee.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D107646
2021-08-12 09:44:49 -05:00
Christopher Di Bella c874dd5362 [llvm][clang][NFC] updates inline licence info
Some files still contained the old University of Illinois Open Source
Licence header. This patch replaces that with the Apache 2 with LLVM
Exception licence.

Differential Revision: https://reviews.llvm.org/D107528
2021-08-11 02:48:53 +00:00
Kai Luo 666ee849f0 [PowerPC] Fix shift amount of xxsldwi when performing vector int_to_double
POC
```
// main.c
#include <stdio.h>
#include <altivec.h>
extern vector double foo(vector int s);
int main() {
  vector int s = {0, 1, 0, 4};
  vector double vd;
  vd = foo(s);
  printf("%lf %lf\n", vd[0], vd[1]);
  return 0;
}
// poc.c
vector double foo(vector int s) {
  int x1 = s[1];
  int x3 = s[3];
  double d1 = x1;
  double d3 = x3;
  vector double x = { d1, d3 };
  return x;
}
```
Compiled with `poc.c main.c -mcpu=pwr8 -O3` on BE machine.
Current clang gives
```
4.000000 1.000000
```
while xlc gives
```
1.000000 4.000000
```
Xlc's output should be correct.

Reviewed By: shchenz, #powerpc

Differential Revision: https://reviews.llvm.org/D107428
2021-08-06 06:01:29 +00:00
Jinsong Ji 6f84d94b9c [PowerPC] Fix copy/paste error in scalar_to_vector patterns
https://reviews.llvm.org/D100478 refactoring added a copy/paste error
for v8i16 patterns.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D107609
2021-08-06 02:59:01 +00:00
Roman Lebedev 6f6e9a867f
[BasicTTIImpl][LoopUnroll] getUnrollingPreferences(): emit ORE remark when advising against unrolling due to a call in a loop
I'm not sure this is the best way to approach this,
but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D107271
2021-08-03 00:57:26 +03:00
Stefan Pintilie 754520a2bf [PowerPC] Fix issue where hint was providing the incorrect regsiter class.
Regsier hints when copying to a UACC register do not always produce VSRp
registers. This patch makes sure that we do not produce hints in cases
where the subregsiter of the UACC is not a VSRp.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D107101
2021-07-29 21:10:45 -05:00
Nemanja Ivanovic 778932c673 [PowerPC] Turn deprecated altivec prefetch instrs to nops on AIX
The dst/dstt/dstst/dststt instructions are nop's on all PowerPC
cores that AIX supports. The AIX assembler also does not accept
these mnemonics. Turn them into nop's on AIX (similar to dstall).
2021-07-27 15:50:02 -05:00
Nemanja Ivanovic 9654cfd5bb [PowerPC] Fix materialization of SP float values on Power10
All floating point values in registers are in double precision
representation. In order to materialize the correct single precision
value, we need to convert the APFloat that represents the value
to double precision first.

Reviewed By: amyk, NeHuang

Differential Revision: https://reviews.llvm.org/D106812
2021-07-26 19:43:10 -05:00
Masoud Ataei 45951ad323 [PowerPC] Add pwr7 and pwr10 support to IBM MASSV pass on AIX
Before MASSV only supported P8 and P9 on AIX ans Linux . This patch proposes
MASSV to add support of P7 and P10 only on AIX too.

Differential: https://reviews.llvm.org/D106678
2021-07-26 23:21:38 +00:00
Lei Huang 64a15817a0 [PowerPC]Add addex instruction definition and MC tests
Add td definitions and asm/disasm tests for the addex instruction introduced in
ISA 3.0.

Reviewed By: nemanjai, amyk, NeHuang

Differential Revision: https://reviews.llvm.org/D106666
2021-07-26 14:55:38 -05:00
Lei Huang 2d788959ed [PowerPC] Add implicit-def RM to instructions mtfsb[01]
This is a followup patch for D105930 to add implicit-def of RM for
mtfsb[01] instructions as per review comments.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D106603
2021-07-26 14:07:08 -05:00
Victor Huang 26ea4a4432 [PowerPC] Add PowerPC "__stbcx" builtin and intrinsic for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtin and intrinsic for "__stbcx".

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D106484
2021-07-22 10:48:46 -05:00
Quinn Pham e002d251dd [PowerPC] Floating Point Builtins for XL Compat.
This patch is in a series of patches to provide
builtins for compatibility with the XL compiler.
This patch adds builtins related to floating point
operations

Reviewed By: #powerpc, nemanjai, amyk, NeHuang

Differential Revision: https://reviews.llvm.org/D103986
2021-07-21 08:33:39 -05:00
Albion Fung 2fd1520247 [PowerPC] Implemented mtmsr, mfspr, mtspr Builtins
Implemented builtins for mtmsr, mfspr, mtspr on PowerPC;
the patch is intended for XL Compatibility.

Differential revision: https://reviews.llvm.org/D106130
2021-07-20 17:51:00 -05:00
Albion Fung 3434ac9e39 [PowerPC] Store, load, move from and to registers related builtins
This patch implements store, load, move from and to registers related
builtins, as well as the builtin for stfiw. The patch aims to provide
feature parady with xlC on AIX.

Differential revision: https://reviews.llvm.org/D105946
2021-07-20 15:46:14 -05:00
Victor Huang 1a762f93f8 [PowerPC] Add PowerPC cmpb builtin and emit target indepedent code for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch add the builtin and emit target independent
code for __cmpb.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D105194
2021-07-20 13:06:22 -05:00
Stefan Pintilie 1a6dc92be7 [PowerPC] Inefficient register allocation of ACC registers results in many copies.
ACC registers are a combination of four consecutive vector registers.
If the vector registers are assigned first this often forces a number
of copies to appear just before the ACC register is created. If the ACC
register is assigned first then fewer copies are generated when the vector
registers are assigned.

This patch tries to force the register allocator to assign the ACC registers first
and then the UACC registers and then the vector pair registers. It does this
by changing the priority of the register classes.

This patch also adds hints to help the register allocator assign UACC registers from
known ACC registers and vector pair registers from known UACC registers.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D105854
2021-07-20 10:53:40 -05:00
Kai Luo e2ee27b20b [PowerPC] Fallback to base's implementation of shouldExpandAtomicCmpXchgInIR and shouldExpandAtomicCmpXchgInIR
If we can't decide `shouldExpandAtomicCmpXchgInIR` or `shouldExpandAtomicCmpXchgInIR` in PPC's implementation after https://reviews.llvm.org/rGb9c3941cd61de1e1b9e4f3311ddfa92394475f4b, resort to base's implementation.

This fixes internal build of OpenMP which uses atomic operations on float.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D106234
2021-07-20 06:14:24 +00:00
Nemanja Ivanovic 35a18a981f [PowerPC] Implement intrinsics for mtfsf[i]
This provides intrinsics for emitting instructions that set the FPSCR (`mtfsf/mtfsfi`).

The patch also conservatively marks the rounding mode as an implicit def for both since they both may set the rounding mode depending on the operands.

Reviewed By: #powerpc, qiucf

Differential Revision: https://reviews.llvm.org/D105957
2021-07-16 16:26:11 -05:00
Lei Huang c8937b6cb9 [PowerPC] Implement XL compact math builtins
Implement a subset of builtins required for compatiblilty with AIX XL compiler.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D105930
2021-07-16 13:21:13 -05:00
Masoud Ataei ee2068b30e [PowerPC] Updated the error message of MASSV pass to mention vectorization
is needed be enable on P8 and later targets.

 Differential Revision: https://reviews.llvm.org/D106091
2021-07-16 14:45:09 +00:00
Amy Kwan ba627a32e1 [PowerPC] Update Refactored Load/Store Implementation, XForm VSX Patterns, and Tests
This patch includes the following updates to the load/store refactoring effort introduced in D93370:
 - Update various VSX patterns that use to "force" an XForm, to instead just XForm.
   This allows the ability for the patterns to compute the most optimal addressing
   mode (and to produce a DForm instruction when possible)
- Update pattern and test case for the LXVD2X/STXVD2X intrinsics
- Update LIT test cases that use to use the XForm instruction to use the DForm instruction

Differential Revision: https://reviews.llvm.org/D95115
2021-07-16 09:28:48 -05:00
Victor Huang 4eb107ccba [PowerPC] Add PowerPC population count, reversed load and store related builtins and instrinsics for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for population
count, reversed load and store related operations.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D106021
2021-07-15 17:23:56 -05:00
Quinn Pham de3956605a [PowerPC] Fix popcntb XL Compat Builtin for 32bit
This patch implements the `__popcntb` XL compatibility builtin for 32bit in the frontend and backend. This patch also updates tests for `__popcntb` and other XL Compat sync related builtins.

Reviewed By: #powerpc, nemanjai, amyk

Differential Revision: https://reviews.llvm.org/D105360
2021-07-15 13:19:47 -05:00
Bogdan Graur 442123cada Fixes memory sanitizer 'use-of-uninitialized-value' diagnostic.
Differential Revision: https://reviews.llvm.org/D106047
2021-07-15 11:17:04 +02:00
Kai Luo b9c3941cd6 [PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand
This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D103614
2021-07-15 01:12:09 +00:00
Jinsong Ji fe52296a34 [AIX] Enable dollar sign as PC in inlineasm
$ is used as PC for PowerPC inlineasm, ELF use it,
enable it for AIX XCOFF as well.

Reviewed By: #powerpc, amyk, nemanjai

Differential Revision: https://reviews.llvm.org/D105956
2021-07-14 13:37:52 +00:00
Victor Huang 18c19414eb [PowerPC] Add PowerPC compare and multiply related builtins and instrinsics for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for compare
and multiply related operations.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D102875
2021-07-13 16:55:09 -05:00
Victor Huang 781929b423 [PowerPC][NFC] Power ISA features for Semachecking
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.

Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>

Differential revision: https://reviews.llvm.org/D105501
2021-07-13 13:13:34 -05:00
Victor Huang e4585d3f4e Revert "[PowerPC][NFC] Power ISA features for Semachecking"
This reverts commit 10e0cdfc65.
2021-07-13 13:13:34 -05:00
Amy Kwan b5f4ac4c11 [PowerPC] Add FI alignment check if the addressing mode is DS/DQ-Form, emit X-Form if necessary.
This patch adds a function that checks whether or not the frame index
is aligned when the computed addressing mode is an aligned D-Form (DS, or DQ-Form).
If the frame index appears to be unaligned, within these two modes, reset
the mode to X-Form in order to fall back to selection X-Form loads.

A test case is added to ensure that the test emits X-Form loads and not DQ-Form
loads since the frame index is not aligned within the test case.

Differential Revision: https://reviews.llvm.org/D105661
2021-07-13 12:31:52 -05:00
Arthur Eubanks 693bc04bf6 [OpaquePtr] Use GlobalValue::getValueType() more 2021-07-13 09:34:34 -07:00
Albion Fung f1aca5ac96 [PowerPC] Fix L[D|W]ARX Implementation
LDARX and LWARX sometimes gets optimized out by the compiler
when it is critical to the correctness of the code. This inline asm generation
ensures that it preserved.

Differential Revision: https://reviews.llvm.org/D105754
2021-07-13 11:02:07 -05:00
Victor Huang 10e0cdfc65 [PowerPC][NFC] Power ISA features for Semachecking
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.

Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>

Differential revision: https://reviews.llvm.org/D105501
2021-07-13 10:51:25 -05:00
Qiu Chaofan 6fd9c1901f [PowerPC] Fix typo in vector shuffle combining
a22ecb4 fixed a crash on big endian subtargets. This commit fixes a typo
in that commit which may cause miscompile.
2021-07-13 14:35:47 +08:00
Amy Kwan 35909ff6cf [PowerPC] Fix the splat immediate in PPCMIPeephole depending on if we have an Altivec and VSX splat instruction.
An assertion of the following can occur because Altivec and VSX splats use a different operand number for the immediate:
```
int64_t llvm::MachineOperand::getImm() const: Assertion `isImm() && "Wrong MachineOperand accessor"' failed.
```
This patch updates PPCMIPeephole.cpp assign the correct splat immediate.

Differential Revision: https://reviews.llvm.org/D105790
2021-07-12 16:20:11 -05:00
Jinsong Ji 2377eca93c [PowerPC] Custom Lowering BUILD_VECTOR for v2i64 for P7 as well
The lowering for v2i64 is now guarded with hasDirectMove,
however, the current lowering can handle the pattern correctly,
only lowering it when there is efficient patterns and corresponding
instructions.

The original guard was added in D21135, and was for Legal action.
The code has evloved now, this guard is not necessary anymore.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D105596
2021-07-12 17:56:10 +00:00
Albion Fung ef49d925e2 [PowerPC] Implement trap and conversion builtins for XL compatibility
This patch implements trap and FP to and from double conversions. The builtins
generate code that mirror what is generated from the XL compiler. Intrinsics
are named conventionally with builtin_ppc, but are aliased to provide the same
builtin names as the XL compiler.

Differential Revision: https://reviews.llvm.org/D103668
2021-07-12 11:04:17 -05:00
zhijian 841077a7e9 [AIX][XCOFF] Use bit order of has_vec and longtbtable bits as defined in AIX header debug.h
Summary:

  The bit order of the has_vec and longtbtable bits in the traceback table generated by the XL compiler flipped at some point after v12.1. This is different from the definition is the AIX header debug.h. The change in the XL compiler that caused the deviation from the OS header definition was unintentional. Since both orderings are extant and the XL compiler runtime also expects the ordering defined by the OS, we will correct the output from LLVM to match the defined ordering given by the OS (which is also consistent with the Assembler Language Reference). Mitigation for traceback tables encoded with the wrong ordering is required for either ordering.

Reviewers: XingXue, HubertTong
Differential Revision: https://reviews.llvm.org/D105487
2021-07-09 11:06:46 -04:00
Kai Luo 55bd12d4b7 [PowerPC] Remove implicit use register after transformToImmFormFedByLI()
When the instruction has imm form and fed by LI, we can remove the redundat LI instruction.
Below is an example:
```
    renamable $x5 = LI8 2
    renamable $x4 = exact SRD killed renamable $x4, killed renamable $r5, implicit $x5
```

will be converted to:
```
   renamable $x5 = LI8 2
   renamable $x4 = exact RLDICL killed renamable $x4, 62, 2,  implicit killed $x5
```

But when we do this optimization, we forget to remove implicit killed $x5
This bug has caused a lnt case error. This patch is to fix above bug.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D85288
2021-07-09 04:42:54 +00:00
Matt Arsenault 9b057f647d GlobalISel: Track original argument index in ArgInfo
SelectionDAG's equivalents in ISD::InputArg/OutputArg track the
original argument index. Mips relies on this, and its currently
reinventing its own parallel CallLowering infrastructure which tracks
these indexes on the side. Add this to help move towards deleting the
custom mips handling.
2021-07-08 13:39:02 -04:00
Qiu Chaofan a22ecb4508 [PowerPC] Fix i64 to vector lowering on big endian
Lowering for scalar to vector would skip if current subtarget is big
endian and the scalar is larger or equal than 64 bits. However there's
some issue in implementation that SToVRHS may refer to SToVLHS's scalar
size if SToVLHS is present, which leads to some crash.o

Reviewed By: nemanjai, shchenz

Differential Revision: https://reviews.llvm.org/D105094
2021-07-08 11:05:09 +08:00
Nemanja Ivanovic 6a06dbafa1 [PowerPC] Disable permuted SCALAR_TO_VECTOR on LE without direct moves
There are some patterns involving the permuted scalar to vector node
for which we don't have patterns without direct moves on little endian
subtargets. This causes selection errors. While we can of course add
the missing patterns, any additional effort to make this work is not
useful since there is no support for any CPU that can run in
little endian mode and does not support direct moves.
2021-07-07 13:50:49 -05:00
Zarko Todorovski ee6ca9c7df [AIX] Use VSSRC/VSFRC Register classes for f32/f64 callee arguments on P8 and above
Adding usage of VSSRC and VSFRC when adding the live in registers on AIX.
This matches the behaviour of the rest of PPC Subtargets.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D104396
2021-07-07 09:18:20 -04:00
Nemanja Ivanovic 3553698de7 [PowerPC] Re-enable combine for i64 BSWAP on targets without LDBRX
The combine was disabled in 4e22c7265d as it caused failures in
the ppc64be-multistage (bootstrap) bot.
It turns out that the combine did not correctly update the MMO for
the high load which caused aliased stores to be reported as unaliased.
This patch fixes that problem and re-enables the combine.
2021-07-06 20:42:01 -05:00
Albion Fung 7d10dd60ce [PowerPC] Implament Load and Reserve and Store Conditional Builtins
This patch implaments the load and reserve and store conditional
builtins for the PowerPC target, in order to have feature parody with
xlC on AIX.

Differential revision: https://reviews.llvm.org/D105236
2021-07-05 21:35:41 -05:00
Kai Luo c063946476 [AIX] Adjust CSR order to avoid breaking ABI regarding traceback
Allocate non-volatile registers in order to be compatible with ABI, regarding gpr_save.

Quoted from https://www.ibm.com/docs/en/ssw_aix_72/assembler/assembler_pdf.pdf page55,
> The preferred method of using GPRs is to use the volatile registers first. Next, use the nonvolatile registers
> in descending order, starting with GPR31.

This patch is based on @jsji 's initial draft.

Tested on test-suite and SPEC, found no degradation.

Reviewed By: jsji, ZarkoCA, xingxue

Differential Revision: https://reviews.llvm.org/D100167
2021-07-03 04:45:26 +00:00
Matt Arsenault 99c7e918b5 GlobalISel: Use LLT in call lowering callbacks
This preserves the memory type so the lowerings can rely on them.
2021-07-01 12:15:54 -04:00
Qiu Chaofan 07f0faed11 [NFC][Scheduler] Refactor tryCandidate to return boolean
This patch changes return type of tryCandidate from void to bool:

1. Methods in some targets already follow this convention.
2. This would help if some target wants to re-use generic code.
3. It looks more intuitive if these try-method returns the same type.

We may need to change return type of them from bool to some enum
further, to make it less confusing.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D103951
2021-07-01 14:31:47 +08:00
zhijian 9a9e6189d7 [AIX][XCOFF][BUG-Fixed] need to switch back to text section after emit a dumy eh structure
Summary:

in the patch https://reviews.llvm.org/D103651 [AIX][XCOFF] generate eh_info when vector registers are saved according to the traceback table.

when generate eh_info, it switch to other section, when it done, it need to switch back to text section again.

Reviewers: Jason Liu
Differential Revision: https://reviews.llvm.org/105195
2021-06-30 13:56:37 -04:00
Nemanja Ivanovic 4e22c7265d [PowerPC] Disable combine 64-bit bswap(load) without LDBRX
This causes failures on the big endian bootstrap bot.
Disabling this combine temporarily until I can get a proper fix.
2021-06-25 15:11:22 -05:00
Qiu Chaofan a08fc1361a [PowerPC] Change VSRpRC allocation order
On PowerPC, VSRpRC represents the pairs of even and odd VSX register,
and VRRC corresponds to higher 32 VSX registers. In some cases, extra
copies are produced when handling incoming VRRC arguments with VSRpRC.

This patch changes allocation order of VSRpRC to eliminate this kind of
copy.

Stack frame sizes may increase if allocating non-volatile registers, and
some other vector copies happen. They need fix in future changes.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D104855
2021-06-25 16:04:41 +08:00
Kai Luo b904574b3d [PowerPC] Move PPCBranchSelector as close to asm printer as possible
Currently, PPCBranchSelector is not immediately preceding asm printer pass. `-debug-pass=Structure` gives
```
      PowerPC Branch Selector
      Contiguously Lay Out Funclets
      StackMap Liveness Analysis
      Live DEBUG_VALUE analysis
      Lazy Machine Block Frequency Analysis
      Machine Optimization Remark Emitter
      Linux PPC Assembly Printer
```
After the patch
```
      Contiguously Lay Out Funclets
      StackMap Liveness Analysis
      Live DEBUG_VALUE analysis
      PowerPC Branch Selector
      Lazy Machine Block Frequency Analysis
      Machine Optimization Remark Emitter
      Linux PPC Assembly Printer
```

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D104762
2021-06-25 02:05:19 +00:00
Nemanja Ivanovic dcccb2f594 [PowerPC] Fix bswap combine for big endian systems
Commit 0464586ac5 added a combine
for a 64-bit load feeding a bswap but the implementation is only
correct for little endian systems.
This fixes it for big endian systems.
2021-06-24 18:04:50 -05:00
Martin Storsjö 42f74e8249 [llvm] Rename StringRef _lower() method calls to _insensitive()
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
2021-06-25 00:22:01 +03:00
Nemanja Ivanovic 0464586ac5 [PowerPC] Combine 64-bit bswap(load) without LDBRX
When targeting CPUs that don't have LDBRX, we end up producing code that is
very inefficient and large for this common idiom. This patch just
optimizes it two 32-bit LWBRX instructions along with a merge.

This fixes https://bugs.llvm.org/show_bug.cgi?id=49610

Differential revision: https://reviews.llvm.org/D104836
2021-06-24 15:11:47 -05:00
zhijian bd240b3d77 [AIX][XCOFF] generate eh_info when vector registers are saved according to the traceback table.
Summary:

generate eh_info when vector registers are saved according to the traceback table.

struct eh_info_t {

unsigned version;       /* EH info version 0 */
#if defined(64BIT)

char _pad[4];           /* padding */
#endif

unsigned long lsda;     /* Pointer to Language Specific Data Area */
unsigned long personality; /* Pointer to the personality routine */
};

the value of lsda and personality is zero when the number of vector registers saved is large zero and there is not personality of the function

Reviewers: Jason Liu
Differential Revision: https://reviews.llvm.org/D103651
2021-06-22 13:01:31 -04:00
Fangrui Song 59d90fe817 Simplify some typedef struct 2021-06-19 11:36:44 -07:00
David Spickett e4ecd83fe9 [llvm][AArch64] Handle arrays of struct properly (from IR)
This only applies to FastIsel. GlobalIsel seems to sidestep
the issue.

This fixes https://bugs.llvm.org/show_bug.cgi?id=46996

One of the things we do in llvm is decide if a type needs
consecutive registers. Previously, we just checked if it
was an array or not.
(plus an SVE specific check that is not changing here)

This causes some confusion when you arbitrary IR like:
```
%T1 = type { double, i1 };
define [ 1 x %T1 ] @foo() {
entry:
  ret [ 1 x %T1 ] zeroinitializer
}
```

We see it is an array so we call CC_AArch64_Custom_Block
which bails out when it sees the i1, a type we don't want
to put into a block.

This leaves the location of the double in some kind of
intermediate state and leads to odd codegen. Which then crashes
the backend because it doesn't know how to implement
what it's been asked for.

You get this:
```
  renamable $d0 = FMOVD0
  $w0 = COPY killed renamable $d0
```

Rather than this:
```
  $d0 = FMOVD0
  $w0 = COPY $wzr
```

The backend knows how to copy 64 bit to 64 bit registers,
but not 64 to 32. It can certainly be taught how but the real
issue seems to be us even trying to assign a register block
in the first place.

This change makes the logic of
AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters
a bit more in depth. If we find an array, also check that all the
nested aggregates in that array have a single member type.

Then CC_AArch64_Custom_Block's assumption of a type that looks
like [ N x type ] will be valid and we get the expected codegen.

New tests have been added to exercise these situations. Note that
some of the output is not ABI compliant. The aim of this change is
to simply handle these situations and not to make our processing
of arbitrary IR ABI compliant.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D104123
2021-06-16 13:56:01 +00:00
Nemanja Ivanovic 821a8f680e [PowerPC] Fix spilling of paired VSX registers
We have added STXVP/LXVP for spilling and restoring the registers
but we neglected to add FI elimination code for these. The result
is that we end up producing impossible MachineInstr's that have
register operands in place of immediates.
2021-06-15 14:13:17 -05:00
Arthur Eubanks be5d454f3f [NFC][OpaquePtr] Avoid calling getPointerElementType()
Pointee types are going away soon.

For this, we mostly just care about store/load types, which are already
available without the pointee types. The other intrinsics always use
i8*.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103719
2021-06-15 09:53:12 -07:00
Arthur Eubanks 25b2126b9e [NFC] Remove redundant variable
Differential Revision: https://reviews.llvm.org/D103706
2021-06-15 09:53:11 -07:00
Kai Luo 1c450c3d7e [PowerPC] Export 16 byte load-store instructions
Export `lq`, `stq`, `lqarx` and `stqcx.` in preparation for implementing 16-byte lock free atomic operations on AIX.
Add a new register class `g8prc` for these instructions, since these instructions require even-odd register pair.

Reviewed By: nemanjai, jsji, #powerpc

Differential Revision: https://reviews.llvm.org/D103010
2021-06-15 01:56:10 +00:00
zhijian 7ed515d168 [AIX][XCOFF] emit vector info of traceback table.
Summary:

emit vector info of traceback table.

Reviewers: Jason Liu,Hubert Tong
Differential Revision: https://reviews.llvm.org/D93659
2021-06-14 11:15:22 -04:00
Zarko Todorovski c1bb75febe [PowerPC] Allow wa inline asm to also accept floating point arguments
GCC documentation for the `wa` constraint states that:
```
wa

    A VSX register (VSR), vs0…vs63. This is either an FPR (vs0…vs31 are f0…f31)
    or a VR (vs32…vs63 are v0…v31).
```
This technically means that we could accept floating point parameters. In fact,
gcc itself does. The following testcase compiles and runs on all PPC platforms with GCC,
whereas clang/llc will assert:
```
#include <stdio.h>
double foo ( vector double a ) {
  double b, c;
  asm("xvabsdp  %x0, %x2        \n"
             "xxsldwi  %x1, %x0, %x0, 2 \n"
      :  "+wa"    (b),
         "=wa"    (c)
      :  "wa"    (a)
      );
  return b+c;
}
int main(void) {
  vector double a = {-3., -4.};
  double t = foo( a );
  printf("%g\n", t);
}
```
This patch allows clang/llc to build and run this testcase.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D103409
2021-06-11 07:19:10 -04:00
Qiu Chaofan bc104fdcec [PowerPC] Relax register superclasses for paired memops
Relaxing superclass constraint for VSX register classes helps reducing
32-byte spills and copies when register pressure is high.

In test case affected, some of them introduces more copies due to new
allocation order. However, this patch should not be the root cause, and
we may be able to fix it in other places of register allocation.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D104006
2021-06-11 14:54:03 +08:00
Timm Bäder a9e4f91adf [llvm][PPC] Add missing case for 'I' asm memory operands
From https://llvm.org/docs/LangRef.html#asm-template-argument-modifiers:

I: Print the letter ‘i’ if the operand is an integer constant,
otherwise nothing. Used to print ‘addi’ vs ‘add’ instructions.

Differential Revision: https://reviews.llvm.org/D103968
2021-06-10 12:52:50 +02:00
Jinsong Ji 4a89ed373c [AIX] Add traceback ssp canary bit support
We will need to set the ssp canary bit in traceback table to communicate
with unwinder about the canary.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D103202
2021-06-10 02:40:02 +00:00
Kai Luo bf58600bad [PowerPC] Make sure the first probe is full size or is the last probe when stack is realigned
When `-fstack-clash-protection` is enabled and stack has to be realigned, some parts of redzone is written prior the probe, so probe might overwrite content already written in redzone. To avoid it, we have to make sure the first probe is at full probe size or is the last probe so that we can skip redzone.

It also fixes violation of ABI under PPC where `r1` isn't updated atomically.

This fixes https://bugs.llvm.org/show_bug.cgi?id=49903.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D100290
2021-06-09 06:35:35 +00:00
Kai Luo c87c294397 [PowerPC][Dwarf] Assign MMA register's dwarf register number to negative value
According to ELF V2 ABI, `0` should be the dwarf number of `r0`. Currently MMA's register also uses `0` as its dwarf number, this confuses `RegisterInfoEmitter` and generates wrong dwarf -> llvm mapping.
```
extern const MCRegisterInfo::DwarfLLVMRegPair PPCDwarfFlavour1Dwarf2L[] = {
  { 0U, PPC::VSRp31 },
```
This leads to wrong cfi output in https://reviews.llvm.org/D100290.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D103761
2021-06-09 02:24:01 +00:00
Simon Pilgrim 01b77159e3 PPCISelLowering.cpp - don't dereference a dyn_cast<>.
dyn_cast<> can return nullptr which we would then dereference - use cast<> which will assert that the type is correct.
2021-06-08 17:59:05 +01:00
Nikita Popov 1ffa6499ea [TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC)
Don't require a specific kind of IRBuilder for TargetLowering hooks.
This allows us to drop the IRBuilder.h include from TargetLowering.h.

Differential Revision: https://reviews.llvm.org/D103759
2021-06-06 16:29:50 +02:00
Anshil Gandhi 1c5ff0b03f [PowerPC] [GlobalISel] Implementation of formal arguments lowering in the IRTranslator for the PPC backend
Differential Revision: https://reviews.llvm.org/D99812
2021-06-02 16:46:39 -06:00
Anshil Gandhi 3e5ddb83e3 Revert "Differential Revision: https://reviews.llvm.org/D99812"
This reverts commit c729f2a48a.
2021-06-02 16:36:00 -06:00
Anshil Gandhi c729f2a48a Differential Revision: https://reviews.llvm.org/D99812 2021-06-02 14:09:52 -06:00
Michael Benfield 00d19c6704 [various] Remove or use variables which are unused but set.
This is in preparation for the -Wunused-but-set-variable warning.

Differential Revision: https://reviews.llvm.org/D102942
2021-06-01 15:38:48 -07:00
Daniel Sanders aaac268285 [globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one
It's still in use in a few places so we can't delete it yet but there's not
many at this point.

Differential Revision: https://reviews.llvm.org/D103352
2021-06-01 13:23:48 -07:00
Albion Fung db26cd30b6 [PowerPC] Improve f32 to i32 bitcast code gen
The code gen for f32 to i32 bitcast is not currently the most efficient;
this patch removes some unneccessary instructions gerneated.

Differential revision: https://reviews.llvm.org/D100782
2021-05-31 16:00:58 -05:00