Commit Graph

6876 Commits

Author SHA1 Message Date
Amy Kwan af430944b3 [PowerPC][AIX] Allow VSX patterns to be 32-bit and 64-bit safe on P8+.
This patch updates two patterns involving `scalar_to_vector` and
`SCALAR_TO_VECTOR_PERMUTED` nodes to be safe for both 64-bit and 32-bit by
pulling the patterns out of the 64-bit specific guard. These patterns are
matched on POWER8 and above.

Differential Revision: https://reviews.llvm.org/D125389
2022-05-27 10:34:17 -05:00
Zongwei Lan ad73ce318e [Target] use getSubtarget<> instead of static_cast<>(getSubtarget())
Differential Revision: https://reviews.llvm.org/D125391
2022-05-26 11:22:41 -07:00
Stefan Pintilie 610eb39c68 [PowerPC][Future] Add an ISA Future to go with mcpu=future.
On Power PC we have ISA3.0 for Power 9, ISA3.1 for Power 10.
This patchs adds an ISA for mcpu=future. The idea is to have a placeholder ISA
for work that is experimental and may not be supported by existing ISAs.

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D126075
2022-05-26 09:19:58 -05:00
Amy Kwan 0bf3c38b0b Fix build failure revealed by c35ca3a1c7
This commit resolves a Linux kernel build failure that was revealed by
c35ca3a1c7. The patch introduces two new
intrinsics, which ultimately changes the intrinsic numbering of other PPC
intrinsics. This causes an issue introduced by
ff40fb07ad, as the patch checks for intrinsics
with particular values, but the addition of the fnabs/fnabss intrinsics updates
the original sqrt/sdiv intrinsic values.
2022-05-24 16:32:04 -05:00
Amy Kwan c35ca3a1c7 [PowerPC] Implement XL compat __fnabs and __fnabss builtins.
This patch implements the following floating point negative absolute value
builtins that required for compatibility with the XL compiler:
```
double __fnabs(double);
float __fnabss(float);
```

These builtins will emit :
- fnabs on PWR6 and below, or if VSX is disabled.
- xsnabsdp on PWR7 and above, if VSX is enabled.

Differential Revision: https://reviews.llvm.org/D125506
2022-05-19 11:28:40 -05:00
Qiu Chaofan d9d15af787 [PowerPC] Treat llvm.fmuladd intrinsic as using CTR
This fixes bug 55463, similar to D78668. This is a temporary fix since
we will switch to post-isel CTR loop determination in the future.

Reviewed By: dim, shchenz

Differential Revision: https://reviews.llvm.org/D125746
2022-05-18 15:57:55 +08:00
esmeyi 8d6e2c3e3d [XCOFF] support writing sections, relocations and symbols for XCOFF64.
This is the second patch to enable the XCOFF64 object writer.

Reviewed By: jhenderson, shchenz

Differential Revision: https://reviews.llvm.org/D122287
2022-05-17 04:27:47 -04:00
Sheng c644488a8b Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h`
The name `MCFixedLenDisassembler.h` is out of date after D120958.

Rename it as `MCDecoderOps.h` to reflect the change.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D124987
2022-05-15 08:44:58 +08:00
Ting Wang 289236d597 [PowerPC] Fix PPCISD::STBRX selection issue on A2
Enable FeatureISA2_06 on Power A2 target

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D125203
2022-05-10 20:47:51 -04:00
David Green 115c188807 [DAG][PowerPC] Combine shuffle(bitcast(X), Mask) to bitcast(shuffle(X, Mask'))
If the mask is made up of elements that form a mask in the higher type
we can convert shuffle(bitcast into the bitcast type, simplifying the
instruction sequence. A v4i32 2,3,0,1 for example can be treated as a
1,0 v2i64 shuffle. This helps clean up some of the AArch64 concat load
combines, along with helping simplify a number of other tests.

The PowerPC combine for v16i8 splat vector loads needed some fixes to
keep it working for v16i8 vectors. This improves the handling of v2i64
shuffles to match too, hopefully improving them in general.

Differential Revision: https://reviews.llvm.org/D123801
2022-05-06 10:50:31 +01:00
Amy Kwan 2534dc120a [PowerPC] Enable CR bits support for Power8 and above.
This patch turns on support for CR bit accesses for Power8 and above. The reason
why CR bits are turned on as the default for Power8 and above is that because
later architectures make use of builtins and instructions that require CR bit
accesses (such as the use of setbc in the vector string isolate predicate
and bcd builtins on Power10).

This patch also adds the clang portion to allow for turning on CR bits in the
front end if the user so desires to.

Differential Revision: https://reviews.llvm.org/D124060
2022-05-02 12:06:15 -05:00
Stefan Pintilie f685bce808 [PowerPC][NFC] Add a function to determine if a call needs to be NOTOC.
Add the isNoTOCCallInstr function to PPCInstrInfo to determine if a call opcode
does not need a TOC restore after the call. All call opcodes should be listed in
this function. A default unreachable in this function should force future call
opcodes to also be added.

This is a follow up patch to D122012

Reviewed By: jsji, shchenz

Differential Revision: https://reviews.llvm.org/D124415
2022-04-29 08:36:07 -05:00
David Tenty 8042699a30 [LLVM] Add exported visibility style for XCOFF
For the AIX linker, under default options, global or weak symbols which
have no visibility bits set to zero (i.e. no visibility, similar to ELF
default) are only exported if specified on an export list provided to
the linker. So AIX has an additional visibility style called
"exported" which indicates to the linker that the symbol should
be explicitly globally exported.

This change maps "dllexport" in the LLVM IR to correspond to XCOFF
exported as we feel this best models the intended semantic (discussion
on the discourse RFC thread: https://discourse.llvm.org/t/rfc-adding-exported-visibility-style-to-the-ir-to-model-xcoff-exported-visibility/61853)
and allows us to enable writing this visibility for the AIX target
in the assembly path.

Reviewed By: DiggerLin

Differential Revision: https://reviews.llvm.org/D123951
2022-04-28 14:56:00 -04:00
Vasileios Porpodas fa8a9fea47 Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"
This reverts commit 6a9bbd9f20.

Code review: https://reviews.llvm.org/D124202
2022-04-26 14:02:40 -07:00
David Green 9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Fangrui Song fb193db2c7 [PowerPC] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds 2022-04-19 22:35:05 -07:00
Stefan Pintilie ef34442232 [NFC][PowerPC] Move the Regsiter Operands for PowerPC into PPCRegisterInfo.td
Currently the regsiter operand definitions are found in three separate files.
This patch moves all of the definitions into PPCRegisterInfo.td.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D123543
2022-04-18 14:50:24 -05:00
Stefan Pintilie bc9916fff2 [NFC][PowerPC] Style and ordering changes for PPCInstrP10.td
Renamed the two classes 8LS_DForm_R_SI34_RTA5 and 8LS_DForm_R_SI34_XT6_RA5 to
8LS_DForm_R_SI34_RTA5_MEM and 8LS_DForm_R_SI34_XT6_RA5_MEM because the
instructions that use the classes use memory reads/writes.

Moved the instruction defs up closer to the classes.
Removed unnecessary whitespace.
2022-04-18 13:28:48 -05:00
Qiu Chaofan 1e23175df6 [PowerPC] Mark side effects of Power9 darn instruction
This fixes CVE-2019-15847, preventing random number generation from
being merged.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D122783
2022-04-18 13:21:40 +08:00
Kai Luo 18679ac0d7 [PowerPC] Adjust `MaxAtomicSizeInBitsSupported` on PPC64
AtomicExpandPass uses this variable to determine emitting libcalls or not. The default value is 1024 and if we don't specify it for PPC64 explicitly, AtomicExpandPass won't emit `__atomic_*` libcalls for those target unable to inline atomic ops and finally the backend emits `__sync_*` libcalls. Thanks @efriedma for pointing it out.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122868
2022-04-09 00:03:09 +00:00
Kai Luo 549e118e93 [PowerPC] Support 16-byte lock free atomics on pwr8 and up
Make 16-byte atomic type aligned to 16-byte on PPC64, thus consistent with GCC. Also enable inlining 16-byte atomics on non-AIX targets on PPC64.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D122377
2022-04-08 23:25:56 +00:00
Ting Wang b389354b28 [Clang][PowerPC] Add max/min intrinsics to Clang and PPC backend
Add support for builtin_[max|min] which has below prototype:
A builtin_max (A1, A2, A3, ...)
All arguments must have the same type; they must all be float, double, or long double.
Internally use SelectCC to get the result.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D122478
2022-04-05 22:43:48 -04:00
Stefan Pintilie 585c85abe5 [PowerPC] Fix lowering of byval parameters for sizes greater than 8 bytes.
To store a byval parameter the existing code would store as many 8 byte elements
as was required to store the full size of the byval parameter.
For example, a paramter of size 16 would store two element of 8 bytes.
A paramter of size 12 would also store two elements of 8 bytes.
This would sometimes store too many bytes as the size of the paramter is not
always a factor of 8.

This patch fixes that issue and now byval paramters are stored with the correct
number of bytes.

Reviewed By: nemanjai, #powerpc, quinnp, amyk

Differential Revision: https://reviews.llvm.org/D121430
2022-03-31 15:12:46 -05:00
Stefan Pintilie 2e55bc9f3c [PowerPC] Set the special DSCR with a compiler option.
Add a compiler option and the instructions required to set the
special Data Stream Control Register (DSCR). The special register will
not be set by default.

Original patch by: Muhammad Usman

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D117013
2022-03-31 14:06:30 -05:00
Shao-Ce SUN 662b9fa02c [NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122557
2022-03-29 09:53:24 +08:00
Kazu Hirata 6212871968 [Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC) 2022-03-27 22:22:37 -07:00
Maksim Panchenko 4ae9745af1 [Disassember][NFCI] Use strong type for instruction decoder
All LLVM backends use MCDisassembler as a base class for their
instruction decoders. Use "const MCDisassembler *" for the decoder
instead of "const void *". Remove unnecessary static casts.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D122245
2022-03-25 18:53:59 -07:00
Vasileios Porpodas 39aa202aff Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b77.
2022-03-23 18:32:17 -07:00
Stefan Pintilie 2c25c65cdc [PowerPC] The BL8_NOTOC_RM instruction needs to produce a notoc relocation.
The BL8_NOTOC_RM instruction was incorrectly producing a relocation that reqired
a TOC restore after the call. This patch fixes that issue and the notoc
relocation is now used.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D122012
2022-03-23 19:01:05 -05:00
Arthur Eubanks e6ead19b77 Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f9492.

Causes crashes, see comments in D121973
2022-03-23 10:57:45 -07:00
Vasileios Porpodas 27bd8f9492 Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit f7d7d2a08d.
2022-03-22 16:41:55 -07:00
Arthur Eubanks f7d7d2a08d Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d3.

Causes crashes, see comments in https://reviews.llvm.org/D121973.
2022-03-22 13:33:49 -07:00
Vasileios Porpodas 79613185d3 Recommit "[SLP] Fix lookahead operand reordering for splat loads."
Original review: https://reviews.llvm.org/D121354

The original commit 9136145eb0 broke the build on several targets.

Differential Revision: https://reviews.llvm.org/D121973
2022-03-21 15:57:32 -07:00
Chen Zheng 9ada761be3 [PowerPC][NFC] rename file for PPCCTRLoopsVerify pass.
Rename file for PPCCTRLoopsVerify pass from PPCCTRLoops.cpp
to PPCCTRLoopsVerify.cpp.

There will be a new file PPCCTRLoops.cpp for PPC CTR loops
generation later.
2022-03-21 03:42:14 -04:00
Aaron Puchert c1a31ee65b [PPCISelLowering] Avoid emitting calls to __multi3, __muloti4
After D108936, @llvm.smul.with.overflow.i64 was lowered to __multi3
instead of __mulodi4, which also doesn't exist on PowerPC 32-bit, not
even with compiler-rt. Block it as well so that we get inline code.

Because libgcc doesn't have __muloti4, we block that as well.

Fixes #54460.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122090
2022-03-20 20:59:30 +01:00
Shengchen Kan 37b378386e [NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments 2022-03-16 20:25:42 +08:00
serge-sans-paille 989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Stefan Pintilie 78406ac898 [PowerPC][P10] Add Vector pair calling convention
Add the calling convention for the vector pair registers.
These registers overlap with the vector registers.

Part of an original patch by: Lei Huang

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D117225
2022-03-15 14:08:42 -05:00
Qiu Chaofan 300e1293de [PowerPC] Disable perfect shuffle by default
We are going to remove the old 'perfect shuffle' optimization since it
brings performance penalty in hot loop around vectors. For example, in
following loop sharing the same mask:

  %v.1 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>
  %v.2 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>

The generated instructions will be `vmrglw-vmrghw-vmrglw-vmrghw` instead
of `vperm-vperm`. In some large loop cases, this causes 20%+ performance
penalty.

The original attempt to resolve this is to pre-record masks of every
shufflevector operation in DAG, but that is somewhat complex and brings
unnecessary computation (to scan all nodes) in optimization. Here we
disable it by default. There're indeed some cases becoming worse after
this, which will be fixed in a more careful way in future patches.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D121082
2022-03-15 15:52:24 +08:00
Nemanja Ivanovic 766ca2c59e [PowerPC] Add missed VSX shuffles instead of Altivec ones
VSX introduced some permute instructions that are direct
replacements for Altivec ones except they can target all
the VSX registers. We have added code generation for most
of these but somehow missed the low/hi word merges (XXMRG[LH]W).
This caused some additional spills on some large
computationally intensive code.

This patch simply adds the missed patterns.
2022-03-14 10:11:54 -05:00
serge-sans-paille ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Nico Weber a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeea.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille 7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Masoud Ataei 30f30e1c12 [PowerPC] Fix the none tail call in scalar MASS conversion
This patch is proposing a fix for patch https://reviews.llvm.org/D101759
on none tail call math function conversion to MASS call.

Differential: https://reviews.llvm.org/D121016

reviewer: @nemanjai
2022-03-08 08:59:17 -08:00
Qiu Chaofan b2497e5435 [PowerPC] Add generic fnmsub intrinsic
Currently in Clang, we have two types of builtins for fnmsub operation:
one for float/double vector, they'll be transformed into IR operations;
one for float/double scalar, they'll generate corresponding intrinsics.

But for the vector version of builtin, the 3 op chain may be recognized
as expensive by some passes (like early cse). We need some way to keep
the fnmsub form until code generation.

This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub
intrinsics.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D116015
2022-03-07 13:00:06 +08:00
Mircea Trofin cb2160760e [nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.

Differential Revision: https://reviews.llvm.org/D119876
2022-03-01 21:53:25 -08:00
Stefan Pintilie a84a8c937b [PowerPC] Remove redundant MMA patterns.
There are two MMA patterns that have been added twice. This patch just removes
one set of petterns. Should not change the way MMA behaves.

Reviewed By: lei, #powerpc

Differential Revision: https://reviews.llvm.org/D120680
2022-03-01 09:13:21 -06:00
Jameson Nash c4b1a63a1b mark getTargetTransformInfo and getTargetIRAnalysis as const
Seems like this can be const, since Passes shouldn't modify it.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D120518
2022-02-25 14:30:44 -05:00
Stefan Pintilie 0625aed2fc [PowerPC][NFC] Split out the MMA instructions from the P10 instructions.
Currently all of the MMA instructions as well as the MMA related register info
is bundled with the Power 10 instructions. This patch just splits them out.

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D120515
2022-02-25 11:41:09 -06:00
Stefan Pintilie 4fbe60fd13 [PowerPC][NFC] Add file info and license that was missing from this file.
Added the license info as well as description about how classes should be named
based on existing documentation.

Reviewed By: lei, #powerpc

Differential Revision: https://reviews.llvm.org/D120530
2022-02-25 10:28:02 -06:00
Stefan Pintilie eb1c5a9862 [PowerPC] Add the Power10 LXVKQ instrution.
Add the Power 10 instruction LXVKQ.

This patch was taken from an original patch by: Yi-Hong Lyu

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D117507
2022-02-23 08:48:59 -06:00
Nemanja Ivanovic 2aaba44b5c [PowerPC] Allow absolute expressions in relocations
The Linux kernel build uses absolute expressions suffixed with @lo/@ha
relocations. This currently doesn't work for DS/DQ form instructions and
there is no reason for it not to. It also works with GAS.
This patch allows this as long as the value is a multiple of 4/16
for DS/DQ form.

Differential revision: https://reviews.llvm.org/D115419
2022-02-22 09:53:08 -06:00
Qiu Chaofan 43d48ed220 [PowerPC] Add option to disable perfect shuffle
Perfect shuffle was introduced into PowerPC backend years ago, and only
available in big-endian subtargets. This optimization has good effects
in simple cases, but brings serious negative impact in large programs
with many shuffle instructions sharing the same mask.

Here introduces a temporary backend hidden option to control it until we
implemented better way to fix the gap in vectorshuffle decomposition.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D120072
2022-02-21 01:39:35 +08:00
Shao-Ce SUN 21ac474392 [NFC] Correct typo `interger` to `integer` 2022-02-17 21:17:47 +08:00
Lei Huang 5abe6c312b [PowerPC] Rename PPCInstrPrefix.td to PPCInstrP10.td 2022-02-16 10:22:41 -06:00
Shao-Ce SUN 2aed07e96c [NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846
2022-02-16 13:10:09 +08:00
Amy Kwan ac5a5a9cfe [PowerPC] Add default handling for single element vectors, and split/promote vNi1 vectors.
This patch updates the handling of vectors in getPreferredVectorAction():

For single-element and scalable vectors, fall back to default vector legalization
handling. For vNi1 vectors, add handling to either split or promote them in
order to prevent the production of wide v256i1/v512i1 types.

The following assertion is fixed by this patch, as we ended up producing the
wide vector types (that are used for MMA) in the backend prior to this fix.

```
Assertion failed: VT.getSizeInBits() == Operand.getValueSizeInBits() &&
"Cannot BITCAST between types of different sizes!"
```

Differential Revision: https://reviews.llvm.org/D119521
2022-02-15 08:44:08 -06:00
Stefan Pintilie a601db30c6 [PowerPC] Remove the LDMX instruction.
The LDMX instruction was to be potentially added in P9 but it was never added
in either ISA 3.0 or ISA 3.1. This patch removes that instruction as it is
currently still an invalid instruction.

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D118074
2022-02-14 17:03:48 -06:00
Ting Wang 097a95f2df [PowerPC] Add custom lowering for SELECT_CC fp128 using xsmaxcqp
Power ISA 3.1 adds xsmaxcqp/xsmincqp for quad-precision type-c max/min selection,
and this opens the opportunity to improve instruction selection on: llvm.maxnum.f128,
llvm.minnum.f128, and select_cc ordered gt/lt and (don't care) gt/lt.

Reviewed By: nemanjai, shchenz, amyk

Differential Revision: https://reviews.llvm.org/D117006
2022-02-09 21:48:28 -05:00
serge-sans-paille ef736a1c39 Cleanup LLVMMC headers
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:

llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h

Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after:  1049293745

Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
2022-02-09 11:09:17 +01:00
Nikita Popov 149195f576 [PPCISelLowering] Avoid use of getPointerElementType()
Use the value type instead.
2022-02-07 14:30:15 +01:00
Wael Yehia addd073325 [AIX][PowerPC][PGO] Generate .ref for some PGO sections
For PGO on AIX, when we switch to the linux-style PGO variable access
(via _start and _stop labels), we need the compiler to generate a .ref
assembly for each of the three csects:

 -   __llvm_prf_data[RW]
 -   __llvm_prf_names[RO]
 -   __llvm_prf_vnds[RW]

We insert the .ref inside the __llvm_prf_cnts[RW] csect so that if it's
live then the 3 csects are live.

For example, for a testcase with at least one function definition, when
compiled with -fprofile-generate we should generate:

        .csect __llvm_prf_cnts[RW],3
        .ref __llvm_prf_data[RW]   <<============ needs to be inserted
        .ref __llvm_prf_names[RO]  <<===========

the __llvm_prf_vnds is not always present, so we reference it only when
it's present.

Reviewed By: sfertile, daltenty

Differential Revision: https://reviews.llvm.org/D116607
2022-02-05 06:34:20 -05:00
Masoud Ataei 8ce13bc93b [PowerPC] Option controling scalar MASS convertion
differential: https://reviews.llvm.org/D119035

reviewer: bmahjour
2022-02-04 13:24:22 -08:00
Benjamin Kramer 85243124cf Tweak some uses of std::iota to skip initializing the underlying storage. NFCI. 2022-02-04 17:00:50 +01:00
Masoud Ataei 70066dd0e8 [PowerPC] Fixing buildbod failure ppc64le-lld-multistage-test 2022-02-02 10:29:22 -08:00
Masoud Ataei 256d253332 [PowerPC] Scalar IBM MASS library conversion pass
This patch introduces the conversions from math function calls
to MASS library calls. To resolves calls generated with these conversions, one
need to link libxlopt.a library. This patch is tested on PowerPC Linux and AIX.

Differential: https://reviews.llvm.org/D101759

Reviewer: bmahjour
2022-02-02 07:54:19 -08:00
Amy Kwan 0d6e64755a [PowerPC] Update P10 vector insert patterns to use refactored load/stores, and update handling of v4f32 vector insert.
This patch updates the P10 patterns with a load feeding into an insertelt to
utilize the refactored load and store infrastructure, as well as updating any
tests that exhibit any codegen changes.

Furthermore, custom legalization is added for v4f32 on Power9 and above to not
only assist with adjusting the refactored load/stores for P10 vector insert,
but also it enables the utilization of direct moves.

Differential Revision: https://reviews.llvm.org/D115691
2022-02-01 08:48:37 -06:00
Amy Kwan 9cc5b064f1 [PowerPC] Update handling of splat loads for v4i32/v4f32/v2i64 to require non-extending loads.
This patch updates how splat loads handled and is an extension of D106555.

Particularly, for v2i64/v4f32/v4i32 types, they are updated to handle only
non-extending loads. For v8i16/v16i8 types, they are updated to handle extending
loads only if the memory VT is the same vector element VT type.

A test case has been added to illustrate a scenario where a PPCISD::LD_SPLAT
node should not be produced. In this test, it depicts the following f64
extending load used in a v2f64 build vector, but the extending load is actually
used in more places other than the build vector (such as in t12 and t16).
```
Type-legalized selection DAG: %bb.0 'test:entry'
SelectionDAG has 20 nodes:
  t0: ch = EntryToken
  t4: i64,ch = CopyFromReg t0, Register:i64 %1
  t6: i64,ch = CopyFromReg t0, Register:i64 %2
  t11: f64,ch = load<(load (s64) from %ir.b, !tbaa !7)> t0, t4, undef:i64
        t16: f64 = fadd t31, t37
      t34: ch = store<(store (s64) into %ir.c, !tbaa !7)> t31:1, t16, t6, undef:i64
    t36: ch = TokenFactor t34, t37:1
    t27: v2f64 = BUILD_VECTOR t37, t37
  t22: ch,glue = CopyToReg t36, Register:v2f64 $v2, t27
      t12: f64 = fadd t11, t37
    t28: ch = store<(store (s64) into %ir.b, !tbaa !7)> t11:1, t12, t4, undef:i64
  t31: f64,ch = load<(load (s64) from %ir.c, !tbaa !7)> t28, t6, undef:i64
    t2: i64,ch = CopyFromReg t0, Register:i64 %0
  t37: f64,ch = load<(load (s32) from %ir.a, !tbaa !3), anyext from f32> t0, t2, undef:i64
  t23: ch = PPCISD::RET_FLAG t22, Register:v2f64 $v2, t22:1
```

Differential Revision: https://reviews.llvm.org/D117803
2022-01-28 08:23:01 -06:00
Ting Wang 6f25cb8685 [PowerPC] Add the Power10 XS[MAX|MIN]CQP instruction
Add the Power 10 instruction XS[MAX|MIN]CQP.

Reviewed By: shchenz, amyk

Differential Revision: https://reviews.llvm.org/D118036
2022-01-26 23:00:43 -05:00
Benjamin Kramer f15014ff54 Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef82063207.

- It conflicts with the existing llvm::size in STLExtras, which will now
  never be called.
- Calling it without llvm:: breaks C++17 compat
2022-01-26 16:55:53 +01:00
serge-sans-paille ef82063207 Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to
STLForwardCompat.h (which is included by STLExtras.h so no build
breakage expected).
2022-01-26 16:17:45 +01:00
Nemanja Ivanovic 0c56bc92e4 [PowerPC] Fix eq/ne comparison of v2i64 pre-Power8
In commit 1674d9b6b2, I fixed the bug where we didn't consider
both words of the result of the comparison. However, the logic
needs to be different for eq and ne.
Namely for eq, we need both words of the doubleword to equal so it
is an AND. OTOH for ne, we need either word to be unequal so it
is an OR.
2022-01-26 08:59:08 -06:00
Qiu Chaofan ad0345aed1 [PowerPC] Emit gnu_attribute according to float-abi metadata
According to GNU as documentation, PowerPC supports some .gnu_attribute
tags to represent the vector and float ABI type in the object file.
Some linkers like GNU ld respects the attribute and will prevent objects
with conflicting ABIs being linked.

This patch emits gnu_attribute value in assembly printer according to
the float-abi metadata. More attributes for soft-fp, hard single/double
and even vector ABI need to be supported in the future.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D117193
2022-01-26 13:28:50 +08:00
Nikita Popov aa97bc116d [NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().

This is part of D117885, in preparation for deprecating the API.
2022-01-25 09:44:52 +01:00
Quinn Pham 6a028296fe [PowerPC] Emit warning when SP is clobbered by asm
This patch emits a warning when the stack pointer register (`R1`) is found in
the clobber list of an inline asm statement. Clobbering the stack pointer is
not supported.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D112073
2022-01-24 15:12:23 -06:00
Kazu Hirata bf039a8620 [Target] Use range-based for loops (NFC) 2022-01-23 22:53:15 -08:00
Qiu Chaofan 00d68c3824 [PowerPC] Support parsing GNU attributes in MC
This patch is the first step to enable support of GNU attribute in LLVM
PowerPC, enabling it for PowerPC targets, otherwise llvm-mc raises error
when seeing the attribute section.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D115854
2022-01-22 23:29:34 +08:00
Qiu Chaofan 8dedf9b58b [PowerPC] Change CTR clobber estimation for 128-bit floating types
Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D117459
2022-01-22 23:20:14 +08:00
Stefan Pintilie 1324bb29f7 [PowerPC] Fix issue with strict float to int conversion.
When doing the float to int conversion the strict conversion also needs to
retun a chain. This patch fixes that.

Reviewed By: nemanjai, #powerpc, qiucf

Differential Revision: https://reviews.llvm.org/D117464
2022-01-19 10:57:22 -06:00
Jim Lin d6b0734837 [NFC] Use Register instead of unsigned 2022-01-19 20:17:04 +08:00
Sean Fertile 10d3bf9518 [PowerPC][AIX] Fallback to DAG-ISEL if global has toc-data attribute.
FAST-ISEL should fall back to DAG-ISEL when a global variable has the
toc-data attribute. A number of the checks were duplicated in the lit
test becuase of
1) Slightly different output between -O0 and -O2 due to FAST-ISEL vs
   DAG-ISEL codegen.
2) In preperation of a peephole optimization that will run when
   optimizations are enabled.

Differential Revision: https://reviews.llvm.org/D115373
2022-01-17 16:21:38 -05:00
Fangrui Song 1ae1dd16cf [MC][PowerPC] Replace MCContext::reportFatalError calls with reportError
User errors should use reportError. reportError allows us to continue parsing
the file and collect more diagnostics.

While here, make the diagnostic follow convention, merge tests, and test
line/column numbers.
2022-01-15 00:01:36 -08:00
Nick Desaulniers 9c4b49db19 [ShrinkWrap] check for PPC's non-callee-saved LR
As pointed out in https://reviews.llvm.org/D115688#inline-1108193, we
don't want to sink the save point past an INLINEASM_BR, otherwise
prologepilog may incorrectly sink a prolog past the MBB containing an
INLINEASM_BR and into the wrong MBB.

ShrinkWrap is getting this wrong because LR is not in the list of callee
saved registers. Specifically, ShrinkWrap::useOrDefCSROrFI calls
RegisterClassInfo::getLastCalleeSavedAlias which reads
CalleeSavedAliases which was populated by
RegisterClassInfo::runOnMachineFunction by iterating the list of
MCPhysReg returned from MachineRegisterInfo::getCalleeSavedRegs.

Because PPC's LR is non-allocatable, it's NOT considered callee saved.
Add an interface to TargetRegisterInfo for such a case and use it in
Shrinkwrap to ensure we don't sink a prolog past an INLINEASM or
INLINEASM_BR that clobbers LR.

Reviewed By: jyknight, efriedma, nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D116424
2022-01-11 10:01:34 -08:00
Chen Zheng 2c46ca96e2 [PowerPC] fast isel can lower intrinsics call on AIX.
Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D114778
2022-01-10 02:30:05 +00:00
Kazu Hirata 2aed08131d [llvm] Use true/false instead of 1/0 (NFC)
Identified with modernize-use-bool-literals.
2022-01-07 00:39:14 -08:00
Qiu Chaofan c2cc70e4f5 [NFC] Fix endif comments to match with include guard 2022-01-07 15:52:59 +08:00
Kazu Hirata f3a344d212 [Target] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-06 22:01:44 -08:00
Stefan Pintilie 04496201e0 [PowerPC] Add support for ROP protection for 32 bit.
Add support for Return Oriented Programming (ROP) protection for 32 bit.
This patch also adds a testing for AIX on both 64 and 32 bit.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D111362
2022-01-05 15:15:53 -06:00
Kazu Hirata e5947760c2 Revert "[llvm] Remove redundant member initialization (NFC)"
This reverts commit fd4808887e.

This patch causes gcc to issue a lot of warnings like:

  warning: base class ‘class llvm::MCParsedAsmOperand’ should be
  explicitly initialized in the copy constructor [-Wextra]
2022-01-03 11:28:47 -08:00
Kazu Hirata 7e163afd9e Remove redundant void arguments (NFC)
Identified by modernize-redundant-void-arg.
2022-01-02 10:20:19 -08:00
Kazu Hirata fd4808887e [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-01 16:18:18 -08:00
Kazu Hirata 69ccc96162 [llvm] Use the default constructor for SDValue (NFC) 2022-01-01 10:36:59 -08:00
Kazu Hirata 5a667c0e74 [llvm] Use nullptr instead of 0 (NFC)
Identified with modernize-use-nullptr.
2021-12-28 08:52:25 -08:00
Nikita Popov f5ac23b5ae [ArgPromotion][TTI] Pass types to ABI compatibility hook
The areFunctionArgsABICompatible() hook currently accepts a list of
pointer arguments, though what we're actually interested in is the
ABI compatibility after these pointer arguments have been converted
into value arguments.

This means that a) the current API is incompatible with opaque
pointers (because it requires inspection of pointee types) and
b) it can only be used in the specific context of ArgPromotion.
I would like to reuse the API when inspecting calls during inlining.

This patch converts it into an areTypesABICompatible() hook, which
accepts a list of types. This makes the method more generally usable,
and compatible with opaque pointers from an API perspective (the
actual usage in ArgPromotion/Attributor is still incompatible,
I'll follow up on that in separate patches).

Differential Revision: https://reviews.llvm.org/D116031
2021-12-22 09:37:51 +01:00
Kazu Hirata 9db0e21660 [llvm] Use depth_first (NFC) 2021-12-21 22:28:48 -08:00
Nemanja Ivanovic 1674d9b6b2 [PowerPC] Fix vector equality comparison for v2i64 pre-Power8
The current code makes the assumption that equality
comparison can be performed with a word comparison
instruction. While this is true if the entire 64-bit
results are used, it does not generally work. It is
possible that the low order words and high order
words produce different results and a user of only
one will get the wrong result.

This patch adds an and of the result words so that
each word has the result of the comparison of the
entire doubleword that contains it.

Differential revision: https://reviews.llvm.org/D115678
2021-12-21 14:28:41 -06:00
Nemanja Ivanovic a3ea9052d6 [PowerPC] Do not increase cost for getUserCost with MMA types
Commit 150681f increases
cost of producing MMA types (vector pair and quad).
However, it increases the cost for getUserCost() which is
used in unrolling. As a result, loops that contain these
types already (from the user code) cannot be unrolled
(even with the user's unroll pragma). This was an unintended
sideeffect. Reverting that portion of the commit to allow
unrolling such loops.

Differential revision: https://reviews.llvm.org/D115424
2021-12-21 13:36:08 -06:00
Esme-Yi b66328701a [PowerPC][llvm-objdump] enable --symbolize-operands for PowerPC ELF/XCOFF.
Summary: When disassembling, symbolize a branch target operand
to print a label instead of a real address.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D114492
2021-12-21 04:17:57 +00:00
Nemanja Ivanovic 2fb9029f26 [PowerPC] Support hwsync extended mnemonic
This mnemonic has been supported by GAS for years and
it was added to the PowerPC ISA as of ISA 3.1. We will
support the mnemonic to be compatible with GAS.
2021-12-20 10:08:31 -06:00
Zaara Syeda 3f066ac648 Test commit 2021-12-14 15:37:28 +00:00