Commit Graph

448 Commits

Author SHA1 Message Date
Benjamin Kramer f15014ff54 Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef82063207.

- It conflicts with the existing llvm::size in STLExtras, which will now
  never be called.
- Calling it without llvm:: breaks C++17 compat
2022-01-26 16:55:53 +01:00
serge-sans-paille ef82063207 Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to
STLForwardCompat.h (which is included by STLExtras.h so no build
breakage expected).
2022-01-26 16:17:45 +01:00
Kazu Hirata bf039a8620 [Target] Use range-based for loops (NFC) 2022-01-23 22:53:15 -08:00
Jim Lin d6b0734837 [NFC] Use Register instead of unsigned 2022-01-19 20:17:04 +08:00
Kazu Hirata 5a667c0e74 [llvm] Use nullptr instead of 0 (NFC)
Identified with modernize-use-nullptr.
2021-12-28 08:52:25 -08:00
Kazu Hirata ff649e0802 [Target] Use range-based for loops (NFC) 2021-11-27 11:16:19 -08:00
Kazu Hirata d5b73a70a0 [llvm] Use range-based for loops (NFC) 2021-11-22 20:33:28 -08:00
Nemanja Ivanovic 5840f7197d [PowerPC] Respect rounding mode in the back end
Currently, the floating point instructions that depend on
rounding mode are correctly marked in the PPC back end with
an implicit use of the RM register. Similarly, instructions
that explicitly define the register are marked with an
implicit def of the same register. So for the most part,
RM-using code won't be moved across RM-setting instructions.

However, calls are not marked as RM-setting instructions so
code can be moved across calls. This is generally desired,
but so is the ability to turn off this behaviour with an
appropriate option - and -frounding-math really should be
that option.

This patch provides a set of call instructions (for direct
and indirect calls) that are marked with an implicit def of
the RM register. These will be used for calls that are marked
with the strictfp attribute.

Differential revision: https://reviews.llvm.org/D111433
2021-11-10 08:19:58 -06:00
Reid Kleckner 89b57061f7 Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.

This allows us to ensure that Support doesn't have includes from MC/*.

Differential Revision: https://reviews.llvm.org/D111454
2021-10-08 14:51:48 -07:00
Jay Foad a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Victor Huang 6e1aaf18af [PowerPC] Mark splat immediate instructions as rematerializable
This patch marks splat immediate instructions XXSPLTIW and XXSPLTIDP as
rematerializable to prevent MachineLICM from moving them out of loops.

Reviewed By: lei, amy

Differential revision: https://reviews.llvm.org/D108823
2021-09-24 12:03:34 -05:00
Victor Huang 4a226529e2 [PowerPC] Fixed the crash due to early if conversion with fixed CR fields
This patch adds a fix to do early if conversion to select when
conditional branch not using physical register to prevent the crash when
expanding ISEL instruction.

Reviewed By: lei, kamaub, PowerPC

Differential revision: https://reviews.llvm.org/D108302
2021-09-07 10:51:03 -05:00
Kai Luo 5eaebd5d64 [PowerPC] Implement quadword atomic load/store
Add support to load/store i128 atomically.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D105612
2021-09-01 06:55:40 +00:00
Alexander Pivovarov eb946cc5b6 Fix typo in comments
Reviewed By: MaskRay, jsji

Differential Revision: https://reviews.llvm.org/D108857
2021-08-31 11:55:40 +05:30
Nikita Popov 0529e2e018 [InstrInfo] Use 64-bit immediates for analyzeCompare() (NFCI)
The backend generally uses 64-bit immediates (e.g. what
MachineOperand::getImm() returns), so use that for analyzeCompare()
and optimizeCompareInst() as well. This avoids truncation for
targets that support immediates larger 32-bit. In particular, we
can avoid the bugprone value normalization hack in the AArch64
target.

This is a followup to D108076.

Differential Revision: https://reviews.llvm.org/D108875
2021-08-30 19:46:04 +02:00
Arthur Eubanks 80ea2bb574 [NFC] Rename AttributeList::getParam/Ret/FnAttributes() -> get*Attributes()
This is more consistent with similar methods.
2021-08-13 11:16:52 -07:00
Kai Luo 55bd12d4b7 [PowerPC] Remove implicit use register after transformToImmFormFedByLI()
When the instruction has imm form and fed by LI, we can remove the redundat LI instruction.
Below is an example:
```
    renamable $x5 = LI8 2
    renamable $x4 = exact SRD killed renamable $x4, killed renamable $r5, implicit $x5
```

will be converted to:
```
   renamable $x5 = LI8 2
   renamable $x4 = exact RLDICL killed renamable $x4, 62, 2,  implicit killed $x5
```

But when we do this optimization, we forget to remove implicit killed $x5
This bug has caused a lnt case error. This patch is to fix above bug.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D85288
2021-07-09 04:42:54 +00:00
Kai Luo 1c450c3d7e [PowerPC] Export 16 byte load-store instructions
Export `lq`, `stq`, `lqarx` and `stqcx.` in preparation for implementing 16-byte lock free atomic operations on AIX.
Add a new register class `g8prc` for these instructions, since these instructions require even-odd register pair.

Reviewed By: nemanjai, jsji, #powerpc

Differential Revision: https://reviews.llvm.org/D103010
2021-06-15 01:56:10 +00:00
Victor Huang ae3377c553 [AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects
- Add new variantKinds for the symbol's variable offset and region handle
- Print the proper relocation specifier @gd in the asm streamer when emitting
  the TC Entry for the variable offset for the symbol
- Fix the switch section failure between the TC Entry of variable offset and
  region handle
- Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property

Reviewed by: sfertile

Differential Revision: https://reviews.llvm.org/D100956
2021-04-29 13:18:59 -05:00
Chen Zheng 74e77295e7 [PowerPC] fixup killed flags for ri + addi to ri transformation
Fixup killed flags if DefMI and MI are not in the same basic blocks.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D100023
2021-04-07 22:04:08 -04:00
Albion Fung e29bb074c6 [PowerPC] Exploit xxsplti32dx (constant materialization) for scalars
This patch exploits the xxsplti32dx instruction available on Power10
in place of constant pool loads where xxspltidp would not be able to,
usually because the immediate cannot fit into 32 bits.

Differential Revision: https://reviews.llvm.org/D95458
2021-03-24 15:59:59 -04:00
Qiu Chaofan 52f33f7953 [PowerPC] Enable redundant TOC save removal on AIX
Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D97039
2021-03-22 14:29:22 +08:00
Nemanja Ivanovic a8697c57fa [PowerPC] Fix the check for 16-bit signed field in peephole
When a D-Form instruction is fed by an add-immediate, we attempt
to merge the two immediates to form a single displacement so we
can remove the add-immediate.

However, we don't check whether the new displacement fits into
a 16-bit signed immediate field early enough. Namely, we do a
sign-extend from 16 bits first which will discard high bits and
then we check whether the result is a 16-bit signed immediate.
It of course will always be.

Move the check prior to the sign extend to ensure we are checking
the correct value.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49640
2021-03-19 07:15:53 -05:00
Fangrui Song 5d44c92bf8 Change void getNoop(MCInst &NopInst) to MCInst getNop()
Prefer (self-documenting) return values to output parameters (which are
liable to be used).
While here, rename Noop to Nop which is more widely used and improves
consistency with hasEmitNops/setEmitNops/emitNop/etc.
2021-03-15 12:05:34 -07:00
Chen Zheng b0869a7d72 [PowerPC] [NFC] fix wording typos
Post commit comments address for D92071.
2021-02-02 21:03:17 -05:00
Stefan Pintilie 288f762b6f [PowerPC] Materialize 34 bit constants with pli on Power 10.
NOTE: This patch was originally written by Anil Mahmud. His code has been
rebased but otherwise left mostly unchanged.

A new instructon on Power 10 allows for the materialization of 34 bit
immediate values. This patch allows the compiler to take advantage of
the new instruction in this situation.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D92879
2021-02-02 09:49:22 -06:00
Chen Zheng 0ed4cf4bf3 [PowerPC] support register pressure reduction in machine combiner.
Reassociating some patterns to generate more fma instructions to
reduce register pressure.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D92071
2021-01-24 21:28:21 -05:00
Tres Popp 3bd24574c7 Revert "[PowerPC] support register pressure reduction in machine combiner."
This reverts commit 26a396c4ef.

See https://reviews.llvm.org/D92071 for a description of the issue.
2021-01-18 12:01:57 +01:00
Chen Zheng 26a396c4ef [PowerPC] support register pressure reduction in machine combiner.
Reassociating some patterns to generate more fma instructions to
reduce register pressure.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D92071
2021-01-17 23:56:13 -05:00
Kai Luo f6515b0520 [PowerPC] Do not fold `cmp(d|w)` and `subf` instruction to `subf.` if `nsw` is not present
In `PPCInstrInfo::optimizeCompareInstr` we seek opportunities to fold `cmp(d|w)` and `subf` as an `subf.`. However, if `subf.` gets overflow, `cr0` can't reflect the correct order, violating the semantics of `cmp(d|w)`.

Fixed https://bugs.llvm.org/show_bug.cgi?id=47830.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D90156
2021-01-04 07:54:15 +00:00
Esme-Yi 2ea7210e39 Revert "[PowerPC] Extend folding RLWINM + RLWINM to post-RA."
This reverts commit 1c0941e152.
2020-12-16 17:12:24 +00:00
Chen Zheng 4830d458dd [MachineCombiner][NFC] Add MustReduceRegisterPressure goal
add a new goal MustReduceRegisterPressure for machine combiner pass.

PowerPC will use this new goal to do some register pressure related optimization.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D92068
2020-12-14 00:02:42 -05:00
Chen Zheng 95d6042dd4 [NFC][PowerPC] code refactor: split IsReassociable to fma and add.
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D92070
2020-12-01 21:18:57 -05:00
Esme-Yi 1c0941e152 [PowerPC] Extend folding RLWINM + RLWINM to post-RA.
Summary: We have the patterns to fold 2 RLWINMs before RA, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization too.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D89855
2020-11-22 07:37:24 +00:00
Baptiste Saleil 37c4ac8545 [PowerPC] Accumulator/Unprimed Accumulator register copy, spill and restore
This patch adds support for accumulator/unprimed accumulator
register copy, spill and restore for MMA.

Authored By: Baptiste Saleil

Reviewed By: #powerpc, bsaleil, amyk

Differential Revision: https://reviews.llvm.org/D90616
2020-11-11 16:23:45 -06:00
Esme-Yi 5053eab890 Revert "[PowerPC] Extend folding RLWINM + RLWINM to post-RA."
This reverts commit 119ab2181e.
2020-11-03 16:34:02 +00:00
Esme-Yi 119ab2181e [PowerPC] Extend folding RLWINM + RLWINM to post-RA.
Summary: This patch depends on D89846. We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too.

Reviewed By: shchenz, steven.zhang

Differential Revision: https://reviews.llvm.org/D89855
2020-11-03 07:44:11 +00:00
Esme-Yi b969dfe26f [NFC][PowerPC] Move the folding RLWINMs from ppc-mi-peephole to PPCInstrInfo.
Summary: We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example D88274. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too.
This is a NFC patch to move the folding patterns to PPCInstrInfo, and the follow-up works will be calling it in pre-emit-peephole and expand the patterns to handle more cases.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D89846
2020-11-03 06:28:56 +00:00
Nicholas Guy 9a2d2bedb7 Add "SkipDead" parameter to TargetInstrInfo::DefinesPredicate
Some instructions may be removable through processes such as IfConversion,
however DefinesPredicate can not be made aware of when this should be considered.
This parameter allows DefinesPredicate to distinguish these removable instructions
on a per-call basis, allowing for more fine-grained control from processes like
ifConversion.

Renames DefinesPredicate to ClobbersPredicate, to better reflect it's purpose

Differential Revision: https://reviews.llvm.org/D88494
2020-10-21 11:52:47 +01:00
Ahsan Saghir f3202b30b8 [PowerPC] Add assemble disassemble intrinsics for MMA
This patch adds support for assemble disassemble intrinsics
for MMA.

Reviewed By: bsaleil, #powerpc

Differential Revision: https://reviews.llvm.org/D88739
2020-10-13 13:21:58 -05:00
Esme-Yi c4690b0077 [PowerPC] Put the CR field in low bits of GRC during copying CRRC to GRC.
Summary: How we copying the CRRC to GRC is using a single MFOCRF to copy the contents of CR field n (CR bits 4×n+32:4×n+35) into bits 4×n+32:4×n+35 of register GRC. That’s not correct because we expect the value of destination register equals to source so we have to put the the contents of CR field in the lowest 4 bits. This patch adds a RLWINM after MFOCRF to achieve that.
The problem came up when adding builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp, as posted in D88278. We need to move the outputs (in CR register) to GRC. However outputs of these instructions may not in a fixed CR# register, so we can’t directly add a rotation instruction in the .td patterns, but need to wait until the CR register is determined. Then we confirmed this should be a bug in POST-RA PSEUDO PASS.

Reviewed By: nemanjai, shchenz

Differential Revision: https://reviews.llvm.org/D88274
2020-10-02 01:26:18 +00:00
Sean Fertile dfb717da1f [PowerPC] Remove support for VRSAVE save/restore/update.
After removal of Darwin as a PowerPC subtarget, the VRSAVE
save/restore/spill/update code is no longer needed by any supported
subtarget, so remove it while keeping support for vrsave and related instruction
aliases for inline asm. I've pre-commited tests to document the existing vrsave
handling in relation to @llvm.eh.unwind.init and inline asm usage, as
well as a test which shows a beahviour change on AIX related to
returning vector type as we were wrongly emiting VRSAVE_UPDATE on AIX.
2020-09-30 10:05:53 -04:00
Baptiste Saleil 0156914275 [PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types
This patch legalizes the v256i1 and v512i1 types that will be used for MMA.

It implements loads and stores of these types.
v256i1 is a pair of VSX registers, so for this type, we load/store the two
underlying registers. v512i1 is used for MMA accumulators. So in addition to
loading and storing the 4 associated VSX registers, we generate instructions to
prime (copy the VSX registers to the accumulator) after loading and unprime
(copy the accumulator back to the VSX registers) before storing.

This patch also adds the UACC register class that is necessary to implement the
loads and stores. This class represents accumulator in their unprimed form and
allow the distinction between primed and unprimed accumulators to avoid invalid
copies of the VSX registers associated with primed accumulators.

Differential Revision: https://reviews.llvm.org/D84968
2020-09-28 14:39:37 -05:00
Victor Huang 652a8f150d [PowerPC][PCRelative] Thread Local Storage Support for Local Dynamic
This patch is the initial support for the Local Dynamic Thread Local Storage
model to produce code sequence and relocation correct to the ABI for the model
when using PC relative memory operations.

Differential Revision: https://reviews.llvm.org/D87721
2020-09-23 13:48:06 -05:00
Qiu Chaofan bec81dc67d Reland "[PowerPC] Implement instruction clustering for stores"
Commit 3c0b3250 introduced store fusion for PowerPC target, but it
brought failure under UB sanitizer and was reverted. This patch fixes
them.
2020-09-13 19:51:01 +08:00
Qiu Chaofan 8d9c13f37d Revert "[PowerPC] Implement instruction clustering for stores"
This reverts commit 3c0b325023, (along
with ea795304 and bb39eb9e) since it breaks test with UB sanitizer.
2020-09-08 17:24:08 +08:00
Qiu Chaofan bb39eb9e7f [PowerPC] Fix getMemOperandWithOffsetWidth
Commit 3c0b3250 introduced memory cluster under pwr10 target, but a
check for operands was unexpectedly removed. This adds it back to avoid
regression.
2020-09-08 15:35:25 +08:00
Mikael Holmen ea795304ec [PowerPC] Add parentheses to silence gcc warning
Without gcc 7.4 warns with

../lib/Target/PowerPC/PPCInstrInfo.cpp:2284:25: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
          BaseOp1.isFI() &&
          ~~~~~~~~~~~~~~~^~
              "Only base registers and frame indices are supported.");
              ~
2020-09-08 08:39:57 +02:00
Qiu Chaofan 3c0b325023 [PowerPC] Implement instruction clustering for stores
On Power10, it's profitable to schedule some stores with adjacent target
address together. This patch implements this feature.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D86754
2020-09-08 11:03:09 +08:00
Nemanja Ivanovic c485343c83 [PowerPC] Handle SUBFIC in reg+reg -> reg+imm transformation
We initially missed the subtract-immediate in this transformation.
This patch just adds that.

Differential revision: https://reviews.llvm.org/D84659
2020-08-24 16:22:59 -05:00