Commit Graph

2351 Commits

Author SHA1 Message Date
Fangrui Song a36ddf0aa9 Migrate function attribute "no-frame-pointer-elim"="false" to "frame-pointer"="none" as cleanups after D56351 2019-12-24 16:27:51 -08:00
Fangrui Song eb16435b5e Migrate function attribute "no-frame-pointer-elim-non-leaf" to "frame-pointer"="non-leaf" as cleanups after D56351 2019-12-24 16:05:15 -08:00
Fangrui Song 502a77f125 Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351 2019-12-24 15:57:33 -08:00
czhengsz 79b3325be0 [PowerPC] NFC - fix the testcase bug of folding rlwinm 2019-12-23 10:28:22 -05:00
QingShan Zhang 6d5e35e89d [Power9] Remove the PPCISD::XXREVERSE as it has completely the same semantics of ISD::BSWAP
The custom node PPCISD::XXREVERSE has completely the same semantics of generic node ISD::BSWAP.
We need to clean up it as we have the combine rules for bswap in the base class, while nothing for xxreverse.

Differential Revision: https://reviews.llvm.org/D70657
2019-12-23 07:44:33 +00:00
QingShan Zhang 9d1071eac4 [NFC][Test][PowerPC] Add more tests for 'and mask' 2019-12-23 06:59:14 +00:00
Kai Luo 9681dc9627 [PowerPC] Exploit `vrl(b|h|w|d)` to perform vector rotation
Summary:
Currently, we set legalization action of `ISD::ROTL` vectors as
`Expand` in `PPCISelLowering`. However, we can exploit `vrl(b|h|w|d)`
to lower `ISD::ROTL` directly.

Differential Revision: https://reviews.llvm.org/D71324
2019-12-23 03:04:43 +00:00
Fangrui Song e8054f0933 [PPC32] Emit R_PPC_PLTREL24 for calls to dso_local ifunc
static void *ifunc(void) __attribute__((ifunc("resolver")));
  void foo() { ifunc(); }

The relocation produced by the ifunc() call:

1. gcc -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
2. gcc -msecure-plt -PIE => R_PPC_PLTREL24 r_addend=0x8000
3. clang -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
4. clang -msecure-plt -fPIE => R_PPC_REL24

4 is incorrect. The R_PPC_REL24 needs a call stub due to ifunc. If this
relocation is mixed with other R_PPC_PLTREL24(r_addend=0x8000) in a
function, both GNU ld and lld (after D71621 fix) may produce a wrong
result.

This patch fixes 4 to use R_PPC_PLTREL24, which matches GCC.
Both GNU ld and lld (after D71621) will be happy.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D71649
2019-12-20 11:32:02 -08:00
jasonliu ac741f98c1 [XCOFF][AIX] Fix for missing of undefined symbols from symbol table
Summary:
When we use undefined symbol with its qualname, we are not able
to generate that symbol because of the logic of early "continue"
that skip the qualname symbol. This patch fixes it.

Differential revision: https://reviews.llvm.org/D71667
2019-12-19 21:20:33 +00:00
Justin Hibbits d3aeac8e20 [PowerPC] Only use PLT annotations if using PIC relocation model
Summary:
The default static (non-PIC, non-PIE) model for 32-bit powerpc does not
use @PLT annotations and relocations in GCC.  LLVM shouldn't use @PLT
annotations either, because it breaks secure-PLT linking with (some
versions of?) GNU LD.

Update the available-externally.ll test to reflect that default mode should be
the same as the static relocation, by using the same check prefix.

Reviewed by:    sfertile
Differential Revision: https://reviews.llvm.org/D70570
2019-12-19 09:27:13 -06:00
czhengsz f5440ec41d [PowerPC] make lwa as a valid ds candidate in ppcloopinstrformprep pass
Fix a FIXME in ppcloopinstrformprep pass.

Reviewed by: nemanjai

Differential Revision: https://reviews.llvm.org/D71346
2019-12-18 21:06:57 -05:00
Nemanja Ivanovic a5da8d90da [PowerPC] Add missing legalization for vector BSWAP
We somehow missed doing this when we were working on Power9 exploitation.
This just adds the missing legalization and cost for producing the vector
intrinsics.

Differential revision: https://reviews.llvm.org/D70436
2019-12-17 19:07:34 -06:00
David Tenty 84161f18cc [AIX] Avoid unset csect assert for functions defined after their use in TOC
Summary:
If a function is defined after it appears in a TOC expression, we may
try to access an unset containing csect when returning a symbol for the
expression.

Reviewers: Xiangling_L, DiggerLin, jasonliu, hubert.reinterpretcast

Reviewed By: hubert.reinterpretcast

Subscribers: hubert.reinterpretcast, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71125
2019-12-17 16:59:22 -05:00
Ulrich Weigand 1e89188d35 [FPEnv] Remove unnecessary rounding mode argument for constrained intrinsics
The following intrinsics currently carry a rounding mode metadata argument:

    llvm.experimental.constrained.minnum
    llvm.experimental.constrained.maxnum
    llvm.experimental.constrained.ceil
    llvm.experimental.constrained.floor
    llvm.experimental.constrained.round
    llvm.experimental.constrained.trunc

This is not useful since the semantics of those intrinsics do not in any way
depend on the rounding mode. In similar cases, other constrained intrinsics
do not have the rounding mode argument. Remove it here as well.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D71218
2019-12-17 21:10:36 +01:00
QingShan Zhang 0d8929ce76 [NFC][Test][PowerPC] Add the test to verify the mask with constant 2019-12-17 07:04:19 +00:00
Jim Lin 7e0fd77645 [PowerPC] Fix %llvm.ppc.altivec.vc* lowering
Summary:
r372285 changed LLVM to use a `TargetConstant` for parameters of intrinsics that are required to be immediates.

Since that commit, use of `%llvm.ppc.altivec.vc{fsx,fux,tsxs,tuxs}` intrinsics has not worked, and resulted in a `LLVM ERROR: Cannot select: intrinsic %llvm.ppc.altivec.vc*` error. The intrinsics' TableGen definitions matched on `imm` instead of `timm`.

This commit updates those definitions to use `timm`.

Fixes: https://llvm.org/PR44239

Reviewers: hfinkel, nemanjai, #powerpc, Jim

Reviewed By: Jim

Subscribers: qiucf, wuzish, Jim, hiraditya, kbarton, jsji, shchenz, llvm-commits

Tags: #llvm

Patched by vddvss (Colin Samples).

Differential Revision: https://reviews.llvm.org/D71138
2019-12-16 10:21:55 +08:00
Sean Fertile 93faa237da [PowerPC] Add Support for indirect calls on AIX.
Extends the desciptor-based indirect call support for 32-bit codegen,
and enables indirect calls for AIX.

In-depth Description:
In a function descriptor based ABI, a function pointer points at a
descriptor structure as opposed to the function's entry point. The
descriptor takes the form of 3 pointers: 1 for the function's entry
point, 1 for the TOC anchor of the module containing the function
definition, and 1 for the environment pointer:

struct FunctionDescriptor {
  void *EntryPoint;
  void *TOCAnchor;
  void *EnvironmentPointer;
};

An indirect call has several steps of loading the the information from
the descriptor into the proper registers for setting up the call. Namely
it has to:

1) Save the caller's TOC pointer into the TOC save slot in the linkage
   area, and then load the callee's TOC pointer into the TOC register
   (GPR 2 on AIX).

2) Load the function descriptor's entry point into the count register.

3) Load the environment pointer into the environment pointer register
   (GPR 11 on AIX).

4) Perform the call by branching on count register.

5) Restore the caller's TOC pointer after returning from the indirect call.

A couple important caveats to the above:

- There is no way to directly load a value from memory into the count register.
  Instead we populate the count register by loading the entry point address into
  a gpr and then moving the gpr to the count register.

- The TOC restore has to come immediately after the branch on count register
  instruction (i.e., the 1st instruction executed after we return from the
  call). This is an implementation limitation. We could, in theory, schedule
  the restore elsewhere as long as no uses of the TOC pointer fall in between
  the call and the restore; however, to keep it simple, we insert a pseudo
  instruction that represents both the indirect branch instruction and the
  load instruction that restores the caller's TOC from the linkage area. As
  they flow through the compiler as a single pseudo instruction, nothing can be
  inserted between them and the caller's TOC is then valid at any use.

Differtential Revision: https://reviews.llvm.org/D70724
2019-12-13 20:07:00 -05:00
Sanjay Patel 2f0c7fd2db [DAGCombiner] fold shift-trunc-shift to shift-mask-trunc (2nd try)
The initial attempt (rG89633320) botched the logic by reversing
the source/dest types. Added x86 tests for additional coverage.
The vector tests show a potential improvement (fold vector load
instead of broadcasting), but that's a known/existing problem.

This fold is done in IR by instcombine, and we have a special
form of it already here in DAGCombiner, but we want the more
general transform too:
https://rise4fun.com/Alive/3jZm

Name: general
Pre: (C1 + zext(C2) < 64)
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%a = and i64 %s2, zext((1 << (16 - C2)) - 1)
%r = trunc %a to i16

Name: special
Pre: C1 == 48
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%r = trunc %s2 to i16

...because D58017 exposes a regression without this fold.
2019-12-13 14:03:54 -05:00
Sanjay Patel 9432937190 Revert "[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc"
This reverts commit 8963332c33.
There was a logic bug typo in this code, but it wasn't visible in the asm for the tests.
2019-12-12 16:24:40 -05:00
Sanjay Patel 8963332c33 [DAGCombiner] fold shift-trunc-shift to shift-mask-trunc
This fold is done in IR by instcombine, and we have a special
form of it already here in DAGCombiner, but we want the more
general transform too:
https://rise4fun.com/Alive/3jZm

Name: general
Pre: (C1 + zext(C2) < 64)
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%a = and i64 %s2, zext((1 << (16 - C2)) - 1)
%r = trunc %a to i16

Name: special
Pre: C1 == 48
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%r = trunc %s2 to i16

...because D58017 exposes a regression without this fold.
2019-12-12 15:44:13 -05:00
Sanjay Patel 927a6614bc [AArch64][PowerPC] add tests for shift sandwich; NFC 2019-12-12 12:37:02 -05:00
Danila Kutenin 19e83a9b4c [ValueTracking] Pointer is known nonnull after load/store
If the pointer was loaded/stored before the null check, the check
is redundant and can be removed. For now the optimizers do not
remove the nullptr check, see https://gcc.godbolt.org/z/H2r5GG.
The patch allows to use more nonnull constraints. Also, it found
one more optimization in some PowerPC test. This is my first llvm
review, I am free to any comments.

Differential Revision: https://reviews.llvm.org/D71177
2019-12-11 20:32:29 +01:00
czhengsz bf4580b7e7 [PowerPC][NFC] add test case for lwa - loop ds form prep 2019-12-11 06:10:11 -05:00
QingShan Zhang f99297176c [PowerPC] Exploitate the Vector Integer Average Instructions
PowerPC has instruction to do the semantics of this piece of code:

vector int foo(vector int m, vector int n) {
  return (m + n + 1) >> 1;
}
This patch is adding the match rule to select it.

Differential Revision: https://reviews.llvm.org/D71002
2019-12-11 07:25:57 +00:00
diggerlin 98f5f022f0 [BUG-FIX][XCOFF] fixed a bug of XCOFFObjectFile.cpp when there is padding at the last csect of a sections
SUMMARY:
  Fixed a bug of XCOFFObjectFile.cpp when there is padding at the last csect of a sections.
when there is a tail padding of a section, but the value of CurrentAddressLocation do not be increased by the padding size. it will hit assert assert(CurrentAddressLocation == Section->Address && "We should have no padding between sections.");

Reviewers: daltenty,hubert.reinterpretcast,

Differential Revision: https://reviews.llvm.org/D70859
2019-12-10 11:14:49 -05:00
Jinsong Ji 3d41a58eac [PowerPC][NFC] Rename ANDI(S)o8 to ANDI(S)8o
Summary:
This is found during https://reviews.llvm.org/D70758
All the other record forms are having suffix o at the end.
ANDIo8 and ANDISo8 are the only two that put o before 8.

This patch rename them to be consistent with others.

Reviewers: #powerpc, hfinkel, nemanjai, lei, steven.zhang, echristo, jhibbits, joerg

Reviewed By: jhibbits

Subscribers: wuzish, hiraditya, kbarton, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70928
2019-12-09 19:21:34 +00:00
Amaury Séchet d7aded3937 [PowerPC] Automatically generate store-constant.ll . NFC 2019-12-09 01:08:18 +01:00
Kai Luo 884351547d [PowerPC] Fix MI peephole optimization for splats
Summary:
This patch fixes an issue where the PPC MI peephole optimization pass incorrectly remove a vector swap.

Specifically, the pass can combine a splat/swap to a splat/copy. It uses `TargetRegisterInfo::lookThruCopyLike` to determine that the operands to the splat are the same. However, the current logic only compares the operands based on register numbers. In the case where the splat operands are ultimately feed from the same physical register, the pass can incorrectly remove a swap if the feed register for one of the operands has been clobbered.

This patch adds a check to ensure that the registers feeding are both virtual registers or the operands to the splat or swap are both the same register.

Here is an example in pseudo-MIR of what happens in the test cased added in this patch:

Before PPC MI peephole optimization:
```
%arg = XVADDDP %0, %1

$f1 = COPY %arg.sub_64
call double rint(double)
%res.first = COPY $f1
%vec.res.first = SUBREG_TO_REG 1, %res.first, %subreg.sub_64

%arg.swapped = XXPERMDI %arg, %arg, 2
$f1 = COPY %arg.swapped.sub_64
call double rint(double)
%res.second = COPY $f1

%vec.res.second = SUBREG_TO_REG 1, %res.second, %subreg.sub_64
%vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0
%vec.res = XXPERMDI %vec.res.splat, %vec.res.splat, 2
; %vec.res == [ %vec.res.second[0], %vec.res.first[0] ]
```

After optimization:
```
; ...
%vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0
; lookThruCopyLike(%vec.res.first) == lookThruCopyLike(%vec.res.second) == $f1
; so the pass replaces the swap with a copy:
%vec.res = COPY %vec.res.splat
; %vec.res == [ %vec.res.first[0], %vec.res.second[0] ]
```

As best as I can tell, this has occurred since r288152, which added support for lowering certain vector operations to direct moves in the form of a splat.

Committed for vddvss (Colin Samples). Thanks Colin for the patch!
Differential Revision: https://reviews.llvm.org/D69497
2019-12-07 14:51:20 +08:00
Guozhi Wei 72942459d0 [MBP] Avoid tail duplication if it can't bring benefit
Current tail duplication integrated in bb layout is designed to increase the fallthrough from a BB's predecessor to its successor, but we have observed cases that duplication doesn't increase fallthrough, or it brings too much size overhead.

To overcome these two issues in function canTailDuplicateUnplacedPreds I add two checks:

  make sure there is at least one duplication in current work set.
  the number of duplication should not exceed the number of successors.

The modification in hasBetterLayoutPredecessor fixes a bug that potential predecessor must be at the bottom of a chain.

Differential Revision: https://reviews.llvm.org/D64376
2019-12-06 09:53:53 -08:00
diggerlin 4a7e00df34 [AIX][XCOFF] created a test case to verify the raw text section of xcoffobject file
SUMMARY:
in the patch https://reviews.llvm.org/D66969 . we need a test case to verify the out text section of the xcoffobject file is correct or not.

but we do not have llvm disassembly tools to dump the xcoffobjectfile . since we commit the patch https://reviews.llvm.org/D70255, we have tools for it. we create this test case for it.

Reviewers: daltenty,hubert.reinterpretcast,

Differential Revision: https://reviews.llvm.org/D70719
2019-12-06 10:12:09 -05:00
David Tenty 1ea1e053f6 [AIX] Make sure to use QualNames for external global objects
Summary: Previously we only handled the case where the csect hadn't been set up yet, so we'd hit an assert later on.

Reviewers: jasonliu, DiggerLin, stevewan

Reviewed By: jasonliu

Subscribers: hubert.reinterpretcast, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71032
2019-12-05 15:22:53 -05:00
Kai Luo b200c5180e Reland [MachineCopyPropagation] Extend MCP to do trivial copy backward propagation.
Fix assertion error
```
bool llvm::MachineOperand::isRenamable() const: Assertion `Register::isPhysicalRegister(getReg()) && "isRenamable should only be checked on physical registers"' failed.
```
by checking if the register is 0 before invoking `isRenamable`.
2019-12-05 14:32:11 +08:00
Kai Luo 3882edbe19 Revert "[MachineCopyPropagation] Extend MCP to do trivial copy backward propagation"
This reverts commit 75b3a1c318, since it
breaks bootstrap build.
2019-12-05 12:48:37 +08:00
Kai Luo 75b3a1c318 [MachineCopyPropagation] Extend MCP to do trivial copy backward propagation
Summary:
This patch mainly do such transformation
```
$R0 = OP ...
... // No read/clobber of $R0 and $R1
$R1 = COPY $R0 // $R0 is killed
```
Replace $R0 with $R1 and remove the COPY, we have
```
$R1 = OP ...
```
This transformation can also expose more opportunities for existing
copy elimination in MCP.

Differential Revision: https://reviews.llvm.org/D67794
2019-12-05 10:59:07 +08:00
jasonliu 5422e81a89 [XCOFF][AIX] Emit TOC entries for object file generation
Summary:
Implement emitTCEntry for PPCTargetXCOFFStreamer.
Add TC csects to TOCCsects for object file writing.

Note:

1. I did not include any raw data testing for this object file generation
because TC entries raw data will all be 0 without relocation implemented.
I will add raw data testing as part of relocation testing later.
2. I removed "Symbol->setFragment(F);" for common symbols because we
 don't need it, and if we have it then we would hit assertions below:
Assertion `(SymbolContents == SymContentsUnset ||
            SymbolContents == SymContentsOffset) &&
            "Cannot get offset for a common/variable symbol"' failed.
3.Fixed incorrect TOC-base alignment.

Differential Revision: https://reviews.llvm.org/D70798
2019-12-04 16:44:44 +00:00
czhengsz f0ba1aec35 [PowerPC] folding rlwinm + rlwinm to rlwinm
For example:
    x3 = rlwinm x3, 27, 5, 31
    x3 = rlwinm x3, 19, 0, 12
  can be combined to
    x3 = rlwinm x3, 14, 0, 12

Reviewed by: steven.zhang, lkail

Differential Revision: https://reviews.llvm.org/D70374
2019-12-03 21:51:19 -05:00
Craig Topper f586fd44e4 [FPEnv] [PowerPC] Lowering ppc_fp128 StrictFP Nodes to libcalls
This is an alternative to D64662 that shares more code between
strict and non-strict nodes. It's modeled after the implementation
that I did for softening.

Differential Revision: https://reviews.llvm.org/D70867
2019-12-03 14:11:21 -08:00
Taewook Oh 2da205d43e Reland "b19ec1eb3d0c [BPI] Improve unreachable/ColdCall heurstics to handle loops."
Summary: b19ec1eb3d has been reverted because of the test failures
with PowerPC targets. This patch addresses the issues from the previous
commit.

Test Plan: ninja check-all. Confirmed that CodeGen/PowerPC/pr36292.ll
and CodeGen/PowerPC/sms-cpy-1.ll pass

Subscribers: llvm-commits
2019-12-02 10:28:40 -08:00
Nemanja Ivanovic 241cbf201a [PowerPC] Fix crash in peephole optimization
When converting reg+reg shifts to reg+imm rotates, we neglect to consider the
CodeGenOnly versions of the 32-bit shift mnemonics. This means we produce a
rotate with missing operands which causes a crash.

Committing this fix without review since it is non-controversial that the list
of mnemonics to consider should include the 64-bit aliases for the exact
mnemonics.

Fixes PR44183.
2019-12-02 08:56:04 -06:00
Sean Fertile 26ab827c24 [PowerPC][AIX] Add support for lowering int/float/double formal arguments.
This patch adds LowerFormalArguments_AIX, support is added for lowering
int, float, and double formal arguments into general purpose and
floating point registers only.

The aix calling convention testcase have been redone to test for caller
and callee functionality in the same lit test.

Patch by Zarko Todorovski!

Differential Revision: https://reviews.llvm.org/D69578
2019-11-29 12:46:53 -05:00
David Tenty 98740643f7 [AIX] Emit TOC entries for ASM printing
Summary:
Emit the correct .toc psuedo op when we change to the TOC and emit
TC entries. Make sure TOC psuedos get the right symbols via overriding
getMCSymbolForTOCPseudoMO on AIX. Add a test for TOC assembly writing
and update tests to include TOC entries.

Also make sure external globals have a csect set and handle external function descriptor (originally authored by Jason Liu) so we can emit TOC entries for them.

Reviewers: DiggerLin, sfertile, Xiangling_L, jasonliu, hubert.reinterpretcast

Reviewed By: jasonliu

Subscribers: arphaman, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70461
2019-11-27 17:20:55 -05:00
Stefan Pintilie dcceab1a0a [PowerPC] Add new Future CPU for PowerPC in LLVM
This is a continuation of D70262
The previous patch as listed above added the future CPU in clang. This patch
adds the future CPU in the PowerPC backend. At this point the patch simply
assumes that a future CPU will have the same characteristics as pwr9. Those
characteristics may change with later patches.

Differential Revision: https://reviews.llvm.org/D70333
2019-11-27 14:30:06 -06:00
czhengsz 98189755cd [PowerPC] [NFC] change PPCLoopPreIncPrep class name after D67088.
Afer https://reviews.llvm.org/D67088, PPCLoopPreIncPrep pass can prepare more instruction forms except pre inc form, like DS/DQ forms.

This patch is a follow-up of https://reviews.llvm.org/D67088 to rename the pass name.

Reviewed by: jsji

Differential Revision: https://reviews.llvm.org/D70371
2019-11-26 23:58:00 -05:00
jasonliu 7707d8aa9d [XCOFF][AIX] Check linkage on the function, and two fixes for comments
This is a follow up commit to address post-commit comment in D70443

Differential revision: https://reviews.llvm.org/D70443
2019-11-26 16:09:31 +00:00
Nemanja Ivanovic 7fbaa8097e [PowerPC] Fix VSX clobbers of CSR registers
If an inline asm statement clobbers a VSX register that overlaps with a
callee-saved Altivec register or FPR, we will not record the clobber and will
therefore violate the ABI. This is clearly a bug so this patch fixes it.

Differential revision: https://reviews.llvm.org/D68576
2019-11-25 11:41:34 -06:00
jasonliu 906ecae2ed [AIX][XCOFF] Generate undefined symbol in symbol table for external function call
Summary:
This patch sets up the infrastructure for

 1. Associate MCSymbolXCOFF with an MCSectionXCOFF when it could not
    get implicitly associated.
 2. Generate undefined symbols. The patch itself generates undefined symbol
    for external function call only. Generate undefined symbol for external
    global variable and external function descriptors will be handled in
    separate patch(s) after this is land.

Differential Revision: https://reviews.llvm.org/D70443
2019-11-25 15:02:01 +00:00
QingShan Zhang bae5aac1ff [NFC][Test] Adding the test for bswap + logic op for PowerPC 2019-11-25 08:21:12 +00:00
czhengsz d1c16598b7 Revert "[PowerPC] combine rlwinm+rlwinm to rlwinm"
This reverts commit 29f6f9b2b2.
2019-11-24 22:46:26 -05:00
Amy Kwan d1dded28da [PowerPC] Spill CR LT bits on P9 using setb
This patch aims to spill CR[0-7]LT bits on POWER9 using the setb instruction.
The sequence on P9 to spill these bits will be:

setb %reg, %CRREG
stw %reg, $FI

Instead of the typical sequence:

mfocrf %reg, %CRREG
rlwinm %reg1, %reg, $SH, 0, 0
stw %reg1, $FI

Differential Revision: https://reviews.llvm.org/D68443
2019-11-24 00:27:40 -06:00
jasonliu af8576ff9d [XCOFF][AIX] Read-only data section object file generation
Summary:
This patch is a follow up on read-only assembly patch D70182.
It intends to enable object file generation for the read-only data section on AIX.

Reviewers: DiggerLin, daltenty

Differential Revision: https://reviews.llvm.org/D70455
2019-11-22 15:49:37 +00:00
QingShan Zhang a4cc895aee [PowerPC] Implement the vector extend sign instruction pattern match
Power9 has instructions to implement the semantics of SIGN_EXTEND_INREG for vector type.
Mark it as legal and add the match pattern.

Differential Revision: https://reviews.llvm.org/D69601
2019-11-22 08:58:27 +00:00
czhengsz 29f6f9b2b2 [PowerPC] combine rlwinm+rlwinm to rlwinm
combine
x3 = rlwinm x3, 27, 5, 31
x3 = rlwinm x3, 19, 0, 12

to
x3 = rlwinm x3, 14, 0, 12

Reviewed by: steven.zhang

Differential Revision: https://reviews.llvm.org/D70374
2019-11-22 00:00:33 -05:00
Xing Xue 5665fc91fe [AIX][XCOFF] Add support for generating assembly code for one-byte mergable strings
This patch adds support for generating assembly code for one-byte mergeable strings.

Generating assembly code for multi-byte mergeable strings and the `XCOFF` object code for mergeable strings will be supported later.

Reviewers: hubert.reinterpretcast, jasonliu, daltenty, sfertile, DiggerLin, Xiangling_L

Reviewed by: daltenty

Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70310
2019-11-20 11:26:49 -05:00
Xiangling Liao ca33727abe [AIX] Lowering jump table, constant pool and block address in asm
This patch lowering jump table, constant pool and block address in assembly.
1. On AIX, jump table index is always relative;
2. Put CPI and JTI into ReadOnlySection until we support unique data sections;
3. Create the temp symbol for block address symbol;
4. Update MIR testcases and add related assembly part;

Differential Revision: https://reviews.llvm.org/D70243
2019-11-20 10:27:15 -05:00
jasonliu c9edaa828e [AIX][XCOFF] Write Function descriptors and TOC base to data section
This patch implements writing function descriptors and TOC base into
data section, and also add function descriptors(both csect and label)
and TOC base symbols to the symbol table.
2019-11-19 16:11:00 +00:00
Simon Pilgrim c7f85f3a84 [PowerPC] Regenerate vsx_insert_extract_le.ll tests 2019-11-19 13:18:44 +00:00
diggerlin 5e0a4eddac Adding a test case for read-only data assembly writing for aix
SUMMARY:

Adding a test case  for read-only data assembly writing for aix

Reviewers: daltenty,Xiangling_Liao
Subscribers: rupprecht, seiyai,hiraditya

Differential Revision: https://reviews.llvm.org/D70182
2019-11-18 17:07:13 -05:00
Stefan Pintilie 6512473cee [PowerPC] Improve float vector gather codegen
This patch aims to improve the code generation for float vector gather on POWER9.
Patterns have been implemented to utilize instructions that deliver improved
performance.

Patch by: Kamau Bridgeman

Differential Revision: https://reviews.llvm.org/D62908
2019-11-18 15:53:32 -06:00
Stefan Pintilie 9d93893914 [PowerPC] Test case for vector float gather on ppc64le and ppc64
Test case to verify that the expected code is generated for a
vector float gather based on the patterns in tablegen for big
and little endian cases.

Patch by: Kamau Bridgeman

Differential Revision: https://reviews.llvm.org/D69443
2019-11-18 13:17:07 -06:00
czhengsz 1ce5fcda17 [PowerPC] [NFC] add IR testcases for folding rlwinma. 2019-11-18 07:43:30 -05:00
QingShan Zhang 03e7fb2e07 [NFC][Test] Add the vavg test for PowerPC 2019-11-18 10:41:47 +00:00
czhengsz a0337d269b [PowerPC] extend PPCPreIncPrep Pass for ds/dq form
Now, PPCPreIncPrep pass changes a loop to update form and update all load/store
with same base accordingly. We can do more for load/store with same base, for
example, convert load/store with same base to ds/dq form.

Reviewed by: jsji

Differential Revision: https://reviews.llvm.org/D67088
2019-11-17 21:38:43 -05:00
QingShan Zhang bcb6829ee6 [NFC] Add one test for PowerPC to verify the sext_inreg for vector type. 2019-11-14 10:57:05 +00:00
Jinsong Ji 228dd96c6f [PowerPC] Remove allow-deprecated-dag-overlap and fix broken tests
Summary:
This is found during review of https://reviews.llvm.org/D67088.

CHECK-DAG is non-overlapping after https://reviews.llvm.org/D47106.
-allow-deprecated-dag-overlap was introduced to temporary accept old
behavior.

But it actually hide some broken tests, eg: `test/CodeGen/PowerPC/swaps-le-1.ll`
The codegen has changed, but the CHECK-DAG still PASS due to allowing `overlap`.

This patch remove the deprecated options, and fix the broken tests.

Reviewers: #powerpc, hfinkel, nemanjai, steven.zhang, shchenz

Reviewed By: shchenz

Subscribers: shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69733
2019-11-12 15:18:54 +00:00
Nemanja Ivanovic 70193b21d1 [NFC] Fix test case after edab7dd426
The author of the patch forgot to add -verify-machineinstrs to the RUN
lines which would have made the issue appear on all bots. Added that
as well as a fix for the undefined register issue (after the hoisting).
2019-11-11 20:40:40 -06:00
Sean Fertile e5e2e0a66b [PowerPC][XCOFF] Add support for zero initialized global values.
For XCOFF, globals mapped into the .bss section are linked as COMMON
definitions. This behaviour is incorrect for zero initialized data, so
emit those to the .data section instead.

Differential Revision: https://reviews.llvm.org/D69528
2019-11-11 18:52:10 -05:00
Victor Huang 6b0af41ad7 Fixing PowerPC llc test cases for Disable hoisting MI to hotter basic blocks by adding powerpc triple 2019-11-11 23:47:47 +00:00
Victor Huang edab7dd426 Disable hoisting MI to hotter basic blocks
In current Hoist() function of machine licm pass, it will not check the source and destination basic block frequencies that a instruction is hoisted from/to.
There is a chance that instruction is hoisted from a cold to a hot basic block.

In this patch, we add options to disable machine instruction hoisting if destination block is hotter.

Differential Revision: https://reviews.llvm.org/D63676
2019-11-11 21:32:56 +00:00
Yi-Hong Lyu 6bbfafd037 [CGP] Make ICMP_EQ use CR result of ICMP_S(L|G)T dominators
For example:

long long test(long long a, long long b) {
  if (a << b > 0)
    return b;
  if (a << b < 0)
    return a;
  return a*b;
}

Produces:

        sld. 5, 3, 4
        ble 0, .LBB0_2
        mr 3, 4
        blr
.LBB0_2:                                # %if.end
        cmpldi  5, 0
        li 5, 1
        isel 4, 4, 5, 2
        mulld 3, 4, 3
        blr

But the compare (cmpldi 5, 0) is redundant and can be removed (CR0 already
contains the result of that comparison).

The root cause of this is that LLVM converts signed comparisons into equality
comparison based on dominance. Equality comparisons are unsigned by default, so
we get either a record-form or cmp (without the l for logical) feeding a cmpl.
That is the situation we want to avoid here.

Differential Revision: https://reviews.llvm.org/D60506
2019-11-11 17:28:50 +00:00
Yi-Hong Lyu a3db9c08eb [PowerPC] Remove redundant CRSET/CRUNSET in custom lowering of known CR bit spills
We lower known CR bit spills (CRSET/CRUNSET) to load and spill the known value
but forgot to remove the redundant spills.

e.g., This sequence was used to spill a CRUNSET:
    crclr   4*cr5+lt
    mfocrf  r3,4
    rlwinm  r3,r3,20,0,0
    stw     r3,132(r1)

Custom lowering of known CR bit spills lower it to:
    crxor 4*cr5+lt, 4*cr5+lt, 4*cr5+lt
    li  r3,0
    stw r3,132(r1)

crxor is redundant if there is no use of 4*cr5+lt so we should remove it

Differential revision: https://reviews.llvm.org/D67722
2019-11-08 15:32:31 +00:00
Jason Liu 0dc0572b48 [XCOFF][AIX] Differentiate usage of label symbol and csect symbol
Summary:
 We are using symbols to represent label and csect interchangeably before, and that could be a problem.
There are cases we would need to add storage mapping class to the symbol if that symbol is actually the name of a csect, but it's hard for us to figure out whether that symbol is a label or csect.

This patch intend to do the following:
    1. Construct a QualName (A name include the storage mapping class)
       MCSymbolXCOFF for every MCSectionXCOFF.
    2. Keep a pointer to that QualName inside of MCSectionXCOFF.
    3. Use that QualName whenever we need a symbol refers to that
       MCSectionXCOFF.
    4. Adapt the snowball effect from the above changes in
       XCOFFObjectWriter.cpp.

Reviewers: xingxue, DiggerLin, sfertile, daltenty, hubert.reinterpretcast

Reviewed By: DiggerLin, daltenty

Subscribers: wuzish, nemanjai, mgorny, hiraditya, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69633
2019-11-08 09:30:10 -05:00
Nemanja Ivanovic 9af28400d6 [PowerPC] Option for enabling absolute jumptables with command line
This option allows the user to specify the use of absolute jumptables instead
of relative which is the default on most PPC subtargets.

Patch by Kamauu Bridgeman

Differential revision: https://reviews.llvm.org/D69108
2019-11-07 19:33:15 -06:00
QingShan Zhang 529bb8a980 [PowerPC] Fix the incorrect 'RM' flag set on load/store instr
The 'RM' flag model the "Rounding Mode" and it has nothing to do with the load/store instructions.

Differential Revision: https://reviews.llvm.org/D69551
2019-11-06 02:46:37 +00:00
David Green f01b9aa89e [MachineScheduler] Enable AA in PostRA Machine scheduler
This adds AA to Post-RA Machine Scheduling, allowing the pass more
freedom when handling memory operations.

My understanding is that this was just never done, not that it is
inherently incorrect to do so. The older PostRA List scheduler already
makes use of AA, it's just that the MI PostRA Scheduler was never taught
to use it.

Differential Revision: https://reviews.llvm.org/D69814
2019-11-05 11:58:50 +00:00
Jinsong Ji 40d0d4e233 Lower generic MASSV entries to PowerPC subtarget-specific entries
This patch (second of two patches) lowers the generic PowerPC vector
entries to PowerPC subtarget-specific entries.
For instance, the PowerPC generic entry 'cbrtd2_massv' is lowered to
'cbrtd2_P9' or Power9 subtarget.

The first patch enables the vectorizer to recognize the IBM MASS vector
library routines. This patch specifically adds support for recognizing
the '-vector-library=MASSV' option, and defines mappings from IEEE
standard scalar math functions to generic PowerPC MASS vector
counterparts.
For instance, the generic PowerPC MASS vector entry for double-precision
'cbrt' function is '__cbrtd2_massv'

The overall support for MASS vector library is presented as such in two
patches for ease of review.

Patch by pjeeva01 (Jeeva P.)
Differential Revision: https://reviews.llvm.org/D59883
2019-11-04 17:17:24 +00:00
Bjorn Pettersson 56c22931bd [LDV][RAGreedy] Inform LiveDebugVariables about new VRegs added by InlineSpiller
Summary:
Make sure RAGreedy informs LiveDebugVariables about new VRegs
that is introduced at spill by InlineSpiller.

Consider this example

 LDV: !"var"	 [48r;128r):0 Loc0=%2

 48B   %2 = ...
 ...
 128B  %7 = ADD %2, ...

If %2 is spilled the InlineSpiller will insert spill/reload
instructions and introduces some new vregs. So we get

 48B   %4 = ...
 56B   spill %4
 ...
 120B  reload %5
 128B  %3 = ADD %5, ...

In the past we did not inform LDV about this, and when reintroducing
DBG_VALUE instruction LDV still got information that "var" had the
location of the spilled register %2 for the interval [48r;128r).
The result was bad, since we mapped "var" to the spill slot even
before the spill happened:

 %4 = ...
 DBG_VALUE %spill.0, !"var"
 spill %4 to %spill.0
 ...
 reload %5
 %3 = ADD %5, ...

This patch will inform LDV about the interval split introduced
due to spilling. So the location map in LDV will become

 !"var"	[48r;56r):1 [56r;120r):0 [120r;128r):2 Loc0=%2 Loc1=%4 Loc2=%5

And when inserting DBG_VALUE instructions we get

 %4 = ...
 DBG_VALUE %4, !"var"
 spill %4 to %spill.0
 DBG_VALUE %spill.0, !"var"
 ...
 reload %5
 DBG_VALUE %5, !"var"
 %3 = ADD %5, ...

Fixes: https://bugs.llvm.org/show_bug.cgi?id=38899

Reviewers: jmorse, vsk, aprantl

Reviewed By: jmorse

Subscribers: dstenb, wuzish, MatzeB, qcolombet, nemanjai, hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69584
2019-11-01 16:25:32 +01:00
jasonliu 8bd0c97810 [PowerPC][AIX] Adds support for writing the data section in object files
Adds support for generating the XCOFF data section in object files for global variables with initialization.

Merged aix-xcoff-common.ll into aix-xcoff-data.ll.

Changed variable name charr to chrarray in the test case to test if readobj works with 8-character names.

Authored by: xingxue

Reviewers: hubert.reinterptrtcast, sfertile, jasonliu, daltenty, Xiangling_L.

Reviewed by: hubert.reinterpretcast, sfertile, daltenty.

Subscribers: DiggerLin, Wuzish, nemanjai, hiraditya, MaskRay, jsji, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67125
2019-10-30 18:44:35 +00:00
Xiangling Liao 5c9bdc79e1 [AIX] Lowering CPI/JTI/BA to MIR
Enable lowering of constant pool index, jump table index, and bloack address to MIR on AIX.

Differential Revision: https://reviews.llvm.org/D69264
2019-10-30 11:21:37 -04:00
QingShan Zhang f15cf93899 [PowerPC] Clear the sideeffect bit for those instructions that didn't have the match pattern
If the instruction have match pattern, llvm-tblgen will infer the sideeffect bit from the match pattern and it works well.
If not, the tblgen will set it as true that hurt the scheduling.

PowerPC has some instructions that didn't specify the match pattern(i.e. LXSD etc), which is manually selected post-ra according
to the register pressure. We need to clear the sideeffect flag for these instructions.

Differential Revision: https://reviews.llvm.org/D69232
2019-10-30 07:59:32 +00:00
David Zarzycki f68925d450
[X86] Make memcmp vector lowering handle arbitrary expansions
Teach combineVectorSizedSetCCEquality() to handle arbitrary memcmp
expansions but do not change any default policy for now.

This also fixes a bug in the memcmp expansion itself when large
displacements are needed.

https://reviews.llvm.org/D69507
2019-10-30 09:12:57 +02:00
Nemanja Ivanovic 25a41ad242 [PowerPC] Emit scalar fp min/max instructions
VSX provides floating point minimum and maximum instructions that conform
to IEEE semantics. This legalizes the respective nodes and emits VSX code
for them. Furthermore, on Power9 cores we have xsmaxcdp and xsmincdp
instructions that conform to language semantics for the conditional operator
even in the presence of NaNs.

Differential revision: https://reviews.llvm.org/D62993
2019-10-28 19:13:33 -05:00
Nemanja Ivanovic 97e3626070 [PowerPC] Do not emit HW loop if the body contains calls to lrint/lround
These two intrinsics are lowered to calls so should prevent the formation of
CTR loops. In a subsequent patch, we will handle all currently known intrinsics
and prevent the formation of HW loops if any unknown intrinsics are encountered.

Differential revision: https://reviews.llvm.org/D68841
2019-10-28 17:23:08 -05:00
Sean Fertile 582e3c09d4 [AIX] Refactor AIX Call Lowering to use CCState. NFCI.
This patch reworks the AIX call lowering to use CCState. Some defensive errors
are added in this patch to protect from emitting bad code for calling convention
logic that has not been implemented by design. The use of CCState follows the
precedent of other targets and enables the reuse of calling convention logic in
LowerFormalArguments, which will be rewritten to also use CCState in a late
patch.

Patch by Chris Bowler.

Differential Revision: https://reviews.llvm.org/D69101
2019-10-28 12:44:22 -04:00
Sanjay Patel 1ebd4a2e3a [DAGCombiner] widen any_ext of popcount based on target support
This enhances D69127 (rGe6c145e0548e3b3de6eab27e44e1504387cf6b53)
to handle the looser "any_extend" cast in addition to zext.

This is a prerequisite step for canonicalizing in the other direction
(narrow the popcount) in IR - PR43688:
https://bugs.llvm.org/show_bug.cgi?id=43688
2019-10-28 10:07:12 -04:00
Sanjay Patel e6c145e054 [DAGCombiner] widen zext of popcount based on target support
zext (ctpop X) --> ctpop (zext X)

This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688:
https://bugs.llvm.org/show_bug.cgi?id=43688

I'm not sure if any other targets are affected, but I found a missing fold for PPC, so added tests based on that.
The reason we widen all the way to 64-bit in these tests is because the initial DAG looks something like this:

  t5: i8 = ctpop t4
  t6: i32 = zero_extend t5  <-- created based on IR, but unused node?
    t7: i64 = zero_extend t5

Differential Revision: https://reviews.llvm.org/D69127
2019-10-25 14:10:51 -04:00
Sanjay Patel b74d7e5ccc [PowerPC] add test for popcnt with any_extend; NFC
A zext-specific variation of this case is proposed in D69127.
2019-10-25 12:43:44 -04:00
czhengsz 822059147b [PowerPC] [Peephole] fold frame offset by using index form to save add.
renamable $x6 = ADDI8 $x1, -80      ;;; 0 is replaced with -80
renamable $x6 = ADD8 killed renamable $x6, renamable $x5
STW killed renamable $r3, 4, killed renamable $x6 :: (store 4 into %ir.14, !tbaa !2)

After PEI there is a peephole opt opportunity to combine above -80 in ADDI8 with 4 in the STW to eliminate unnecessary ADD8.

Expected result:
renamable $x6 = ADDI8 $x1, -76
STWX killed renamable $r3, renamable $x5, killed renamable $x6 :: (store 4 into %ir.6, !tbaa !2)

Reviewed by: stefanp

Differential Revision: https://reviews.llvm.org/D66329
2019-10-25 04:13:30 -04:00
Kai Luo 81c2a5bb39 Test commit via git. 2019-10-25 01:36:55 +00:00
Jinsong Ji 31d3c1d8b7 [PowerPC][NFC] Remove deprecated Function Attrs comments #2 2019-10-22 21:50:50 +00:00
Jinsong Ji cf57be9d34 [PowerPC][NFC] Remove deprecated Function Attrs comments 2019-10-22 21:38:31 +00:00
Nemanja Ivanovic f2c8f3b181 [PowerPC] Turn on CR-Logical reducer pass
This re-commits r375152 which was pulled in r375233 because it broke
the EXPENSIVE_CHECKS bot on Windows.

The reason for the failure was a bug in the pass that the commit turned
on by default. This patch fixes that bug and turns the pass back on.
This patch has been verified on the buildbot that originally failed
thanks to Simon Pilgrim.

Differential revision: https://reviews.llvm.org/D52431

llvm-svn: 375497
2019-10-22 12:20:38 +00:00
Simon Pilgrim 0a803dd822 [PowerPC] Regenerate test for D52431
llvm-svn: 375435
2019-10-21 17:45:51 +00:00
Nemanja Ivanovic dd7021d466 Revert r375152 as it is causing failures on EXPENSIVE_CHECKS bot
llvm-svn: 375233
2019-10-18 13:38:46 +00:00
Nemanja Ivanovic 8a3d7c9cbd [PowerPC] Turn on CR-Logical reducer pass
Quite a while ago, we implemented a pass that will reduce the number of
CR-logical operations we emit. It does so by converting a CR-logical operation
into a branch. We have kept this off by default because it seemed to cause a
significant regression with one benchmark.
However, that regression turned out to be due to a completely unrelated
reason - AADB introducing a self-copy that is a priority-setting nop and it was
just exacerbated by this pass.

Now that we understand the reason for the only degradation, we can turn this
pass on by default. We have long since fixed the cause for the degradation.

Differential revision: https://reviews.llvm.org/D52431

llvm-svn: 375152
2019-10-17 18:24:28 +00:00
Sanjay Patel 990c43380b [PowerPC] add tests for popcount with zext; NFC
llvm-svn: 375142
2019-10-17 17:44:04 +00:00
Xiangling Liao ffe2ec5170 [AIX] TOC pseudo expansion for 64bit large + 64bit small + 32bit large models
This patch provides support for peudo ops including ADDIStocHA8, ADDIStocHA, LWZtocL,
LDtoc, LDtocL for AIX, lowering them from MIR to assembly.

Differential Revision: https://reviews.llvm.org/D68341

llvm-svn: 375113
2019-10-17 13:20:25 +00:00
Digger Lin fdfd6ab12e [XCOFF] Output object text section header and symbol entry for program code.
This is remaining part of  rG41ca91f2995b: [AIX][XCOFF] Output XCOFF
object text section header and symbol entry for rogram code.

SUMMARY:
Original form of this patch is provided by Stefan Pintillie.

1. The patch try to output program code section header , symbol entry for
 program code (PR) and Instruction into the raw text section.
2. The patch include how to alignment and layout the CSection in the text
 section.
3. The patch also reorganize the code , put some codes into a function.
 (XCOFFObjectWriter::writeSymbolTableEntryForControlSection)

Additional: We can not add raw data of text section test in the patch, If want
 to output raw text section data,it need a function description patch first.

Reviewers: hubert.reinterpretcast, sfertile, jasonliu, xingxue.
Subscribers: wuzish, nemanjai, hiraditya, MaskRay, jsjji.

Differential Revision: https://reviews.llvm.org/D66969

llvm-svn: 374923
2019-10-15 17:40:41 +00:00
Digger Lin 41ca91f299 [AIX][XCOFF] Output XCOFF object text section header and symbol entry for program code.
SUMMARY
Original form of this patch is provided by Stefan Pintillie.

The patch try to output program code section header , symbol entry for program code (PR) and Instruction into the raw text section.
The patch include how to alignment and layout the CSection in the text section.
The patch also reorganize the code , put some codes into a function(XCOFFObjectWriter::writeSymbolTableEntryForControlSection)
Additional: We can not add raw data of text section test in the patch, If want to output raw text section data,it need a function description patch first.

Reviewers: hubert.reinterpretcast, sfertile, jasonliu, xingxue.
Subscribers: wuzish, nemanjai, hiraditya, MaskRay, jsjji.

Differential Revision: https://reviews.llvm.org/D66969

llvm-svn: 374914
2019-10-15 17:09:54 +00:00
Jeremy Morse ed29dbaafa [DebugInfo] Remove some users of DBG_VALUEs IsIndirect field
This patch kills off a significant user of the "IsIndirect" field of
DBG_VALUE machine insts. Brought up in in PR41675, IsIndirect is
techncally redundant as it can be expressed by the DIExpression of a
DBG_VALUE inst, and it isn't helpful to have two ways of expressing
things.

Rather than setting IsIndirect, have DBG_VALUE creators add an extra deref
to the insts DIExpression. There should now be no appearences of
IsIndirect=True from isel down to LiveDebugVariables / VirtRegRewriter,
which is ensured by an assertion in LDVImpl::handleDebugValue. This means
we also get to delete the IsIndirect handling in LiveDebugVariables. Tests
can be upgraded by for example swapping the following IsIndirect=True
DBG_VALUE:

  DBG_VALUE $somereg, 0, !123, !DIExpression(DW_OP_foo)

With one where the indirection is in the DIExpression, by _appending_
a deref:

  DBG_VALUE $somereg, $noreg, !123, !DIExpression(DW_OP_foo, DW_OP_deref)

Which both mean the same thing. 

Most of the test changes in this patch are updates of that form; also some
changes in how the textual assembly printer handles these insts.

Differential Revision: https://reviews.llvm.org/D68945

llvm-svn: 374877
2019-10-15 10:46:24 +00:00
David Tenty 033d16cedc [AIX] Use .space instead of .zero in assembly
Summary:
The AIX system assembler does not understand .zero, so we should prefer
emitting .space.

Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68815

llvm-svn: 374564
2019-10-11 15:07:28 +00:00
Yi-Hong Lyu 2fbfb04ffe [PowerPC] Remove assertion "Shouldn't overwrite a register before it is killed"
The assertion is everzealous and fail tests like:

  renamable $x3 = LI8 0
  STD renamable $x3, 16, $x1
  renamable $x3 = LI8 0

Remove the assertion since killed flag of $x3 is not mandentory.

Differential Revision: https://reviews.llvm.org/D68344

llvm-svn: 374515
2019-10-11 05:32:29 +00:00
Chen Zheng 92e00293fd [PowerPC] add testcase for ppc loop instr form prep - NFC
llvm-svn: 374273
2019-10-10 03:00:15 +00:00
Yi-Hong Lyu 6088f84398 [NFC][CGP] Tests for making ICMP_EQ use CR result of ICMP_S(L|G)T dominators
llvm-svn: 373876
2019-10-07 05:29:11 +00:00
David Bolvansky 41c934acaf [SelectionDAG] Add tests for LKK algorithm
Added some tests testing urem and srem operations with a constant divisor.

Patch by TG908 (Tim Gymnich)

Differential Revision: https://reviews.llvm.org/D68421

llvm-svn: 373830
2019-10-05 14:29:25 +00:00
Reid Kleckner 67cfa79c01 Revert [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
This reverts r371177 (git commit f879c68755)

It caused PR43566 by removing empty, address-taken MachineBasicBlocks.
Such blocks may have references from blockaddress or other operands, and
need more consideration to be removed.

See the PR for a test case to use when relanding.

llvm-svn: 373805
2019-10-04 22:24:21 +00:00
Kevin P. Neal 68b8052121 [FPEnv] Strict FP tests should use the requisite function attributes.
A set of function attributes is required in any function that uses constrained
floating point intrinsics. None of our tests use these attributes.

This patch fixes this.

These tests have been tested against the IR verifier changes in D68233.

Reviewed by:	andrew.w.kaylor, cameron.mcinally, uweigand
Approved by:	andrew.w.kaylor
Differential Revision:	https://reviews.llvm.org/D67925

llvm-svn: 373761
2019-10-04 17:03:46 +00:00
Jinsong Ji 4a6881eabc [PowerPC] Adjust the naming and operand order of fnmsub patterns
Summary:
This is follow up patch of https://reviews.llvm.org/D67595.
Adjust naming and the Commutable operands for additional patterns
to make it easier to read.

The testcase update also show that we can save some unecessary fmr as
well.

Reviewers: #powerpc, steven.zhang, hfinkel, nemanjai

Reviewed By: #powerpc, nemanjai

Subscribers: wuzish, hiraditya, kbarton, MaskRay, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68112

llvm-svn: 373652
2019-10-03 19:36:42 +00:00
Yi-Hong Lyu c7be067974 [PowerPC] Fix SH field overflow issue
Store rlwinm Rx, Ry, 32, 0, 31 as rlwinm Rx, Ry, 0, 0, 31 and store
rldicl Rx, Ry, 64, 0 as rldicl Rx, Ry, 0, 0. Otherwise SH field is overflow and
fails assertion in assembly printing stage.

Differential Revision: https://reviews.llvm.org/D66991

llvm-svn: 373519
2019-10-02 20:25:16 +00:00
Sanjay Patel 520876d83f [PowerPC] make tests immune to improved undef handling
The fma mutate test will not exercise what it was intended to test
once we simplify those ops immediately, but the test will still
pass with the existing CHECKs, so I'm leaving it in case that
still has minimal value.

llvm-svn: 373149
2019-09-28 13:34:53 +00:00
Xiangling Liao 3b808fb330 [AIX]Emit function descriptor csect in assembly
This patch emits the function descriptor csect for functions with definitions
under both 32-bit/64-bit mode on AIX.

Differential Revision: https://reviews.llvm.org/D66724

llvm-svn: 373009
2019-09-26 19:38:32 +00:00
Jinsong Ji eaf6746db0 [PowerPC] Add missing pattern for VSX Scalar Negative Multiply-Subtract Single Precision
Summary:
This was found during review of https://reviews.llvm.org/D66050.
In the simple test of fdiv, we miss to fold
```
        fneg 2, 2
        xsmaddasp 3, 2, 0
```
to
```
        xsnmsubasp 3, 2, 0
```
We have the patterns for Double Precision and vectors, just missing
Single Precision, the patch add that.

Reviewers: #powerpc, hfinkel, nemanjai, steven.zhang

Reviewed By: #powerpc, steven.zhang

Subscribers: wuzish, hiraditya, kbarton, MaskRay, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67595

llvm-svn: 372985
2019-09-26 15:11:33 +00:00
Sean Fertile b3a9320c08 Extends the expansion of the LWZtoc pseduo op for AIX.
Differential Revision: https://reviews.llvm.org/D67853

llvm-svn: 372772
2019-09-24 18:04:51 +00:00
Jinsong Ji 216be996d6 [NFC][PowerPC] Consolidate testing of common linkage symbols
Add a new file to test the code gen for common linkage symbol.
Remove common linkage in some other testcases to avoid distraction.

llvm-svn: 372426
2019-09-20 20:31:37 +00:00
Jinsong Ji ca4c5deae5 [NFC][PowerPC] Fast-isel VSX support test
We have fixed most of the VSX limitation in Fast-isel,
so we can remove the -mattr=-vsx for most testcases now.

llvm-svn: 372345
2019-09-19 18:18:18 +00:00
Nemanja Ivanovic 1461fb6e78 [PowerPC] Exploit single instruction load-and-splat for word and doubleword
We currently produce a load, followed by (possibly a move for integers and) a
splat as separate instructions. VSX has always had a splatting load for
doublewords, but as of Power9, we have it for words as well. This patch just
exploits these instructions.

Differential revision: https://reviews.llvm.org/D63624

llvm-svn: 372139
2019-09-17 16:45:20 +00:00
Lei Huang bfb197d7a3 [PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32
This is a follow up patch from https://reviews.llvm.org/D57857 to handle
extract_subvector v4f32.  For cases where we fpext of v2f32 to v2f64 from
extract_subvector we currently generate on P9 the following:

  lxv 0, 0(3)
  xxsldwi 1, 0, 0, 1
  xscvspdpn 2, 0
  xxsldwi 3, 0, 0, 3
  xxswapd 0, 0
  xscvspdpn 1, 1
  xscvspdpn 3, 3
  xscvspdpn 0, 0
  xxmrghd 0, 0, 3
  xxmrghd 1, 2, 1
  stxv 0, 0(4)
  stxv 1, 0(5)

This patch custom lower it to the following sequence:

  lxv 0, 0(3)       # load the v4f32 <w0, w1, w2, w3>
  xxmrghw 2, 0, 0   # Produce the following vector <w0, w0, w1, w1>
  xxmrglw 3, 0, 0   # Produce the following vector <w2, w2, w3, w3>
  xvcvspdp 2, 2     # FP-extend to <d0, d1>
  xvcvspdp 3, 3     # FP-extend to <d2, d3>
  stxv 2, 0(5)      # Store <d0, d1> (%vecinit11)
  stxv 3, 0(4)      # Store <d2, d3> (%vecinit4)

Differential Revision: https://reviews.llvm.org/D61961

llvm-svn: 372029
2019-09-16 20:04:15 +00:00
Jinsong Ji 07d824a7c3 [PowerPC][NFC] Add a testcase for fdiv expansion.
Pre-commit for following patch.

llvm-svn: 371938
2019-09-15 20:02:25 +00:00
Jinsong Ji 455a0db01a [PowerPC][NFC] Move codegen tests to PowerPC from MIR/PowerPC
All tests with -run-pass !=none should not in MIR/, See MIR/README.

```
Tests for codegen passes should NOT be here but in
test/CodeGen/sometarget. As
a rule of thumb this directory should only contain tests using
'llc -run-pass none'.
```

llvm-svn: 371857
2019-09-13 14:18:36 +00:00
Craig Topper 36e04d14e9 [PowerPC] Remove the SPE4RC register class and instead add f32 to the GPRC register class.
Summary:
Since the SPE4RC register class contains an identical set of registers
and an identical spill size to the GPRC class its slightly confusing
the tablegen emitter. It's preventing the GPRC_and_GPRC_NOR0 synthesized
register class from inheriting VTs and AltOrders from GPRC or GPRC_NOR0.
This is because SPE4C is found first in the super register class list
when inheriting these properties and it doesn't set the VTs or
AltOrders the same way as GPRC or GPRC_NOR0.

This patch replaces all uses of GPE4RC with GPRC and allows GPRC and
GPRC_NOR0 to contain f32.

The test changes here are because the AltOrders are being inherited
to GPRC_NOR0 now.

Found while trying to determine if getCommonSubClass needs to take
a VT argument. It was originally added to support fp128 on x86-64,
I've changed some things about that so that it might be needed
anymore. But a PowerPC test crashed without it and I think its
due to this subclass issue.

Reviewers: jhibbits, nemanjai, kbarton, hfinkel

Subscribers: wuzish, nemanjai, mehdi_amini, hiraditya, kbarton, MaskRay, dexonsmith, jsji, shchenz, steven.zhang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67513

llvm-svn: 371779
2019-09-12 22:07:35 +00:00
Qiu Chaofan b7fb5d0f6f [DAGCombiner] Improve division estimation of floating points.
Current implementation of estimating divisions loses precision since it
estimates reciprocal first and does multiplication.  This patch is to re-order
arithmetic operations in the last iteration in DAGCombiner to improve the
accuracy.

Reviewed By: Sanjay Patel, Jinsong Ji

Differential Revision: https://reviews.llvm.org/D66050

llvm-svn: 371713
2019-09-12 07:51:24 +00:00
Guillaume Chatelet 48904e9452 [Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes mir parsing
Summary:
This catches malformed mir files which specify alignment as log2 instead of pow2.
See https://reviews.llvm.org/D65945 for reference,

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67433

llvm-svn: 371608
2019-09-11 11:16:48 +00:00
Dmitri Gribenko 2bf8d77453 Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.""
This reverts commit r371502, it broke tests
(clang/test/CodeGenCXX/auto-var-init.cpp).

llvm-svn: 371507
2019-09-10 10:39:09 +00:00
Clement Courbet 612c260ec3 Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."
With a fix for sanitizer breakage (see explanation in D60318).

llvm-svn: 371502
2019-09-10 09:18:00 +00:00
Kai Luo 73da43aeb3 [PowerPC][NFC] Update test assertions using update_llc_test_checks.py
Summary:
This patch is made due to https://reviews.llvm.org/rL371289 where typo
fixes failed.

Differential Revision: https://reviews.llvm.org/D67317

llvm-svn: 371483
2019-09-10 02:28:24 +00:00
Dmitri Gribenko d9c4060bd5 Revert "[MachineCopyPropagation] Remove redundant copies after TailDup via machine-cp"
This reverts commit 371359. I'm suspecting a miscompile, I posted a
reproducer to https://reviews.llvm.org/D65267.

llvm-svn: 371421
2019-09-09 16:46:45 +00:00
Kai Luo 9115c477bb [MachineCopyPropagation] Remove redundant copies after TailDup via machine-cp
Summary:
After tailduplication, we have redundant copies. We can remove these
copies in machine-cp if it's safe to, i.e.
```
$reg0 = OP ...
... <<< No read or clobber of $reg0 and $reg1
$reg1 = COPY $reg0 <<< $reg0 is killed
...
<RET>
```
will be transformed to
```
$reg1 = OP ...
...
<RET>
```

Differential Revision: https://reviews.llvm.org/D65267

llvm-svn: 371359
2019-09-09 02:32:42 +00:00
Bjorn Pettersson d065c81164 [CodeGen] Handle SMULFIXSAT with scale zero in TargetLowering::expandFixedPointMul
Summary:
Normally TargetLowering::expandFixedPointMul would handle
SMULFIXSAT with scale zero by using an SMULO to compute the
product and determine if saturation is needed (if overflow
happened). But if SMULO isn't custom/legal it falls through
and uses the same technique, using MULHS/SMUL_LOHI, as used
for non-zero scales.

Problem was that when checking for overflow (handling saturation)
when not using MULO we did not expect to find a zero scale. So
we ended up in an assertion when doing
  APInt::getLowBitsSet(VTSize, Scale - 1)

This patch fixes the problem by adding a new special case for
how saturation is computed when scale is zero.

Reviewers: RKSimon, bevinh, leonardchan, spatel

Reviewed By: RKSimon

Subscribers: wuzish, nemanjai, hiraditya, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67071

llvm-svn: 371309
2019-09-07 12:16:23 +00:00
Bjorn Pettersson 5e331e4ce8 [Intrinsic] Add the llvm.umul.fix.sat intrinsic
Summary:
Add an intrinsic that takes 2 unsigned integers with
the scale of them provided as the third argument and
performs fixed point multiplication on them. The
result is saturated and clamped between the largest and
smallest representable values of the first 2 operands.

This is a part of implementing fixed point arithmetic
in clang where some of the more complex operations
will be implemented as intrinsics.

Patch by: leonardchan, bjope

Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel

Reviewed By: leonardchan

Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57836

llvm-svn: 371308
2019-09-07 12:16:14 +00:00
Xing GUO ed20dcb88b Revert [CodeGen] Fix typos to run tests. NFC.
This reverts r371286 (git commit b38105bbd0)

r371286 caused build bots' failure. I'll check it.

llvm-svn: 371289
2019-09-07 05:14:47 +00:00
Xing GUO b38105bbd0 [CodeGen] Fix typos to run tests. NFC.
llvm-svn: 371286
2019-09-07 04:57:53 +00:00
Sean Fertile eaf34a983c [PowerPC][XCOFF] Remove basic test. [NFC]
Test verified that we could compile an empty module and produce an XCOFF
object file. Newer tests superssed this coverage, its safe to remove.

llvm-svn: 371247
2019-09-06 19:55:44 +00:00
Sean Fertile 74966aca35 [PowerPC][XCOFF] Verify symbol table in xcoff object files. [NFC]
Extend the common/local-common testing for object files to also verify the
symbol table now that the needed functionality has landed in llvm-readobj.

Differential Revision: https://reviews.llvm.org/D66944

llvm-svn: 371237
2019-09-06 18:56:14 +00:00
Kang Zhang f879c68755 [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
Summary:

Fix a bug of not update the jump table and recommit it again.

In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
But the `early-ret` pass is before `block-placement`, we don't want to run it again.
This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D63972

llvm-svn: 371177
2019-09-06 08:16:18 +00:00
Guillaume Chatelet aff45e4b23 [LLVM][Alignment] Make functions using log of alignment explicit
Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:

 - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
 - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
 - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,

Reviewers: lattner, thegameg, courbet

Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65945

llvm-svn: 371045
2019-09-05 10:00:22 +00:00
Alina Sbirlea 6da79ce1fe [MemorySSA] Re-enable MemorySSA use.
Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370957
2019-09-04 19:16:04 +00:00
Alina Sbirlea ccb1862bc9 [MemorySSA] Disable MemorySSA use.
Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370821
2019-09-03 21:20:46 +00:00
Alina Sbirlea e331d50534 [MemorySSA] Re-enable MemorySSA use.
Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370811
2019-09-03 19:28:37 +00:00
Jinsong Ji fb4b86af92 [PowerPC][NFC] Avoid checking non-relevant .cfi instructions
Summary:
This is brought up in
https://reviews.llvm.org/D64662?id=209923#inline-599490

CFI information are non-relevant to quite some testcases,
we should get rid of checking them when its unecessary.

This patch avoid generating cfi info in testcases that are not
testing prolog/epilog or exception handling.

Reviewers: kbarton, hfinkel, nemanjai, #powerpc

Reviewed By: hfinkel

Subscribers: MaskRay, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67016

llvm-svn: 370505
2019-08-30 19:24:25 +00:00
Jinsong Ji 54a1ad5bd7 [PowerPC][NFC] Use -mtriple in RUN line, remove target triple in tls.ll
To avoid confusion, especially when -mtriple are also added for PPC32.

llvm-svn: 370427
2019-08-30 02:57:33 +00:00
Fangrui Song 7704b54389 [PPC32] Emit R_PPC_GOT_TPREL16 instead R_PPC_GOT_TPREL16_LO
Unlike ppc64, which has ADDISgotTprelHA+LDgotTprelL pairs,
ppc32 just uses LDgotTprelL32, so it does not make lots of sense to use
_LO without a paired _HA.

Emit R_PPC_GOT_TPREL16 instead R_PPC_GOT_TPREL16_LO to match GCC, and
get better linker relocation check. Note, R_PPC_GOT_TPREL16_{HA,LO}
don't have good linker support:

(a) lld does not support R_PPC_GOT_TPREL16_{HA,LO}.
(b) Top of tree ld.bfd does not support R_PPC_GOT_REL16_HA Initial-Exec -> Local-Exec relaxation:

  // a.o
  addis 3, 3, tsd_tls@got@tprel@ha
  lwz 3, tsd_tls@got@tprel@l(3)
  add 3, 3, tsd_tls@tls
  // b.o
  .section .tdata,"awT"; .globl tsd_tls; tsd_tls:

  // ld/ld-new a.o b.o
  internal error, aborting at ../../bfd/elf32-ppc.c:7952 in ppc_elf_relocate_section

Reviewed By: adalava

Differential Revision: https://reviews.llvm.org/D66925

llvm-svn: 370426
2019-08-30 02:20:49 +00:00
Jinsong Ji 1ed7d2119e [PowerPC] Support extended mnemonics mffprwz etc.
Summary:
Reported in https://github.com/opencv/opencv/issues/15413.

We have serveral extended mnemonics for Move To/From Vector-Scalar Register Instructions
eg: mffprd,mtfprd etc.

We only support one of them, this patch add the others.

Reviewers: nemanjai, steven.zhang, hfinkel, #powerpc

Reviewed By: hfinkel

Subscribers: wuzish, qcolombet, hiraditya, kbarton, MaskRay, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66963

llvm-svn: 370411
2019-08-29 21:53:59 +00:00
Jordan Rupprecht f9f81289e6 Revert [MBP] Disable aggressive loop rotate in plain mode
This reverts r369664 (git commit 51f48295cb)

It causes many benchmark regressions, internally and in llvm's benchmark suite.

llvm-svn: 370398
2019-08-29 19:03:58 +00:00
Alina Sbirlea 4b87023bae Revert enabling MemorySSA.
Breaks sanitizers bots.

Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370397
2019-08-29 19:01:23 +00:00
Alina Sbirlea 6289ee941d [MemorySSA & LoopPassManager] Enable MemorySSA as loop dependency. Update tests.
Summary:
I'm not planning to check this in at the moment, but feedback is very welcome, in particular how this affects performance.
The feedback obtains here will guide the next steps towards enabling this.

This patch enables the use of MemorySSA in the loop pass manager.

Passes that currently use MemorySSA:
 - EarlyCSE
Passes that use MemorySSA after this patch:
 - EarlyCSE
 - LICM
 - SimpleLoopUnswitch
Loop passes that update MemorySSA (and do not use it yet, but could use it after this patch):
 - LoopInstSimplify
 - LoopSimplifyCFG
 - LoopUnswitch
 - LoopRotate
 - LoopSimplify
 - LCSSA
Loop passes that do *not* update MemorySSA:
 - IndVarSimplify
 - LoopDelete
 - LoopIdiom
 - LoopSink
 - LoopUnroll
 - LoopInterchange
 - LoopUnrollAndJam
 - LoopVectorize
 - LoopReroll
 - IRCE

Reviewers: chandlerc, george.burgess.iv, davide, sanjoy, gberry

Subscribers: jlebar, Prazek, dmgreen, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370384
2019-08-29 17:08:13 +00:00
Jinsong Ji 8b0317ad7d [PowerPC][NFC] Update fp-int-conversions-direct-moves.ll using script
Also add -ppc-asm-full-reg-names,-ppc-vsr-nums-as-vr.

llvm-svn: 370375
2019-08-29 15:38:02 +00:00
Kevin P. Neal ddf13c00ed [FPEnv] Add fptosi and fptoui constrained intrinsics.
This implements constrained floating point intrinsics for FP to signed and
unsigned integers.

Quoting from D32319:
The purpose of the constrained intrinsics is to force the optimizer to
respect the restrictions that will be necessary to support things like the
STDC FENV_ACCESS ON pragma without interfering with optimizations when
these restrictions are not needed.

Reviewed by:	Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton
Approved by:	Craig Topper
Differential Revision:	http://reviews.llvm.org/D63782

llvm-svn: 370228
2019-08-28 16:33:36 +00:00
Sanjay Patel b516f1afdd [DAGCombiner] cancel fnegs from multiplied operands of FMA
(-X) * (-Y) + Z --> X * Y + Z

This is a missing optimization that shows up as a potential regression in D66050,
so we should solve it first. We appear to be partly missing this fold in IR as well.

We do handle the simpler case already:
(-X) * (-Y) --> X * Y

And it might be beneficial to make the constraint less conservative (eg, if both
operands are cheap, but not necessarily cheaper), but that causes infinite looping
for the existing fmul transform.

Differential Revision: https://reviews.llvm.org/D66755

llvm-svn: 370071
2019-08-27 15:17:46 +00:00
Jason Liu fc056950aa Handle local commons for XCOFF object file writing
Summary:
Adds support for emitting common local global symbols to an XCOFF object file.
Local commons are emitted into the .bss section with a storage class of
C_HIDEXT.

Patch by: daltenty

Reviewers: sfertile, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D66097

llvm-svn: 370070
2019-08-27 15:14:45 +00:00
Jinsong Ji 7f536bcf22 Revert "[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks"
This reverts commit b3d258fc44.

@skatkov is reporting crash in D63972#1646303
Contacted @ZhangKang, and revert the commit on behalf of him.

llvm-svn: 370069
2019-08-27 14:59:08 +00:00
Sanjay Patel 442a5765ce [PowerPC] add tests for fma with negated ops; NFC
llvm-svn: 369923
2019-08-26 16:20:09 +00:00