Commit Graph

2136 Commits

Author SHA1 Message Date
Lei Huang bfb197d7a3 [PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32
This is a follow up patch from https://reviews.llvm.org/D57857 to handle
extract_subvector v4f32.  For cases where we fpext of v2f32 to v2f64 from
extract_subvector we currently generate on P9 the following:

  lxv 0, 0(3)
  xxsldwi 1, 0, 0, 1
  xscvspdpn 2, 0
  xxsldwi 3, 0, 0, 3
  xxswapd 0, 0
  xscvspdpn 1, 1
  xscvspdpn 3, 3
  xscvspdpn 0, 0
  xxmrghd 0, 0, 3
  xxmrghd 1, 2, 1
  stxv 0, 0(4)
  stxv 1, 0(5)

This patch custom lower it to the following sequence:

  lxv 0, 0(3)       # load the v4f32 <w0, w1, w2, w3>
  xxmrghw 2, 0, 0   # Produce the following vector <w0, w0, w1, w1>
  xxmrglw 3, 0, 0   # Produce the following vector <w2, w2, w3, w3>
  xvcvspdp 2, 2     # FP-extend to <d0, d1>
  xvcvspdp 3, 3     # FP-extend to <d2, d3>
  stxv 2, 0(5)      # Store <d0, d1> (%vecinit11)
  stxv 3, 0(4)      # Store <d2, d3> (%vecinit4)

Differential Revision: https://reviews.llvm.org/D61961

llvm-svn: 372029
2019-09-16 20:04:15 +00:00
Jinsong Ji 07d824a7c3 [PowerPC][NFC] Add a testcase for fdiv expansion.
Pre-commit for following patch.

llvm-svn: 371938
2019-09-15 20:02:25 +00:00
Jinsong Ji 455a0db01a [PowerPC][NFC] Move codegen tests to PowerPC from MIR/PowerPC
All tests with -run-pass !=none should not in MIR/, See MIR/README.

```
Tests for codegen passes should NOT be here but in
test/CodeGen/sometarget. As
a rule of thumb this directory should only contain tests using
'llc -run-pass none'.
```

llvm-svn: 371857
2019-09-13 14:18:36 +00:00
Craig Topper 36e04d14e9 [PowerPC] Remove the SPE4RC register class and instead add f32 to the GPRC register class.
Summary:
Since the SPE4RC register class contains an identical set of registers
and an identical spill size to the GPRC class its slightly confusing
the tablegen emitter. It's preventing the GPRC_and_GPRC_NOR0 synthesized
register class from inheriting VTs and AltOrders from GPRC or GPRC_NOR0.
This is because SPE4C is found first in the super register class list
when inheriting these properties and it doesn't set the VTs or
AltOrders the same way as GPRC or GPRC_NOR0.

This patch replaces all uses of GPE4RC with GPRC and allows GPRC and
GPRC_NOR0 to contain f32.

The test changes here are because the AltOrders are being inherited
to GPRC_NOR0 now.

Found while trying to determine if getCommonSubClass needs to take
a VT argument. It was originally added to support fp128 on x86-64,
I've changed some things about that so that it might be needed
anymore. But a PowerPC test crashed without it and I think its
due to this subclass issue.

Reviewers: jhibbits, nemanjai, kbarton, hfinkel

Subscribers: wuzish, nemanjai, mehdi_amini, hiraditya, kbarton, MaskRay, dexonsmith, jsji, shchenz, steven.zhang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67513

llvm-svn: 371779
2019-09-12 22:07:35 +00:00
Qiu Chaofan b7fb5d0f6f [DAGCombiner] Improve division estimation of floating points.
Current implementation of estimating divisions loses precision since it
estimates reciprocal first and does multiplication.  This patch is to re-order
arithmetic operations in the last iteration in DAGCombiner to improve the
accuracy.

Reviewed By: Sanjay Patel, Jinsong Ji

Differential Revision: https://reviews.llvm.org/D66050

llvm-svn: 371713
2019-09-12 07:51:24 +00:00
Guillaume Chatelet 48904e9452 [Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes mir parsing
Summary:
This catches malformed mir files which specify alignment as log2 instead of pow2.
See https://reviews.llvm.org/D65945 for reference,

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67433

llvm-svn: 371608
2019-09-11 11:16:48 +00:00
Dmitri Gribenko 2bf8d77453 Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.""
This reverts commit r371502, it broke tests
(clang/test/CodeGenCXX/auto-var-init.cpp).

llvm-svn: 371507
2019-09-10 10:39:09 +00:00
Clement Courbet 612c260ec3 Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."
With a fix for sanitizer breakage (see explanation in D60318).

llvm-svn: 371502
2019-09-10 09:18:00 +00:00
Kai Luo 73da43aeb3 [PowerPC][NFC] Update test assertions using update_llc_test_checks.py
Summary:
This patch is made due to https://reviews.llvm.org/rL371289 where typo
fixes failed.

Differential Revision: https://reviews.llvm.org/D67317

llvm-svn: 371483
2019-09-10 02:28:24 +00:00
Dmitri Gribenko d9c4060bd5 Revert "[MachineCopyPropagation] Remove redundant copies after TailDup via machine-cp"
This reverts commit 371359. I'm suspecting a miscompile, I posted a
reproducer to https://reviews.llvm.org/D65267.

llvm-svn: 371421
2019-09-09 16:46:45 +00:00
Kai Luo 9115c477bb [MachineCopyPropagation] Remove redundant copies after TailDup via machine-cp
Summary:
After tailduplication, we have redundant copies. We can remove these
copies in machine-cp if it's safe to, i.e.
```
$reg0 = OP ...
... <<< No read or clobber of $reg0 and $reg1
$reg1 = COPY $reg0 <<< $reg0 is killed
...
<RET>
```
will be transformed to
```
$reg1 = OP ...
...
<RET>
```

Differential Revision: https://reviews.llvm.org/D65267

llvm-svn: 371359
2019-09-09 02:32:42 +00:00
Bjorn Pettersson d065c81164 [CodeGen] Handle SMULFIXSAT with scale zero in TargetLowering::expandFixedPointMul
Summary:
Normally TargetLowering::expandFixedPointMul would handle
SMULFIXSAT with scale zero by using an SMULO to compute the
product and determine if saturation is needed (if overflow
happened). But if SMULO isn't custom/legal it falls through
and uses the same technique, using MULHS/SMUL_LOHI, as used
for non-zero scales.

Problem was that when checking for overflow (handling saturation)
when not using MULO we did not expect to find a zero scale. So
we ended up in an assertion when doing
  APInt::getLowBitsSet(VTSize, Scale - 1)

This patch fixes the problem by adding a new special case for
how saturation is computed when scale is zero.

Reviewers: RKSimon, bevinh, leonardchan, spatel

Reviewed By: RKSimon

Subscribers: wuzish, nemanjai, hiraditya, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67071

llvm-svn: 371309
2019-09-07 12:16:23 +00:00
Bjorn Pettersson 5e331e4ce8 [Intrinsic] Add the llvm.umul.fix.sat intrinsic
Summary:
Add an intrinsic that takes 2 unsigned integers with
the scale of them provided as the third argument and
performs fixed point multiplication on them. The
result is saturated and clamped between the largest and
smallest representable values of the first 2 operands.

This is a part of implementing fixed point arithmetic
in clang where some of the more complex operations
will be implemented as intrinsics.

Patch by: leonardchan, bjope

Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel

Reviewed By: leonardchan

Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57836

llvm-svn: 371308
2019-09-07 12:16:14 +00:00
Xing GUO ed20dcb88b Revert [CodeGen] Fix typos to run tests. NFC.
This reverts r371286 (git commit b38105bbd0)

r371286 caused build bots' failure. I'll check it.

llvm-svn: 371289
2019-09-07 05:14:47 +00:00
Xing GUO b38105bbd0 [CodeGen] Fix typos to run tests. NFC.
llvm-svn: 371286
2019-09-07 04:57:53 +00:00
Sean Fertile eaf34a983c [PowerPC][XCOFF] Remove basic test. [NFC]
Test verified that we could compile an empty module and produce an XCOFF
object file. Newer tests superssed this coverage, its safe to remove.

llvm-svn: 371247
2019-09-06 19:55:44 +00:00
Sean Fertile 74966aca35 [PowerPC][XCOFF] Verify symbol table in xcoff object files. [NFC]
Extend the common/local-common testing for object files to also verify the
symbol table now that the needed functionality has landed in llvm-readobj.

Differential Revision: https://reviews.llvm.org/D66944

llvm-svn: 371237
2019-09-06 18:56:14 +00:00
Kang Zhang f879c68755 [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
Summary:

Fix a bug of not update the jump table and recommit it again.

In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
But the `early-ret` pass is before `block-placement`, we don't want to run it again.
This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D63972

llvm-svn: 371177
2019-09-06 08:16:18 +00:00
Guillaume Chatelet aff45e4b23 [LLVM][Alignment] Make functions using log of alignment explicit
Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:

 - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
 - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
 - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,

Reviewers: lattner, thegameg, courbet

Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65945

llvm-svn: 371045
2019-09-05 10:00:22 +00:00
Alina Sbirlea 6da79ce1fe [MemorySSA] Re-enable MemorySSA use.
Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370957
2019-09-04 19:16:04 +00:00
Alina Sbirlea ccb1862bc9 [MemorySSA] Disable MemorySSA use.
Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370821
2019-09-03 21:20:46 +00:00
Alina Sbirlea e331d50534 [MemorySSA] Re-enable MemorySSA use.
Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370811
2019-09-03 19:28:37 +00:00
Jinsong Ji fb4b86af92 [PowerPC][NFC] Avoid checking non-relevant .cfi instructions
Summary:
This is brought up in
https://reviews.llvm.org/D64662?id=209923#inline-599490

CFI information are non-relevant to quite some testcases,
we should get rid of checking them when its unecessary.

This patch avoid generating cfi info in testcases that are not
testing prolog/epilog or exception handling.

Reviewers: kbarton, hfinkel, nemanjai, #powerpc

Reviewed By: hfinkel

Subscribers: MaskRay, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67016

llvm-svn: 370505
2019-08-30 19:24:25 +00:00
Jinsong Ji 54a1ad5bd7 [PowerPC][NFC] Use -mtriple in RUN line, remove target triple in tls.ll
To avoid confusion, especially when -mtriple are also added for PPC32.

llvm-svn: 370427
2019-08-30 02:57:33 +00:00
Fangrui Song 7704b54389 [PPC32] Emit R_PPC_GOT_TPREL16 instead R_PPC_GOT_TPREL16_LO
Unlike ppc64, which has ADDISgotTprelHA+LDgotTprelL pairs,
ppc32 just uses LDgotTprelL32, so it does not make lots of sense to use
_LO without a paired _HA.

Emit R_PPC_GOT_TPREL16 instead R_PPC_GOT_TPREL16_LO to match GCC, and
get better linker relocation check. Note, R_PPC_GOT_TPREL16_{HA,LO}
don't have good linker support:

(a) lld does not support R_PPC_GOT_TPREL16_{HA,LO}.
(b) Top of tree ld.bfd does not support R_PPC_GOT_REL16_HA Initial-Exec -> Local-Exec relaxation:

  // a.o
  addis 3, 3, tsd_tls@got@tprel@ha
  lwz 3, tsd_tls@got@tprel@l(3)
  add 3, 3, tsd_tls@tls
  // b.o
  .section .tdata,"awT"; .globl tsd_tls; tsd_tls:

  // ld/ld-new a.o b.o
  internal error, aborting at ../../bfd/elf32-ppc.c:7952 in ppc_elf_relocate_section

Reviewed By: adalava

Differential Revision: https://reviews.llvm.org/D66925

llvm-svn: 370426
2019-08-30 02:20:49 +00:00
Jinsong Ji 1ed7d2119e [PowerPC] Support extended mnemonics mffprwz etc.
Summary:
Reported in https://github.com/opencv/opencv/issues/15413.

We have serveral extended mnemonics for Move To/From Vector-Scalar Register Instructions
eg: mffprd,mtfprd etc.

We only support one of them, this patch add the others.

Reviewers: nemanjai, steven.zhang, hfinkel, #powerpc

Reviewed By: hfinkel

Subscribers: wuzish, qcolombet, hiraditya, kbarton, MaskRay, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66963

llvm-svn: 370411
2019-08-29 21:53:59 +00:00
Jordan Rupprecht f9f81289e6 Revert [MBP] Disable aggressive loop rotate in plain mode
This reverts r369664 (git commit 51f48295cb)

It causes many benchmark regressions, internally and in llvm's benchmark suite.

llvm-svn: 370398
2019-08-29 19:03:58 +00:00
Alina Sbirlea 4b87023bae Revert enabling MemorySSA.
Breaks sanitizers bots.

Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370397
2019-08-29 19:01:23 +00:00
Alina Sbirlea 6289ee941d [MemorySSA & LoopPassManager] Enable MemorySSA as loop dependency. Update tests.
Summary:
I'm not planning to check this in at the moment, but feedback is very welcome, in particular how this affects performance.
The feedback obtains here will guide the next steps towards enabling this.

This patch enables the use of MemorySSA in the loop pass manager.

Passes that currently use MemorySSA:
 - EarlyCSE
Passes that use MemorySSA after this patch:
 - EarlyCSE
 - LICM
 - SimpleLoopUnswitch
Loop passes that update MemorySSA (and do not use it yet, but could use it after this patch):
 - LoopInstSimplify
 - LoopSimplifyCFG
 - LoopUnswitch
 - LoopRotate
 - LoopSimplify
 - LCSSA
Loop passes that do *not* update MemorySSA:
 - IndVarSimplify
 - LoopDelete
 - LoopIdiom
 - LoopSink
 - LoopUnroll
 - LoopInterchange
 - LoopUnrollAndJam
 - LoopVectorize
 - LoopReroll
 - IRCE

Reviewers: chandlerc, george.burgess.iv, davide, sanjoy, gberry

Subscribers: jlebar, Prazek, dmgreen, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370384
2019-08-29 17:08:13 +00:00
Jinsong Ji 8b0317ad7d [PowerPC][NFC] Update fp-int-conversions-direct-moves.ll using script
Also add -ppc-asm-full-reg-names,-ppc-vsr-nums-as-vr.

llvm-svn: 370375
2019-08-29 15:38:02 +00:00
Kevin P. Neal ddf13c00ed [FPEnv] Add fptosi and fptoui constrained intrinsics.
This implements constrained floating point intrinsics for FP to signed and
unsigned integers.

Quoting from D32319:
The purpose of the constrained intrinsics is to force the optimizer to
respect the restrictions that will be necessary to support things like the
STDC FENV_ACCESS ON pragma without interfering with optimizations when
these restrictions are not needed.

Reviewed by:	Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton
Approved by:	Craig Topper
Differential Revision:	http://reviews.llvm.org/D63782

llvm-svn: 370228
2019-08-28 16:33:36 +00:00
Sanjay Patel b516f1afdd [DAGCombiner] cancel fnegs from multiplied operands of FMA
(-X) * (-Y) + Z --> X * Y + Z

This is a missing optimization that shows up as a potential regression in D66050,
so we should solve it first. We appear to be partly missing this fold in IR as well.

We do handle the simpler case already:
(-X) * (-Y) --> X * Y

And it might be beneficial to make the constraint less conservative (eg, if both
operands are cheap, but not necessarily cheaper), but that causes infinite looping
for the existing fmul transform.

Differential Revision: https://reviews.llvm.org/D66755

llvm-svn: 370071
2019-08-27 15:17:46 +00:00
Jason Liu fc056950aa Handle local commons for XCOFF object file writing
Summary:
Adds support for emitting common local global symbols to an XCOFF object file.
Local commons are emitted into the .bss section with a storage class of
C_HIDEXT.

Patch by: daltenty

Reviewers: sfertile, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D66097

llvm-svn: 370070
2019-08-27 15:14:45 +00:00
Jinsong Ji 7f536bcf22 Revert "[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks"
This reverts commit b3d258fc44.

@skatkov is reporting crash in D63972#1646303
Contacted @ZhangKang, and revert the commit on behalf of him.

llvm-svn: 370069
2019-08-27 14:59:08 +00:00
Sanjay Patel 442a5765ce [PowerPC] add tests for fma with negated ops; NFC
llvm-svn: 369923
2019-08-26 16:20:09 +00:00
Zi Xuan Wu e18aa1e0a2 [NFC][Regalloc] Add testcases for D66576
llvm-svn: 369877
2019-08-26 05:06:30 +00:00
Xing Xue ef039a3ccd [PowerPC][AIX] Adds support for writing the .data section in assembly files
Summary:
Adds support for generating the .data section in assembly files for global variables with a non-zero initialization. The support for writing the .data section in XCOFF object files will be added in a follow-on patch. Any relocations are not included in this patch.

Reviewers: hubert.reinterpretcast, sfertile, jasonliu, daltenty, Xiangling_L

Reviewed by: hubert.reinterpretcast

Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, wuzish, shchenz, DiggerLin, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66154

llvm-svn: 369869
2019-08-25 15:17:25 +00:00
Roland Froese b4051e57b1 [PowerPC] Expand v1i128 smin
The smin opcode and friends for v1i128 are incorrectly marked as legal for PPC.
Change them to expand.

Differential Revision: https://reviews.llvm.org/D64960

llvm-svn: 369797
2019-08-23 19:04:47 +00:00
Amaury Sechet 57ae79d7a2 [PowerPC] Automatically generate various tests. NFC
llvm-svn: 369754
2019-08-23 13:30:45 +00:00
Amaury Sechet 0ddb0e9fcb [PowerPC] Automatically generate vec_buildvector_loadstore.ll . NFC
llvm-svn: 369703
2019-08-22 20:42:50 +00:00
Amaury Sechet cac5274b20 [PowerPC] Automatically generate various tests. NFC
llvm-svn: 369700
2019-08-22 20:26:56 +00:00
Guozhi Wei 51f48295cb [MBP] Disable aggressive loop rotate in plain mode
Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse.

To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true.

Differential Revision: https://reviews.llvm.org/D65673

llvm-svn: 369664
2019-08-22 16:21:32 +00:00
Amaury Sechet 95cf66de7c [DAGCombiner] Remove explicit call to AddToWorklist in sqrt and reciprocal computations
Summary: These nodes end up being processed regardless due to DAGCombiner ensuring arguments are processed. This changes the order in which nodes are processed, which fixes an issue on PowerPC.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri, mcberg2017, stefanp, hfinkel

Subscribers: nemanjai, MaskRay, jsji, steven.zhang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66548

llvm-svn: 369662
2019-08-22 15:35:45 +00:00
Simon Pilgrim ab2f68d5ad [PowerPC] Regenerate reciprocal tests, as discussed on D66548
llvm-svn: 369659
2019-08-22 15:14:52 +00:00
Sean Fertile 1e46d4cec5 Adds support for writing the .bss section for XCOFF object files.
Adds Wrapper classes for MCSymbol and MCSection into the XCOFF target
object writer. Also adds a class to represent the top-level sections, which we
materialize in the ObjectWriter.

executePostLayoutBinding will map all csects into the appropriate
container depending on its storage mapping class, and map all symbols
into their containing csect. Once all symbols have been processed we
- Assign addresses and symbol table indices.
- Calaculte section sizes.
- Build the section header table.
- Assign the sections raw-pointer value for non-virtual sections.

Since the .bss section is virtual, writing the header table is enough to
add support. Writing of a sections raw data, or of any relocations is
not included in this patch.

Testing is done by dumping the section header table, but it needs to be
extended to include dumping the symbol table once readobj support for
dumping auxiallary entries lands.

Differential Revision: https://reviews.llvm.org/D65159

llvm-svn: 369454
2019-08-20 22:03:18 +00:00
Jinsong Ji 0776da5236 [PeepholeOptimizer] Don't assume bitcast def always has input
Summary:
If we have a MI marked with bitcast bits, but without input operands,
PeepholeOptimizer might crash with assert.

eg:
If we apply the changes in PPCInstrVSX.td as in this patch:

[(set v4i32:$XT, (bitconvert (v16i8 immAllOnesV)))]>;

We will get assert in PeepholeOptimizer.

```
llvm-lit llvm-project/llvm/test/CodeGen/PowerPC/build-vector-tests.ll -v

llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:417: const
llvm::MachineOperand &llvm::MachineInstr::getOperand(unsigned int)
const: Assertion `i < getNumOperands() && "getOperand() out of range!"'
failed.
```

The fix is to abort if we found out of bound access.

Reviewers: qcolombet, MatzeB, hfinkel, arsenm

Reviewed By: qcolombet

Subscribers: wdng, arsenm, steven.zhang, wuzish, nemanjai, hiraditya, kbarton, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65542

llvm-svn: 369261
2019-08-19 14:19:04 +00:00
Kang Zhang b3d258fc44 [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
Summary:

Fix a bug of preducessors.

In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
But the `early-ret` pass is before `block-placement`, we don't want to run it again.
This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D63972

llvm-svn: 369191
2019-08-17 14:37:05 +00:00
Florian Hahn 403e85cbc5 Revert [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
This reverts r368997 (git commit 2a903c0b67)

It looks like this commit adds invalid predecessors to MBBs. The example
below fails the verifier after MachineBlockPlacement (run llc
-verify-machineinstrs):

@global.4 = external constant i8*

declare i32 @zot(...)

define i16* @snork.67() personality i8* bitcast (i32 (...)* @zot to i8*) {
bb:
  invoke void undef()
          to label %bb5 unwind label %bb4

bb4:                                              ; preds = %bb
  %tmp = landingpad { i8*, i32 }
          catch i8* null
  unreachable

bb5:                                              ; preds = %bb
  %tmp6 = load i32, i32* null, align 4
  %tmp7 = icmp eq i32 %tmp6, 0
  br i1 %tmp7, label %bb14, label %bb8

bb8:                                              ; preds = %bb11, %bb5
  invoke void undef()
          to label %bb9 unwind label %bb11

bb9:                                              ; preds = %bb8
  %tmp10 = invoke i16* undef()
          to label %bb14 unwind label %bb11

bb11:                                             ; preds = %bb9, %bb8
  %tmp12 = landingpad { i8*, i32 }
          cleanup
          catch i8* bitcast (i8** @global.4 to i8*)
  %tmp13 = icmp ult i64 undef, undef
  br i1 %tmp13, label %bb8, label %bb14

bb14:                                             ; preds = %bb11, %bb9, %bb5
  %tmp15 = phi i16* [ null, %bb5 ], [ null, %bb11 ], [ %tmp10, %bb9 ]
  ret i16* %tmp15
}

llvm-svn: 369104
2019-08-16 13:19:29 +00:00
Chen Zheng 02cbdbdabf [PowerPC] add testcases for folding frame offset - NFC
llvm-svn: 369077
2019-08-16 01:52:50 +00:00
Jinsong Ji 9fd81dc139 [PowerPC] Use xxleqv to set all one vector IMM(-1).
Summary:
xxspltib/vspltisb are 3 cycle PM instructions,
xxleqv is 2 cycle ALU instruction.

We should use xxleqv to set all one vectors.

Reviewers: hfinkel, nemanjai, steven.zhang

Subscribers: hiraditya, kbarton, MaskRay, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65529

llvm-svn: 369006
2019-08-15 14:32:51 +00:00