Commit Graph

8129 Commits

Author SHA1 Message Date
Ayman Musa 0c2da88f82 Remove MVT:i1 xor instruction before SELECT. (Performance improvement).
Differential Revision: https://reviews.llvm.org/D23764

llvm-svn: 281308
2016-09-13 09:12:45 +00:00
Elena Demikhovsky b906df9fe5 AVX-512: Fix for PR28175 - Scalar code optimization.
Optimized (truncate (assertzext x) to i1) and anyext i1 to i8/16/32.
Optimization of this patterns is a one more step towards i1 optimization on AVX-512.

Differential Revision: https://reviews.llvm.org/D24456

llvm-svn: 281302
2016-09-13 07:57:00 +00:00
Craig Topper 4619c9e6a8 [X86] Remove masked shufpd/shufps intrinsics and autoupgrade to native vector shuffles. They were removed from clang previously but accidentally left in the backend.
llvm-svn: 281300
2016-09-13 07:40:53 +00:00
Peter Collingbourne d4135bbc30 DebugInfo: New metadata representation for global variables.
This patch reverses the edge from DIGlobalVariable to GlobalVariable.
This will allow us to more easily preserve debug info metadata when
manipulating global variables.

Fixes PR30362. A program for upgrading test cases is attached to that
bug.

Differential Revision: http://reviews.llvm.org/D20147

llvm-svn: 281284
2016-09-13 01:12:59 +00:00
Hans Wennborg 8a42d4b9cc X86: Conditional tail calls should not have isBarrier = 1
That confuses e.g. machine basic block placement, which then doesn't
realize that control can fall through a block that ends with a conditional
tail call. Instead, isBranch=1 should be set.

Also, mark EFLAGS as used by these instructions.

llvm-svn: 281281
2016-09-13 00:21:32 +00:00
Dehao Chen 9bbb941acf Lower consecutive select instructions correctly.
Summary: If consecutive select instructions are lowered separately in CGP, it will introduce redundant condition check and branches that cannot be removed by later optimization phases. This patch lowers all consecutive select instructions at the same to to avoid inefficent code as demonstrated in https://llvm.org/bugs/show_bug.cgi?id=29095

Reviewers: davidxl

Subscribers: vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D24147

llvm-svn: 281252
2016-09-12 20:23:28 +00:00
Elena Demikhovsky 57c602aad3 AVX-512: Added a test for -O0 mode. NFC.
llvm-svn: 281246
2016-09-12 19:03:21 +00:00
Elena Demikhovsky 5730bf0429 AVX-512: Simplified masked_gather_scatter test. NFC.
llvm-svn: 281244
2016-09-12 18:50:47 +00:00
Ahmed Bougacha b678219aa6 [BranchFolding] Unique added live-ins after hoisting code.
We're not supposed to have duplicate live-ins.

llvm-svn: 281224
2016-09-12 16:05:31 +00:00
Ahmed Bougacha 45bfa8772f [X86] Copy imp-uses when folding tailcall into conditional branch.
r280832 added 32-bit support for emitting conditional tail-calls, but
dropped imp-used parameter registers.  This went unnoticed until
r281113, which added 64-bit support, as this is only exposed with
parameter passing via registers.

Don't drop the imp-used parameters.

llvm-svn: 281223
2016-09-12 16:05:27 +00:00
Igor Breger a3e36da6f2 add select i1 test, reproduser pr30249.
llvm-svn: 281218
2016-09-12 15:27:02 +00:00
Elena Demikhovsky de1b494555 AVX-512: Added a test case that should be optimized in the future. NFC.
llvm-svn: 281196
2016-09-12 06:26:03 +00:00
Craig Topper 3639cda748 [AVX-512] Add test cases to demonstrate opportunities for commuting vpternlog. Commuting will be added in a future commit.
llvm-svn: 281157
2016-09-11 05:33:43 +00:00
Craig Topper fb4564cf21 [AVX-512] Add VPTERNLOG to load folding tables.
llvm-svn: 281156
2016-09-11 05:33:40 +00:00
Craig Topper 2c86705755 [X86] Side effecting asm in AVX512 integer stack folding test should return 2 x i64 not 8 x i64.
llvm-svn: 281155
2016-09-11 05:33:38 +00:00
Arnold Schwaighofer 6c57f4f56d It should also be legal to pass a swifterror parameter to a call as a swifterror
argument.

rdar://28233388

llvm-svn: 281147
2016-09-10 19:42:53 +00:00
Arnold Schwaighofer 112ff66505 We also need to pass swifterror in R12 under swiftcc not only under ccc
rdar://28190687

llvm-svn: 281138
2016-09-10 14:16:55 +00:00
Hans Wennborg 6ecf619be9 X86: Fold tail calls into conditional branches also for 64-bit (PR26302)
This extends the optimization in r280832 to also work for 64-bit. The only
quirk is that we can't do this for 64-bit Windows (yet).

Differential Revision: https://reviews.llvm.org/D24423

llvm-svn: 281113
2016-09-09 22:37:27 +00:00
Simon Pilgrim a3d1e03cd7 [X86][XOP] Fix VPERMIL2PD mask creation on 32-bit targets
Use getConstVector helper to correctly create v2i64/v4i64 constants on 32-bit targets

llvm-svn: 281105
2016-09-09 21:47:21 +00:00
Michael Kuperstein b1153848ff [X86] Regenerate test. NFC.
llvm-svn: 281099
2016-09-09 21:36:17 +00:00
Arnold Schwaighofer 7d7b4b4014 Create phi nodes for swifterror values at the end of the phi instructions list
ISel makes assumption about the order of phi nodes.

rdar://28190150

llvm-svn: 281095
2016-09-09 21:18:47 +00:00
Simon Pilgrim 153b408433 [SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar type
Fixes issue with rL280927 identified by Mikael Holmén

llvm-svn: 281042
2016-09-09 13:31:52 +00:00
Craig Topper 149e6bdc16 [AVX-512] Add VPCMP instructions to the load folding tables and make them commutable.
llvm-svn: 281013
2016-09-09 01:36:10 +00:00
Craig Topper 1a1ac11625 [AVX-512] Add more integer vector comparison tests with loads. Some of these show opportunities where we can commute to fold loads.
Commutes will be added in a followup commit.

llvm-svn: 281012
2016-09-09 01:36:04 +00:00
Michael Kuperstein ceff022800 [X86] Add more baseline tests for "irregular" shuffles. NFC.
This adds more tests for shuffles where the output width does not match
the input width and/or the output is generated from more than two inputs.

llvm-svn: 281005
2016-09-09 00:49:29 +00:00
Hans Wennborg c39ef776fc Win64: Don't use REX prefix for direct tail calls
The REX prefix should be used on indirect jmps, but not direct ones.
For direct jumps, the unwinder looks at the offset to determine if
it's inside the current function.

Differential Revision: https://reviews.llvm.org/D24359

llvm-svn: 281003
2016-09-08 23:35:10 +00:00
Simon Pilgrim cc7b4b511b [SelectionDAG] Add BUILD_VECTOR support to computeKnownBits and SimplifyDemandedBits
Add the ability to computeKnownBits and SimplifyDemandedBits to extract the known zero/one bits from BUILD_VECTOR, returning the known bits that are shared by every vector element.

This is an initial step towards determining the sign bits of a vector (PR29079).

Differential Revision: https://reviews.llvm.org/D24253

llvm-svn: 280927
2016-09-08 12:57:51 +00:00
Simon Pilgrim a01ee07a19 [DAGCombiner] Enable AND combines of splatted constant vectors
Allow AND combines to use a vector splatted constant as well as a constant scalar.

Preliminary part of D24253.

llvm-svn: 280926
2016-09-08 12:36:39 +00:00
Michael Kuperstein f79af6f8c4 [CGP] Be less conservative about tail-duplicating a ret to allow tail calls
CGP tail-duplicates rets into blocks that end with a call that feed the ret.
This puts the call in tail position, potentially allowing the DAG builder to
lower it as a tail call. To avoid tail duplication in cases where we won't
form the tail call, CGP tried to predict whether this is going to be possible,
and avoids doing it when lowering as a tail call will definitely fail.
However, it was being too conservative by always throwing away calls to
functions with a signext/zeroext attribute on the return type.

Instead, we can use the same logic the builder uses to determine whether the
attributes work out.

Differential Revision: https://reviews.llvm.org/D24315

llvm-svn: 280894
2016-09-08 00:48:37 +00:00
Elena Demikhovsky dcc86d5bb6 Shift-left (ISD::SHL) operation crashes on "DAG Legalization" phase.
https://llvm.org/bugs/show_bug.cgi?id=29058.

While node legalization we tried to legalize its operands.
If an operand node is replaced during legalization the user node may be destroyed.

Differential Revision: https://reviews.llvm.org/D24244

llvm-svn: 280862
2016-09-07 20:54:33 +00:00
Wei Mi b96ebe6a1a Rename test pr30298.ll to shrink_vmul_sse.ll, to make the name more meaningful, NFC.
Add PR number and comment in pr30298.ll to explain what is testing.

llvm-svn: 280843
2016-09-07 18:46:15 +00:00
Wei Mi f100d4e93d Don't reduce the width of vector mul if the target doesn't support SSE2.
The patch is to fix PR30298, which is caused by rL272694. The solution is to
bail out if the target has no SSE2.

Differential Revision: https://reviews.llvm.org/D24288

llvm-svn: 280837
2016-09-07 18:22:17 +00:00
Hans Wennborg 6c45f9233c Add more triple to conditional-tailcall.ll test
llvm-svn: 280835
2016-09-07 18:19:31 +00:00
Hans Wennborg 75e25f6812 X86: Fold tail calls into conditional branches where possible (PR26302)
When branching to a block that immediately tail calls, it is possible to fold
the call directly into the branch if the call is direct and there is no stack
adjustment, saving one byte.

Example:

  define void @f(i32 %x, i32 %y) {
  entry:
    %p = icmp eq i32 %x, %y
    br i1 %p, label %bb1, label %bb2
  bb1:
    tail call void @foo()
    ret void
  bb2:
    tail call void @bar()
    ret void
  }

before:

  f:
          movl    4(%esp), %eax
          cmpl    8(%esp), %eax
          jne     .LBB0_2
          jmp     foo
  .LBB0_2:
          jmp     bar

after:

  f:
          movl    4(%esp), %eax
          cmpl    8(%esp), %eax
          jne     bar
  .LBB0_1:
          jmp     foo

I don't expect any significant size savings from this (on a Clang bootstrap I
saw 288 bytes), but it does make the code a little tighter.

This patch only does 32-bit, but 64-bit would work similarly.

Differential Revision: https://reviews.llvm.org/D24108

llvm-svn: 280832
2016-09-07 17:52:14 +00:00
Simon Pilgrim d311311beb Fix typo in test - it should be masking bits0-15 not bit16
llvm-svn: 280816
2016-09-07 15:19:07 +00:00
Simon Pilgrim 32cfa5ba83 [X86][SSE] Added or combine tests for known bits of vectors
Part of the yak shaving for D24253

llvm-svn: 280813
2016-09-07 14:49:50 +00:00
Simon Pilgrim 65cdc058b6 [X86][SSE] Added and+or+zext combine tests for known bits of vectors
Part of the yak shaving for D24253

llvm-svn: 280810
2016-09-07 14:00:52 +00:00
Simon Pilgrim 2415144425 [X86][SSE] Added and+or combine tests currently failing with vectors
(and (or x, C), D) -> D if (C & D) == D

Part of the yak shaving for D24253

llvm-svn: 280809
2016-09-07 13:40:03 +00:00
Elena Demikhovsky f0ddd1b8b5 AVX512F: FMA intrinsic + FNEG - sequence optimization
The previous commit (r280368 - https://reviews.llvm.org/D23313) does not cover AVX-512F, KNL set.
FNEG(x) operation is lowered to (bitcast (vpxor (bitcast x), (bitcast constfp(0x80000000))).
It happens because FP XOR is not supported for 512-bit data types on KNL and we use integer XOR instead.
I added pattern match for integer XOR.

Differential Revision: https://reviews.llvm.org/D24221

llvm-svn: 280785
2016-09-07 06:54:28 +00:00
Craig Topper b880ad3a71 [AVX-512] Add support for commuting masked instructions in findCommutedOpIndices. The default implementation doesn't skip the mask input or the preserved input.
llvm-svn: 280781
2016-09-07 04:46:11 +00:00
Simon Pilgrim 1b4462b7c1 [SelectionDAG] Simplify extract_subvector( insert_subvector ( Vec, In, Idx ), Idx ) -> In
If we are extracting a subvector that has just been inserted then we should just use the original inserted subvector.

This has come up in certain several x86 shuffle lowering cases where we are crossing 128-bit lanes.

Differential Revision: https://reviews.llvm.org/D24254

llvm-svn: 280715
2016-09-06 16:42:05 +00:00
Craig Topper 4fa3b50fc3 [AVX-512] Fix masked VPERMI2PS isel when the index comes from a bitcast.
We need to bitcast the index operand to a floating point type so that it matches the result type. If not then the passthru part of the DAG will be a bitcast from the index's original type to the destination type. This makes it very difficult to match. The other option would be to add 5 sets of patterns for every other possible type.

llvm-svn: 280696
2016-09-06 06:56:59 +00:00
Craig Topper cf9f1b8dfa [AVX-512] Add a test case to show that we don't select masked vpermi2ps when the index operand comes from a bitcast.
It doesn't work because we're looking for a bitcast from the v4i32 index operand to v4f32 for the passthru part of the DAG. But since the index is bitcasted from v2i64 and bitcasts fold, we actually have a bitcast from v2i64 to v4f32 in the passthru part of the DAG.

Taken from optimized output from clang's test case.

llvm-svn: 280695
2016-09-06 05:45:27 +00:00
Craig Topper 62d0a5e7d3 [AVX-512] Fix v8i64 shift by immediate lowering on 32-bit targets.
llvm-svn: 280684
2016-09-06 00:31:10 +00:00
Craig Topper dfc4fc9f02 [AVX-512] Teach fastisel load/store handling to use EVEX encoded instructions for 128/256-bit vectors and scalar single/double.
Still need to fix the register classes to allow the extended range of registers.

llvm-svn: 280682
2016-09-05 23:58:40 +00:00
Craig Topper 70e1348031 [X86] Update fast-isel store test to have more 256 and 512-bit test cases. Add command lines for AVX and AVX512 feature sets.
llvm-svn: 280681
2016-09-05 23:58:37 +00:00
Craig Topper f54ebca2dd [X86] Update fast-isel vector load test to have more 256 and 512-bit test cases. Add a command line for SKX features too.
llvm-svn: 280680
2016-09-05 23:58:32 +00:00
Simon Pilgrim 6dd4b3526a [X86][SSE] Add test cases for PR29078
'Failure to recognise i64 sitofp/uitofp conversions that can be performed as i32'

llvm-svn: 280671
2016-09-05 18:11:17 +00:00
Simon Pilgrim efab350967 [X86][SSE] Add test cases for PR29079
'Failure to recognise uitofp conversions that can be performed as sitofp'

llvm-svn: 280670
2016-09-05 18:04:38 +00:00
Simon Pilgrim 5f14ff75ec [X86][SSE] Regenerate odd shuffle tests with common prefixes
llvm-svn: 280661
2016-09-05 14:15:38 +00:00