Commit Graph

2188 Commits

Author SHA1 Message Date
Anna Thomas c858faa244 Revert "Invariant start/end intrinsics overloaded for address space"
This reverts commit r276316.

llvm-svn: 276320
2016-07-21 19:06:28 +00:00
Anna Thomas 29b24dfe44 Invariant start/end intrinsics overloaded for address space
Summary:
The llvm.invariant.start and llvm.invariant.end intrinsics currently
support specifying invariant memory objects only in the default address space.

With this change, these intrinsics are overloaded for any adddress space for memory objects
and we can use these llvm invariant intrinsics in non-default address spaces.

Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr)

This overloaded intrinsic is needed for representing final or invariant memory in managed languages.

Reviewers: tstellarAMD, reames, apilipenko

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22519

llvm-svn: 276316
2016-07-21 18:41:44 +00:00
Sanjay Patel 0753c06d9c [InstCombine] LogicOpc (zext X), C --> zext (LogicOpc X, C) (PR28476)
The benefits of this change include:
1. Remove DeMorgan-matching code that was added specifically to work-around 
   the missing transform in http://reviews.llvm.org/rL248634.
2. Makes the DeMorgan transform work for vectors too.
3. Fix PR28476: https://llvm.org/bugs/show_bug.cgi?id=28476

Extending this transform to other casts and other associative operators may
be useful too. See https://reviews.llvm.org/D22421 for a prerequisite for
doing that though.

Differential Revision: https://reviews.llvm.org/D22271

llvm-svn: 276221
2016-07-21 00:24:18 +00:00
Sanjay Patel 5f3c70307d [InstSimplify][InstCombine] don't crash when folding vector selects of icmp
Differential Revision: https://reviews.llvm.org/D22602

llvm-svn: 276209
2016-07-20 23:40:01 +00:00
Sanjay Patel c0812702f8 minimize tests and auto-generate checks
llvm-svn: 276147
2016-07-20 17:58:20 +00:00
Benjamin Kramer b4d64cf27d Revert "[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))"
Makes InstCombine infloop when compiling v8.

This reverts commit r275989 and r276105.

llvm-svn: 276106
2016-07-20 11:40:16 +00:00
Tobias Grosser 8c6201b49f [InstCombine] Provide more test cases for cast-folding [NFC]
Summary: In r275989 we enabled the folding of `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))`. Here we add more test cases to assure this folding works for all logical operations `and`/`or`/`xor`.

Reviewers: grosser

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22561

Contributed-by: Matthias Reisinger
llvm-svn: 276105
2016-07-20 11:24:27 +00:00
Sanjay Patel d4ea94eb94 regenerate checks
llvm-svn: 276042
2016-07-19 22:32:15 +00:00
Sanjay Patel 2d477e59e8 [InstCombine] fold add(zext(xor X, C), C) --> sext X when C is INT_MIN in the source type
The pattern may look more obviously like a sext if written as:

  define i32 @g(i16 %x) {
    %zext = zext i16 %x to i32
    %xor = xor i32 %zext, 32768
    %add = add i32 %xor, -32768
    ret i32 %add
  }

We already have that fold in visitAdd().

Differential Revision: https://reviews.llvm.org/D22477

llvm-svn: 276035
2016-07-19 22:09:34 +00:00
Tobias Grosser 1c38262279 [InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))
Summary:
Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one:

> Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp.

Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check:

`if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) && ...`

This check seems to sort out more cases than necessary because:
- the reverse transformation is obviously done for `or` instructions only
- and also not every `zext icmp` pair is necessarily the result of this reverse transformation

Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`).

As an example, consider the following IR snippet

```
%1 = icmp sgt i64 %a, %b
%2 = zext i1 %1 to i8
%3 = icmp slt i64 %a, %c
%4 = zext i1 %3 to i8
%5 = and i8 %2, %4
```

which would now be transformed to

```
%1 = icmp sgt i64 %a, %b
%2 = icmp slt i64 %a, %c
%3 = and i1 %1, %2
%4 = zext i1 %3 to i8
```

This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code.

Reviewers: grosser, vtjnash, majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D22511

Contributed-by: Matthias Reisinger
llvm-svn: 275989
2016-07-19 16:39:17 +00:00
Sanjay Patel dbf44f5016 add tests for missed sext transform
llvm-svn: 275908
2016-07-18 20:37:51 +00:00
Sanjay Patel 79acd2a96b [InstCombine] allow X + signbit --> X ^ signbit for vector splats
llvm-svn: 275691
2016-07-16 18:29:26 +00:00
Sanjay Patel 040bd16e56 add vector test to show missing transform
llvm-svn: 275690
2016-07-16 18:24:18 +00:00
Sanjay Patel eb50476f77 update tests to use FileCheck, consolidate tests, fix comments
llvm-svn: 275688
2016-07-16 18:08:22 +00:00
Sanjay Patel e540a31f59 update test to use FileCheck
llvm-svn: 275687
2016-07-16 16:31:58 +00:00
Sanjay Patel 972a53fb42 auto-generate checks
llvm-svn: 275686
2016-07-16 16:27:58 +00:00
Sanjay Patel 309f98ef3a auto-ggenerate checks
llvm-svn: 275685
2016-07-16 16:24:06 +00:00
Sanjay Patel f9d2b20daf [InstCombine] reassociate logic ops with constants separated by a zext
This is a partial implementation of a general fold for associative+commutative operators:
(op (cast (op X, C2)), C1) --> (cast (op X, op (C1, C2)))
(op (cast (op X, C2)), C1) --> (op (cast X), op (C1, C2))

There are 7 associative operators and 13 cast types, so this could potentially go a lot further.

Differential Revision: https://reviews.llvm.org/D22421

llvm-svn: 275684
2016-07-16 15:20:19 +00:00
Sanjay Patel 27fefb2fcf add tests for associative ops blocked by a cast
These are more generalized versions of the cases added in
r275302 and r275297.

llvm-svn: 275594
2016-07-15 18:39:02 +00:00
David Majnemer 666aa945a5 [InstCombine] Masked loads with undef masks can fold to normal loads
We were able to fold masked loads with an all-ones mask to a normal
load.  However, we couldn't turn a masked load with a mask with mixed
ones and undefs into a normal load.

llvm-svn: 275380
2016-07-14 06:58:42 +00:00
Sanjay Patel eff2aa70fc add more tests for zexty xor sandwiches
...mmm sandwiches

llvm-svn: 275302
2016-07-13 18:58:55 +00:00
Sanjay Patel 904a88025a add test for zexty xor sandwich
llvm-svn: 275297
2016-07-13 18:40:38 +00:00
Sanjay Patel c00e48a3db [InstCombine] extend vector select matching for non-splat constants
In D21740, we discussed trying to make this a more general matcher. However, I didn't see a clean
way to handle the regular m_Not cases and these non-splat vector patterns, so I've opted for the
direct approach here. If there are other potential uses of areInverseVectorBitmasks(), we could
move that helper function to a higher level.

There is an open question as to which is of these forms should be considered the canonical IR:
  %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a, <4 x i32> %b
  %shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 5, i32 6, i32 3>

Differential Revision: http://reviews.llvm.org/D22114

llvm-svn: 275289
2016-07-13 18:07:02 +00:00
David Majnemer 1b3db33e3d [ConstantFolding] Don't treat negative GEP offsets as positive
GEP offsets are signed, don't treat them as huge positive numbers.

llvm-svn: 275251
2016-07-13 05:16:16 +00:00
Sanjay Patel 4a6a751dce add tests for missing DeMorgan's Law folds
llvm-svn: 275192
2016-07-12 17:05:04 +00:00
Sanjay Patel 3900191ecc auto-generate checks
llvm-svn: 275188
2016-07-12 16:21:55 +00:00
Sanjay Patel 93dffe629a auto-generate checks
llvm-svn: 275187
2016-07-12 16:17:30 +00:00
Sanjay Patel 6d1f227e6b auto-generate checks
llvm-svn: 275186
2016-07-12 16:13:04 +00:00
Hal Finkel 6fd5e1f02b Teach computeKnownBits to look through returned-argument functions
If a function is known to return one of its arguments, we can use that in order
to compute known bits of the return value.

Differential Revision: http://reviews.llvm.org/D9397

llvm-svn: 275036
2016-07-11 02:25:14 +00:00
Anna Thomas 9ad45adfd7 Revert "InstCombine rule to fold truncs whose value is available"
This reverts commit r274853.
Caused failure in ppcBE build

llvm-svn: 274943
2016-07-08 22:15:08 +00:00
Sanjay Patel 664514f7fe [InstCombine] don't form select from bitcasted logic ops if bitcasts have >1 use
This isn't a sure thing (are 2 extra bitcasts less expensive than a logic op?), 
but we'll try to err on the conservative side by going with the case that has
less IR instructions.

Note: This question came up in http://reviews.llvm.org/D22114 , but this part is
independent of that patch proposal, so I'm making this small change ahead of that
one. 

See also:
http://reviews.llvm.org/rL274926

llvm-svn: 274932
2016-07-08 21:17:51 +00:00
Sanjay Patel 5246482c7a add another multi-use test for logic->select transform
llvm-svn: 274929
2016-07-08 21:08:16 +00:00
Sanjay Patel f4a08ede03 [InstCombine] don't form select from logic ops if it's unlikely that we'll eliminate any ops
llvm-svn: 274926
2016-07-08 20:53:29 +00:00
Sanjay Patel 297a0e67b6 adjust test so it won't completely optimize away
llvm-svn: 274925
2016-07-08 20:35:53 +00:00
Sanjay Patel 0733e6b61c add tests for multi-use folding to select
llvm-svn: 274922
2016-07-08 20:22:27 +00:00
Sanjay Patel 1b6b824548 [InstCombine] check for one-use before turning simple logic op into a select
llvm-svn: 274891
2016-07-08 17:26:47 +00:00
Sanjay Patel 910ce0d511 add test to show multi-use output
llvm-svn: 274887
2016-07-08 17:12:27 +00:00
Sanjay Patel cbfca9e8ef [InstCombine] allow or(sext(A), B) --> A ? -1 : B transform for vectors
llvm-svn: 274883
2016-07-08 17:01:15 +00:00
Sanjay Patel 647174c8a4 add vector tests to show missing transform
llvm-svn: 274876
2016-07-08 16:39:53 +00:00
Sanjay Patel 46df968326 minimize tests
The cmp and load aren't required.

llvm-svn: 274864
2016-07-08 16:11:48 +00:00
Sanjay Patel e1acad9b61 regenerate checks
llvm-svn: 274860
2016-07-08 16:06:38 +00:00
Anna Thomas 3124f6273a InstCombine rule to fold truncs whose value is available
We can fold truncs whose operand feeds from a load, if the trunc value
is available through a prior load/store.

This change is from: http://reviews.llvm.org/D21246, which folded the
trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW
call, when the load type didnt match the prior load/store type.

Differential Revision: http://reviews.llvm.org/D21791

llvm-svn: 274853
2016-07-08 15:18:56 +00:00
Sjoerd Meijer 17c08dc701 Code size optimisation: don't rewrite fputs to fwrite when optimising for size
because fwrite requires more arguments and thus extra MOVs are required.

llvm-svn: 274753
2016-07-07 13:56:23 +00:00
Sanjay Patel 65a51c25c1 [InstCombine] enhance (select X, C1, C2 --> ext X) to handle vectors
By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we
allow these transforms for splat vectors.

Differential Revision: http://reviews.llvm.org/D21899

llvm-svn: 274696
2016-07-06 22:23:01 +00:00
Sanjay Patel cbaac41856 [InstCombine] enable vector select of bools -> logic folds
llvm-svn: 274465
2016-07-03 14:34:39 +00:00
Sanjay Patel 42396ae0ea add vector bool select tests and regenerate checks for scalar bool select tests
llvm-svn: 274460
2016-07-03 13:26:02 +00:00
Sanjay Patel 7c6eab5777 [InstCombine] shrink switch conditions better (PR24766)
https://llvm.org/bugs/show_bug.cgi?id=24766#c2

This removes a hack that was added for the benefit of x86 codegen. 
It prevented shrinking the switch condition even to smaller legal (DataLayout) types.
We have a safety mechanism in CGP after:
http://reviews.llvm.org/rL251857
...so we're free to use the optimal (smallest) IR type now.

Differential Revision: http://reviews.llvm.org/D12965

llvm-svn: 274233
2016-06-30 14:51:21 +00:00
Sanjay Patel 7ad98babfa [InstCombine] extend matchSelectFromAndOr() to work with i1 scalar types
If the incoming types are i1, then we don't have to pattern match any sext ops.

Differential Revision: http://reviews.llvm.org/D21740

llvm-svn: 274228
2016-06-30 14:18:18 +00:00
Sanjay Patel 348111f4b9 add vector tests to show missing transform
llvm-svn: 274192
2016-06-30 00:09:13 +00:00
Sanjay Patel c3701e8b92 regenerate checks
llvm-svn: 274188
2016-06-29 23:58:39 +00:00