Commit Graph

12956 Commits

Author SHA1 Message Date
Philip Reames 5a637cbdc7 [LoopPred] Extend LFTR normalization to the inverse EQ case
A while back, I added support for NE latches formed by LFTR.  I didn't think that quite through, as LFTR will also produce the inverse EQ form for some loops and I hadn't handled that.  This change just adds handling for that case as well.

llvm-svn: 365419
2019-07-09 01:27:45 +00:00
Johannes Doerfert accd3e8747 [Attributor] Deduce the "returned" argument attribute
Deduce the "returned" argument attribute by collecting all potentially
returned values.

Not only the unique return value, if any, can be used by subsequent
attributes but also the set of all potentially returned values as well
as the mapping from returned values to return instructions that they
originate from (see AAReturnedValues::checkForallReturnedValues).

Change in statistics (-stats) for LLVM-TS + Spec2006, totaling ~19% more "returned" arguments.

  ADDED: attributor                   NumAttributesManifested                  n/a ->        637
  ADDED: attributor                   NumAttributesValidFixpoint               n/a ->      25545
  ADDED: attributor                   NumFnArgumentReturned                    n/a ->        637
  ADDED: attributor                   NumFnKnownReturns                        n/a ->      25545
  ADDED: attributor                   NumFnUniqueReturned                      n/a ->      14118
CHANGED: deadargelim                  NumRetValsEliminated                     470 ->        449 (    -4.468%)
REMOVED: functionattrs                NumReturned                              535 ->        n/a
CHANGED: indvars                      NumElimIdentity                          138 ->        164 (   +18.841%)

Reviewers: homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames, efriedma, chandlerc

Subscribers: hiraditya, bollu, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D59919

llvm-svn: 365407
2019-07-08 23:27:20 +00:00
Sanjay Patel 3dee113ebc [InstCombine] fold insertelement into splat of same scalar
Forming the canonical splat shuffle improves analysis and
may allow follow-on transforms (although some possibilities
are missing as shown in the test diffs).

The backend generically turns these patterns into build_vector,
so there should be no codegen regressions. All targets are
expected to be able to lower splats efficiently.

llvm-svn: 365379
2019-07-08 19:48:52 +00:00
Sanjay Patel 77ccc04700 [InstCombine] add tests for insert of same splatted scalar; NFC
llvm-svn: 365362
2019-07-08 18:03:22 +00:00
Sanjay Patel 0b59103a73 [InstCombine] canonicalize insert+splat to/from element 0 of vector
We recognize a splat from element 0 in (VectorUtils) llvm::getSplatValue()
and also in ShuffleVectorInst::isZeroEltSplatMask(), so this converts
to that form for better matching.

The backend generically turns these patterns into build_vector,
so there should be no codegen difference.

llvm-svn: 365342
2019-07-08 16:26:48 +00:00
Brian Homerding b4b21d807e Add, and infer, a nofree function attribute
This patch adds a function attribute, nofree, to indicate that a function does
not, directly or indirectly, call a memory-deallocation function (e.g., free,
C++'s operator delete).

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D49165

llvm-svn: 365336
2019-07-08 15:57:56 +00:00
Sanjay Patel 320a28200f [InstCombine] fix typo in test; NFC
I added this test in rL365325, but didn't mean to create an undef insert.

llvm-svn: 365333
2019-07-08 15:38:03 +00:00
Sanjay Patel 74cbaa37b6 [InstCombine] add tests for splat shuffles; NFC
llvm-svn: 365325
2019-07-08 14:49:21 +00:00
Cameron McInally 771769be90 [Float2Int] Add support for unary FNeg to Float2Int
Differential Revision: https://reviews.llvm.org/D63941

llvm-svn: 365324
2019-07-08 14:46:07 +00:00
Petr Hosek e28fca29fe Revert "[IRBuilder] Fold consistently for or/and whether constant is LHS or RHS"
This reverts commit r365260 which broke the following tests:

    Clang :: CodeGenCXX/cfi-mfcall.cpp
    Clang :: CodeGenObjC/ubsan-nullability.m
    LLVM :: Transforms/LoopVectorize/AArch64/pr36032.ll

llvm-svn: 365284
2019-07-07 22:12:01 +00:00
Nikita Popov a01502f1ba [LFTR] Regenerate test checks; NFC
llvm-svn: 365262
2019-07-06 08:54:15 +00:00
Philip Reames 9812668d77 [IRBuilder] Fold consistently for or/and whether constant is LHS or RHS
Without this, we have the unfortunate property that tests are dependent on the order of operads passed the CreateOr and CreateAnd functions.  In actual usage, we'd promptly optimize them away, but it made tests slightly more verbose than they should have been.

llvm-svn: 365260
2019-07-06 04:28:00 +00:00
Sanjay Patel f3481b8c9a [InferFunctionAttrs] add tests for 'dereferenceable' argument attribute; NFC
llvm-svn: 365227
2019-07-05 17:49:53 +00:00
Sanjay Patel 75b5edf6a1 [InstCombine] allow undef elements when forming splat from chain of insertelements
We allow forming a splat (broadcast) shuffle, but we were conservatively limiting
that to cases where all elements of the vector are specified. It should be safe
from a codegen perspective to allow undefined lanes of the vector because the
expansion of a splat shuffle would become the chain of inserts again.

Forming splat shuffles can reduce IR and help enable further IR transforms.
Motivating bugs:
https://bugs.llvm.org/show_bug.cgi?id=42174
https://bugs.llvm.org/show_bug.cgi?id=16739

Differential Revision: https://reviews.llvm.org/D63848

llvm-svn: 365147
2019-07-04 16:45:34 +00:00
David Bolvansky 5f73e37af8 [NFC] Added tests for D64099
llvm-svn: 365141
2019-07-04 13:48:32 +00:00
Chen Zheng 469f30abab [PowerPC] Hardware Loop branch instruction's condition may not be icmp.
This fixes pr42492.
Differential Revision: https://reviews.llvm.org/D64124

llvm-svn: 365104
2019-07-04 01:51:47 +00:00
Eli Friedman 41ee3977c4 [JumpThreading] Fix threading with unusual PHI nodes.
If the block being cloned contains a PHI node, in general, we need to
clone that PHI node, even though it's trivial. If the operand of the PHI
is an instruction in the block being cloned, the correct value for the
operand doesn't exist until SSAUpdater constructs it.

We usually don't hit this issue because we try to avoid threading across
loop headers, but it's possible to hit this in some cases involving
irreducible CFGs.  I added a flag to allow threading across loop headers
to make the testcase easier to understand.

Thanks to Brian Rzycki for reducing the testcase.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42085.

Differential Revision: https://reviews.llvm.org/D63913

llvm-svn: 365094
2019-07-03 23:12:39 +00:00
Philip Reames ea06d63c35 [LFTR] Use SCEVExpander for the pointer limit case instead of manual IR gen
As noted in the test change, this is not trivially NFC, but all of the changes in output are cases where the SCEVExpander form is more canonical/optimal than the hand generation.  

llvm-svn: 365075
2019-07-03 20:03:46 +00:00
Philip Reames 83cca94194 [LFTR] Hoist extend expressions outside of loops w/o waiting for LICM
The motivation for this is two fold:
1) Make the output (and thus tests)  a bit more readable to a human trying to understand the result of the transform
2) Reduce spurious diffs in a potential future change to restructure all of this logic to use SCEVExpander (which hoists by default)

llvm-svn: 365066
2019-07-03 18:18:36 +00:00
Roman Lebedev 826db453d1 [NFC][InstCombine] onehot_merge.ll: add last few tests in the state they regress to in D62818
llvm-svn: 365056
2019-07-03 16:48:53 +00:00
Sanjay Patel c1c86adb16 [SLP] add tests for bitcasted vector pointer load; NFC
I'm not sure if this falls within the scope of SLP,
but we could create vector loads for some of these
patterns.

llvm-svn: 365055
2019-07-03 16:46:14 +00:00
Roman Lebedev 9f0c83902d [InstCombine] Y - ~X --> X + Y + 1 fold (PR42457)
Summary:
I *think* we'd want this new variant, because we obviously
have better handling for `add` as compared to `sub`/`not`.

https://rise4fun.com/Alive/WMn

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]]

Reviewers: spatel, nikic, huihuiz, efriedma

Reviewed By: spatel

Subscribers: RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63992

llvm-svn: 365011
2019-07-03 09:41:50 +00:00
Eugene Leviant ac407a7b4a [SCEV][LSR] Prevent using undefined value in binops
On some occasions ReuseOrCreateCast may convert previously
expanded value to undefined. That value may be passed by
SCEVExpander as an argument to InsertBinop making IV chain
undefined.

Differential revision: https://reviews.llvm.org/D63928 

llvm-svn: 365009
2019-07-03 09:36:32 +00:00
Jordan Rupprecht 02647f73d4 Revert [InlineCost] cleanup calculations of Cost and Threshold
This reverts r364422 (git commit 1a3dc76186)

The inlining cost calculation is incorrect, leading to stack overflow due to large stack frames from heavy inlining.

llvm-svn: 365000
2019-07-03 04:01:51 +00:00
Vasileios Porpodas cf47ff5ffb [SLP] Recommit: Look-ahead operand reordering heuristic.
Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).

Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk

Reviewed By: RKSimon, dtemirbulatov

Subscribers: hiraditya, phosek, rnk, rcorcs, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60897

llvm-svn: 364964
2019-07-02 20:20:28 +00:00
David Bolvansky cb1a5a705c [SimplifyLibCalls] powf(x, sitofp(n)) -> powi(x, n)
Summary:
Partially solves https://bugs.llvm.org/show_bug.cgi?id=42190



Reviewers: spatel, nikic, efriedma

Reviewed By: efriedma

Subscribers: efriedma, nikic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63038

llvm-svn: 364940
2019-07-02 15:58:45 +00:00
Roman Lebedev 0bde7c6527 [InstCombine] Shift amount reassociation: fixup constantexpr handling (PR42484)
I was actually wondering if there was some nicer way than m_Value()+cast,
but apparently what i was really "subconsciously" thinking about
was correctness issue.

hasNoUnsignedWrap()/hasNoUnsignedWrap() exist for Instruction,
not for BinaryOperator, so let's just use m_Instruction(),
thus both avoiding a cast, and a crash.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42484,
      https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15587

llvm-svn: 364915
2019-07-02 12:54:48 +00:00
Roman Lebedev 7928fea4a7 [NFC][InstCombine] Revisit tests for "redundant shift input masking" (PR42456)
llvm-svn: 364897
2019-07-02 10:02:25 +00:00
Roman Lebedev 377dfb0226 [NFC][InstCombine] Add tests for "redundant shift input masking" (PR42456)
https://bugs.llvm.org/show_bug.cgi?id=42456
https://rise4fun.com/Alive/Vf1p

llvm-svn: 364894
2019-07-02 09:27:34 +00:00
Reid Kleckner d72163947a [PGO] Update ICP pass for recent byval type changes
Fixes verifier errors encountered in PR42413.

Reviewers: xur, t.p.northover, inglorion, gbiv, george.burgess.iv

Differential Revision: https://reviews.llvm.org/D63842

llvm-svn: 364861
2019-07-01 22:43:39 +00:00
Huihui Zhang 8e1051b3a0 [InstCombine][NFCI] Update test cases in onehot_merge.ll
Use both one bit and signbit shifting to check for one bit merge.

Reviewers: lebedev.ri, spatel, efriedma, craig.topper

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63903

llvm-svn: 364857
2019-07-01 22:00:32 +00:00
Sanjay Patel ddc1b40f26 [InstCombine] reduce more checks for power-of-2-or-zero using ctpop
Extends the transform from:
rL364341
...to include another (more common?) pattern that tests whether a
value is a power-of-2 (including or excluding zero).

llvm-svn: 364856
2019-07-01 22:00:00 +00:00
Jordan Rupprecht a7972dc04a Revert [SLP] Look-ahead operand reordering heuristic.
This reverts r364478 (git commit 574cb0eb3a)

The patch is causing compilation timeouts.

llvm-svn: 364846
2019-07-01 21:10:43 +00:00
Roman Lebedev 975120a21b [NFC][InstCombine] More commutative tests for "shift direction in bittest" (PR42466)
'and' is commutative, if we don't want to touch shift-of-const,
we still need to check the other hand of 'and'.

llvm-svn: 364844
2019-07-01 20:33:56 +00:00
Roman Lebedev e62857786f [NFC][InstCombine] Add tests for "shift direction in bittest" (PR42466)
https://rise4fun.com/Alive/8O1
https://bugs.llvm.org/show_bug.cgi?id=42466

llvm-svn: 364824
2019-07-01 18:11:32 +00:00
Roman Lebedev 04d3d3bbff [InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459)
Summary:
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

This does address the regression being exposed in D63992.

https://godbolt.org/z/7DGltU
https://rise4fun.com/Alive/EPO0

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 | PR42459 ]]

Reviewers: spatel, nikic, huihuiz

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63993

llvm-svn: 364792
2019-07-01 15:55:24 +00:00
Roman Lebedev 72b8d41ce8 [InstCombine] Shift amount reassociation in bittest (PR42399)
Summary:
Given pattern:
`icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0`
we should move shifts to the same hand of 'and', i.e. rewrite as
`icmp eq/ne (and (x shift (Q+K)), y), 0`  iff `(Q+K) u< bitwidth(x)`

It might be tempting to not restrict this to situations where we know
we'd fold two shifts together, but i'm not sure what rules should there be
to avoid endless combine loops.

We pick the same shift that was originally used to shift the variable we picked to shift:
https://rise4fun.com/Alive/6x1v

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 | PR42399]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63829

llvm-svn: 364791
2019-07-01 15:55:15 +00:00
Roman Lebedev 34a0b16e29 [NFC][InstCombine] Better commutative tests for "shift amount reassociation in bittest" pattern.
As discussed in https://reviews.llvm.org/D63829
*if* *both* shifts are one-use, we'd most likely want to produce `lshr`,
and not rely on ordering.

Also, there should likely be a *separate* fold to do this reordering.

llvm-svn: 364772
2019-07-01 14:28:24 +00:00
Roman Lebedev 9f3645869c [NFC][InstCombine] Improve test coverage for ((~x) + y) + 1 -> y - x fold fold (PR42459)
So we indeed to have this fold, but only if +1 is not the last operation..

llvm-svn: 364764
2019-07-01 13:31:06 +00:00
Roman Lebedev d5c3e34cb7 [NFC][InstCombine] Tests for ((~x) + y) + 1 -> y - x fold fold (PR42459)
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

https://bugs.llvm.org/show_bug.cgi?id=42459
https://rise4fun.com/Alive/EPO0

llvm-svn: 364749
2019-07-01 12:22:06 +00:00
Roman Lebedev 4f878fe3a7 [NFC][InstCombine] Tests for x - ~(y) -> x + y + 1 fold (PR42457)
https://bugs.llvm.org/show_bug.cgi?id=42457
https://rise4fun.com/Alive/iFhE

llvm-svn: 364739
2019-07-01 09:57:53 +00:00
Roman Lebedev f55818e3a7 [InstCombine] Omit 'urem' where possible
This was added in D63390 / rL364286 to backend,
but it makes sense to also handle it in middle-end.
https://rise4fun.com/Alive/Zsln

llvm-svn: 364738
2019-07-01 09:41:43 +00:00
Roman Lebedev 0f82f64c83 [NFC][InstCombine] Copy test for omit urem when possible from TargetLowering
Was added in D63390 / rL364286 to backend, but it makes sense to also handle it here.
https://rise4fun.com/Alive/Zsln

llvm-svn: 364737
2019-07-01 09:41:27 +00:00
Yevgeny Rouban d4097b4a93 [SimpleLoopUnswitch] Implement handling of prof branch_weights metadata for SwitchInst
Differential Revision: https://reviews.llvm.org/D60606

llvm-svn: 364734
2019-07-01 08:43:53 +00:00
Sam Parker 98722691b0 [ARM] WLS/LE Code Generation
Backend changes to enable WLS/LE low-overhead loops for armv8.1-m:
1) Use TTI to communicate to the HardwareLoop pass that we should try
   to generate intrinsics that guard the loop entry, as well as setting
   the loop trip count.
2) Lower the BRCOND that uses said intrinsic to an Arm specific node:
   ARMWLS.
3) ISelDAGToDAG the node to a new pseudo instruction:
   t2WhileLoopStart.
4) Add support in ArmLowOverheadLoops to handle the new pseudo
   instruction.

Differential Revision: https://reviews.llvm.org/D63816

llvm-svn: 364733
2019-07-01 08:21:28 +00:00
Sanjay Patel 706b48251f [InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics
This is the opposite direction of D62158 (we have to choose 1 form or the other).
Now that we have FMF on the select, this becomes more palatable. And the benefits
of having a single IR instruction for this operation (less chances of missing folds
based on extra uses, etc) overcome my previous comments about the potential advantage
of larger pattern matching/analysis.

Differential Revision: https://reviews.llvm.org/D62414

llvm-svn: 364721
2019-06-30 13:40:31 +00:00
Sanjay Patel 77dc1e8568 [InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnum
This transform came up in D62414, but we should deal with it first.
We have LLVM intrinsics that correspond exactly to libm calls (unlike
most libm calls, these libm calls never set errno).
This holds without any fast-math-flags, so we should always canonicalize
to those intrinsics directly for better optimization.
Currently, we convert to fcmp+select only when we have FMF (nnan) because
fcmp+select does not preserve the semantics of the call in the general case.

Differential Revision: https://reviews.llvm.org/D63214

llvm-svn: 364714
2019-06-29 14:28:54 +00:00
Roman Lebedev e3a94ba4a9 [InstCombine] Shift amount reassociation (PR42391)
Summary:
Given pattern:
`(x shiftopcode Q) shiftopcode K`
we should rewrite it as
`x shiftopcode (Q+K)`  iff `(Q+K) u< bitwidth(x)`
This is valid for any shift, but they must be identical.

* https://rise4fun.com/Alive/9E2
* exact on both lshr => exact https://rise4fun.com/Alive/plHk
* exact on both ashr => exact https://rise4fun.com/Alive/QDAA
* nuw on both shl => nuw https://rise4fun.com/Alive/5Uk
* nsw on both shl => nsw https://rise4fun.com/Alive/0plg

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42391 | PR42391]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: nikic

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63812

llvm-svn: 364712
2019-06-29 11:51:50 +00:00
Nikita Popov 2d756c4feb [LFTR] Fix post-inc pointer IV with truncated exit count (PR41998)
Fixes https://bugs.llvm.org/show_bug.cgi?id=41998. Usually when we
have a truncated exit count we'll truncate the IV when comparing
against the limit, in which case exit count overflow in post-inc
form doesn't matter. However, for pointer IVs we don't do that, so
we have to be careful about incrementing the IV in the wide type.

I'm fixing this by removing the IVCount variable (which was
ExitCount or ExitCount+1) and replacing it with a UsePostInc flag,
and then moving the actual limit adjustment to the individual cases
(which are: pointer IV where we add to the wide type, integer IV
where we add to the narrow type, and constant integer IV where we
add to the wide type).

Differential Revision: https://reviews.llvm.org/D63686

llvm-svn: 364709
2019-06-29 09:24:12 +00:00
Cameron McInally b671535983 [NFC][NewGVN] Explicitly check fpmath metadata in fpmath.ll
Suggested in D63933.

llvm-svn: 364685
2019-06-28 21:39:08 +00:00