Commit Graph

192 Commits

Author SHA1 Message Date
Nikita Popov 68b24c3b44 [CVP] Simplify comparisons without constant operand
CVP currently only tries to simplify comparisons if there is a
constant operand. However, even if both are non-constant, we may
be able to determine the result of the comparison based on range
information.

IPSCCP is already capable of doing this, but because it runs very
early, it may miss some cases.

Differential Revision: https://reviews.llvm.org/D137253
2022-11-03 15:35:27 +01:00
Yuanbo Li ebd0249fcf [DebugInfo] Missing debug location after replacement in processSRem function
This patch fixes an issue in which CorrelatedValuePropagation::processSRem
would create new instructions to represent the SRem instruction, but would not
correctly copy any existing debug location metadata to the new instruction.

Differential Revision: https://reviews.llvm.org/D132218
2022-09-01 13:18:17 +01:00
Kazu Hirata 0e37ef0186 [Transforms] Fix comment typos (NFC) 2022-08-07 23:55:24 -07:00
Nuno Lopes 373571dbb4 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-06-30 23:01:43 +01:00
Simon Moll b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Jay Foad 6bec3e9303 [APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.

The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.

Differential Revision: https://reviews.llvm.org/D125557
2022-05-19 11:23:13 +01:00
Craig Topper 4b36d9bde7 [CVP] Preserve exact name when converting sext->zext and ashr->lshr.
Previously we took the old name and always appended a numberic suffix.
Since we're doing a 1:1 replacement, it's clearer to keep the original
name exactly.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D125281
2022-05-10 09:13:59 -07:00
serge-sans-paille 59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Dmitry Vassiliev 90a3b31091 [Transforms] Enhance CorrelatedValuePropagation to handle both values of select
The "Correlated Value Propagation" pass was missing a case when handling select instructions. It was only handling the "false" constant value, while in NVPTX the select may have the condition (and thus the branches) inverted, for example:
```
loop:
	%phi = phi i32* [ %sel, %loop ], [ %x, %entry ]
	%f = tail call i32* @f(i32* %phi)
	%cmp1 = icmp ne i32* %f, %y
	%sel = select i1 %cmp1, i32* %f, i32* null
	%cmp2 = icmp eq i32* %sel, null
	br i1 %cmp2, label %return, label %loop
```
But the select condition can be inverted:
```
	%cmp1 = icmp eq i32* %f, %y
	%sel = select i1 %cmp1, i32* null, i32* %f
```
The fix is to enhance "Correlated Value Propagation" to handle both branches of the select instruction.

Reviewed By: nikic, lebedev.ri

Differential Revision: https://reviews.llvm.org/D119643
2022-02-23 00:11:20 +04:00
Nikita Popov 18bf42c0a6 [CVP] Extract helper from phi processing (NFC)
So we can use early returns and avoid those awkward !V checks.
2022-02-14 10:51:34 +01:00
Fabian Wolff 45ecfed6c6 [CVP] Remove ashr of -1 or 0
Fixes PR#52190. There is already a check for converting ashr instructions with non-negative left-hand sides into lshr; this patch adds an optimization to remove ashr altogether if the left-hand side is known to be in the range [-1, 1).

Differential Revision: https://reviews.llvm.org/D113835
2021-11-29 15:32:09 -08:00
Artur Pilipenko aa60d169ea [CVP] Add a cl::opt for canonicalization of signed relational comparisons
This canonicalization breaks the ability to discard checks in some cases.
Add a command line option to disable it. This option is on by default,
so the change is NFC.

See for details:
https://reviews.llvm.org/D112895#3149487
2021-11-24 13:52:38 -08:00
Jun Ma 07333810ca Revert "Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."""""
This reverts commit c93f93b2e3.
2021-11-24 10:26:37 +08:00
Zarko Todorovski 0d3add216f [llvm][NFC] Inclusive language: Reword replace uses of sanity in llvm/lib/Transform comments and asserts
Reworded some comments and asserts to avoid usage of `sanity check/test`

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D114372
2021-11-23 13:22:55 -05:00
Roman Lebedev b554e41e2d
[CVP] Canonicalize signed relational comparisons of scalar integers to unsigned comparison predicates
Now that the reasoning was added to ConstantRange in D90924,
this replicates IndVars variant of this transform (D111836)
in a pass that uses value range reasoning for the transform.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D112895
2021-11-01 12:16:05 +03:00
Jun Ma c93f93b2e3 Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values.""""
This reverts commit 3a998c06a8.
2021-11-01 15:31:59 +08:00
Jun Ma 3a998c06a8 Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."""
This reverts commit 8ba2adcf9e.
2021-09-27 20:39:05 +08:00
Kazu Hirata 8e86c0e4f4 [Scalar] Use make_early_inc_range (NFC) 2021-09-12 08:17:18 -07:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Jun Ma 8ba2adcf9e Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values.""
Differential Revision: https://reviews.llvm.org/D106056
2021-09-09 16:53:33 +08:00
Sanjay Patel ec54e275f5 Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."
This reverts commit 9934a5b2ed.
This patch may cause miscompiles because it missed a constraint
as shown in the examples from:
https://llvm.org/PR51531
2021-08-19 08:43:51 -04:00
Jun Ma 9934a5b2ed [CVP] processSwitch: Remove default case when switch cover all possible values.
Differential Revision: https://reviews.llvm.org/D106056
2021-08-18 10:23:13 +08:00
Nikita Popov 9c91614959 [CVP] Guard against poison in common phi value transform (PR50399)
The common phi value transform replaces constants with values that
have the same value as the constant on a given edge. However, LVI
generally only provides information that is correct up to poison,
so this can end up replacing a well-defined value with poison.
D69442 addressed an instance of this problem by clearing poison
flags on the generating instruction, which was sufficient at the
time. rGa917fb89dc28 made LVI's edge value analysis slightly more
powerful, and clearing poison flags is no longer sufficient.

This patch changes the transform to instead explicitly guard against
a poison value instead. This should be satisfied for most cases due
to a prior branch on poison.

Fixes https://bugs.llvm.org/show_bug.cgi?id=50399.

Differential Revision: https://reviews.llvm.org/D102966
2021-05-25 20:47:17 +02:00
Arthur Eubanks 6b9524a05b [NewPM] Don't mark AA analyses as preserved
Currently all AA analyses marked as preserved are stateless, not taking
into account their dependent analyses. So there's no need to mark them
as preserved, they won't be invalidated unless their analyses are.

SCEVAAResults was the one exception to this, it was treated like a
typical analysis result. Make it like the others and don't invalidate
unless SCEV is invalidated.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D102032
2021-05-18 13:49:03 -07:00
Evgeniy Brevnov 35e95c6817 [CVP] processCallSite returns wrong status
Recently processMinMaxIntrinsic has been added and we started to observe a number of analysis get invalidated after CVP. The problem is CVP conservatively returns 'true'  even if there were no modifications to IR. I found one more place besides processMinMaxIntrinsic  which has the same problem. I think processMinMaxIntrinsic and similar should better have boolean return status to prevent similar issue reappear in future.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D100538
2021-04-19 12:13:22 +07:00
Roman Lebedev 9829f5e6b1
[CVP] @llvm.[us]{min,max}() intrinsics handling
If we can tell that either one of the arguments is taken,
bypass the intrinsic.

Notably, we are indeed fine with non-strict predicate:
* UL: https://alive2.llvm.org/ce/z/69qVW9 https://alive2.llvm.org/ce/z/kNFTKf
      https://alive2.llvm.org/ce/z/AvaPw2 https://alive2.llvm.org/ce/z/oxo53i
* UG: https://alive2.llvm.org/ce/z/wxHeGH https://alive2.llvm.org/ce/z/Lf76qx
* SL: https://alive2.llvm.org/ce/z/hkeTGS https://alive2.llvm.org/ce/z/eR_b-W
* SG: https://alive2.llvm.org/ce/z/wEqRm7 https://alive2.llvm.org/ce/z/FpAsVr

Much like with all other comparison handling in CVP,
while we could sort-of handle two Value's,
at least for plain ICmpInst it does not appear to be worthwhile.

This only fires 78 times on test-suite + dt + rs,
but we don't canonicalize to these yet. (only SCEV produces them)
2021-04-11 00:33:47 +03:00
Roman Lebedev a407738def
[NFC][CVP] Add statistic for function pointer argument non-null-ness deduction 2021-04-10 21:23:20 +03:00
Roman Lebedev fe7b3ad8d5
[CVP] LVI: Use in-block values when checking value signedness domain
This has a huge positive impact on all the folds that use these helpers,
as it can be seen on vanilla test-suite + rawspeed + darktable:
correlated-value-propagation.NumSRems             +75.68% (+ 28)
correlated-value-propagation.NumAShrs             +63.87% (+198)
correlated-value-propagation.NumSDivs             +49.42% (+127)
correlated-value-propagation.NumSExt              + 8.85% (+593)
correlated-value-propagation.NumUDivURemsNarrowed	+ 8.65% (+34)

... while having pretty minimal compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=e8c7f43e2c2c6f3581ec1c6489ec21ad9f98958a&to=4cd197711e58ee1b2faeee0c35eea54540185569&stat=instructions
2021-04-10 21:10:59 +03:00
Roman Lebedev 257eda0794
[NFC][LVI] getPredicateAt(): drop default value for UseBlockValue
The default is likely wrong.
Out of all the callees, only a single one needs to pass-in false (JumpThread),
everything else either already passes true, or should pass true.

Until the default is flipped, at least make it harder to unintentionally
add new callees with UseBlockValue=false.
2021-04-10 20:46:01 +03:00
Roman Lebedev c329a47d9e
[CVP] @llvm.abs() handling
Iff we know the sigdness domain of the argument,
we can either skip @llvm.abs, or do negation directly.

Notably, INT_MIN can belong to either domain:
* X u<= INT_MIN --> X  is always fine
  https://alive2.llvm.org/ce/z/QB8j-C https://alive2.llvm.org/ce/z/7sFKpS
* X s<= 0 --> -X  is always fine
  https://alive2.llvm.org/ce/z/QbGSyq https://alive2.llvm.org/ce/z/APsN84

If all else fails, try to inferr NSW flag:
https://alive2.llvm.org/ce/z/qCJfYm
2021-04-10 16:47:31 +03:00
Nikita Popov 2b494f85f1 [CVP] Remove -cvp-dont-add-nowrap-flags option
This option was originally added to work around a bug in LFTR.
The bug has long since been fixed.
2021-03-07 18:19:31 +01:00
Nikita Popov afbb6d97b5 [CVP] Simplify and generalize switch handling
CVP currently handles switches by checking an equality predicate
on all edges from predecessor blocks. Of course, this can only
work if the value being switched over is defined in a different block.

Replace this implementation with a call to getPredicateAt(), which
also does the predecessor edge predicate check (if not defined in
the same block), but can also do quite a bit more: It can reason
about phi-nodes by checking edge predicates for incoming values,
it can reason about assumes, and it can reason about block values.

As such, this makes the implementation both simpler and more
powerful. The compile-time impact on CTMark is in the noise.
2020-12-12 21:12:27 +01:00
Philip Reames e46d74b589 [CVP] Allow two transforms in one invocation
For a call site which had both constant deopt operands and nonnull arguments, we were missing the opportunity to recognize the later by bailing early.

This is somewhat of a speculative fix.  Months ago, I'd had a private report of performance and compile time regressions from the deopt operand folding.  I never received a test case.  However, the only possibility I see was that after that change CVP missed the nonnull fold, and we end up with a pass ordering/missed simplification issue.  So, since it's a real issue, fix it and hope.
2020-09-28 15:11:42 -07:00
Nikita Popov fe79061be2 [LVI][CVP] Use block value when simplifying icmps
Add a flag to getPredicateAt() that allows making use of the block
value. This allows us to take into account range information from
the current block, rather than only information that is threaded
over edges, making the icmp simplification in CVP a lot more
powerful.

I'm not changing getPredicateAt() to use the block value
unconditionally to avoid any impact on the JumpThreading pass,
which is somewhat picky about LVI query order.

Most test changes here are just icmps that now get dropped (while
previously only a result used in a return was replaced). The three
tests in icmp.ll show some representative improvements. Some of
the folds this enables have been covered by IPSCCP in the meantime,
but LVI can reason about some cases which are hard to support in
IPSCCP, such as in test_br_cmp_with_offset.

The compile-time time cost of doing this is fairly minimal, with
a ~0.05% CTMark regression for ReleaseThinLTO:
https://llvm-compile-time-tracker.com/compare.php?from=709d03f8af4da4204849a70f01798e7cebba2e32&to=6236fd503761f43c99f4537121e057a01056f185&stat=instructions

This is because the block values will typically already be queried
and cached by other CVP optimizations anyway.

Differential Revision: https://reviews.llvm.org/D69686
2020-09-27 20:25:16 +02:00
Nikita Popov 9b959b59df [LVI] Require context instruction in external API (NFCI)
Require CxtI in getConstant() and getConstantRange() APIs.
Accordingly drop the BB parameter, as it is implied by
CxtI->getParent().

This makes sure we don't forget to pass the context instruction,
and makes the API contract clearer (also clean up the comments to
that effect -- the value holds at the context instruction, not
the end of the block).
2020-09-27 18:07:24 +02:00
Nikita Popov c8abf1c12d [CVP] Pass context instruction when narrowing div/rem
This fold was the only place not passing the context instruction.
The tests worked around that fact by introducing a basic block split,
which is now no longer necessary.
2020-09-27 17:51:30 +02:00
Martin Storsjö b90132399a [CVP] Remove a redundant trailing semicolon, fixing GCC warnings. NFC. 2020-09-23 09:03:01 +03:00
Roman Lebedev b289dc5306
[CVP] Narrow SDiv/SRem to the smallest power-of-2 that's sufficient to contain its operands
This is practically identical to what we already do for UDiv/URem:
  https://rise4fun.com/Alive/04K

Name: narrow udiv
Pre: C0 u<= 255 && C1 u<= 255
%r = udiv i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = udiv i8 %t0, %t1
%r = zext i8 %t2 to i16

Name: narrow exact udiv
Pre: C0 u<= 255 && C1 u<= 255
%r = udiv exact i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = udiv exact i8 %t0, %t1
%r = zext i8 %t2 to i16

Name: narrow urem
Pre: C0 u<= 255 && C1 u<= 255
%r = urem i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = urem i8 %t0, %t1
%r = zext i8 %t2 to i16

... only here we need to look for 'min signed bits', not 'active bits',
and there's an UB to be aware of:
  https://rise4fun.com/Alive/KG86
  https://rise4fun.com/Alive/LwR

Name: narrow sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = sdiv i16 C0, C1
  =>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = sdiv i9 %t0, %t1
%r = sext i9 %t2 to i16

Name: narrow exact sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = sdiv exact i16 C0, C1
  =>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = sdiv exact i9 %t0, %t1
%r = sext i9 %t2 to i16

Name: narrow srem
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = srem i16 C0, C1
  =>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = srem i9 %t0, %t1
%r = sext i9 %t2 to i16


Name: narrow sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = sdiv i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = sdiv i8 %t0, %t1
%r = sext i8 %t2 to i16

Name: narrow exact sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = sdiv exact i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = sdiv exact i8 %t0, %t1
%r = sext i8 %t2 to i16

Name: narrow srem
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = srem i16 C0, C1
  =>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = srem i8 %t0, %t1
%r = sext i8 %t2 to i16


The ConstantRangeTest.losslessSignedTruncationSignext test sanity-checks
the logic, that we can losslessly truncate ConstantRange to
`getMinSignedBits()` and signext it back, and it will be identical
to the original CR.

On vanilla llvm test-suite + RawSpeed, this fires 1262 times,
while the same fold for UDiv/URem only fires 384 times. Sic!

Additionally, this causes +606.18% (+1079) extra cases of
aggressive-instcombine.NumDAGsReduced, and +473.14% (+1145)
of aggressive-instcombine.NumInstrsReduced folds.
2020-09-22 21:37:30 +03:00
Roman Lebedev 4977eadee5
[NFC][CVP] Give a better name STATISTIC() counting udiv i16 -> udiv i8 xforms 2020-09-22 21:37:30 +03:00
Roman Lebedev ba5afe5588
[NFC][CVP] processUDivOrURem(): refactor to use ConstantRange::getActiveBits()
As an exhaustive test shows, this logic is fully identical to the old
implementation, with exception of the case where both of the operands
had empty ranges:

```
TEST_F(ConstantRangeTest, CVP_UDiv) {
  unsigned Bits = 4;
  EnumerateConstantRanges(Bits, [&](const ConstantRange &CR0) {
    if(CR0.isEmptySet())
      return;
    EnumerateConstantRanges(Bits, [&](const ConstantRange &CR1) {
      if(CR0.isEmptySet())
        return;

      unsigned MaxActiveBits = 0;
      for (const ConstantRange &CR : {CR0, CR1})
        MaxActiveBits = std::max(MaxActiveBits, CR.getActiveBits());

      ConstantRange OperandRange(Bits, /*isFullSet=*/false);
      for (const ConstantRange &CR : {CR0, CR1})
        OperandRange = OperandRange.unionWith(CR);
      unsigned NewWidth = OperandRange.getUnsignedMax().getActiveBits();

      EXPECT_EQ(MaxActiveBits, NewWidth) << CR0 << " " << CR1;
    });
  });
}
```
2020-09-22 21:37:29 +03:00
Roman Lebedev 4eeeb356fc
[CVP] Enhance SRem -> URem fold to work not just on non-negative operands
This is a continuation of 8d487668d0,
the logic is pretty much identical for SRem:

Name: pos pos
Pre: C0 >= 0 && C1 >= 0
%r = srem i8 C0, C1
  =>
%r = urem i8 C0, C1

Name: pos neg
Pre: C0 >= 0 && C1 <= 0
%r = srem i8 C0, C1
  =>
%r = urem i8 C0, -C1

Name: neg pos
Pre: C0 <= 0 && C1 >= 0
%r = srem i8 C0, C1
  =>
%t0 = urem i8 -C0, C1
%r = sub i8 0, %t0

Name: neg neg
Pre: C0 <= 0 && C1 <= 0
%r = srem i8 C0, C1
  =>
%t0 = urem i8 -C0, -C1
%r = sub i8 0, %t0

https://rise4fun.com/Alive/Vd6

Now, this new logic does not result in any new catches
as of vanilla llvm test-suite + RawSpeed.
but it should be virtually compile-time free,
and it may be important to be consistent in their handling,
because if we had a pair of sdiv-srem, and only converted one of them,
-divrempairs will no longer see them as a pair,
and thus not "merge" them.
2020-09-22 21:37:28 +03:00
Nikita Popov 25af353b0e [NewPM][LVI] Abandon LVI after CVP
As mentioned on D70376, LVI can currently cause performance issues
when running under NewPM. The problem is that, unlike the legacy
pass manager, NewPM will not immediately discard the LVI analysis
if the following pass does not need it. This is a problem, because
LVI has a high memory requirement, and mass invalidation of LVI
values is very inefficient. LVI should only be alive during passes
that actively interact with it.

This patch addresses the issue by explicitly abandoning LVI after CVP,
which gets us back to the LegacyPM behavior.

Differential Revision: https://reviews.llvm.org/D84959
2020-08-01 23:47:46 +02:00
Roman Lebedev 9dceb32f30
[NFC][CVP] processSDiv(): pacify gcc compilers 2020-07-18 19:41:43 +03:00
Roman Lebedev 8d487668d0
[CVP] Soften SDiv into a UDiv as long as we know domains of both of the operands.
Yes, if operands are non-positive this comes at the extra cost
of two extra negations. But  a. division is already just
ridiculously costly, two more subtractions can't hurt much :)
and  b. we have better/more analyzes/folds for an unsigned division,
we could end up narrowing it's bitwidth, converting it to lshr, etc.

This is essentially a take two on 0fdcca07ad,
which didn't fix the potential regression i was seeing,
because ValueTracking's computeKnownBits() doesn't make use
of dominating conditions in it's analysis.
While i could teach it that, this seems like the more general fix.

This big hammer actually does catch said potential regression.

Over vanilla test-suite + RawSpeed + darktable
(10M IR instrs, 1M IR BB, 1M X86 ASM instrs), this fires/converts 5 more
(+2%) SDiv's, the total instruction count at the end of middle-end pipeline
is only +6, so out of +10 extra negations, ~half are folded away,
and asm instr count is only +1, so practically speaking all extra
negations are folded away and are therefore free.
Sadly, all these new UDiv's remained, none folded away.
But there are two less basic blocks.

https://rise4fun.com/Alive/VS6

Name: v0
Pre: C0 >= 0 && C1 >= 0
%r = sdiv i8 C0, C1
  =>
%r = udiv i8 C0, C1

Name: v1
Pre: C0 <= 0 && C1 >= 0
%r = sdiv i8 C0, C1
  =>
%t0 = udiv i8 -C0, C1
%r = sub i8 0, %t0

Name: v2
Pre: C0 >= 0 && C1 <= 0
%r = sdiv i8 C0, C1
  =>
%t0 = udiv i8 C0, -C1
%r = sub i8 0, %t0

Name: v3
Pre: C0 <= 0 && C1 <= 0
%r = sdiv i8 C0, C1
  =>
%r = udiv i8 -C0, -C1
2020-07-18 17:59:56 +03:00
Roman Lebedev 45b7388824
[NFC][CVP] Rename predicates - s/positive/non negative/ to better note that zero is ok 2020-07-18 17:59:32 +03:00
Roman Lebedev 2cde6984d8
[NFC][CVP] Refactor isPositive() out of hasPositiveOperands() 2020-07-18 17:59:32 +03:00
Mircea Trofin ceb7f308b8 [llvm][NFC][CallSite] Removed CallSite from few implementation details
Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78724
2020-04-23 10:36:36 -07:00
Florian Hahn b37543750c [ValueLattice] Distinguish between constant ranges with/without undef.
This patch updates ValueLattice to distinguish between ranges that are
guaranteed to not include undef and ranges that may include undef.

A constant range guaranteed to not contain undef can be used to simplify
instructions to arbitrary values. A constant range that may contain
undef can only be used to simplify to a constant. If the value can be
undef, it might take a value outside the range. For example, consider
the snipped below

define i32 @f(i32 %a, i1 %c) {
  br i1 %c, label %true, label %false
true:
  %a.255 = and i32 %a, 255
  br label %exit
false:
  br label %exit
exit:
  %p = phi i32 [ %a.255, %true ], [ undef, %false ]
  %f.1 = icmp eq i32 %p, 300
  call void @use(i1 %f.1)
  %res = and i32 %p, 255
  ret i32 %res
}

In the exit block, %p would be a constant range [0, 256) including undef as
%p could be undef. We can use the range information to replace %f.1 with
false because we remove the compare, effectively forcing the use of the
constant to be != 300. We cannot replace %res with %p however, because
if %a would be undef %cond may be true but the  second use might not be
< 256.

Currently LazyValueInfo uses the new behavior just when simplifying AND
instructions and does not distinguish between constant ranges with and
without undef otherwise. I think we should address the remaining issues
in LVI incrementally.

Reviewers: efriedma, reames, aqjune, jdoerfert, sstefan1

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D76931
2020-03-31 12:50:20 +01:00
Enna1 03bc311a16 [CorrelatedValuePropagation] Remove redundant if statement in processSelect()
This statement

    if (ReplaceWith == S) ReplaceWith = UndefValue::get(S->getType());

is introduced in https://reviews.llvm.org/rG35609d97ae89b8e13f40f4e6b9b056954f8baa83
to fix a case where unreachable code can cause select instruction
simplification to fail. In https://reviews.llvm.org/rGd10480657527ffb44ea213460fb3676a6b1300aa,
we begin to perform a depth-first walk of basic blocks. This means
we will not visit unreachable blocks. So we do not need this the
special check any more.

Differential Revision: https://reviews.llvm.org/D76753
2020-03-28 18:01:17 +01:00
Nikita Popov 9d9633fb70 [CVP] Simplify cmp of local phi node
CVP currently does not simplify cmps with instructions in the same
block, because LVI getPredicateAt() currently does not provide
much useful information for that case (D69686 would change that,
but is stuck.) However, if the instruction is a Phi node, then
LVI can compute the result of the predicate by threading it into
the predecessor blocks, which allows it simplify some conditions
that nothing else can handle. Relevant code:
6d6a4590c5/llvm/lib/Analysis/LazyValueInfo.cpp (L1904-L1927)

Differential Revision: https://reviews.llvm.org/D72169
2020-02-26 20:36:41 +01:00