Commit Graph

844 Commits

Author SHA1 Message Date
Nikita Popov c63a3175c2 [AttrBuilder] Remove ctor accepting AttributeList and Index
Use the AttributeSet constructor instead. There's no good reason
why AttrBuilder itself should exact the AttributeSet from the
AttributeList. Moving this out of the AttrBuilder generally results
in cleaner code.
2022-01-15 22:39:31 +01:00
Caroline Concatto 8e5a5b619d [InstCombine] Fold for masked scatters to a uniform address
When masked scatter intrinsic does a uniform store to a destination
address from a source vector, and in this case, the mask is all one value.
This patch replaces the masked scatter with an extracted element of the
last lane of the source vector and stores it in the destination vector.
This patch also folds when the value in the masked scatter is a splat.
In this case, the mask cannot be all zero, and it folds to a scalar store
of the value in the destination pointer.

Differential Revision: https://reviews.llvm.org/D115724
2022-01-14 09:44:34 +00:00
Philip Reames 5265ac72c6 [MemoryBuiltin] Add an API for checking if an unused allocation can be removed [NFC]
Not all allocation functions are removable if unused.  An example of a non-removable allocation would be a direct call to the replaceable global allocation function in C++.  An example of a removable one - at least according to historical practice - would be malloc.
2022-01-10 15:43:39 -08:00
Bryce Wilson fb936595fa [MemoryBuiltins] Add field for alignment argument [NFC]
There are a few places where the alignment argument for AlignedAllocLike functions was previously hardcoded. This patch adds an getAllocAlignment function and a change to the MemoryBuiltin table to allow alignment arguments to be found generically.

This will shortly allow alignment inference on operator new's with align_val params and an extension to Attributor's HeapToStack.  The former will follow shortly - I split Bryce's patch for purpose of having the large change be NFC.  The later will be reviewed separately.

Differential Revision: https://reviews.llvm.org/D116851 (part 1 of 2)
2022-01-10 09:15:20 -08:00
Philip Reames f4c54683d6 [instcombine] Infer alignment for aligned_alloc with potentially zero size
This change removes a previous restriction where we had to prove the allocation performed by aligned_alloc was non-zero in size before using the align parameter to annotate the result.  I believe this was conservatism around the C11 specification of this routine which allowed UB when size was not a multiple of alignment, but if so, it was a partial one at best.  (ex: align 32, size 16  was equally UB, but not restricted)  The spec has since been clarified to require nullptr return, not UB.

A nullptr - the documented return for this function on failure for all cases after UB mentioned above was removed - is trivially aligned for any power of two.  This isn't totally new behavior even for this transform, we'd previously annotate potentially failing allocs (e.g. huge sizes) meaning we were putting align on potentially null pointers anyways.  This change simpy does the same for all failure modes.
2022-01-10 08:48:49 -08:00
Serge Guelton d2cc6c2d0c Use a sorted array instead of a map to store AttrBuilder string attributes
Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor slightly heavier, and involves extra
allocation for each new string attribute. Storing the attribute key/value as
strings implies extra allocation/copy step.

Use a sorted vector instead. Given the low number of attributes generally
involved, this is cheaper, as showcased by

https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions

Differential Revision: https://reviews.llvm.org/D116599
2022-01-10 14:49:53 +01:00
Philip Reames 2cafbcb560 [instcombine] Key deref vs deref_or_null annotation of allocation sites off nonnull attribute
Goal is to remove use of isOpNewLike.  I looked at a couple approaches to this, and this turned out to be the cheapest one.  Just letting deref_or_null be generated causes a bunch of test diffs, and I couldn't convince myself there wasn't a real regression somewhere.  A generic instcombine to convert deref_or_null + nonnull to deref is annoying complicated since you have to mix facts from callsite and declaration while manipulating only existing call site attributes.  It just wasn't worth the code complexity.

Note that the change in new-delete-itanium.ll is a real regression.  If you have a callsite which overrides the builtin status of a nobuiltin declaration, *and* you don't put the apppriate attributes on that callsite, you may lose the deref fact.  I decided this didn't matter; if anyone disagrees, you can add this case to the generic non-null inference.
2022-01-08 10:33:54 -08:00
Philip Reames dcbc91f40c [instcombine] Delete duplicate object size logic
nstCombine appears to duplicate the allocation size logic used inside getObjectSize when figuring out which attributes are safe to place on the callsite. We can use the existing utility function instead.

The test change is correct. With aligned_alloc, a zero alignment is required to return nullptr. As such, deref_or_null is a correct attribute to use.

Differential Revision: https://reviews.llvm.org/D116816
2022-01-07 10:32:26 -08:00
Nick Desaulniers 95ba0e4563 [SimplifyLibCalls] propagate tail flags on CallInsts
I noticed we weren't propagating tail flags on calls when
FortifiedLibCallSimplifier.optimizeCall() was replacing calls to runtime
checked calls to the non-checked routines (when safe to do so). Make
sure to check this before replacing the original calls!

Also, avoid any libcall transforms when notail/musttail is present.

PR46734
Fixes: https://github.com/llvm/llvm-project/issues/46079

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D107872
2021-12-13 11:18:30 -08:00
Zarko Todorovski 0d3add216f [llvm][NFC] Inclusive language: Reword replace uses of sanity in llvm/lib/Transform comments and asserts
Reworded some comments and asserts to avoid usage of `sanity check/test`

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D114372
2021-11-23 13:22:55 -05:00
Itay Bookstein f9059efa0d [InstCombine] Extend stacksave/restore elimination
Previously, InstCombine detected a pair of llvm.stacksave/stackrestore
instructions that are adjacent modulo debug instructions in order to
eliminate the llvm.stackrestore. This precludes situations where
intervening instructions (e.g. loads) preclude the llvm.stacksave and
llvm.stackrestore from becoming adjacent. This commit extends the logic
and allows for eliminating the llvm.stackrestore when the range of
instructions between them does not include any alloca or side-effect
causing instructions.

Signed-off-by: Itay Bookstein <itay.bookstein@nextsilicon.com>

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D113105
2021-11-10 10:41:58 +02:00
Itay Bookstein fe7491d32f [InstCombine][NFC] Refactor llvm.stackrestore handling
Hoist the instruction classification logic outside the loop
in preparation for reuse in a future commit.

Signed-off-by: Itay Bookstein <itay.bookstein@nextsilicon.com>

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D113464
2021-11-10 10:41:56 +02:00
Hongtao Yu 098a0d8fbc [CSSPGO] Unblock optimizations with pseudo probe instrumentation part 3.
This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation.

Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are:
- Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes
- Unblocked CSE by avoiding pseudo probe from clobbering memory SSA
- Unblocked induction variable simpliciation
- Allow empty loop deletion by treating probe intrinsic isDroppable
- Some refactoring.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110847
2021-10-12 09:44:12 -07:00
Jay Foad a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Kazu Hirata 4f0225f6d2 [Transforms] Migrate from getNumArgOperands to arg_size (NFC)
Note that getNumArgOperands is considered a legacy name.  See
llvm/include/llvm/IR/InstrTypes.h for details.
2021-10-01 09:57:40 -07:00
Sanjay Patel 6063e6b499 [InstCombine] move add after min/max intrinsic
This is another regression noted with the proposal to canonicalize
to the min/max intrinsics in D98152.

Here are Alive2 attempts to show correctness without specifying
exact constants:
https://alive2.llvm.org/ce/z/bvfCwh (smax)
https://alive2.llvm.org/ce/z/of7eqy (smin)
https://alive2.llvm.org/ce/z/2Xtxoh (umax)
https://alive2.llvm.org/ce/z/Rm4Ad8 (umin)
(if you comment out the assume and/or no-wrap, you should see failures)

The different output for the umin test is due to a fold added with
c4fc2cb5b2 :

// umin(x, 1) == zext(x != 0)

We probably want to adjust that, so it applies more generally
(umax --> sext or patterns where we can fold to select-of-constants).
Some folds that were ok when starting with cmp+select may increase
instruction count for the equivalent intrinsic, so we have to decide
if it's worth altering a min/max.

Differential Revision: https://reviews.llvm.org/D110038
2021-09-26 09:49:10 -04:00
Florian Hahn e08a5dc86f
[InstCombine] Move InstCombineWorklist to Utils to allow reuse (NFC).
InstCombine's worklist can be re-used by other passes like
VectorCombine. Move it to llvm/Transform/Utils and rename it to
InstructionWorklist.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D110181
2021-09-22 08:47:21 +01:00
Usman Nadeem f417d9d821 [InstCombine] Eliminate vector reverse if all inputs/outputs to an instruction are reverses
Differential Revision: https://reviews.llvm.org/D109808

Change-Id: I1a10d2bc33acbe0ea353c6cb3d077851391fe73e
2021-09-20 18:32:24 -07:00
Dávid Bolvanský a4a426c9e0 [InstCombine] Added llvm.powi optimizations
If power is even:
powi(-x, p) -> powi(x, p)
powi(fabs(x), p) -> powi(x, p)
powi(copysign(x, y), p) -> powi(x, p)
2021-09-16 19:42:21 +02:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Arthur Eubanks b81fc14f2d [NFC][InstCombine] Make check for sret in a vararg function clearer
We're trying to get the parameter index of sret and see if it's part of
a function's varargs.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D109335
2021-09-07 11:19:27 -07:00
Roman Lebedev 3f1f08f0ed
Revert @llvm.isnan intrinsic patchset.
Please refer to
https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html
(and that whole thread.)

TLDR: the original patch had no prior RFC, yet it had some changes that
really need a proper RFC discussion. It won't be productive to discuss
such an RFC, once it's actually posted, while said patch is already
committed, because that introduces bias towards already-committed stuff,
and the tree is potentially in broken state meanwhile.

While the end result of discussion may lead back to the current design,
it may also not lead to the current design.

Therefore i take it upon myself
to revert the tree back to last known good state.

This reverts commit 4c4093e6e3.
This reverts commit 0a2b1ba33a.
This reverts commit d9873711cb.
This reverts commit 791006fb8c.
This reverts commit c22b64ef66.
This reverts commit 72ebcd3198.
This reverts commit 5fa6039a5f.
This reverts commit 9efda541bf.
This reverts commit 94d3ff09cf.
2021-09-02 13:53:56 +03:00
Sanjay Patel 8c7a7e1f67 [InstCombine] allow more min/max with 'not' folds for intrinsics
isFreeToInvert allows min/max with 'not' on both operands,
so easing the argument restriction catches the case where
that operand has one use.

We already handle the sub-patterns when there are less uses:
https://alive2.llvm.org/ce/z/8Jatm_

...but this is another step towards parity with the
equivalent icmp+select idioms ( D98152 ).

Differential Revision: https://reviews.llvm.org/D109059
2021-09-01 14:40:00 -04:00
Sanjay Patel 8a10f4a0f6 [InstCombine] use isFreeToInvert to generalize min/max with 'not'
This mimics the code for the corresponding cmp-select idiom.

This also prevents an infinite loop because isFreeToInvert
does not match constant expressions.

So this patch solves the same problem as D108814 and obsoletes
it, but my main motivation is to enhance the pattern matching
to allow more invertible ops. That change will be a follow-up
patch on top of this one.

Differential Revision: https://reviews.llvm.org/D109058
2021-09-01 14:34:22 -04:00
Arthur Eubanks 3f4d00bc3b [NFC] More get/removeAttribute() cleanup 2021-08-17 21:05:41 -07:00
Sanjay Patel 50c1138796 [InstCombine] add TODO about another min/max fold; NFC
Suggested in post-commit for d0975b7cb0
2021-08-17 14:14:25 -04:00
Sanjay Patel e73f4e1123 [InstCombine] remove unused function argument; NFC 2021-08-17 08:10:42 -04:00
Sanjay Patel d0975b7cb0 [InstCombine] fold signed min/max intrinsics with negated operands
If both operands are negated, we can invert the min/max and do
the negation after:
smax (neg nsw X), (neg nsw Y) --> neg nsw (smin X, Y)
smin (neg nsw X), (neg nsw Y) --> neg nsw (smax X, Y)

This is visible as a remaining regression in D98152. I don't see
a way to generalize this for 'unsigned' or adapt Negator to
handle it. This only appears to be safe with 'nsw':
https://alive2.llvm.org/ce/z/GUy1zJ

Differential Revision: https://reviews.llvm.org/D108165
2021-08-17 08:10:42 -04:00
David Green c6b7db015f [InstCombine] Add call to matchSAddSubSat from min/max
This adds a call to matchSAddSubSat from smin/smax instrinsics, allowing
the same patterns to match if the canonical form of a min/max is an
intrinsics, not a icmp/select.

Differential Revision: https://reviews.llvm.org/D108077
2021-08-15 17:25:16 +01:00
Arthur Eubanks 80ea2bb574 [NFC] Rename AttributeList::getParam/Ret/FnAttributes() -> get*Attributes()
This is more consistent with similar methods.
2021-08-13 11:16:52 -07:00
Arthur Eubanks a0c42ca56c [NFC] Remove AttributeList::hasParamAttribute()
It's the same as AttributeList::hasParamAttr().
2021-08-13 10:58:21 -07:00
Sanjay Patel 14eefa57f2 [InstCombine] factorize min/max intrinsic ops with common operand (2nd try)
This is a re-try of 6de1dbbd09 which was reverted because
it missed a null check. Extra test for that failure added.

Original commit message:
This is an adaptation of D41603 and another step on the way
to canonicalizing to the intrinsic forms of min/max.

See D98152 for status.
2021-08-12 16:32:07 -04:00
Amy Huang 427520a8fa Revert "[InstCombine] factorize min/max intrinsic ops with common operand"
This reverts commit 6de1dbbd09 because it causes a
compiler crash.
2021-08-12 12:36:25 -07:00
Sanjay Patel cd44cc86e3 [InstCombine] remove unused function argument; NFC
This was just added with 6de1dbbd09 , and I missed
pulling the extra arg from the final revision.
2021-08-12 11:47:25 -04:00
Sanjay Patel 6de1dbbd09 [InstCombine] factorize min/max intrinsic ops with common operand
This is an adaptation of D41603 and another step on the way
to canonicalizing to the intrinsic forms of min/max.

See D98152 for status.
2021-08-12 11:19:09 -04:00
Roman Lebedev 0a241e90d4
[NFC][InstCombine] `vector_reduce_xor(?ext(<n x i1>))` --> `?ext(vector_reduce_add(<n x i1>))`
Instead of expanding it ourselves,
we can just forward to `?ext(vector_reduce_add(<n x i1>))`, as per alive2:
https://alive2.llvm.org/ce/z/ymz7zE (self)
https://alive2.llvm.org/ce/z/eKu2v2 (skipped zext)
https://alive2.llvm.org/ce/z/c3BXgc (skipped sext)
2021-08-07 17:31:33 +03:00
Roman Lebedev c6ff867f92
[NFC][InstCombine] Simplify emitted IR for `vector_reduce_xor(?ext(<n x i1>))`
Now that we canonicalize low bit splatting to the form we were emitting
here ourselves, emit simpler IR that will be canonicalized later.

See 1e801439be for proofs:
https://alive2.llvm.org/ce/z/MjCm5W (self)
https://alive2.llvm.org/ce/z/kgqF4M (skipped zext)
https://alive2.llvm.org/ce/z/pgy3HP (skipped sext)
2021-08-07 17:31:24 +03:00
Serge Pavlov 4c4093e6e3 Introduce intrinsic llvm.isnan
This is recommit of the patch 16ff91ebcc,
reverted in 0c28a7c990 because it had
an error in call of getFastMathFlags (base type should be FPMathOperator
but not Instruction). The original commit message is duplicated below:

    Clang has builtin function '__builtin_isnan', which implements C
    library function 'isnan'. This function now is implemented entirely in
    clang codegen, which expands the function into set of IR operations.
    There are three mechanisms by which the expansion can be made.

    * The most common mechanism is using an unordered comparison made by
      instruction 'fcmp uno'. This simple solution is target-independent
      and works well in most cases. It however is not suitable if floating
      point exceptions are tracked. Corresponding IEEE 754 operation and C
      function must never raise FP exception, even if the argument is a
      signaling NaN. Compare instructions usually does not have such
      property, they raise 'invalid' exception in such case. So this
      mechanism is unsuitable when exception behavior is strict. In
      particular it could result in unexpected trapping if argument is SNaN.

    * Another solution was implemented in https://reviews.llvm.org/D95948.
      It is used in the cases when raising FP exceptions by 'isnan' is not
      allowed. This solution implements 'isnan' using integer operations.
      It solves the problem of exceptions, but offers one solution for all
      targets, however some can do the check in more efficient way.

    * Solution implemented by https://reviews.llvm.org/D96568 introduced a
      hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
      specific code into IR. Now only SystemZ implements this hook and it
      generates a call to target specific intrinsic function.

    Although these mechanisms allow to implement 'isnan' with enough
    efficiency, expanding 'isnan' in clang has drawbacks:

    * The operation 'isnan' is hidden behind generic integer operations or
      target-specific intrinsics. It complicates analysis and can prevent
      some optimizations.

    * IR can be created by tools other than clang, in this case treatment
      of 'isnan' has to be duplicated in that tool.

    Another issue with the current implementation of 'isnan' comes from the
    use of options '-ffast-math' or '-fno-honor-nans'. If such option is
    specified, 'fcmp uno' may be optimized to 'false'. It is valid
    optimization in general, but it results in 'isnan' always returning
    'false'. For example, in some libc++ implementations the following code
    returns 'false':

        std::isnan(std::numeric_limits<float>::quiet_NaN())

    The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
    operands are never NaNs. This assumption however should not be applied
    to the functions that check FP number properties, including 'isnan'. If
    such function returns expected result instead of actually making
    checks, it becomes useless in many cases. The option '-ffast-math' is
    often used for performance critical code, as it can speed up execution
    by the expense of manual treatment of corner cases. If 'isnan' returns
    assumed result, a user cannot use it in the manual treatment of NaNs
    and has to invent replacements, like making the check using integer
    operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
    which also expresses the opinion, that limitations imposed by
    '-ffast-math' should be applied only to 'math' functions but not to
    'tests'.

    To overcome these drawbacks, this change introduces a new IR intrinsic
    function 'llvm.isnan', which realizes the check as specified by IEEE-754
    and C standards in target-agnostic way. During IR transformations it
    does not undergo undesirable optimizations. It reaches instruction
    selection, where is lowered in target-dependent way. The lowering can
    vary depending on options like '-ffast-math' or '-ffp-model' so the
    resulting code satisfies requested semantics.

    Differential Revision: https://reviews.llvm.org/D104854
2021-08-06 14:32:27 +07:00
Serge Pavlov 0c28a7c990 Revert "Introduce intrinsic llvm.isnan"
This reverts commit 16ff91ebcc.
Several errors were reported mainly test-suite execution time. Reverted
for investigation.
2021-08-04 17:18:15 +07:00
Serge Pavlov 16ff91ebcc Introduce intrinsic llvm.isnan
Clang has builtin function '__builtin_isnan', which implements C
library function 'isnan'. This function now is implemented entirely in
clang codegen, which expands the function into set of IR operations.
There are three mechanisms by which the expansion can be made.

* The most common mechanism is using an unordered comparison made by
  instruction 'fcmp uno'. This simple solution is target-independent
  and works well in most cases. It however is not suitable if floating
  point exceptions are tracked. Corresponding IEEE 754 operation and C
  function must never raise FP exception, even if the argument is a
  signaling NaN. Compare instructions usually does not have such
  property, they raise 'invalid' exception in such case. So this
  mechanism is unsuitable when exception behavior is strict. In
  particular it could result in unexpected trapping if argument is SNaN.

* Another solution was implemented in https://reviews.llvm.org/D95948.
  It is used in the cases when raising FP exceptions by 'isnan' is not
  allowed. This solution implements 'isnan' using integer operations.
  It solves the problem of exceptions, but offers one solution for all
  targets, however some can do the check in more efficient way.

* Solution implemented by https://reviews.llvm.org/D96568 introduced a
  hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
  specific code into IR. Now only SystemZ implements this hook and it
  generates a call to target specific intrinsic function.

Although these mechanisms allow to implement 'isnan' with enough
efficiency, expanding 'isnan' in clang has drawbacks:

* The operation 'isnan' is hidden behind generic integer operations or
  target-specific intrinsics. It complicates analysis and can prevent
  some optimizations.

* IR can be created by tools other than clang, in this case treatment
  of 'isnan' has to be duplicated in that tool.

Another issue with the current implementation of 'isnan' comes from the
use of options '-ffast-math' or '-fno-honor-nans'. If such option is
specified, 'fcmp uno' may be optimized to 'false'. It is valid
optimization in general, but it results in 'isnan' always returning
'false'. For example, in some libc++ implementations the following code
returns 'false':

    std::isnan(std::numeric_limits<float>::quiet_NaN())

The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
operands are never NaNs. This assumption however should not be applied
to the functions that check FP number properties, including 'isnan'. If
such function returns expected result instead of actually making
checks, it becomes useless in many cases. The option '-ffast-math' is
often used for performance critical code, as it can speed up execution
by the expense of manual treatment of corner cases. If 'isnan' returns
assumed result, a user cannot use it in the manual treatment of NaNs
and has to invent replacements, like making the check using integer
operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
which also expresses the opinion, that limitations imposed by
'-ffast-math' should be applied only to 'math' functions but not to
'tests'.

To overcome these drawbacks, this change introduces a new IR intrinsic
function 'llvm.isnan', which realizes the check as specified by IEEE-754
and C standards in target-agnostic way. During IR transformations it
does not undergo undesirable optimizations. It reaches instruction
selection, where is lowered in target-dependent way. The lowering can
vary depending on options like '-ffast-math' or '-ffp-model' so the
resulting code satisfies requested semantics.

Differential Revision: https://reviews.llvm.org/D104854
2021-08-04 15:27:49 +07:00
Roman Lebedev 4ba3326f17
[InstCombine] `vector_reduce_{or,and}(?ext(<n x i1>))` --> `?ext(vector_reduce_{or,and}(<n x i1>))` (PR51259)
This allows the expansion logic to actually trigger if the argument
was extended from i1 element type, like the rest of the reductions expect.

Alive2 agrees:
https://alive2.llvm.org/ce/z/wcfews (or zext)
https://alive2.llvm.org/ce/z/FCXNFx (or sext)
https://alive2.llvm.org/ce/z/f26zUY (and zext)
https://alive2.llvm.org/ce/z/jprViN (and sext)
2021-08-03 00:54:35 +03:00
Roman Lebedev 554fc9ad0a
[InstCombine] `vector_reduce_smax(?ext(<n x i1>))` --> `?ext(vector_reduce_{and,or}(<n x i1>))` (PR51259)
Alive2 agrees:
https://alive2.llvm.org/ce/z/3oqir9 (self)
https://alive2.llvm.org/ce/z/6cuI5m (zext)
https://alive2.llvm.org/ce/z/4FL8rD (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
2021-08-03 00:29:06 +03:00
Roman Lebedev f47b7b6d10
[InstCombine] `vector_reduce_smin(?ext(<n x i1>))` --> `?ext(vector_reduce_{or,and}(<n x i1>))` (PR51259)
Alive2 agrees:
https://alive2.llvm.org/ce/z/noXtZ8 (self)
https://alive2.llvm.org/ce/z/JNrN6C (zext)
https://alive2.llvm.org/ce/z/58snuN (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
2021-08-03 00:29:06 +03:00
Roman Lebedev b9b7162b8b
[InstCombine] `vector_reduce_umax(?ext(<n x i1>))` --> `?ext(vector_reduce_or(<n x i1>))` (PR51259)
Alive2 agrees:
https://alive2.llvm.org/ce/z/NbBaeT (self)
https://alive2.llvm.org/ce/z/iEaig4 (zext)
https://alive2.llvm.org/ce/z/meGb3y (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
2021-08-02 23:02:23 +03:00
Roman Lebedev 0c13798056
[InstCombine] `vector_reduce_umin(?ext(<n x i1>))` --> `?ext(vector_reduce_and(<n x i1>))` (PR51259)
Alive2 agrees:
https://alive2.llvm.org/ce/z/XxUScW (self)
https://alive2.llvm.org/ce/z/3usTF- (zext)
https://alive2.llvm.org/ce/z/GVxwQz (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
2021-08-02 23:02:22 +03:00
Roman Lebedev 469793efa7
[InstCombine] `vector_reduce_mul(?ext(<n x i1>))` --> `zext(vector_reduce_and(<n x i1>))` (PR51259)
Alive2 agrees:
https://alive2.llvm.org/ce/z/PDansB (self)
https://alive2.llvm.org/ce/z/55D-Xc (zext)
https://alive2.llvm.org/ce/z/LxG3-r (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
2021-08-02 21:57:51 +03:00
Roman Lebedev 1e801439be
[InstCombine] `xor` reduction w/ i1 elt type is a parity check
For i1 element type, `xor` and `add` are interchangeable
(https://alive2.llvm.org/ce/z/e77hhQ), so we should treat it just like
an `add` reduction and consistently transform them both:
https://alive2.llvm.org/ce/z/MjCm5W (self)
https://alive2.llvm.org/ce/z/kgqF4M (skipped zext)
https://alive2.llvm.org/ce/z/pgy3HP (skipped sext)

Though, let's emit the IR that is similar to the one we produce for
`vector_reduce_add(<n x i1>)`.

See https://bugs.llvm.org/show_bug.cgi?id=51259
2021-08-02 20:21:37 +03:00
Jun Ma 958dddf7df [NFC][InstCombine] Fix typo 2021-07-27 11:33:10 +08:00
Alexey Bataev 8af69975af [InstCombine][NFC]Use only `replaceInstUsesWith`, NFC. 2021-07-08 13:58:30 -07:00
Alexey Bataev b5113bff46 [Instcombine]Transform reduction+(sext/zext(<n x i1>) to <n x im>) to [-]zext/trunc(ctpop(bitcast <n x i1> to in)) to im.
Some of the SPEC tests end up with reduction+(sext/zext(<n x i1>) to <n x im>) pattern, which can be transformed to [-]zext/trunc(ctpop(bitcast <n x i1> to in)) to im.
Also, reduction+(<n x i1>) can be transformed to ctpop(bitcast <n x i1> to in) & 1 != 0.

Differential Revision: https://reviews.llvm.org/D105587
2021-07-08 07:56:41 -07:00