Commit Graph

210 Commits

Author SHA1 Message Date
Rosie Sumpter c2441b6b89 [LoopVectorize] Add vector reduction support for fmuladd intrinsic
Enables LoopVectorize to handle reduction patterns involving the
llvm.fmuladd intrinsic.

Differential Revision: https://reviews.llvm.org/D111555
2021-11-24 08:50:04 +00:00
Kazu Hirata d1abf481da [llvm] Use range-based for loops (NFC) 2021-11-19 21:12:13 -08:00
Kazu Hirata 0d182d9d1e [Transforms] Use make_early_inc_range (NFC) 2021-11-07 17:03:15 -08:00
Florian Hahn 8977bd5806
[IndVars] Invalidate SCEV when IR is changed in rewriteLoopExitValue.
At the moment, rewriteLoopExitValue forgets the current phi node in the
loop that collects phis to rewrite. A few lines after the value is
forgotten, SCEV is used again to analyze incoming values and
potentially expand SCEV expression. This means that another SCEV is
created for PN, before the IR is actually updated in the next loop.

This leads to accessing invalid cached expression in combination with
D71539.

PN should only be changed once the actual incoming exit value is set in
the next loop. Moving invalidation there should ensure that PN is
invalidated in all relevant cases.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D111495
2021-10-20 20:48:33 +01:00
Florian Hahn e844f05397
[LoopUtils] Simplify addRuntimeCheck to return a single value.
This simplifies the return value of addRuntimeCheck from a pair of
instructions to a single `Value *`.

The existing users of addRuntimeChecks were ignoring the first element
of the pair, hence there is not reason to track FirstInst and return
it.

Additionally all users of addRuntimeChecks use the second returned
`Instruction *` just as `Value *`, so there is no need to return an
`Instruction *`. Therefore there is no need to create a redundant
dummy `and X, true` instruction any longer.

Effectively this change should not impact the generated code because the
redundant AND will be folded by later optimizations. But it is easy to
avoid creating it in the first place and it allows more accurately
estimating the cost of the runtime checks.
2021-10-18 18:03:09 +01:00
David Sherwood 26b7d9d622 [LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve
selecting an integer value based on a previous comparison. Consider the
following C++ loop:

  int r = a;
  for (int i = 0; i < n; i++) {
    if (src[i] > 3) {
      r = b;
    }
    src[i] += 2;
  }

We should be able to vectorise this loop because all we are doing is
selecting between two states - 'a' and 'b' - both of which are loop
invariant. This just involves building a vector of values that contain
either 'a' or 'b', where the final reduced value will be 'b' if any lane
contains 'b'.

The IR generated by clang typically looks like this:

  %phi = phi i32 [ %a, %entry ], [ %phi.update, %for.body ]
  ...
  %pred = icmp ugt i32 %val, i32 3
  %phi.update = select i1 %pred, i32 %b, i32 %phi

We already detect min/max patterns, which also involve a select + cmp.
However, with the min/max patterns we are selecting loaded values (and
hence loop variant) in the loop. In addition we only support certain
cmp predicates. This patch adds a new pattern matching function
(isSelectCmpPattern) and new RecurKind enums - SelectICmp & SelectFCmp.
We only support selecting values that are integer and loop invariant,
however we can support any kind of compare - integer or float.

Tests have been added here:

  Transforms/LoopVectorize/AArch64/sve-select-cmp.ll
  Transforms/LoopVectorize/select-cmp-predicated.ll
  Transforms/LoopVectorize/select-cmp.ll

Differential Revision: https://reviews.llvm.org/D108136
2021-10-11 09:41:38 +01:00
Simon Pilgrim 0dcd2b40e6 [TTI] Remove default condition type and predicate arguments from getCmpSelInstrCost
We need to be better at exposing the comparison predicate to getCmpSelInstrCost calls as some targets (e.g. X86 SSE) have very different costs for different comparisons (PR48337), and we can't always rely on the optional Instruction argument.

This initial commit requires explicit condition type and predicate arguments. The next step will be to review a lot of the existing getCmpSelInstrCost calls which have used BAD_ICMP_PREDICATE even when the predicate is known.

Differential Revision: https://reviews.llvm.org/D111024
2021-10-06 15:40:35 +01:00
Krasimir Georgiev 685f1bfd0a Revert "[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns"
It appears to cause stage2 clang build failures, e.g.,
https://lab.llvm.org/buildbot/#/builders/74/builds/7145.

This reverts commit 1fb37334bd.
2021-10-01 11:39:43 +02:00
David Sherwood 1fb37334bd [LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve
selecting an integer value based on a previous comparison. Consider the
following C++ loop:

  int r = a;
  for (int i = 0; i < n; i++) {
    if (src[i] > 3) {
      r = b;
    }
    src[i] += 2;
  }

We should be able to vectorise this loop because all we are doing is
selecting between two states - 'a' and 'b' - both of which are loop
invariant. This just involves building a vector of values that contain
either 'a' or 'b', where the final reduced value will be 'b' if any lane
contains 'b'.

The IR generated by clang typically looks like this:

  %phi = phi i32 [ %a, %entry ], [ %phi.update, %for.body ]
  ...
  %pred = icmp ugt i32 %val, i32 3
  %phi.update = select i1 %pred, i32 %b, i32 %phi

We already detect min/max patterns, which also involve a select + cmp.
However, with the min/max patterns we are selecting loaded values (and
hence loop variant) in the loop. In addition we only support certain
cmp predicates. This patch adds a new pattern matching function
(isSelectCmpPattern) and new RecurKind enums - SelectICmp & SelectFCmp.
We only support selecting values that are integer and loop invariant,
however we can support any kind of compare - integer or float.

Tests have been added here:

  Transforms/LoopVectorize/AArch64/sve-select-cmp.ll
  Transforms/LoopVectorize/select-cmp-predicated.ll
  Transforms/LoopVectorize/select-cmp.ll

Differential Revision: https://reviews.llvm.org/D108136
2021-10-01 08:41:03 +01:00
Philip Reames c3b3aa277a Fix a missing MemorySSA update in breakLoopBackedge
This is a case I'd missed in 6a8237. The odd bit here is that missing the edge removal update seems to produce MemorySSA which verifies, but is still corrupt in a way which bothers following passes. I wasn't able to reduce a single pass test case, which is why the reported test case is taken as is.

Differential Revision: https://reviews.llvm.org/D109068
2021-09-01 16:59:01 -07:00
Roman Lebedev 795d142d23
[NFCI][IndVars] rewriteLoopExitValues(): don't expand SCEV's until needed
Previously, we'd expand *ALL* the SCEV's eagerly, because we needed to
check with `isValidRewrite()`, and discard bad rewrite candidates,
but now that we do not do that, we also don't need to always expand.

In particular, this avoids expanding potentially-huge SCEV's that we
would discard anyways because they are high-cost and we aren't
rewriting aggressively.
2021-08-30 12:28:24 +03:00
Roman Lebedev 7b0d59da9a
[IndVars] Drop check for the validity of rewrite
`isValidRewrite()` checks that the both the original SCEV,
and the rewrite SCEV have the same base pointer.
I //believe//, after all the recent SCEV improvements,
this invariant is already enforced by SCEV itself.

I originally tried changing it into an assert in D108043,
but that showed that it triggers on e.g. https://reviews.llvm.org/D108043#2946621,
where SCEV manages to forward the store to load,
test added.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D108655
2021-08-30 12:06:58 +03:00
Philip Reames 6a82376012 Special case common branch patterns in breakLoopBackedge (try 2)
Changes since aec08e:
* Adjust placement of a closing brace so that the general case actually runs.  Turns out we had *no* coverage of the switch case.  I added one in eae90fd.
* Drop .llvm.loop.* metadata from the new branch as there is no longer a loop to annotate.

Original commit message:

This special cases an unconditional latch and a conditional branch latch exit to improve codegen and test readability. I am hoping to reuse this function in the runtime unroll code, but without this change, the test diffs are far too complex to assess.
2021-08-27 10:27:16 -07:00
Philip Reames 1e07f19bfc Revert "Special case common branch patterns in breakLoopBackedge"
This reverts commit aec08e8600.

Several problems have been reported with malformed loopinfo after this change, see discussion on https://reviews.llvm.org/rGaec08e86004b.
2021-08-24 08:53:42 -07:00
Philip Reames aec08e8600 Special case common branch patterns in breakLoopBackedge
This special cases an unconditional latch and a conditional branch latch exit to improve codegen and test readability.  I am hoping to reuse this function in the runtime unroll code, but without this change, the test diffs are far too complex to assess.
2021-08-22 10:42:23 -07:00
Roman Lebedev febcedf18c
Revert "[NFCI][IndVars] rewriteLoopExitValues(): nowadays SCEV should not change `GEP` base pointer"
https://bugs.llvm.org/show_bug.cgi?id=51490 was filed.

This reverts commit 35a8bdc775.
2021-08-16 14:30:29 +03:00
David Sherwood 9b19b77883 [NFC] Remove unused code in llvm::createSimpleTargetReduction 2021-08-16 09:50:45 +01:00
Roman Lebedev 35a8bdc775
[NFCI][IndVars] rewriteLoopExitValues(): nowadays SCEV should not change `GEP` base pointer
Currently/previously, while SCEV guaranteed that it produces the same value,
the way it was produced may be illegal IR, so we have an ugly check that
the replacement is valid.

But now that the SCEV strictness wrt the pointer/integer types has been improved,
i believe this invariant is already upheld by the SCEV itself, natively.

I think we should add an assertion, wait for a week, and then, if all is good,
rip out all this checking.
Or we could just do the latter directly i guess.

This reverts commit rL127839.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D108043
2021-08-15 18:59:32 +03:00
Johannes Doerfert 25a3130d89 [Local] Do not introduce a new `llvm.trap` before `unreachable`
This is the second attempt to remove the `llvm.trap` insertion after
https://reviews.llvm.org/rGe14e7bc4b889dfaffb7180d176a03311df2d4ae6
reverted the first one. It is not clear what the exact issue was back
then and it might already be gone by now, it has been >5 years after
all.

Replaces D106299.

Differential Revision: https://reviews.llvm.org/D106308
2021-07-26 23:33:36 -05:00
Florian Hahn 6d753b0751
[LAA] Remove RuntimeCheckingPtrGroup::RtCheck member (NFC).
This patch removes RtCheck from RuntimeCheckingPtrGroup to make it
possible to construct RuntimeCheckingPtrGroup objects without a
RuntimePointerChecking object. This should make it easier to
re-use the code to generate runtime checks, e.g. in D102834.

RtCheck was only used to access the pointer info for a given index.
Instead, the start and end expressions can be passed directly.

For code-gen, we also need to know the address space to use. This can
also be explicitly passed at construction.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D105481
2021-07-26 17:38:10 +01:00
Mindong Chen e908e063d1 [LoopUtils] Fix incorrect RT check bounds of loop-invariant mem accesses
This fixes the lower and upper bound calculation of a
RuntimeCheckingPtrGroup when it has more than one loop
invariant pointers. Resolves PR50686.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D104148
2021-07-19 19:38:24 +08:00
Simon Pilgrim 5e6bfb661e [Analysis] Pass RecurrenceDescriptor as const reference. NFCI.
We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.).

Differential Revision: https://reviews.llvm.org/D104029
2021-06-11 10:24:14 +01:00
Philip Reames 7629b2a09c [LI] Add a cover function for checking if a loop is mustprogress [nfc]
Essentially, the cover function simply combines the loop level check and the function level scope into one call.  This simplifies several callers and is (subjectively) less error prone.
2021-06-10 13:37:32 -07:00
Philip Reames b6ee5f2b1d Move code for checking loop metadata into Analysis [nfc]
I need the mustprogress loop metadata in ScalarEvolution and it makes sense to keep all the accessors for quering loop metadate together.
2021-06-10 13:01:22 -07:00
Bardia Mahjour ddb3b26a12 [LV] Consider Loop Unroll Hints When Making Interleave Decisions
This patch causes the loop vectorizer to not interleave loops that have
nounroll loop hints (llvm.loop.unroll.disable and llvm.loop.unroll_count(1)).
Note that if a particular interleave count is being requested
(through llvm.loop.interleave_count), it will still be honoured, regardless
of the presence of nounroll hints.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D101374
2021-04-28 17:27:52 -04:00
Benjamin Kramer ce4acb01b3 Avoid unused variable warning in Release builds 2021-04-06 16:25:19 +02:00
Kerry McLaughlin 7344f3d39a [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization
Previously we could only vectorize FP reductions if fast math was enabled, as this allows us to
reorder FP operations. However, it may still be beneficial to vectorize the loop by moving
the reduction inside the vectorized loop and making sure that the scalar reduction value
be an input to the horizontal reduction, e.g:

  %phi = phi float [ 0.0, %entry ], [ %reduction, %vector_body ]
  %load = load <8 x float>
  %reduction = call float @llvm.vector.reduce.fadd.v8f32(float %phi, <8 x float> %load)

This patch adds a new flag (IsOrdered) to RecurrenceDescriptor and makes use of the changes added
by D75069 as much as possible, which already teaches the vectorizer about in-loop reductions.
For now in-order reduction support is off by default and controlled with the `-enable-strict-reductions` flag.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D98435
2021-04-06 14:45:34 +01:00
Jingu Kang e4abb64100 [LoopUnswitch] Use reference variables instead of pointer one
Differential Revision: https://reviews.llvm.org/D99496
2021-03-29 13:08:46 +01:00
Jingu Kang cfe87d4edd [NFC][LoopUnswitch] Move hasPartialIVCondition to LoopUtils
Differential revision: https://reviews.llvm.org/D99490
2021-03-29 10:29:45 +01:00
Sanjay Patel 79b1b4a581 [Vectorizers][TTI] remove option to bypass creation of vector reduction intrinsics
The vector reduction intrinsics started life as experimental ops, so backend support
was lacking. As part of promoting them to 1st-class intrinsics, however, codegen
support was added/improved:
D58015
D90247

So I think it is safe to now remove this complication from IR.

Note that we still have an IR-level codegen expansion pass for these as discussed
in D95690. Removing that is another step in simplifying the logic. Also note that
x86 was already unconditionally forming reductions in IR, so there should be no
difference for x86.

I spot checked a couple of the tests here by running them through opt+llc and did
not see any asm diffs.

If we do find functional differences for other targets, it should be possible
to (at least temporarily) restore the shuffle IR with the ExpandReductions IR
pass.

Differential Revision: https://reviews.llvm.org/D96552
2021-02-12 08:13:50 -05:00
Sanjay Patel bbed5f2f8a [LoopVectorize] improve IR fast-math-flags propagation in reductions
This is another step (see D95452) towards correcting fast-math-flags
bugs in vector reductions.

There are multiple bugs visible in the test diffs, and this is still
not working as it should. We still use function attributes (rather
than FMF) to drive part of the logic, but we are not checking for
the correct FP function attributes.

Note that FMF may not be propagated optimally on selects (example
in https://llvm.org/PR35607 ). That's why I'm proposing to union the
FMF of a fcmp+select pair and avoid regressions on existing vectorizer
tests.

Differential Revision: https://reviews.llvm.org/D95690
2021-02-01 16:21:36 -05:00
Florian Hahn 28410d17f5
[LoopUtils] Pass SCEVExpander instead SE to addRuntimeChecks.
This gives the user control over which expander to use, which in turn
allows the user to decide what to do with the expanded instructions.

Used in D75980.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D94295
2021-01-27 17:36:19 +00:00
Sanjay Patel 09b1c56366 [LoopUtils] do not initialize Cmp predicate unnecessarily; NFC
The switch must set the predicate correctly; anything else
should lead to unreachable/assert.

I'm trying to fix FMF propagation here and the callers,
so this is a preliminary cleanup.
2021-01-26 11:22:51 -05:00
Philip Reames ef51eed37b [LoopDeletion] Handle inner loops w/untaken backedges
This builds on the restricted after initial revert form of D93906, and adds back support for breaking backedges of inner loops. It turns out the original invalidation logic wasn't quite right, specifically around the handling of LCSSA.

When breaking the backedge of an inner loop, we can cause blocks which were in the outer loop only because they were also included in a sub-loop to be removed from both loops. This results in the exit block set for our original parent loop changing, and thus a need for new LCSSA phi nodes.

This case happens when the inner loop has an exit block which is also an exit block of the parent, and there's a block in the child which reaches an exit to said block without also reaching an exit to the parent loop.

(I'm describing this in terms of the immediate parent, but the problem is general for any transitive parent in the nest.)

The approach implemented here involves a potentially expensive LCSSA rebuild.  Perf testing during review didn't show anything concerning, but we may end up needing to revert this if anyone encounters a practical compile time issue.

Differential Revision: https://reviews.llvm.org/D94378
2021-01-22 16:31:29 -08:00
Kazu Hirata dc300beba7 [STLExtras] Add a default value to drop_begin
This patch adds the default value of 1 to drop_begin.

In the llvm codebase, 70% of calls to drop_begin have 1 as the second
argument.  The interface similar to with std::next should improve
readability.

This patch converts a couple of calls to drop_begin as examples.

Differential Revision: https://reviews.llvm.org/D94858
2021-01-18 10:16:34 -08:00
Kazu Hirata 8a20e2b3d3 [llvm] Use Optional::getValueOr (NFC) 2021-01-12 21:43:50 -08:00
Philip Reames 4739dd67e7 [LoopDeletion] Break backedge of outermost loops when known not taken
This is a resubmit of dd6bb367 (which was reverted due to stage2 build failures in 7c63aac), with the additional restriction added to the transform to only consider outer most loops.

As shown in the added test case, ensuring LCSSA is up to date when deleting an inner loop is tricky as we may actually need to remove blocks from any outer loops, thus changing the exit block set.   For the moment, just avoid transforming this case.  I plan to return to this case in a follow up patch and see if we can do better.

Original commit message follows...

The basic idea is that if SCEV can prove the backedge isn't taken, we can go ahead and get rid of the backedge (and thus the loop) while leaving the rest of the control in place. This nicely handles cases with dispatch between multiple exits and internal side effects.

Differential Revision: https://reviews.llvm.org/D93906
2021-01-10 16:02:33 -08:00
Atmn Patel f88a797521 [LoopDeletion] Allows deletion of possibly infinite side-effect free loops
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.

This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.

Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86844
2021-01-05 09:56:16 -05:00
Sanjay Patel 36263a7ccc [LoopUtils] remove redundant opcode parameter; NFC
While here, rename the inaccurate getRecurrenceBinOp()
because that was also used to get CmpInst opcodes.

The recurrence/reduction kind should always refer to the
expected opcode for a reduction. SLP appears to be the
only direct caller of createSimpleTargetReduction(), and
that calling code ideally should not be carrying around
both an opcode and a reduction kind.

This should allow us to generalize reduction matching to
use intrinsics instead of only binops.
2021-01-04 17:05:28 -05:00
Sanjay Patel 9766957524 [LoopUtils] reduce code for creatng reduction; NFC
We can return from each case instead creating a temporary
variable just to have a common return.
2021-01-04 16:05:03 -05:00
Sanjay Patel 58b6c5d932 [LoopUtils] reorder logic for creating reduction; NFC
If we are using a shuffle reduction, we don't need to
go through the switch on opcode - return early.
2021-01-04 16:05:02 -05:00
Philip Reames 7c63aac7bd Revert "[LoopDeletion] Break backedge of loops when known not taken"
This reverts commit dd6bb367d1.

Multi-stage builders are showing an assertion failure w/LCSSA not being preserved on entry to IndVars.  Reason isn't clear, reverting while investigating.
2021-01-04 09:50:47 -08:00
Philip Reames dd6bb367d1 [LoopDeletion] Break backedge of loops when known not taken
The basic idea is that if SCEV can prove the backedge isn't taken, we can go ahead and get rid of the backedge (and thus the loop) while leaving the rest of the control in place. This nicely handles cases with dispatch between multiple exits and internal side effects.

Differential Revision: https://reviews.llvm.org/D93906
2021-01-04 09:19:29 -08:00
Sanjay Patel c74e8539ff [Analysis] flatten enums for recurrence types
This is almost all mechanical search-and-replace and
no-functional-change-intended (NFC). Having a single
enum makes it easier to match/reason about the
reduction cases.

The goal is to remove `Opcode` from reduction matching
code in the vectorizers because that makes it harder to
adapt the code to handle intrinsics.

The code in RecurrenceDescriptor::AddReductionVar() is
the only place that required closer inspection. It uses
a RecurrenceDescriptor and a second InstDesc to sometimes
overwrite part of the struct. It seem like we should be
able to simplify that logic, but it's not clear exactly
which cmp+sel patterns that we are trying to handle/avoid.
2021-01-01 12:20:16 -05:00
Bogdan Graur 8bee4d4e8f Revert "[LoopDeletion] Allows deletion of possibly infinite side-effect free loops"
Test clang/test/Misc/loop-opt-setup.c fails when executed in Release.

This reverts commit 6f1503d598.

Reviewed By: SureYeaah

Differential Revision: https://reviews.llvm.org/D93956
2020-12-31 11:47:49 +00:00
Atmn Patel 6f1503d598 [LoopDeletion] Allows deletion of possibly infinite side-effect free loops
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.

This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.

Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86844
2020-12-30 21:43:01 -05:00
Sanjay Patel 8ca60db40b [LoopUtils] reduce FMF and min/max complexity when forming reductions
I don't know if there's some way this changes what the vectorizers
may produce for reductions, but I have added test coverage with
3567908 and 5ced712 to show that both passes already have bugs in
this area. Hopefully this does not make things worse before we can
really fix it.
2020-12-30 15:22:26 -05:00
Sanjay Patel e90ea76380 [IR] remove 'NoNan' param when creating FP reductions
This is no-functional-change-intended (AFAIK, we can't
isolate this difference in a regression test).

That's because the callers should be setting the IRBuilder's
FMF field when creating the reduction and/or setting those
flags after creating. It doesn't make sense to override this
one flag alone.

This is part of a multi-step process to clean up the FMF
setting/propagation. See PR35538 for an example.
2020-12-30 09:51:23 -05:00
Juneyoung Lee 9b29610228 Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.

Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)

The order is swapped, but in terms of correctness it is still fine.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93923
2020-12-30 22:36:08 +09:00
Sanjay Patel 8d18bc8e6d [Utils] reduce code in createTargetReduction(); NFC
The switch duplicated the translation in getRecurrenceBinOp().
This code is still weird because it translates to the TTI
ReductionFlags for min/max, but then createSimpleTargetReduction()
converts that back to RecurrenceDescriptor::MinMaxRecurrenceKind.
2020-12-29 15:56:19 -05:00