Commit Graph

5 Commits

Author SHA1 Message Date
Florian Hahn 23c2f2e6b2
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable
of the vector loop as NUW when not folding the tail.

If the tail is not folded, we know that End - Start >= Step (either
statically or through the minimum iteration checks). We also know that both
Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV +
%Step == %End. Hence we must exit the loop before %IV + %Step unsigned
overflows and we can mark the induction increment as NUW.

This should make SCEV return more precise bounds for the created vector
loops, used by later optimizations, like late unrolling.

At the moment quite a few tests still need to be updated, but before
doing so I'd like to get initial feedback to make sure I am not missing
anything.

Note that this could probably be further improved by using information
from the original IV.

Attempt of modeling of the assumption in Alive2:
https://alive2.llvm.org/ce/z/H_DL_g

Part of a set of fixes required for PR50412.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D103255
2021-06-07 10:47:52 +01:00
Sander de Smalen 4f86aa650c [LV] Add -scalable-vectorization=<option> flag.
This patch adds a new option to the LoopVectorizer to control how
scalable vectors can be used.

Initially, this suggests three levels to control scalable
vectorization, although other more aggressive options can be added in
the future.

The possible options are:
- Disabled:   Disables vectorization with scalable vectors.
- Enabled:    Vectorize loops using scalable vectors or fixed-width
              vectors, but favors fixed-width vectors when the cost
              is a tie.
- Preferred:  Like 'Enabled', but favoring scalable vectors when the
              cost-model is inconclusive.

Reviewed By: paulwalker-arm, vkmr

Differential Revision: https://reviews.llvm.org/D101945
2021-05-19 10:40:56 +01:00
Cullen Rhodes 1e7efd397a [LV] Legalize scalable VF hints
In the following loop:

  void foo(int *a, int *b, int N) {
    for (int i=0; i<N; ++i)
      a[i + 4] = a[i] + b[i];
  }

The loop dependence constrains the VF to a maximum of (4, fixed), which
would mean using <4 x i32> as the vector type in vectorization.
Extending this to scalable vectorization, a VF of (4, scalable) implies
a vector type of <vscale x 4 x i32>. To determine if this is legal
vscale must be taken into account. For this example, unless
max(vscale)=1, it's unsafe to vectorize.

For SVE, the number of bits in an SVE register is architecturally
defined to be a multiple of 128 bits with a maximum of 2048 bits, thus
the maximum vscale is 16. In the loop above it is therefore unfeasible
to vectorize with SVE. However, in this loop:

  void foo(int *a, int *b, int N) {
    #pragma clang loop vectorize_width(X, scalable)
    for (int i=0; i<N; ++i)
      a[i + 32] = a[i] + b[i];
  }

As long as max(vscale) multiplied by the number of lanes 'X' doesn't
exceed the dependence distance, it is safe to vectorize. For SVE a VF of
(2, scalable) is within this constraint, since a vector of <16 x 2 x 32>
will have no dependencies between lanes. For any number of lanes larger
than this it would be unsafe to vectorize.

This patch extends 'computeFeasibleMaxVF' to legalize scalable VFs
specified as loop hints, implementing the following behaviour:
  * If the backend does not support scalable vectors, ignore the hint.
  * If scalable vectorization is unfeasible given the loop
    dependence, like in the first example above for SVE, then use a
    fixed VF.
  * Accept scalable VFs if it's safe to do so.
  * Otherwise, clamp scalable VFs that exceed the maximum safe VF.

Reviewed By: sdesmalen, fhahn, david-arm

Differential Revision: https://reviews.llvm.org/D91718
2021-01-08 10:49:44 +00:00
Juneyoung Lee 278aa65cc4 [IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93793
2020-12-30 04:21:04 +09:00
Sander de Smalen d568cff696 [LoopVectorizer][SVE] Vectorize a simple loop with with a scalable VF.
* Steps are scaled by `vscale`, a runtime value.
* Changes to circumvent the cost-model for now (temporary)
  so that the cost-model can be implemented separately.

This can vectorize the following loop [1]:

   void loop(int N, double *a, double *b) {
     #pragma clang loop vectorize_width(4, scalable)
     for (int i = 0; i < N; i++) {
       a[i] = b[i] + 1.0;
     }
   }

[1] This source-level example is based on the pragma proposed
separately in D89031. This patch only implements the LLVM part.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D91077
2020-12-09 11:25:21 +00:00