Commit Graph

21 Commits

Author SHA1 Message Date
Juneyoung Lee 4a8e6ed2f7 [SLP,LV] Use poison constant vector for shufflevector/initial insertelement
This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef.
This is a part of efforts for using poison vector instead of undef to represent "doesn't care" vector.
The goal is to make nice shufflevector optimizations valid that is currently incorrect due to the tricky interaction between undef and poison (see https://bugs.llvm.org/show_bug.cgi?id=44185 ).

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D94061
2021-01-06 11:22:50 +09:00
Juneyoung Lee 278aa65cc4 [IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93793
2020-12-30 04:21:04 +09:00
Philip Reames 0129cd5035 Use deref facts derived from minimum object size of allocations
This change should be fairly straight forward. If we've reached a call, check to see if we can tell the result is dereferenceable from information about the minimum object size returned by the call.

To control compile time impact, I'm only adding the call for base facts in the routine. getObjectSize can also do recursive reasoning, and we don't want that general capability here.

As a follow up patch (without separate review), I will plumb through the missing TLI parameter. That will have the effect of extending this to known libcalls - malloc, new, and the like - whereas currently this only covers calls with the explicit allocsize attribute.

Differential Revision: https://reviews.llvm.org/D90341
2020-12-03 15:01:14 -08:00
Philip Reames 4e4abd16a7 [Deref] Use maximum trip count instead of exact trip count
When trying to prove that a memory access touches only dereferenceable memory across all iterations of a loop, use the maximum exit count rather than an exact one.  In many cases we can't prove exact exit counts whereas we can prove an upper bound.

The test included is for a single exit loop with a min(C,V) exit count, but the true motivation is support for multiple exits loops.  It's just really hard to write a test case for multiple exits because the vectorizer (the primary user of this API), bails far before this.  For multiple exits, this allows a mix of analyzeable and unanalyzable exits when only analyzeable exits are needed to prove deref.
2020-10-28 14:33:30 -07:00
Nikita Popov 0dda633317 [SCEV] Strength nowrap flags after constant folding
We should first try to constant fold the add expression and only
strengthen nowrap flags afterwards. This allows us to determine
stronger flags if e.g. only two operands are left after constant
folding (and thus "guaranteed no wrap region" code applies) or the
resulting operands are non-negative and thus nsw->nuw strengthening
applies.
2020-10-25 18:00:22 +01:00
Amara Emerson 322d0afd87 [llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.

The autoupgrader will handle legacy intrinsics.

Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html

Differential Revision: https://reviews.llvm.org/D88787
2020-10-07 10:36:44 -07:00
Sanjay Patel e50059f6b6 [x86] form reduction intrinsics from vectorizers instead of raw IR
Motivating examples are seen in the PhaseOrdering tests based on:
https://bugs.llvm.org/show_bug.cgi?id=43953#c2 - if we have
intrinsics there, some pass can fold them.

The intrinsics are still named "experimental" at this point, but
if there is no fallout from this patch, that will be a good
indicator that it is safe to finalize them.

Differential Revision: https://reviews.llvm.org/D80867
2020-06-05 12:38:49 -04:00
Sanjay Patel 9d1f95bf9f [LoopVectorize] regenerate test checks; NFC
Align attributes are now visible.
2020-05-29 13:01:35 -04:00
Sjoerd Meijer 9529597cf4 Recommit #2: "[LV] Induction Variable does not remain scalar under tail-folding."
This was reverted because of a miscompilation. At closer inspection, the
problem was actually visible in a changed llvm regression test too. This
one-line follow up fix/recommit will splat the IV, which is what we are trying
to avoid if unnecessary in general, if tail-folding is requested even if all
users are scalar instructions after vectorisation. Because with tail-folding,
the splat IV will be used by the predicate of the masked loads/stores
instructions. The previous version omitted this, which caused the
miscompilation. The original commit message was:

If tail-folding of the scalar remainder loop is applied, the primary induction
variable is splat to a vector and used by the masked load/store vector
instructions, thus the IV does not remain scalar. Because we now mark
that the IV does not remain scalar for these cases, we don't emit the vector IV
if it is not used. Thus, the vectoriser produces less dead code.

Thanks to Ayal Zaks for the direction how to fix this.
2020-05-13 13:50:09 +01:00
Benjamin Kramer f936457f80 Revert "Recommit "[LV] Induction Variable does not remain scalar under tail-folding.""
This reverts commit ae45b4dbe7. It
causes miscompilations, test case on the mailing list.
2020-05-08 14:49:10 +02:00
Sjoerd Meijer ae45b4dbe7 Recommit "[LV] Induction Variable does not remain scalar under tail-folding."
With 3 llvm regr tests fixed/updated that I had missed.
2020-05-07 11:52:20 +01:00
Sjoerd Meijer 20d67ffeae Revert "[LV] Induction Variable does not remain scalar under tail-folding."
This reverts commit 617aa64c84.

while I investigate buildbot failures.
2020-05-07 09:29:56 +01:00
Sjoerd Meijer 617aa64c84 [LV] Induction Variable does not remain scalar under tail-folding.
If tail-folding of the scalar remainder loop is applied, the primary induction
variable is splat to a vector and used by the masked load/store vector
instructions, thus the IV does not remain scalar. Because we now mark
that the IV does not remain scalar for these cases, we don't emit the vector IV
if it is not used. Thus, the vectoriser produces less dead code.

Thanks to Ayal Zaks for the direction how to fix this.

Differential Revision: https://reviews.llvm.org/D78911
2020-05-07 09:15:23 +01:00
Philip Reames 0e8d5085ac Remove a duplicate test
Turns out I'd already added exactly the same test under the name non_unit_stride.

llvm-svn: 371777
2019-09-12 21:40:15 +00:00
Florian Hahn 0741810077 [LV] Update test case after r371768.
llvm-svn: 371769
2019-09-12 20:07:17 +00:00
Philip Reames e0cab70718 Precommit tests for generalization of load dereferenceability in loop
llvm-svn: 371747
2019-09-12 17:09:01 +00:00
Philip Reames b90f94f42e [LV] Support invariant addresses in speculation logic
Implement a TODO from rL371452, and handle loop invariant addresses in predicated blocks. If we can prove that the load is safe to speculate into the header, then we can avoid using a masked.load in favour of a normal load.

This is mostly about vectorization robustness. In the common case, it's generally expected that LICM/LoadStorePromotion would have eliminated such loads entirely.

Differential Revision: https://reviews.llvm.org/D67372

llvm-svn: 371745
2019-09-12 16:49:10 +00:00
Philip Reames b8cddb7611 [Tests] Fix a typo in a test
llvm-svn: 371456
2019-09-09 21:33:59 +00:00
Philip Reames 847fbf7013 [Tests] Precommit test case for D67372
llvm-svn: 371455
2019-09-09 21:32:16 +00:00
Philip Reames 7403569be7 [LoopVectorize] Leverage speculation safety to avoid masked.loads
If we're vectorizing a load in a predicated block, check to see if the load can be speculated rather than predicated.  This allows us to generate a normal vector load instead of a masked.load.

To do so, we must prove that all bytes accessed on any iteration of the original loop are dereferenceable, and that all loads (across all iterations) are properly aligned.  This is equivelent to proving that hoisting the load into the loop header in the original scalar loop is safe.

Note: There are a couple of code motion todos in the code.  My intention is to wait about a day - to be sure this sticks - and then perform the NFC motion without furthe review.

Differential Revision: https://reviews.llvm.org/D66688

llvm-svn: 371452
2019-09-09 20:54:13 +00:00
Philip Reames 2de9788815 Preland test cases for D66688 to make diffs clear.
llvm-svn: 369959
2019-08-26 20:37:06 +00:00