Commit Graph

210 Commits

Author SHA1 Message Date
David Sherwood 71bd59f0cb [SVE] Add support for scalable vectors with vectorize.scalable.enable loop attribute
In this patch I have added support for a new loop hint called
vectorize.scalable.enable that says whether we should enable scalable
vectorization or not. If a user wants to instruct the compiler to
vectorize a loop with scalable vectors they can now do this as
follows:

  br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !2
  ...
  !2 = !{!2, !3, !4}
  !3 = !{!"llvm.loop.vectorize.width", i32 8}
  !4 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}

Setting the hint to false simply reverts the behaviour back to the
default, using fixed width vectors.

Differential Revision: https://reviews.llvm.org/D88962
2020-12-02 13:23:43 +00:00
David Green c7e275388e [ARM] Don't aggressively unroll vector remainder loops
We already do not unroll loops with vector instructions under MVE, but
that does not include the remainder loops that the vectorizer produces.
These remainder loops will be rarely executed and are not worth
unrolling, as the trip count is likely to be low if they get executed at
all. Luckily they get llvm.loop.isvectorized to make recognizing them
simpler.

We have wanted to do this for a while but hit issues with low overhead
loops being reverted due to difficult registry allocation. With recent
changes that seems to be less of an issue now.

Differential Revision: https://reviews.llvm.org/D90055
2020-11-10 17:01:31 +00:00
Atmn Patel 04a0896487 Revert "[LoopDeletion] Allows deletion of possibly infinite side-effect free loops"
This reverts commit 0b17c6e447. This patch
causes a compile-time error in SCEV.
2020-11-07 00:32:12 -05:00
Atmn Patel 0b17c6e447 [LoopDeletion] Allows deletion of possibly infinite side-effect free loops
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.

This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.

Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86844
2020-11-06 22:06:58 -05:00
Atmn Patel babc224c5d [LoopDeletion] Remove dead loops with no exit blocks
Currently, LoopDeletion refuses to remove dead loops with no exit blocks
because it cannot statically determine the control flow after it removes
the block. This leads to miscompiles if the loop is an infinite loop and
should've been removed.

Differential Revision: https://reviews.llvm.org/D90115
2020-11-06 17:08:34 -05:00
Nikita Popov 20b386aae0 [LoopUtils] Fix neutral value for vector.reduce.fadd
Use -0.0 instead of 0.0 as the start value. The previous use of 0.0
was fine for all existing uses of this function though, as it is
always generated with fast flags right now, and thus nsz.
2020-10-29 21:45:13 +01:00
Florian Hahn ad5541045a [LoopDeletion] Remove over-eager SCEV verification.
60b852092c introduced SCEV verification to
deleteDeadLoop, but it appears this check is currently a bit over-eager
and some users of deleteDeadLoop appear to only patch up SE after
calling it (e.g. PR47753).

Remove the extra check for now. We can consider adding it back after we
tracked down the source of the inconsistency for PR47753.
2020-10-12 16:18:30 +01:00
Florian Hahn 7bae2bc5a8 [LoopUtils] Only verify SE in builds with assertions.
Follow up to 60b852092c.
2020-09-29 13:39:23 +01:00
Florian Hahn 60b852092c [LoopDeletion] Forget loop before setting values to undef
After D71539, we need to forget the loop before setting the incoming
values of phi nodes in exit blocks, because we are looking through those
phi nodes now and the SCEV expression could depend on the loop phi. If
we update the phi nodes before forgetting the loop, we miss those users
during invalidation.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D88167
2020-09-29 10:38:44 +01:00
Vitaly Buka 89051ebace [NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
2020-07-30 21:08:24 -07:00
Nicolai Hähnle 76c5cb05a3 DomTree: Remove getChildren() accessor
Summary:
Avoid exposing details about how children are stored. This will enable
subsequent type-erasure changes.

New methods are introduced to cover common access patterns.

Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f

Reviewers: arsenm, RKSimon, mehdi_amini, courbet

Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83083
2020-07-06 21:58:11 +02:00
Serguei Katkov eae0d2e9b2 Revert "[Peeling] Extend the scope of peeling a bit"
This reverts commit 29b2c1ca72.

The patch causes the DT verifier failure like:
DominatorTree is different than a freshly computed one!

Not sure the patch itself it wrong but revert to investigate the failure.
2020-06-22 17:48:29 +07:00
Serguei Katkov 29b2c1ca72 [Peeling] Extend the scope of peeling a bit
Currently we allow peeling of the loops if there is a exiting latch block
and all other exits are blocks ending with deopt.

Actually we want that exit would end up with deopt unconditionally but
it is not required that exit itself ends with deopt.

Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev
Reviewed By: apilipenko
Subscribers: hiraditya, zzheng, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D81140
2020-06-22 12:17:44 +07:00
Christopher Tetreault 8d11ec66b6 [SVE] Remove calls to VectorType::getNumElements from Transforms/Utils
Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea

Reviewed By: david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82057
2020-06-18 13:39:14 -07:00
Sanjay Patel 46a285ad9e [IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC
The predicate can always be used to distinguish between icmp and fcmp,
so we don't need to keep repeating this check in the callers.
2020-06-18 15:47:06 -04:00
Alina Sbirlea 519b019a0a Verify MemorySSA after all updates.
Verify after completing all updates.
Resolves PR46275.
2020-06-11 18:48:41 -07:00
Whitney Tsang 5d6c5b463c [LoopUtils] Use llvm::find
Summary: Fixes this build error:

llvm/lib/Transforms/Utils/LoopUtils.cpp:679:26: error: no matching
function for call to 'find'
      Loop::iterator I = find(ParentLoop->begin(), ParentLoop->end(),
L);
                         ^~~~
Authored By: orivej
Reviewer: Whitney
Reviewed By: Whitney
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80473
2020-05-25 13:34:56 +00:00
Roman Lebedev b2df961231
[IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835)
Summary:
Currently, `rewriteLoopExitValues()`'s logic is roughly as following:
> Loop over each incoming value in each PHI node.
> Query whether the SCEV for that incoming value is high-cost.
> Expand the SCEV.
> Perform sanity check (`isValidRewrite()`, D51582)
> Record the info
> Afterwards, see if we can drop the loop given replacements.
> Maybe perform replacements.

The problem is that we interleave SCEV cost checking and expansion.
This is A Problem, because `isHighCostExpansion()` takes special care
to not bill for the expansions that were already expanded, and we can reuse.

While it makes sense in general - if we know that we will expand some SCEV,
all the other SCEV's costs should account for that, which might cause
some of them to become non-high-cost too, and cause chain reaction.

But that isn't what we are doing here. We expand *all* SCEV's, unconditionally.
So every next SCEV's cost will be affected by the already-performed expansions
for previous SCEV's. Even if we are not planning on keeping
some of the expansions we performed.

Worse yet, this current "bonus" depends on the exact PHI node
incoming value processing order. This is completely wrong.

As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have
a PHI node with two(!) identical high-cost incoming values for the same basic blocks,
we would decide first time around that it is high-cost, expand it,
and immediately decide that it is not high-cost because we have an expansion
that we could reuse (because we expanded it right before, temporarily),
and replace the second incoming value but not the first one;
thus resulting in a broken PHI.

What we instead should do for now, is not perform any expansions
until after we've queried all the costs.

Later, in particular after `isValidRewrite()` is an assertion (D51582)
we could improve upon that, but in a more coherent fashion.

See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 | PR45835 ]]

Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma

Reviewed By: dmajor, mkazantsev

Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79787
2020-05-21 13:05:55 +03:00
Florian Hahn bcbd26bfe6 [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.

This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.

The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.

Reviewers: sanjoy.google, efriedma, reames

Reviewed By: sanjoy.google

Differential Revision: https://reviews.llvm.org/D71537
2020-05-20 10:53:40 +01:00
Florian Hahn 8528186b9b [LAA] Move runtime-check generation to Transforms/Utils/loopUtils (NFC)
Currently LAA's uses of ScalarEvolutionExpander blocks moving the
expander from Analysis to Transforms. Conceptually the expander does not
fit into Analysis (it is only used for code generation) and
runtime-check generation also seems to be better suited as a
transformation utility.

Reviewers: Ayal, anemet

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78460
2020-05-10 17:39:26 +01:00
Florian Hahn 616657b39c [LAA] Move CheckingPtrGroup/PointerCheck outside class (NFC).
This allows forward declarations of PointerCheck, which in turn reduce
the number of times LoopAccessAnalysis needs to be included.

Ultimately this helps with moving runtime check generation to
Transforms/Utils/LoopUtils.h, without having to include it there.

Reviewers: anemet, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78458
2020-04-28 21:47:31 +01:00
Florian Hahn 7a87e8f90b [LoopUtils] Clean up includes, use forward decls if appropriate (NFC).
Most of the includes in LoopUtils.h are not required in the header and
they can be replaced by forward declarations.

Unfortunately includes of TargetTransformInfo.h and IVDescriptors.h pull
in a bunch of additional things, but there is no easy way to get rid of
them at the moment I think.
2020-04-19 19:44:29 +01:00
Benjamin Kramer 6f64daca8f Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints
No functionality change intended.
2020-04-15 12:51:38 +02:00
Christopher Tetreault 00a1032412 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: rriddle, sdesmalen, efriedma

Reviewed By: sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77260
2020-04-09 13:35:41 -07:00
Roman Lebedev 7d572ef2dd
Revert "[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668)"
As discussed in post-commit review in https://reviews.llvm.org/D73501
if the goal of this is to help vectorizer, then we should actually
be teaching vectorizer to do this, because right now this rewrite
is still budget-limited, which isn't what we'd want.

Additionally, while the rest of the patch series was universally profitable,
this particular patch is reportedly (https://reviews.llvm.org/D73501#1905171)
exposing cost-modeling issues on ARM.

So let's just back this particular patch out. Once there's an undo transform,
this could be considered for reintegration.

This reverts commit 44edc6fd2c.
2020-04-03 20:15:04 +03:00
David Green ec7e4a9a80 [LoopVectorizer] Add reduction tests for inloop reductions. NFC
Also adds a force-reduction-intrinsics option for testing, for forcing
the generation of reduction intrinsics even when the backend is not
requesting them.
2020-03-03 10:54:00 +00:00
Arkady Shlykov 3dcaf296ae [Loop Peeling] Add possibility to enable peeling on loop nests.
Summary:
Current peeling implementation bails out in case of loop nests.
The patch introduces a field in TargetTransformInfo structure that
certain targets can use to relax the constraints if it's
profitable (disabled by default).
Also additional option is added to enable peeling manually for
experimenting and testing purposes.

Reviewers: fhahn, lebedev.ri, xbolva00

Reviewed By: xbolva00

Subscribers: RKSimon, xbolva00, hiraditya, zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D70304
2020-03-02 08:37:11 -08:00
Roman Lebedev 44edc6fd2c
[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668)
Summary:
Replacing uses of IV outside of the loop is likely generally useful,
but `rewriteLoopExitValues()` is cautious, and if it isn't told to always
perform the replacement, and there are hard uses of IV in loop,
it doesn't replace.

In [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]],
that prevents `-indvars` from replacing uses of induction variable
after the loop, which might be one of the optimization failures
preventing that code from being vectorized.

Instead, now that the cost model is fixed, i believe we should be
a little bit more optimistic, and also perform replacement
if we believe it is within our budget.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]].

Reviewers: reames, mkazantsev, asbirlea, fhahn, skatkov

Reviewed By: mkazantsev

Subscribers: nikic, hiraditya, zzheng, javed.absar, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73501
2020-02-25 23:05:59 +03:00
Roman Lebedev b99c91a087
[NFC][SCEV] Piping to pass new SCEVCheapExpansionBudget option into SCEVExpander::isHighCostExpansionHelper()
Summary:
In future patches`SCEVExpander::isHighCostExpansionHelper()` will respect the budget allocated by performing TTI cost modelling.
This is a fully NFC patch to make things reviewable.

Reviewers: reames, mkazantsev, wmi, sanjoy

Reviewed By: mkazantsev

Subscribers: hiraditya, zzheng, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73705
2020-02-25 23:05:57 +03:00
Roman Lebedev 0789f28048
[NFC][SCEV] Piping to pass TTI into SCEVExpander::isHighCostExpansionHelper()
Summary:
Future patches will make use of TTI to perform cost-model-driven `SCEVExpander::isHighCostExpansionHelper()`
This is a fully NFC patch to make things reviewable.

Reviewers: reames, mkazantsev, wmi, sanjoy

Reviewed By: mkazantsev

Subscribers: hiraditya, zzheng, javed.absar, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73704
2020-02-25 23:05:56 +03:00
Nikita Popov 28ffe38bba [LoopUtils] Accept IRBuilderBase; NFC 2020-02-18 17:58:46 +01:00
Alina Sbirlea 67904db23c [IRCE] Make IRCE a Function pass.
Summary: Make InductiveRangeCheckElimination a FunctionPass.

Reviewers: reames, mkazantsev

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73592
2020-02-05 09:22:41 -08:00
Alina Sbirlea 388de9dfcd [LoopUtils] Make duplicate method a utility. [NFCI]
Summary:
Method appendLoopsToWorklist is duplicate in LoopUnroll and in the
LoopPassManager as an internal method. Make it an utility.

Reviewers: dmgreen, chandlerc, fedor.sergeev, yamauchi

Subscribers: mehdi_amini, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73569
2020-02-03 10:24:18 -08:00
Sanjay Patel bc1148e7bc [PATCH] D73727: [SLP] drop poison-generating flags for shuffle reduction ops (PR44536)
We may calculate reassociable math ops in arbitrary order when creating a shuffle reduction,
so there's no guarantee that things like 'nsw' hold on those intermediate values. Drop all
poison-generating flags for safety.

This change is limited to shuffle reductions because I don't think we have a problem in the
general case (where we intersect flags of each scalar op that goes into a vector op), but if
there's evidence of other cases being wrong, we can extend this fix to cover those cases.

https://bugs.llvm.org/show_bug.cgi?id=44536

Differential Revision: https://reviews.llvm.org/D73727
2020-01-31 09:54:35 -05:00
Alina Sbirlea efb130fc93 [LoopDeletion] Teach LoopDeletion to preserve MemorySSA if available.
If MemorySSA analysis is analysis, LoopDeletion now preserves it.
2020-01-22 11:38:38 -08:00
Evgeniy Brevnov af7e158872 [LV] Vectorizer should adjust trip count in profile information
Summary: Vectorized loop processes VFxUF number of elements in one iteration thus total number of iterations decreases proportionally. In addition epilog loop may not have more than VFxUF - 1 iterations. This patch updates profile information accordingly.

Reviewers: hsaito, Ayal, fhahn, reames, silvas, dcaballe, SjoerdMeijer, mkuper, DaniilSuchkov

Reviewed By: Ayal, DaniilSuchkov

Subscribers: fedor.sergeev, hiraditya, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67905
2020-01-20 18:36:28 +07:00
Evgeniy Brevnov cfe97681cd [NFC][LoopUtils] Minor change in comment according to review D71990. 2020-01-20 17:10:10 +07:00
Evgeniy Brevnov 10357e1c89 [LoopUtils] Better accuracy for getLoopEstimatedTripCount.
Summary: Current implementation of getLoopEstimatedTripCount returns 1 iteration less than it should. The reason is that in bottom tested loop first iteration is executed before first back branch is taken. For example for loop with !{!"branch_weights", i32 1 // taken, i32 1 // exit} metadata getLoopEstimatedTripCount gives 1 while actual number of iterations is 2.

Reviewers: Ayal, fhahn

Reviewed By: Ayal

Subscribers: mgorny, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71990
2020-01-20 16:58:07 +07:00
Sjoerd Meijer 93175a5caa [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFCI
This moves `rewriteLoopExitValues()` from IndVarSimplify to LoopUtils thus
making it a generic loop utility function.  This allows to rewrite loop exit
values by just calling this function without running the whole IndVarSimplify
pass.

We use this in D72714 to rematerialise the iteration count in exit blocks, so
that we can clean-up loop update expressions inside the hardware-loops later.

Differential Revision: https://reviews.llvm.org/D72602
2020-01-20 09:05:00 +00:00
Evgeniy Brevnov f0abe820ee [LoopUtils][NFC] Minor refactoring in getLoopEstimatedTripCount. 2020-01-09 16:49:15 +07:00
Florian Hahn 99f74a64a2 [SCEV] Remove unused ScalarEvolutionExpander.h includes (NFC). 2020-01-04 18:29:35 +00:00
Whitney Tsang 9883d7edc6 [LoopUtils] Updated deleteDeadLoop() to handle loop nest.
Reviewer: kariddi, sanjoy, reames, Meinersbur, bmahjour, etiotto,
kbarton
Reviewed By: Meinersbur
Subscribers: mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D70939
2019-12-18 15:59:45 +00:00
Whitney Tsang ec4749e3b8 Revert "[LoopUtils] Updated deleteDeadLoop() to handle loop nest."
This reverts commit cd09fee3d6.
This reverts commit c066ff11d8.
2019-12-17 03:51:41 +00:00
Whitney Tsang c066ff11d8 [LoopUtils] Updated deleteDeadLoop() to handle loop nest.
Reviewer: kariddi, sanjoy, reames, Meinersbur, bmahjour, etiotto,
kbarton
Reviewed By: Meinersbur
Subscribers: mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D70939
2019-12-17 01:06:14 +00:00
Reid Kleckner 05da2fe521 Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 16:34:37 -08:00
Alina Sbirlea 6da79ce1fe [MemorySSA] Re-enable MemorySSA use.
Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370957
2019-09-04 19:16:04 +00:00
Alina Sbirlea ccb1862bc9 [MemorySSA] Disable MemorySSA use.
Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370821
2019-09-03 21:20:46 +00:00
Alina Sbirlea e331d50534 [MemorySSA] Re-enable MemorySSA use.
Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370811
2019-09-03 19:28:37 +00:00
Alina Sbirlea 4b87023bae Revert enabling MemorySSA.
Breaks sanitizers bots.

Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370397
2019-08-29 19:01:23 +00:00
Alina Sbirlea 6289ee941d [MemorySSA & LoopPassManager] Enable MemorySSA as loop dependency. Update tests.
Summary:
I'm not planning to check this in at the moment, but feedback is very welcome, in particular how this affects performance.
The feedback obtains here will guide the next steps towards enabling this.

This patch enables the use of MemorySSA in the loop pass manager.

Passes that currently use MemorySSA:
 - EarlyCSE
Passes that use MemorySSA after this patch:
 - EarlyCSE
 - LICM
 - SimpleLoopUnswitch
Loop passes that update MemorySSA (and do not use it yet, but could use it after this patch):
 - LoopInstSimplify
 - LoopSimplifyCFG
 - LoopUnswitch
 - LoopRotate
 - LoopSimplify
 - LCSSA
Loop passes that do *not* update MemorySSA:
 - IndVarSimplify
 - LoopDelete
 - LoopIdiom
 - LoopSink
 - LoopUnroll
 - LoopInterchange
 - LoopUnrollAndJam
 - LoopVectorize
 - LoopReroll
 - IRCE

Reviewers: chandlerc, george.burgess.iv, davide, sanjoy, gberry

Subscribers: jlebar, Prazek, dmgreen, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58311

llvm-svn: 370384
2019-08-29 17:08:13 +00:00
Tim Corringham 4f64f1ba3c Add llvm.licm.disable metadata
For some targets the LICM pass can result in sub-optimal code in some
cases where it would be better not to run the pass, but it isn't
always possible to suppress the transformations heuristically.

Where the front-end has insight into such cases it is beneficial
to attach loop metadata to disable the pass - this change adds the
llvm.licm.disable metadata to enable that.

Differential Revision: https://reviews.llvm.org/D64557

llvm-svn: 368296
2019-08-08 13:46:17 +00:00
Serguei Katkov 7f8c809592 [Loop Utils] Extend the scope of addStringMetadataToLoop.
To avoid duplicates in loop metadata, if the string to add is
already there, just update the value.

Reviewers: reames, Ashutosh
Reviewed By: reames
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D65265

llvm-svn: 367087
2019-07-26 07:04:34 +00:00
Serguei Katkov 3c3a76527e [Loop Utils] Move utilty addStringMetadataToLoop to LoopUtils.cpp. NFC.
Just move the utility function to LoopUtils.cpp to re-use it in loop peeling.

Reviewers: reames, Ashutosh
Reviewed By: reames
Subscribers: hiraditya, asbirlea, llvm-commits
Differential Revision: https://reviews.llvm.org/D65264

llvm-svn: 367085
2019-07-26 06:10:08 +00:00
Serguei Katkov 45c43e7d04 [LoopUtils] Extend the scope of getLoopEstimatedTripCount
With this patch the getLoopEstimatedTripCount function will
accept also the loops where there are more than one exit but
all exits except latch block should ends up with a call to deopt.

This side exits should not impact the estimated trip count.

Reviewers: reames, mkuper, danielcdh
Reviewed By: reames
Subscribers: fhahn, lebedev.ri, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64553

llvm-svn: 366042
2019-07-15 06:42:39 +00:00
Sander de Smalen cbeb563cfb Change semantics of fadd/fmul vector reductions.
This patch changes how LLVM handles the accumulator/start value
in the reduction, by never ignoring it regardless of the presence of
fast-math flags on callsites. This change introduces the following
new intrinsics to replace the existing ones:

  llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd
  llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul

and adds functionality to auto-upgrade existing LLVM IR and bitcode.

Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D60261

llvm-svn: 363035
2019-06-11 08:22:10 +00:00
Sanjay Patel ad62a3a299 [LoopUtils][SLPVectorizer] clean up management of fast-math-flags
Instead of passing around fast-math-flags as a parameter, we can set those
using an IRBuilder guard object. This is no-functional-change-intended.

The motivation is to eventually fix the vectorizers to use and set the
correct fast-math-flags for reductions. Examples of that not behaving as
expected are:
https://bugs.llvm.org/show_bug.cgi?id=23116 (should be able to reduce with less than 'fast')
https://bugs.llvm.org/show_bug.cgi?id=35538 (possible miscompile for -0.0)
D61802 (should be able to reduce with IR-level FMF)

Differential Revision: https://reviews.llvm.org/D62272

llvm-svn: 362612
2019-06-05 14:58:04 +00:00
Roman Lebedev e5be660e25 [NFC][Utils] deleteDeadLoop(): add an assert that exit block has some non-PHI instruction
Summary:
If `deleteDeadLoop()` is called on such a loop, that has "bad" exit block,
one that e.g. has no terminator instruction, the `DIBuilder::insertDbgValueIntrinsic()`
will be told to insert the Dbg Value Intrinsic after `nullptr`
(since there is no first non-PHI instruction), which will cause it to not insert
those instructions into any basic block. The instructions will be parent-less,
and IR verifier will complain. It is rather obvious to track down the root cause
when that happens, so let's just assert it never happens.

Reviewers: sanjoy, davide, vsk

Reviewed By: vsk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61008

llvm-svn: 359993
2019-05-05 18:59:12 +00:00
Sanjoy Das 3f5ce18658 Reland "Relax constraints for reduction vectorization"
Change from original commit: move test (that uses an X86 triple) into the X86
subdirectory.

Original description:
Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
`reassoc` should be sufficient.

Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal

Reviewed By: sdesmalen

Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57728

llvm-svn: 355889
2019-03-12 01:31:44 +00:00
Sanjoy Das 2136a5bc49 Revert "Relax constraints for reduction vectorization"
This reverts commit r355868.  Breaks hexagon.

llvm-svn: 355873
2019-03-11 22:37:31 +00:00
Sanjoy Das 93f8cc186a Relax constraints for reduction vectorization
Summary:
Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
`reassoc` should be sufficient.

Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal

Reviewed By: sdesmalen

Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57728

llvm-svn: 355868
2019-03-11 21:36:41 +00:00
Chijun Sima f131d6110e [DTU] Deprecate insertEdge*/deleteEdge*
Summary: This patch converts all existing `insertEdge*/deleteEdge*` to `applyUpdates` and marks `insertEdge*/deleteEdge*` as deprecated.

Reviewers: kuhar, brzycki

Reviewed By: kuhar, brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58443

llvm-svn: 354652
2019-02-22 05:41:43 +00:00
Alina Sbirlea 97468e9282 [MemorySSA & LoopPassManager] Update MemorySSA in formDedicatedExitBlocks.
MemorySSA is now updated when forming dedicated exit blocks.
Resolves PR40037.

llvm-svn: 354623
2019-02-21 21:13:34 +00:00
Craig Topper 784929d045 Implementation of asm-goto support in LLVM
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html

This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.

This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.

There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.

Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii

Differential Revision: https://reviews.llvm.org/D53765

llvm-svn: 353563
2019-02-08 20:48:56 +00:00
Richard Trieu 5f436fc57a Move DomTreeUpdater from IR to Analysis
DomTreeUpdater depends on headers from Analysis, but is in IR.  This is a
layering violation since Analysis depends on IR.  Relocate this code from IR
to Analysis to fix the layering violation.

llvm-svn: 353265
2019-02-06 02:52:52 +00:00
Michael Kruse 70560a0a2c [WarnMissedTransforms] Do not warn about already vectorized loops.
LoopVectorize adds llvm.loop.isvectorized, but leaves
llvm.loop.vectorize.enable. Do not consider such a loop for user-forced
vectorization since vectorization already happened -- by prioritizing
llvm.loop.isvectorized except for TM_SuppressedByUser.

Fixes http://llvm.org/PR40546

Differential Revision: https://reviews.llvm.org/D57542

llvm-svn: 353082
2019-02-04 19:55:59 +00:00
Alina Sbirlea f9027e554a Check bool attribute value in getOptionalBoolLoopAttribute.
Summary:
Check the bool value of the attribute in getOptionalBoolLoopAttribute
not just its existance.
Eliminates the warning noise generated when vectorization is explicitly disabled.

Reviewers: Meinersbur, hfinkel, dmgreen

Subscribers: jlebar, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D57260

llvm-svn: 352555
2019-01-29 22:33:20 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Max Kazantsev a78dc4d6c8 [NFC] Move some functions to LoopUtils
llvm-svn: 351179
2019-01-15 09:51:34 +00:00
Michael Kruse 978ba61536 Introduce llvm.loop.parallel_accesses and llvm.access.group metadata.
The current llvm.mem.parallel_loop_access metadata has a problem in that
it uses LoopIDs. LoopID unfortunately is not loop identifier. It is
neither unique (there's even a regression test assigning the some LoopID
to multiple loops; can otherwise happen if passes such as LoopVersioning
make copies of entire loops) nor persistent (every time a property is
removed/added from a LoopID's MDNode, it will also receive a new LoopID;
this happens e.g. when calling Loop::setLoopAlreadyUnrolled()).
Since most loop transformation passes change the loop attributes (even
if it just to mark that a loop should not be processed again as
llvm.loop.isvectorized does, for the versioned and unversioned loop),
the parallel access information is lost for any subsequent pass.

This patch unlinks LoopIDs and parallel accesses.
llvm.mem.parallel_loop_access metadata on instruction is replaced by
llvm.access.group metadata. llvm.access.group points to a distinct
MDNode with no operands (avoiding the problem to ever need to add/remove
operands), called "access group". Alternatively, it can point to a list
of access groups. The LoopID then has an attribute
llvm.loop.parallel_accesses with all the access groups that are parallel
(no dependencies carries by this loop).

This intentionally avoid any kind of "ID". Loops that are clones/have
their attributes modifies retain the llvm.loop.parallel_accesses
attribute. Access instructions that a cloned point to the same access
group. It is not necessary for each access to have it's own "ID" MDNode,
but those memory access instructions with the same behavior can be
grouped together.

The behavior of llvm.mem.parallel_loop_access is not changed by this
patch, but should be considered deprecated.

Differential Revision: https://reviews.llvm.org/D52116

llvm-svn: 349725
2018-12-20 04:58:07 +00:00
Davide Italiano 9737096bb1 [LoopUtils] Use i32 instead of `void`.
The actual type of the first argument of the @dbg intrinsic
doesn't really matter as we're setting it to `undef`, but the
bitcode reader is picky about `void` types.

llvm-svn: 349069
2018-12-13 18:37:23 +00:00
Davide Italiano 8ee59ca653 [LoopUtils] Prefer a set over a map. NFCI.
llvm-svn: 348999
2018-12-13 01:11:52 +00:00
Davide Italiano 744c3c327f [LoopDeletion] Update debug values after loop deletion.
When loops are deleted, we don't keep track of variables modified inside
the loops, so the DI will contain the wrong value for these.

e.g.

int b() {

int i;
for (i = 0; i < 2; i++)
  ;
patatino();
return a;
-> 6 patatino();

7     return a;
8   }
9   int main() { b(); }
(lldb) frame var i
(int) i = 0

We mark instead these values as unavailable inserting a
@llvm.dbg.value(undef to make sure we don't end up printing an incorrect
value in the debugger. We could consider doing something fancier,
for, e.g. constants, in the future.

PR39868.
rdar://problem/46418795)

Differential Revision: https://reviews.llvm.org/D55299

llvm-svn: 348988
2018-12-12 23:32:35 +00:00
Michael Kruse 7244852557 [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.
When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g.

    #pragma clang loop unroll_and_jam(enable)
    #pragma clang loop distribute(enable)

is the same as

    #pragma clang loop distribute(enable)
    #pragma clang loop unroll_and_jam(enable)

and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used.

This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance,

    !0 = !{!0, !1, !2}
    !1 = !{!"llvm.loop.unroll_and_jam.enable"}
    !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3}
    !3 = !{!"llvm.loop.distribute.enable"}

defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop.

Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account.

For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations.

Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated.

To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied.

With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling).

Reviewed By: hfinkel, dmgreen

Differential Revision: https://reviews.llvm.org/D49281
Differential Revision: https://reviews.llvm.org/D55288

llvm-svn: 348944
2018-12-12 17:32:52 +00:00
Reid Kleckner 41390b47de Revert r346810 "Preserve loop metadata when splitting exit blocks"
It broke the Windows self-host:
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/1457

llvm-svn: 346823
2018-11-14 01:47:32 +00:00
Craig Topper 3c87c2a3c5 Preserve loop metadata when splitting exit blocks
LoopUtils.cpp contains a utility that splits an loop exit block, so that the new block contains only edges coming from the loop. In the case of nested loops, the exit path for the inner loop might also be the back-edge of the outer loop. The new block which is inserted on this path, is now a latch for the outer loop, and it needs to hold the loop metadata for the outer loop. (The test case gives a more concrete view of the situation.)

Patch by Chang Lin (clin1)

Differential Revision: https://reviews.llvm.org/D53876

llvm-svn: 346810
2018-11-13 23:06:49 +00:00
Vikram TV 7e98d69847 Break LoopUtils into an Analysis file.
Summary:
The InductionDescriptor and RecurrenceDescriptor classes basically analyze the IR to identify the respective IVs. So, it is better to have them in the "Analysis" directory instead of the "Transforms" directory.

The rationale for this is to make the Induction and Recurrence descriptor classes available for analysis passes. Currently including them in an analysis pass produces link error (http://lists.llvm.org/pipermail/llvm-dev/2018-July/124456.html).

Induction and Recurrence descriptors are moved from Transforms/Utils/LoopUtils.h|cpp to Analysis/IVDescriptors.h|cpp.

Reviewers: dmgreen, llvm-commits, hfinkel

Reviewed By: dmgreen

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D51153

llvm-svn: 342016
2018-09-12 01:59:43 +00:00
Vikram TV 09be521d4d Move a transformation routine from LoopUtils to LoopVectorize.
Summary:
Move InductionDescriptor::transform() routine from LoopUtils to its only uses in LoopVectorize.cpp.
Specifically, the function is renamed as InnerLoopVectorizer::emitTransformedIndex().

This is a child to D51153.

Reviewers: dmgreen, llvm-commits

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D51837

llvm-svn: 341776
2018-09-10 06:16:44 +00:00
Vikram TV 6594dc377d Move createMinMaxOp() out of RecurrenceDescriptor.
Reviewers: dmgreen, llvm-commits

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D51838

llvm-svn: 341773
2018-09-10 05:05:08 +00:00
Alina Sbirlea ab6f84f763 Update MemorySSA in BasicBlockUtils.
Summary:
Extend BasicBlocksUtils to update MemorySSA.

Subscribers: sanjoy, arsenm, nhaehnle, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D45300

llvm-svn: 340365
2018-08-21 23:32:03 +00:00
David Green 6cb6478739 [UnJ] Rename hasInvariantIterationCount to hasIterationCountInvariantInParent NFC
This hopefully describes the API of the function more precisely.

llvm-svn: 339762
2018-08-15 10:59:41 +00:00
David Green 395b80cd3c [UnJ] Create a hasInvariantIterationCount function. NFC
Pulled out a separate function for some code that calculates
if an inner loop iteration count is invariant to it's outer
loop.

Differential Revision: https://reviews.llvm.org/D50063

llvm-svn: 339500
2018-08-11 06:57:28 +00:00
Chijun Sima 21a8b605a1 [Dominators] Convert existing passes and utils to use the DomTreeUpdater class
Summary:
This patch is the second in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].

It converts passes (e.g. adce/jump-threading) and various functions which currently accept DDT in local.cpp and BasicBlockUtils.cpp to use the new DomTreeUpdater class.
These converted functions in utils can accept DomTreeUpdater with either UpdateStrategy and can deal with both DT and PDT held by the DomTreeUpdater.

Reviewers: brzycki, kuhar, dmgreen, grosser, davide

Reviewed By: brzycki

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48967

llvm-svn: 338814
2018-08-03 05:08:17 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Simon Pilgrim 23c2182c2b Support generic expansion of ordered vector reduction (PR36732)
Without the fast math flags, the llvm.experimental.vector.reduce.fadd/fmul intrinsic expansions must be expanded in order.

This patch scalarizes the reduction, applying the accumulator at the start of the sequence: ((((Acc + Scl[0]) + Scl[1]) + Scl[2]) + ) ... + Scl[NumElts-1]

Differential Revision: https://reviews.llvm.org/D45366

llvm-svn: 329585
2018-04-09 15:44:20 +00:00
Simon Pilgrim a74f4ae404 Strip trailing whitespace. NFCI.
llvm-svn: 329421
2018-04-06 17:01:54 +00:00
Philip Reames 23aed5ef6f [MustExecute] Move isGuaranteedToExecute and related rourtines to Analysis
Next step is to actually merge the implementations and get both implementations tested through the new printer.

llvm-svn: 328055
2018-03-20 22:45:23 +00:00
Philip Reames 8a106272e8 [LICM/mustexec] Extend first iteration must execute logic to fcmps
This builds on the work from https://reviews.llvm.org/D44287. It turned out supporting fcmp was much easier than I realized, so let's do that now.

As an aside, our -O3 handling of a floating point IVs leaves a lot to be desired. We do convert the float IV to an integer IV, but do so late enough that many other optimizations are missed (e.g. we don't vectorize).

Differential Revision: https://reviews.llvm.org/D44542

llvm-svn: 327722
2018-03-16 16:33:49 +00:00
Philip Reames a21d5f1e18 [LICM] Ignore exits provably not taken on first iteration when computing must execute
It is common to have conditional exits within a loop which are known not to be taken on some iterations, but not necessarily all. This patches extends our reasoning around guaranteed to execute (used when establishing whether it's safe to dereference a location from the preheader) to handle the case where an exit is known not to be taken on the first iteration and the instruction of interest *is* known to be taken on the first iteration.

This case comes up in two major ways:
* If we have a range check which we've been unable to eliminate, we frequently know that it doesn't fail on the first iteration.
* Pass ordering. We may have a check which will be eliminated through some sequence of other passes, but depending on the exact pass sequence we might never actually do so or we might miss other optimizations from passes run before the check is finally eliminated.

The initial version (here) is implemented via InstSimplify. At the moment, it catches a few cases, but misses a lot too. I added test cases for missing cases in InstSimplify which I'll follow up on separately. Longer term, we should probably wire SCEV through to here to get much smarter loop aware simplification of the first iteration predicate.

Differential Revision: https://reviews.llvm.org/D44287

llvm-svn: 327664
2018-03-15 21:04:28 +00:00
Philip Reames fbffd126b8 [NFC] Factor out a helper function for checking if a block has a potential early implicit exit.
llvm-svn: 327065
2018-03-08 21:25:30 +00:00
David Green 0d5f9651f2 Move llvm::computeLoopSafetyInfo from LICM.cpp to LoopUtils.cpp. NFC
Move computeLoopSafetyInfo, defined in Transforms/Utils/LoopUtils.h,
into the corresponding LoopUtils.cpp, as opposed to LICM where it resides
at the moment. This will allow other functions from Transforms/Utils
to reference it.

llvm-svn: 325151
2018-02-14 18:34:53 +00:00
Chad Rosier a097bc69df [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking
The type-shrinking logic in reduction detection, although narrow in scope, is
also rather ad-hoc, which has led to bugs (e.g., PR35734). This patch modifies
the approach to rely on the demanded bits and value tracking analyses, if
available. We currently perform type-shrinking separately for reductions and
other instructions in the loop. Long-term, we should probably think about
computing minimal bit widths in a more complete way for the loops we want to
vectorize.

PR35734
Differential Revision: https://reviews.llvm.org/D42309

llvm-svn: 324195
2018-02-04 15:42:24 +00:00
Hiroshi Inoue d24ddcd6c4 [NFC] fix trivial typos in comments
"the the" -> "the"

llvm-svn: 322934
2018-01-19 10:55:29 +00:00
Serguei Katkov a757d65cec [LoopDeletion] Handle users in unreachable block
This is a fix for PR35884.

When we want to delete dead loop we must clean uses in unreachable blocks
otherwise we'll get an assert during deletion of instructions from the loop.

Reviewers: anna, davide
Reviewed By: anna
Subscribers: llvm-commits, lebedev.ri
Differential Revision: https://reviews.llvm.org/D41943

llvm-svn: 322357
2018-01-12 07:24:43 +00:00
Benjamin Kramer c7fc81e659 Use phi ranges to simplify code. No functionality change intended.
llvm-svn: 321585
2017-12-30 15:27:33 +00:00
Benjamin Kramer 802e6255b2 Make helpers static. No functionality change.
llvm-svn: 321425
2017-12-24 12:46:22 +00:00
Dorit Nuzman 4750c785b3 [LV] Support efficient vectorization of an induction with redundant casts
D30041 extended SCEVPredicateRewriter to improve handling of Phi nodes whose
update chain involves casts; PSCEV can now build an AddRecurrence for some
forms of such phi nodes, under the proper runtime overflow test. This means
that we can identify such phi nodes as an induction, and the loop-vectorizer
can now vectorize such inductions, however inefficiently. The vectorizer
doesn't know that it can ignore the casts, and so it vectorizes them.

This patch records the casts in the InductionDescriptor, so that they could
be marked to be ignored for cost calculation (we use VecValuesToIgnore for
that) and ignored for vectorization/widening/scalarization (i.e. treated as
TriviallyDead).

In addition to marking all these casts to be ignored, we also need to make
sure that each cast is mapped to the right vector value in the vector loop body
(be it a widened, vectorized, or scalarized induction). So whenever an
induction phi is mapped to a vector value (during vectorization/widening/
scalarization), we also map the respective cast instruction (if exists) to that
vector value. (If the phi-update sequence of an induction involves more than one
cast, then the above mapping to vector value is relevant only for the last cast
of the sequence as we allow only the "last cast" to be used outside the
induction update chain itself).

This is the last step in addressing PR30654.

llvm-svn: 320672
2017-12-14 07:56:31 +00:00
Sanjay Patel 3e069f5724 [LoopUtils] simplify createTargetReduction(); NFCI
llvm-svn: 319946
2017-12-06 19:37:00 +00:00
Sanjay Patel 1ea7b6f7a1 [LoopUtils] fix variable name to match FMF vocabulary; NFC
llvm-svn: 319928
2017-12-06 19:11:23 +00:00
Sanjay Patel 629c411538 [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html

...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.

As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic 
reassociation - 'AllowReassoc'.

We're also adding a bit to allow approximations for library functions called 'ApproxFunc' 
(this was initially proposed as 'libm' or similar).

...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did 
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), 
but that's apparently already used for other purposes. Also, I don't think we can just 
add a field to FPMathOperator because Operator is not intended to be instantiated. 
We'll defer movement of FMF to another day.

We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.

Finally, this change is binary incompatible with existing IR as seen in the 
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile 
them. For example, if nsw is ever replaced with something else, dropping it would be 
a valid way to upgrade the IR." 
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR 
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will 
fail to optimize some previously 'fast' code because it's no longer recognized as 
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.

Note: an inter-dependent clang commit to use the new API name should closely follow 
commit.

Differential Revision: https://reviews.llvm.org/D39304

llvm-svn: 317488
2017-11-06 16:27:15 +00:00