Commit Graph

312 Commits

Author SHA1 Message Date
Siddharth Bhat 286c916dde [Polly] [ScopDetection] Allow passing multiple functions to `-polly-only-func`.
- This is useful to run optimisations on only certain functions.

Differential Revision: https://reviews.llvm.org/D33990

llvm-svn: 305060
2017-06-09 08:23:40 +00:00
Michael Kruse a6d48f59a1 Fix a lot of typos. NFC.
llvm-svn: 304974
2017-06-08 12:06:15 +00:00
Tobias Grosser 1e55db30d5 Delinearize memory accesses that reference parameters coming from function calls
Certain affine memory accesses which we model today might contain products of
parameters which we might combined into a new parameter to be able to create an
affine expression that represents these memory accesses. Especially in the
context of OpenCL, this approach looses information as memory accesses such as
A[get_global_id(0) * N + get_global_id(1)] are assumed to be linear. We
correctly recover their multi-dimensional structure by assuming that parameters
that are the result of a function call at IR level likely are not parameters,
but indeed induction variables. The resulting access is now
A[get_global_id(0)][get_global_id(1)] for an array A[][N].

llvm-svn: 304075
2017-05-27 15:18:53 +00:00
Philip Pfaffe 1a0128faaa [Polly] Add handling of Top Level Regions
Summary:
My goal is to make the newly added `AllowWholeFunctions` options more usable/powerful.

The changes to ScopBuilder.cpp are exclusively checks to prevent `Region.getExit()` from being dereferenced, since Top Level Regions (TLRs) don't have an exit block.

In ScopDetection's `isValidCFG`, I removed a check that disallowed ReturnInstructions to have return values. This might of course have been intentional, so I would welcome your feedback on this and maybe a small explanation why return values are forbidden. Maybe it can be done but needs more changes elsewhere?

The remaining changes in ScopDetection are simply to consider the AllowWholeFunctions option in more places, i.e. allow TLRs when it is set and once again avoid derefererncing `getExit()` if it doesn't exist.

Finally, in ScopHelper.cpp I extended `polly::isErrorBlock` to handle regions without exit blocks as well: The original check was if a given BasicBlock dominates all predecessors of the exit block. Therefore I do the same for TLRs by regarding all BasicBlocks terminating with a ReturnInst as predecessors of a "virtual" function exit block.

Patch by: Lukas Boehm

Reviewers: philip.pfaffe, grosser, Meinersbur

Reviewed By: grosser

Subscribers: pollydev, llvm-commits, bollu

Tags: #polly

Differential Revision: https://reviews.llvm.org/D33411

llvm-svn: 303790
2017-05-24 18:39:39 +00:00
Tobias Grosser d8945baa0a [ScopDetection] Allow detection of full functions
This is useful when only analyzing functions.

llvm-svn: 303420
2017-05-19 12:13:02 +00:00
Philip Pfaffe 5cc87e3ab3 [Polly][NewPM] Port ScopDetection to the new PassManager
Summary: This is a proof of concept of how to port polly-passes to the new PassManager architecture.  This approach works ootb for Function-Passes, but might not be directly applicable to Scop/Region-Passes. While we could just run the Analyses/Transforms over functions instead, we'd surrender the nice pipelining behaviour we have now.

Reviewers: Meinersbur, grosser

Reviewed By: grosser

Subscribers: pollydev, sanjoy, nemanjai, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D31459

llvm-svn: 302902
2017-05-12 14:37:29 +00:00
Tobias Grosser 3f25a7e8ee [ScopDetection] Check for already known required-invariant loads [NFC]
For certain test cases we spent over 50% of the scop detection time in
checking if a load is likely invariant. We can avoid most of these checks by
testing early on if a load is expected to be invariant. Doing this reduces
scop-detection time on a large benchmark from 52 seconds to just 25 seconds.

No functional change is expected.

llvm-svn: 302134
2017-05-04 10:16:20 +00:00
Tobias Grosser 7b5a4dfd46 Exploit BasicBlock::getModule to shorten code
Suggested-by: Roman Gareev <gareevroman@gmail.com>
llvm-svn: 299914
2017-04-11 04:59:13 +00:00
Tobias Grosser 8bd7f3c0a5 [ScopDetect/Info] Allow unconditional hoisting of loads from dereferenceable ptrs
In case LLVM pointers are annotated with !dereferencable attributes/metadata
or LLVM can look at the allocation from which a pointer is derived, we can know
that dereferencing pointers is safe and can be done unconditionally. We use this
information to proof certain pointers as save to hoist and then hoist them
unconditionally.

llvm-svn: 297375
2017-03-09 11:36:00 +00:00
Michael Kruse 6744efa8d8 [ScopDetection] Only allow SCoP-wide available base pointers.
Simplify ScopDetection::isInvariant(). Essentially deny everything that
is defined within the SCoP and is not load-hoisted.

The previous understanding of "invariant" has a few holes:

- Expressions without side-effects with only invariant arguments, but
  are defined withing the SCoP's region with the exception of selects
  and PHIs. These should be part of the index expression derived by
  ScalarEvolution and not of the base pointer.

- Function calls with that are !mayHaveSideEffects() (typically
  functions with "readnone nounwind" attributes). An example is given
  below.

      @C = external global i32
      declare float* @getNextBasePtr(float*) readnone nounwind
      ...
      %ptr = call float* @getNextBasePtr(float* %A, float %B)

  The call might return:

  * %A, so %ptr aliases with it in the SCoP
  * %B, so %ptr aliases with it in the SCoP
  * @C, so %ptr aliases with it in the SCoP
  * a new pointer everytime it is called, such as malloc()
  * a pointer into the allocated block of one of the aforementioned
  * any of the above, at random at each call

  Hence and contrast to a comment in the base_pointer.ll regression
  test, %ptr is not necessarily the same all the time. It might also
  alias with anything and no AliasAnalysis can tell otherwise if the
  definition is external. It is hence not suitable in the role of a
  base pointer.

The practical problem with base pointers defined in SCoP statements is
that it is not available globally in the SCoP. The statement instance
must be executed first before the base pointer can be used. This is no
problem if the base pointer is transferred as a scalar value between
statements. Uses of MemoryAccess::setNewAccessRelation may add a use of
the base pointer anywhere in the array. setNewAccessRelation is used by
JSONImporter, DeLICM and D28518. Indeed, BlockGenerator currently
assumes that base pointers are available globally and generates invalid
code for new access relation (referring to the base pointer of the
original code) if not, even if the base pointer would be available in
the statement.

This could be fixed with some added complexity and restrictions. The
ExprBuilder must lookup the local BBMap and code that call
setNewAccessRelation must check whether the base pointer is available
first.

The code would still be incorrect in the presence of aliasing. There
is the switch -polly-ignore-aliasing to explicitly allow this, but
it is hardly a justification for the additional complexity. It would
still be mostly useless because in most cases either getNextBasePtr()
has external linkage in which case the readnone nounwind attributes
cannot be derived in the translation unit itself, or is defined in the
same translation unit and gets inlined.

Reviewed By: grosser

Differential Revision: https://reviews.llvm.org/D30695

llvm-svn: 297281
2017-03-08 15:14:46 +00:00
Michael Kruse 5a4ec5c42b [ScopDetection] Require LoadInst base pointers to be hoisted.
Only when load-hoisted we can be sure the base pointer is invariant
during the SCoP's execution. Most of the time it would be added to
the required hoists for the alias checks anyway, except with
-polly-ignore-aliasing, -polly-use-runtime-alias-checks=0 or if
AliasAnalysis is already sure it doesn't alias with anything
(for instance if there is no other pointer to alias with).

Two more parts in Polly assume that this load-hoisting took place:
- setNewAccessRelation() which contains an assert which tests this.
- BlockGenerator which would use to the base ptr from the original
  code if not load-hoisted (if the access expression is regenerated)

Differential Revision: https://reviews.llvm.org/D30694

llvm-svn: 297195
2017-03-07 20:28:43 +00:00
Tobias Grosser 134a572951 [ScopDetection] Do not detect scops that exit to an unreachable
Scops that exit with an unreachable are today still permitted, but make little
sense to optimize. We therefore can already skip them during scop detection.
This speeds up scop detection in certain cases and also ensures that bugpoint
does not introduce unreachables when reducing test cases.

In practice this change should have little impact, as the performance of
unreachable code is unlikely to matter.

This commit is part of a series that makes Polly more robust in the presence
of unreachables.

llvm-svn: 297151
2017-03-07 15:50:43 +00:00
Tobias Grosser 1c787e0b49 [ScopDetection] Do not allow required-invariant loads in non-affine region
These loads cannot be savely hoisted as the condition guarding the
non-affine region cannot be duplicated to also protect the hoisted load
later on. Today they are dropped in ScopInfo. By checking for this early, we
do not even try to model them and possibly can still optimize smaller regions
not containing this specific required-invariant load.

llvm-svn: 296744
2017-03-02 12:15:37 +00:00
Michael Kruse 52ab4943b4 Remove all references to PostDominators. NFC.
Marking a pass as preserved is necessary if any Polly pass uses it, even
if it is not preserved within the generated code. Not marking it would
cause the the Polly pass chain to be interrupted. It is not used by any
Polly pass anymore, hence we can remove all references to it.

llvm-svn: 295983
2017-02-23 15:16:22 +00:00
Tobias Grosser cd01a363d6 [ScopInfo] Add statistics to count loops after scop modeling
llvm-svn: 295431
2017-02-17 08:12:36 +00:00
Tobias Grosser 65ce9362b8 [ScopDetection] Compute the maximal loop depth correctly
Before this change, we obtained loop depth numbers that were deeper then the
actual loop depth.

llvm-svn: 295430
2017-02-17 08:08:54 +00:00
Tobias Grosser 9fe37df27c [ScopDetection] Add statistics to count the maximal number of scops in loop
llvm-svn: 294893
2017-02-12 10:52:57 +00:00
Tobias Grosser 21a059af09 Adjust formatting to commit r292110 [NFC]
llvm-svn: 292123
2017-01-16 14:08:10 +00:00
Johannes Doerfert bda814350a Allow to disable unsigned operations (zext, icmp ugt, ...)
Unsigned operations are often useful to support but the heuristics are
not yet tuned. This options allows to disable them if necessary.

llvm-svn: 288521
2016-12-02 17:55:41 +00:00
Johannes Doerfert a94ae1aede Do not allow multiple possibly aliasing ptrs in an expression
Relational comparisons should not involve multiple potentially
  aliasing pointers. Similarly this should hold for switch conditions
  and the two conditions involved in equality comparisons (separately!).
  This is a heuristic based on the C semantics that does only allow such
  operations when the base pointers do point into the same object.
  Since this makes aliasing likely we will bail out early instead of
  producing a probably failing runtime check.

llvm-svn: 288516
2016-12-02 17:49:52 +00:00
Tobias Grosser b45ae5601b [ScopDetect] Expand statistics of the detected scops
We now collect:

  Number of total loops
  Number of loops in scops
  Number of scops
  Number of scops with maximal loop depth 1
  Number of scops with maximal loop depth 2
  Number of scops with maximal loop depth 3
  Number of scops with maximal loop depth 4
  Number of scops with maximal loop depth 5
  Number of scops with maximal loop depth 6 and larger
  Number of loops in scops (profitable scops only)
  Number of scops (profitable scops only)
  Number of scops with maximal loop depth 1 (profitable scops only)
  Number of scops with maximal loop depth 2 (profitable scops only)
  Number of scops with maximal loop depth 3 (profitable scops only)
  Number of scops with maximal loop depth 4 (profitable scops only)
  Number of scops with maximal loop depth 5 (profitable scops only)
  Number of scops with maximal loop depth 6 and larger (profitable scops only)

These statistics are certainly completely accurate as we might drop scops
when building up their polyhedral representation, but they should give a good
indication of the number of scops we detect.

llvm-svn: 287973
2016-11-26 07:37:46 +00:00
Tobias Grosser 1f0236d8e5 [ScopDetect] Use mayReadOrWriteMemory to shorten condition
llvm-svn: 287525
2016-11-21 09:07:30 +00:00
Tobias Grosser b94e9b31d0 [ScopDetect] Remove unnecessary namespace qualifier
llvm-svn: 287524
2016-11-21 09:04:45 +00:00
Johannes Doerfert 6cd59e9076 Probably overwritten loads should not be considered hoistable
Do not assume a load to be hoistable/invariant if the pointer is used by
another instruction in the SCoP that might write to memory and that is
always executed.

llvm-svn: 287272
2016-11-17 22:25:17 +00:00
Tobias Grosser 70d2709b1a [ScopDetect] Conservatively handle inaccessible memory alias attributes
Commit r286294 introduced support for inaccessiblememonly and
inaccessiblemem_or_argmemonly attributes to BasicAA, which we need to
support to avoid undefined behavior. This change just refuses all calls
which are annotated with these attributes, which is conservatively correct.
In the future we may consider to model and support such function calls
in Polly.

llvm-svn: 286771
2016-11-13 19:27:24 +00:00
Tobias Grosser a2f8fa33aa [ScopDetect] Evaluate and verify branches at branch condition, not icmp
The validity of a branch condition must be verified at the location of the
branch (the branch instruction), not the location of the icmp that is
used in the branch instruction. When verifying at the wrong location, we
may accept an icmp that is defined within a loop which itself dominates, but
does not contain the branch instruction. Such loops cannot be modeled as
we only introduce domain dimensions for surrounding loops. To address this
problem we change the scop detection to evaluate and verify SCEV expressions at
the right location.

This issue has been around since at least r179148 "scop detection: properly
instantiate SCEVs to the place where they are used", where we explicitly
set the scope to the wrong location. Before this commit the scope
was not explicitly set, which probably also resulted in the scope around the
ICmp to be choosen.

This resolves http://llvm.org/PR30989

Reported-by: Eli Friedman <efriedma@codeaurora.org>
llvm-svn: 286769
2016-11-13 19:27:04 +00:00
Tobias Grosser bbaeda3fe5 Do not allow switch statements in loop latches
In r248701 "Allow switch instructions in SCoPs" support for switch statements
has been introduced, but support for switch statements in loop latches was
incomplete. This change completely disables switch statements in loop latches.

The original commit changed addLoopBoundsToHeaderDomain to support non-branch
terminator instructions, but this change was incorrect: it added a check for
BI != null to the if-branch of a condition, but BI was used in the else branch
es well. As a result, when a non-branch terminator instruction is encounted a
nullptr dereference is triggered. Due to missing test coverage, this bug was
overlooked.

r249273 "[FIX] Approximate non-affine loops correctly" added code to disallow
switch statements for non-affine loops, if they appear in either a loop latch
or a loop exit. We adapt this code to now prohibit switch statements in
loop latches even if the control condition is affine.

We could possibly add support for switch statements in loop latches, but such
support should be evaluated and tested separately.

This fixes llvm.org/PR30952

Reported-by: Eli Friedman <efriedma@codeaurora.org>
llvm-svn: 286426
2016-11-10 05:20:29 +00:00
Tobias Grosser ebb626e4b7 [ScopDetect] Use SCEVRewriteVisitor to simplify SCEVRemoveSMax rewriter
ScalarEvolution got at some pointer a SCEVRewriteVisitor. Use it to simplify
our SCEVRemoveSMax visitor.

llvm-svn: 285491
2016-10-29 06:19:34 +00:00
Michael Kruse 6a19d592da [ScopDetect] Depend transitively on ScalarEvolution.
ScopDetection might be queried by -dot-scops or -view-scops passes for which
it accesses ScalarEvolution.

llvm-svn: 284385
2016-10-17 13:29:20 +00:00
Tobias Grosser 349d1c3368 [ScopDetection] Remove redundant checks for endless loops
Summary:
Both `canUseISLTripCount()` and `addOverApproximatedRegion()` contained checks
to reject endless loops which are now removed and replaced by a single check
in `isValidLoop()`.

For reporting such loops the `ReportLoopOverlapWithNonAffineSubRegion` is
renamed to `ReportLoopHasNoExit`. The test case
`ReportLoopOverlapWithNonAffineSubRegion.ll` is adapted and renamed as well.

The schedule generation in `buildSchedule()` is based on the following
assumption:

Given some block B that is contained in a loop L and a SESE region R,
we assume that L is contained in R or the other way around.

However, this assumption is broken in the presence of endless loops that are
nested inside other loops. Therefore, in order to prevent erroneous behavior
in `buildSchedule()`, r265280 introduced a corresponding check in
`canUseISLTripCount()` to reject endless loops. Unfortunately, it was possible
to bypass this check with -polly-allow-nonaffine-loops which was fixed by adding
another check to reject endless loops in `allowOverApproximatedRegion()` in
r273905. Hence there existed two separate locations that handled this case.

Thank you Johannes Doerfert for helping to provide the above background
information.

Reviewers: Meinersbur, grosser

Subscribers: _jdoerfert, pollydev

Differential Revision: https://reviews.llvm.org/D24560

Contributed-by: Matthias Reisinger <d412vv1n@gmail.com>
llvm-svn: 281987
2016-09-20 17:05:22 +00:00
Tobias Grosser b316dc166f ScopDetection: Make sure we do not accidentally divide by zero
This code path is likely never triggered, but by still handling this case
locally we avoid warnings in clangs static analyzer.

llvm-svn: 280939
2016-09-08 14:08:05 +00:00
Tobias Grosser c80d6979bd Drop '@brief' from doxygen comments
LLVM's coding guideline suggests to not use @brief for one-sentence doxygen
comments to improve readability. Switch this once and for all to ensure people
do not copy @brief comments from other parts of Polly, when writing new code.

llvm-svn: 280468
2016-09-02 06:33:33 +00:00
Tobias Grosser 74814e1a07 Disable invariant load hoisting temporarily
With invariant load hoisting enabled the LLVM buildbots currently show some
miscompiles, which are possibly caused by invariant load hosting itself.
Confirming and fixing this requires a more in-depth analysis. To meanwhile get
back green buildbots that allow us to observe other regressions, we disable
invariant code hoisting temporarily. The relevant bug is tracked at:

http://llvm.org/PR28985

llvm-svn: 278681
2016-08-15 16:43:36 +00:00
Michael Kruse a6cc0d3a2d [ScopDetection] Remove unused DetectionContexts during expansion.
The function expandRegion() frees Region* objects again when it determines that
these are not valid SCoPs. However, the DetectionContext added to the
DetectionContextMap still holds a reference. The validity is checked using the
ValidRegions lookup table. When a new Region is added to that list, it might
share the same address, such that the DetectionContext contains two
Region* associations that are in ValidRegions, but that are unrelated and of
which one has already been free.

Also remove the DetectionContext when not a valid expansion.

llvm-svn: 278062
2016-08-08 22:39:32 +00:00
Tobias Grosser 629109b633 GPGPU: Mark kernel functions as polly.skip
Otherwise, we would try to re-optimize them with Polly-ACC and possibly even
generate kernels that try to offload themselves, which does not work as the
GPURuntime is not available on the accelerator and also does not make any
sense.

llvm-svn: 277589
2016-08-03 12:00:07 +00:00
Weiming Zhao 7614e178cb Fix a build warning of unhandled enum in switch
Summary: LLVM adds a new value FMRB_DoesNotReadMemory in the enumeration.

Reviewers: andrew.w.kaylor, chrisj, zinob, grosser, jdoerfert

Subscribers: Meinersbur, pollydev

Differential Revision: http://reviews.llvm.org/D22109

llvm-svn: 275085
2016-07-11 18:27:52 +00:00
Michael Kruse a1a303f31e Add comment on why loops/regions can overlap. NFC.
The case is described in llvm.org/PR28071 which was fixed in the
previous commit.

llvm-svn: 273906
2016-06-27 19:00:55 +00:00
Michael Kruse 41f046a282 Fix assertion due to loop overlap with nonaffine region.
Reject and report regions that contains loops overlapping nonaffine region.
This situation typically happens in the presence of inifinite loops.

This addresses bug llvm.org/PR28071.

Differential Revision: http://reviews.llvm.org/D21312

Contributed-by: Huihui Zhang <huihuiz@codeaurora.org>
llvm-svn: 273905
2016-06-27 19:00:49 +00:00
Tobias Grosser 8dd653d983 clang-tidy: apply modern-use-nullptr fixes
Instead of using 0 or NULL use the C++11 nullptr symbol when referencing null
pointers.

This cleanup was suggested by Eugene Zelenko <eugene.zelenko@gmail.com> in
http://reviews.llvm.org/D21488 and was split out to increase readability.

llvm-svn: 273435
2016-06-22 16:22:00 +00:00
Tobias Grosser ef6ae7030d ScopDetection: Make enum function-local
The 'Color' enum is only used for irreducible control flow detection. Johannes
already moved this enum in r270054 from ScopDetection.h to ScopDetection.cpp to
limit its scope to a single cpp file. We now move it into the only function
where this enum is needed to make clear that it is only needed locally in this
single function.

Thanks to Johannes for pointing out this cleanup opportunity.

llvm-svn: 272462
2016-06-11 09:00:37 +00:00
Johannes Doerfert 1dafea4114 Make the detection context non-constant [NFC]
llvm-svn: 270410
2016-05-23 09:07:08 +00:00
Johannes Doerfert 469db6a247 Move internal enum out of class declaration [NFC]
llvm-svn: 270054
2016-05-19 12:36:43 +00:00
Johannes Doerfert ffd222f2d6 Propagate the DetectionContext to the SCoP [NFC]
The SCoP now holds a reference to the ScopDetection::DetectionContext
  which allows to simplify the type of various methods and remove code.

llvm-svn: 270053
2016-05-19 12:34:57 +00:00
Johannes Doerfert f5841a66af Remove leftover debug output [NFC]
llvm-svn: 270051
2016-05-19 12:32:54 +00:00
Johannes Doerfert e6e3c9246a Check late for profitability
Before this patch we only expanded valid __and__ profitable region. Therefor
  we did not allow the expansion to create a profitable region from a
  non-profitable one.  With this patch we will remember and expand all valid
  regions and check for profitability only at the end.

  This patch increases the number of valid SCoPs in the LLVM-TS and SPEC
  2000/2006 by 28% (from 303 to 390), including the hot loop in hmmer.

llvm-svn: 269343
2016-05-12 20:21:50 +00:00
Johannes Doerfert 6c7639b380 Cleanup rejection log handling [NFC]
This patch cleans up the rejection log handling during the
  ScopDetection. It consists of two interconnected parts:
    - We keep all detection contexts for a function in order to provide
      more information to the user, e.g., about the rejection of
      extended/intermediate regions.
    - We remove the mutable "RejectLogs" member as the information is
      available through the detection contexts.

llvm-svn: 269323
2016-05-12 18:50:01 +00:00
Johannes Doerfert bf9473b2d8 Weaken profitability constraints during ScopDetection
Regions with one affine loop can be profitable if the loop is
  distributable. To this end we will allow them to be treated as
  profitable if they contain at least two non-trivial basic blocks.

llvm-svn: 269064
2016-05-10 14:42:30 +00:00
Johannes Doerfert 792374b941 Allow unsigned comparisons
With this patch we will optimistically assume that the result of an unsigned
  comparison is the same as the result of the same comparison interpreted as
  signed.

llvm-svn: 267559
2016-04-26 14:33:12 +00:00
Johannes Doerfert 517d8d2f94 Check only loop control of loops that are part of the region
This also removes a duplicated line of code in the region generator
  that caused a SPEC benchmark to fail with the new SCoPs.

llvm-svn: 267404
2016-04-25 13:37:24 +00:00
Johannes Doerfert ec8a217729 Remove unnecessary argument of the SCEVValidator [NFC]
llvm-svn: 267400
2016-04-25 13:32:36 +00:00