Commit Graph

949 Commits

Author SHA1 Message Date
Michael Kruse 1a53b732e6 Compile-fix after StringRef's conversion operator has been made explicit.
Commit 777180a "[ADT] Make StringRef's std::string conversion operator explicit"
caused Polly's GPU code generator to not compile anymore. The rest of
Polly has already been fixed in commit
0257a9 "Fix polly build after StringRef change."
2020-02-05 22:28:05 -06:00
Eli Friedman 0257a9218b Fix polly build after StringRef change. 2020-01-28 19:44:20 -08:00
Guillaume Chatelet 07c9d53266 [Alignment][NFC] Use Align with CreateAlignedLoad
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73449
2020-01-27 10:58:36 +01:00
Guillaume Chatelet 805c157e8a [Alignment][NFC] Deprecate Align::None()
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.

Reviewers: xbolva00, courbet, bollu

Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73099
2020-01-24 12:53:58 +01:00
Guillaume Chatelet 59f95222d4 [Alignment][NFC] Use Align with CreateAlignedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73274
2020-01-23 17:34:32 +01:00
Heejin Ahn 0133dc3983 [IR] Include more target specific intrinsic headers
After D71320, target-specific intrinsic headers should be included.
2019-12-14 19:19:35 -08:00
Heejin Ahn 5368f35efa [IR] Include target specific intrinsic headers
After D71320, target-specific intrinsic headers should be included.
2019-12-12 14:54:31 -08:00
Michael Kruse f7b3ae65c8 [GPGPU] Fix depricated warning.
setAlignment(unsigned) was deprecated in commit:

0e62011df8
[Alignment][NFC] Remove dependency on GlobalObject::setAlignment(unsigned)
2019-11-14 15:24:52 -06:00
Michael Kruse 2c831971bf [GPGPU] Fix #includes.
Adapt for 05da2fe521 "Sink all InitializePasses.h includes" which
forgot the GPGPU files (presumably because POLLY_ENABLE_GPGPU_CODEGEN
is OFF by default).
2019-11-14 14:39:28 -06:00
Reid Kleckner 1dfede3122 Move CodeGenFileType enum to Support/CodeGen.h
Avoids the need to include TargetMachine.h from various places just for
an enum. Various other enums live here, such as the optimization level,
TLS model, etc. Data suggests that this change probably doesn't matter,
but it seems nice to have anyway.
2019-11-13 16:39:34 -08:00
Reid Kleckner 05da2fe521 Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 16:34:37 -08:00
Michael Kruse 0aff3174dc [CodeGen] Fix getArrayAccessFor crashes as in bug 32534 with -polly-vectorizer=polly.
Root cause is VectorBlockGenerator::copyStmt iterates all instructions
in basic block, however some load instructions may be not unnecessary
thus removed by simplification. As a result, these load instructions
don't have a corresponding array.

Looking at BlockGenerator::copyBB, it only iterates instructions list
of ScopStmt. Given it must be a block type scop in case of
vectorization, I think we should do the same in
VectorBlockGenerator::copyStmt.

Patch by bin.narwal <bin.narwal@gmail.com>

Differential Revision: https://reviews.llvm.org/D70076
2019-11-12 13:58:28 -06:00
Guillaume Chatelet 0e62011df8 [Alignment][NFC] Remove dependency on GlobalObject::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, mehdi_amini, jvesely, nhaehnle, hiraditya, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68944

llvm-svn: 374880
2019-10-15 11:24:36 +00:00
Guillaume Chatelet d400d45150 [Alignment][NFC] Remove StoreInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu, jdoerfert

Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68268

llvm-svn: 373595
2019-10-03 13:17:21 +00:00
Guillaume Chatelet ab11b9188d [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68141

llvm-svn: 373207
2019-09-30 13:34:44 +00:00
Guillaume Chatelet 725efb35c7 [Alignment] Fix polly build
llvm-svn: 373199
2019-09-30 11:14:00 +00:00
Michael Kruse 241b02e762 [CodeGen] Handle outlining of CopyStmts.
Since the removal of extensions nodes from schedule trees in r362257 it
is possible to emit parallel code for SCoPs containing
matrix-multiplications. However, the code looking for references used in
outlined statement was not prepared to handle CopyStmts introduced by
the matrix-matrix multiplication detection.

In this case, CopyStmts do not introduce references in addition to the
ones captured by MemoryAccesses, i.e. we change the assertion to accept
CopyStmts and add a regression test for this case.

This fixes llvm.org/PR43164

llvm-svn: 372188
2019-09-17 22:59:43 +00:00
Michael Kruse aa8a976174 [ScheduleOptimizer] Hoist extension nodes after schedule optimization.
Extension nodes make schedule trees are less flexible: Many operations,
such as rescheduling, do not work on such schedule trees with extension.
As such, some functionality such as determining parallel loops in isl's
AST are disabled.

Currently, only the pattern-matching generalized matrix-matrix
multiplication optimization adds extension nodes (to add copy-in
statements).

This patch removes all extension nodes as the last step of the schedule
optimization by hoisting the extension node's added domain up to the
root domain node. All following passes can assume that schedule trees
work without restrictions, including the parallelism test. Mark the
outermost loop of the optimized matrix-matrix multiplication as parallel
such that -polly-parallel is able to parallelize that loop.

Differential Revision: https://reviews.llvm.org/D58202

llvm-svn: 362257
2019-05-31 19:26:57 +00:00
Michael Kruse c4c679c232 [CodeGen] Fix order of PHINode and MA Write generation.
At the end of a region statement, the PHINode must be generated
while the current IRBuilder's block is the region's exit node. For
obvious reasons: The PHINode references the region's exiting block.
A partial write would insert new control flow, i.e. insert new basic
blocks between the exiting blocks and the current block.

We fix this by generating the PHI nodes (region exit values) before
generating any MemoryAccess's stores.

This should fix the AOSP buildbot.

Reported-by: Eli Friedman <efriedma@quicinc.com>
llvm-svn: 361204
2019-05-20 22:31:09 +00:00
Michael Kruse 031bb16556 Apply include-what-you-use #include removal suggestions. NFC.
This removes unused includes (and forward declarations) as
suggested by include-what-you-use. If a transitive include of a removed
include is required to compile a file, I added the required header (or
forward declaration if suggested by include-what-you-use).

This should reduce compilation time and reduce the number of iterative
recompilations when a header was changed.

llvm-svn: 357209
2019-03-28 20:19:49 +00:00
Michael Kruse 89251edefc [CodeGen] LLVM OpenMP Backend.
The ParallelLoopGenerator class is changed such that GNU OpenMP specific
code was removed, allowing to use it as super class in a
template-pattern. Therefore, the code has been reorganized and one may
not use the ParallelLoopGenerator directly anymore, instead specific
implementations have to be provided. These implementations contain the
library-specific code. As such, the "GOMP" (code completely taken from
the existing backend) and "KMP" variant were created.

For "check-polly" all tests that involved "GOMP": equivalents were added
that test the new functionalities, like static scheduling and different
chunk sizes. "docs/UsingPollyWithClang.rst" shows how the alternative
backend may be used.

Patch by Michael Halkenhäuser <michaelhalk@web.de>

Differential Revision: https://reviews.llvm.org/D59100

llvm-svn: 356434
2019-03-19 03:18:21 +00:00
James Y Knight ae2f951219 [opaque pointer types] Update calls to CreateCall to pass the function
type in lldb and polly.

llvm-svn: 353549
2019-02-08 19:30:46 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Chandler Carruth e303c87e19 [TI removal] Make `getTerminator()` return a generic `Instruction`.
This removes the primary remaining API producing `TerminatorInst` which
will reduce the rate at which code is introduced trying to use it and
generally make it much easier to remove the remaining APIs across the
codebase.

Also clean up some of the stragglers that the previous mechanical update
of variables missed.

Users of LLVM and out-of-tree code generally will need to update any
explicit variable types to handle this. Replacing `TerminatorInst` with
`Instruction` (or `auto`) almost always works. Most of these edits were
made in prior commits using the perl one-liner:
```
perl -i -ple 's/TerminatorInst(\b.* = .*getTerminator\(\))/Instruction\1/g'
```

This also my break some rare use cases where people overload for both
`Instruction` and `TerminatorInst`, but these should be easily fixed by
removing the `TerminatorInst` overload.

llvm-svn: 344504
2018-10-15 10:42:50 +00:00
Michael Kruse 7860c5fe4e [IslAst] Fix InParallelFor nesting.
IslAst could mark two nested outer loops as "OutermostParallel". It
caused that the code generator tried to OpenMP-parallelize both loops,
which it is not prepared loop.

It was because the recursive AST build algorithm managed a flag
"InParallelFor" to ensure that no nested loop is also marked as
"OutermostParallel". Unfortunatetly the same flag was used by nodes
marked as SIMD, and reset to false after the SIMD node. Since loops can
be marked as SIMD inside "OutermostParallel" loops, the recursive
algorithm again tried to mark loops as "OutermostParellel" although
still nested inside another "OutermostParallel" loop.

The fix exposed another bug: The function "astScheduleDimIsParallel" was
only called when a loop was potentially "OutermostParallel" or
"InnermostParallel", but as a side-effect also determines the minimum
dependence distance. Hence, changing when we need to know whether a loop
is "OutermostParallel" also changed which loop was annotated with
"#pragma minimal dependence distance".

Moreover, some complex condition linked with "InParallelFor" determined
whether a loop should be an "InnermostParallel" loop. It missed some
situations where it would not use mark as such although being inside an
SIMD mark node, and therefore not be annotated using "#pragma simd".

The changes in particular:

1. Split the "InParallelFor" flag into an "InParallelFor" and an
   "InSIMD" flag.

2. Unconditionally call "astScheduleDimIsParallel" for its side-effects
   and store the result in "InParallel" for later use.

3. Simplify the condition when a loop is "InnermostParallel".

Fixes llvm.org/PR33153 and llvm.org/PR38073.

llvm-svn: 343212
2018-09-27 13:39:37 +00:00
Tobias Grosser 4beb2f964b [PerfMonitor] Fix rdtscp callsites
Summary:
Update all rdtscp callsites in PerfMonitor so that they conform with the signature changes introduced in r341698.

Reviewers: grosser, bollu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51928

llvm-svn: 341946
2018-09-11 14:17:44 +00:00
Tobias Grosser 5f865032db PPCG codegen
The latest version of the isl C++ bindings does not export the 'set'
method yet. Fall back to the C interface until this method can be
exported.

llvm-svn: 338512
2018-08-01 10:48:38 +00:00
Michael Kruse 42ad818265 [Polly-ACC] Fix compilation after r338450. NFC.
llvm-svn: 338462
2018-08-01 00:27:29 +00:00
Michael Kruse e873673b0c [CodeGen] Convert IslNodeBuilder::getNumberOfIterations to isl++. NFC.
llvm-svn: 338451
2018-07-31 23:01:50 +00:00
Michael Kruse f16378b080 [CodeGen] Convert IslNodeBuilder::createForSequential to isl++. NFC.
llvm-svn: 338450
2018-07-31 22:43:04 +00:00
Michael Kruse ade2242e7e [CodeGen] Convert IslNodeBuilder::getUpperBound to isl++. NFC.
llvm-svn: 338449
2018-07-31 22:42:59 +00:00
Tobias Grosser 670482db8b [IslNodeBuilder] Use isl++ to replace foreach_set with for loop
llvm-svn: 337247
2018-07-17 07:08:01 +00:00
Michael Kruse 9f305371d9 [CodeGen] Fix potential null pointer dereference. NFC.
ScalarEvolution::getSCEV dereferences its argument, s.t. passing nullptr
leads to undefined behaviour.

Check for nullptr before calling it instead of checking its argument
afterwards.

llvm-svn: 336350
2018-07-05 13:44:50 +00:00
Siddharth Bhat 936c74ad0d [PPCGCodeGen] Change printf to outs() to prevent garbled output. [NFC]
Summary:
It appears that llvm uses unbuffered C++ streams. So, we should not
mix C and C++ stream operations, because that will give us mixed
up output.

Reviewers: efriedma, jdoerfert, Meinersbur, gareevroman, sebpop, zinob, huihuiz, pollydev, grosser, singam-sanjay, philip.pfaffe

Reviewed By: philip.pfaffe

Subscribers: nemanjai, kbarton

Differential Revision: https://reviews.llvm.org/D40126

llvm-svn: 336288
2018-07-04 16:51:27 +00:00
Philip Pfaffe d71493cb06 [polly-acc] change cl_get_* return types to 32/64bit
Summary:
This patch changes the return types for ocl_get_* functions during SPIR code generation. Because these functions return size_t types, the return type needs to be changed to the actual size of size_t on the device.

Based on work by Michal Babej and Pekka Jääskeläinen

Patch by: Alain Denzler

Reviewers: grosser, philip.pfaffe, bollu

Reviewed By: grosser, philip.pfaffe

Subscribers: nemanjai, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D48774

llvm-svn: 336080
2018-07-02 07:40:47 +00:00
Philip Pfaffe 66a05ad672 Simplify the implementation of getCUDALibDeviceFunction. NFC.
Summary:
The function is currently awfully complicated. Drop the IILE and use
StringRef over std::string.

Reviewers: Meinersbur, grosser, bollu

Reviewed By: Meinersbur

Subscribers: nemanjai, kbarton, bollu, llvm-commits, pollydev

Differential Revision: https://reviews.llvm.org/D48070

llvm-svn: 334695
2018-06-14 08:54:55 +00:00
Philip Pfaffe e6e1828004 Run clang-format
llvm-svn: 334172
2018-06-07 08:32:13 +00:00
Philip Pfaffe 30c5e4ad35 Fix a missing lambda return type that tripped the builders
llvm-svn: 334166
2018-06-07 07:50:55 +00:00
Tobias Grosser 6a6d9df78e getDependences to new C++ interface
Reviewers: Meinersbur, grosser, bollu, cs15btech11044, jdoerfert

Reviewed By: grosser

Subscribers: pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D47786

llvm-svn: 334092
2018-06-06 13:10:32 +00:00
Tobias Grosser a998f98ba6 Fix formatting
llvm-svn: 333988
2018-06-05 09:03:46 +00:00
David Blaikie 4490465db7 Update for a header file move in LLVM
llvm-svn: 333956
2018-06-04 21:23:32 +00:00
Philip Pfaffe c06a6380a0 [Acc] Re-land r326643 to finally fix PR33208.
Other than before, don't clear out LI entirely but only those relevant
loops.

llvm-svn: 333089
2018-05-23 14:52:35 +00:00
Peter Collingbourne 9a45114b3c CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it up to dwo output.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47089

llvm-svn: 332881
2018-05-21 20:16:41 +00:00
Nicola Zaghen 349506a926 [polly] Update uses of DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM

Differential Revision: https://reviews.llvm.org/D44978

llvm-svn: 332352
2018-05-15 13:37:17 +00:00
Tobias Grosser 8dae41a1cb Remove another set or release() calls
llvm-svn: 331129
2018-04-29 00:57:38 +00:00
Tobias Grosser d3d3d6b75d Remove the last uses of isl::give and isl::take
llvm-svn: 331126
2018-04-29 00:28:26 +00:00
Michael Kruse e819fffee3 [CodeGen] Print executed statement instances at runtime.
Add the options -polly-codegen-trace-stmts and
-polly-codegen-trace-scalars. When enabled, adds a call to the
beginning of every generated statement that prints the executed
statement instance. With -polly-codegen-trace-scalars, it also prints
the value of all scalars that are used in the statement, and PHIs
defined in the beginning of the statement.

Differential Revision: https://reviews.llvm.org/D45743

llvm-svn: 330864
2018-04-25 19:43:49 +00:00
David Blaikie 60dc462b04 Fixup Polly for an LLVM header file change.
llvm-svn: 330679
2018-04-24 02:23:41 +00:00
Tobias Grosser c49f115b27 [RuntimeDebugBuilder] Do not break for 64 bit integers
In r330292 this assert was turned incorrectly into an unreachable, but
the correct behavior (thanks Michael) is to assert for anything that is
not 64 bit, but falltrough for 64 bit. I document this in the source
code.

llvm-svn: 330309
2018-04-19 05:38:12 +00:00
Tobias Grosser b20ae44ed0 [RuntimeDebugBuilder] Turn assert into an unreachable
llvm-svn: 330289
2018-04-18 20:18:43 +00:00
Michael Kruse 4485ae0890 [CodeGen] Allow undefined loads in statement instances outside context.
A check in assert-builds was meant to verify that a load provides a
value in all statement instances (i.e. its domain).  The domain is
commonly gist'ed within the parameter context to contain fewer
constraints.  However, statement instances outside the context are
no valid executions, hence the value provided can be undefined.

Refine the check for valid loads to only needed to be defined within
the SCoP context.

In addition, the JSONImporter had to be changed to allow importing
access relations that are broader than the current access relation,
but still defined over all statement instances.

This should fix the compiler crash in test-suite's oggenc of the
-polly-process-unprofitable buildbot.

llvm-svn: 329655
2018-04-10 01:20:51 +00:00
Michael Kruse 388730c9e0 [CodeGen] Convert BlockGenerator::generateScalarLoads to isl++. NFC.
llvm-svn: 329654
2018-04-10 01:20:47 +00:00
Huihui Zhang 71e54ccd06 [Polly][IslAst] Fix minimal dependence distance.
Summary:
When checking the parallelism of a scheduling dimension, we first check if excluding reduction dependences the loop is parallel or not.
If the loop is not parallel, then we need to return the minimal dependence distance of all data dependences, including the previously subtracted reduction dependences.


Reviewers: grosser, Meinersbur, efriedma, eli.friedman, jdoerfert, bollu

Reviewed By: Meinersbur

Subscribers: llvm-commits, pollydev

Tags: #polly

Differential Revision: https://reviews.llvm.org/D45236

llvm-svn: 329214
2018-04-04 18:08:13 +00:00
Reid Kleckner 757c8cf615 Fix polly build after r328717
llvm-svn: 328728
2018-03-28 19:56:26 +00:00
David Blaikie fd94eee3b9 Update for LLVM header movement
llvm-svn: 328169
2018-03-21 23:21:10 +00:00
Tobias Grosser 3a99893618 Adjust to clang-format changes
llvm-svn: 328005
2018-03-20 17:16:32 +00:00
Philip Pfaffe 4d50ab86e6 Revert "[Acc] Fix for PR33208"
This reverts commit r326643. Fix didn't really fix anything.

llvm-svn: 326656
2018-03-03 15:34:49 +00:00
Philip Pfaffe a8f7cc8ec9 [Acc] Fix for PR33208
During codegen, Polly attempts to clear all loops from ScalarEvolution
and LoopInfo, and it does so one block at a time. This causes undefined
behaviour, since this way a loop header might be removed from a loop
before the entire loop is erased, causing ScalarEvolution to run into an
error.

Instead, just delete the entire loop atomically. This fixes currently
failing testcases.

llvm-svn: 326643
2018-03-03 10:47:37 +00:00
Tobias Grosser 718d04c653 Use isl::manage_copy to simplify calls to isl::manage(isl_.._copy())
As part of this cleanup a couple of unnecessary isl::manage(obj.copy()) pattern
are eliminated as well.

We checked for all potential cleanups by scanning for:

  "grep -R isl::manage\( lib/ | grep copy"

llvm-svn: 325558
2018-02-20 07:26:58 +00:00
Michael Kruse 271deb17b0 [CodeGen] Fix noalias annotations for memcpy/memmove.
Memory transfer instructions take two pointers. It is not defined to
which of those a noalias annotation applies. To ensure correctness,
do not add noalias annotations to memcpy/memmove instructions anymore.

The caused a miscompile with test-suite's MultiSource/Applications/obsequi.
Since r321138, the MemCpyOpt pass would remove memcpy/memmove calls if
known to copy uninitialized memory. In that case, it was initialized
by another memcpy, but the annotation for the target pointer said
it would not alias. The annotation was actually meant for the source
pointer, which was was an alloca and could not alias with the target
pointer.

llvm-svn: 321371
2017-12-22 17:44:53 +00:00
Michael Kruse 163cacb469 [CodeGen] Detect empty domain because of parameters context.
Isl does not allow generating isl_ast_expr from an isl_pw_aff that has an
empty domain (i.e. has no pieces). We already detected the case if the
isl_pw_aff comes with an empty domain.

isl_ast_build also considers the domain empty if it is disjoint with the
parameter context (e.g. parameters values that we exclude by runtime
versioning).

Intersect the access relation domain with the parameter context to
also detect such practically empty access domains. The effective
pointer used in the generated code is unimportand because it will never
be executed.

This fixes llvm.org/PR35362

llvm-svn: 318806
2017-11-21 22:11:10 +00:00
Michael Kruse 58166b13e0 Run polly-update-format. NFC.
polly-check-format has been failing since at least r318517,
due to more than one cause.

llvm-svn: 318795
2017-11-21 19:25:26 +00:00
Philip Pfaffe 00fd43b327 Port ScopInfo to the isl cpp bindings
Summary:
Most changes are mechanical, but in one place I changed the program semantics
by fixing a likely bug:

In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the
error-case. Before, when the call to `addNonEmptyDomainConstraints()`
returned a null set, this (probably) accidentally worked because
isl_bool_error converts to true. I'm checking for nullptr now.

Reviewers: grosser, Meinersbur, bollu

Reviewed By: Meinersbur

Subscribers: nemanjai, kbarton, pollydev, llvm-commits

Differential Revision: https://reviews.llvm.org/D39971

llvm-svn: 318632
2017-11-19 22:13:34 +00:00
Mandeep Singh Grang 02e789c9bf [polly] Remove redundant return [NFC]
Reviewers: grosser, bollu

Reviewed By: grosser

Subscribers: nemanjai, kbarton, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D39916

llvm-svn: 317922
2017-11-10 20:33:08 +00:00
Michael Kruse 06618bf71a [OpenMP] Fix reference collection of latest base ptrs.
When collecting base pointers that need to be made available in parallel
subfunctions, use the base pointer associated with the latest
ScopArrayInfo, instead of the original one.

llvm-svn: 316983
2017-10-31 10:28:22 +00:00
Philip Pfaffe 53c803871e [Acc] Do not statically dispatch into IslNodeBuilder's createFor
Summary:
When GPUNodeBuilder creates loops inside the kernel, it dispatches to
IslNodeBuilder. This however is surprisingly dangerous, since it accesses the
AST Node's user through the wrong type. This patch fixes this problem by
overriding createFor correctly.

This fixes PR35010.

Reviewers: grosser, bollu, Meinersbur

Reviewed By: Meinersbur

Subscribers: Meinersbur, nemanjai, pollydev, llvm-commits, kbarton

Differential Revision: https://reviews.llvm.org/D39364

llvm-svn: 316872
2017-10-29 21:36:34 +00:00
Tobias Grosser c52b71db15 [GPGPU] Make sure escaping invariant load hoisted scalars are preserved
We make sure that the final reload of an invariant scalar memory access uses the
same stack slot into which the invariant memory access was stored originally.
Earlier, this was broken as we introduce a new stack slot aside of the preload
stack slot, which remained uninitialized and caused our escaping loads to
contain garbage. This happened due to us clearing the pre-populated values
in EscapeMap after kernel code generation. We address this issue by preserving
the original host values and restoring them after kernel code generation.
EscapeMap is not expected to be used during kernel code generation, hence we
clear it during kernel generation to make sure that any unintended uses are
noticed.

llvm-svn: 314894
2017-10-04 10:24:23 +00:00
Tobias Grosser 2fb847fbf6 [GPGPU] Set Polly's RTC to false in case invariant load hoisting fails
This matches the behavior we already have in lib/Codegen/CodeGeneration.cpp and
makes sure that we fall back to the original code. It seems when invariant load
hoisting was introduced to the GPGPU backend we missed to reset the RTC flag,
such that kernels where invariant load hoisting failed executed the 'optimized'
SCoP, which however is set to a simple 'unreachable'. Unsurprisingly, this
results in hard to debug issues that are a lot of fun to debug.

llvm-svn: 314624
2017-10-01 12:39:14 +00:00
Philip Pfaffe 859ef1c09e Fix the build after r314375
r314375 privatized Loop's constructor and replaced it with an Allocator.

llvm-svn: 314412
2017-09-28 12:20:24 +00:00
Tobias Grosser 75d133f0ac [IslExprBuilder] Do not generate RTC with more than 64 bit
Such RTCs may introduce integer wrapping intrinsics with more than 64 bit,
which are translated to library calls on AOSP that are not part of the
runtime and will consequently cause linker errors.

Thanks to Eli Friedman for reporting this issue and reducing the test case.

llvm-svn: 314065
2017-09-23 15:32:07 +00:00
Michael Kruse 0e370cf1a7 Check whether IslAstInfo and DependenceInfo were computed for the same Scop.
Since -polly-codegen reports itself to preserve DependenceInfo and IslAstInfo,
we might get those analysis that were computed by a different ScopInfo for a
different Scop structure. This would be unfortunate because DependenceInfo and
IslAstInfo hold references to resources allocated by
ScopInfo/ScopBuilder/Scop (e.g. isl_id). If -polly-codegen and
DependenceInfo/IslAstInfo do not agree on which Scop to use, unpredictable
things can happen.

When the ScopInfo/Scop object is freed, there is a high probability that the
new ScopInfo/Scop object will be created at the same heap position with the
same address. Comparing whether the Scop or ScopInfo address is the expected
therefore is unreliable.

Instead, we compare the address of the isl_ctx object. Both, DependenceInfo
and IslAstInfo must hold a reference to the isl_ctx object to ensure it is
not freed before the destruction of those analyses which might happen after
the destruction of the Scop/ScopInfo they refer to.  Hence, the isl_ctx
will not be freed and its address not reused as long there is a
DependenceInfo or IslAstInfo around.

This fixes llvm.org/PR34441

llvm-svn: 313842
2017-09-21 00:01:13 +00:00
Michael Kruse 0481d78c6c [CodegenCleanup] Update cleanup passes according (old) PassManagerBuilder.
Update CodegenCleanup using the function-level passes added by
populatePassManager that run between EP_EarlyAsPossible and
EP_VectorizerStart in -O3.

The changes in particular are:
- Added pass create arguments, e.g. ExpensiveCombines for InstCombine.
- Remove reroll pass. The option -reroll-loops is disabled by default.
- Add passes run with UnitAtATime, which is the default.
- Add instances of LibCallsShrinkWrap, TailCallElimination, SCCP
  (sparse conditional constant propagation), Float2Int
  that did not run before.
- Add instances of GVN as in the default pipeline.

Notes:
- GVNHoist, GVNSink, NewGVN are still disabled in the -O3 pipeline.
- The optimization level and other optimization parameters are not
  accessible outside of PassManagerBuilder, hence we cannot add passes
  depending on these.

Differential Revision: https://reviews.llvm.org/D37571

llvm-svn: 312875
2017-09-09 21:43:49 +00:00
Michael Kruse 2f5cbc449a [CodeGen] Bitcast scalar writes to actual value.
The type of NewValue might change due to ScalarEvolution
looking though bitcasts. The synthesized NewValue therefore
becomes the type before the bitcast.

llvm-svn: 312718
2017-09-07 12:15:01 +00:00
Siddharth Bhat e2950f46c6 [PPCGCodeGen] Document pre-composition with Zero in getExtent. [NFC]
It's weird at first glance that we do this, so I wrote up some
documentation on why we need to perform this process.

llvm-svn: 312715
2017-09-07 11:57:33 +00:00
Tobias Grosser 1a695b1d6c [CodegenCleanup] Use old GVN pass instead of NewGVN
It seems NewGVN still has some problems: llvm.org/PR34452, we will switch back
after they have been resolved.

llvm-svn: 312480
2017-09-04 11:04:33 +00:00
Tobias Grosser 701d943d12 [IslAst] Do not assert in case of empty min/max alias locations
In certain situations, the context in the isl_ast_build could result for the
min/max locations of our alias sets to become empty, which would cause an
internal error in isl, which is then unable to derive a value for these
expressions. Check these conditions before code generating expressions and
instead assume that alias check succeeded. This is valid, as the corresponding
memory accesses will not be executed under any valid context.

This fixed llvm.org/PR34432. Thanks to Qirun Zhang for reporting.

llvm-svn: 312455
2017-09-03 19:47:19 +00:00
Tobias Grosser 6b1e461329 [IslAst] Move buildCondition to isl++
llvm-svn: 312452
2017-09-03 18:31:44 +00:00
Siddharth Bhat 3928e3f50a [ISLNodeBuilder] Materialize Fortran array sizes of arrays without memory accesses.
In Polly, we specifically add a paramter to represent the outermost dimension
 size of fortran arrays. We do this because this information is statically
 available from the fortran metadata generated by dragonegg.
 However, we were only materializing these parameters (meaning, creating an
 llvm::Value to back the isl_id) from *memory accesses*. This is wrong,
 we should materialize parameters from *scop array info*.

 It is wrong because if there is a case where we detect 2 fortran arrays,
 but only one of them is accessed, we may not materialize the other array's
 dimensions at all.

 This is incorrect. We fix this by looping over all
 `polly::ScopArrayInfo` in a scop, rather that just all `polly::MemoryAccess`.

 Differential Revision: https://reviews.llvm.org/D37379

llvm-svn: 312350
2017-09-01 18:55:43 +00:00
Roman Gareev 1cb3491620 Run GVN during the cleanup
Currently, GVN can be necessary to eliminate redundant instructions in case
of, for instance, GEMM and float type. This patch makes GVN be run during
the cleanup.

Reviewed-by: Tobias Grosser <tobias@grosser.es>,
             Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D37340

llvm-svn: 312307
2017-09-01 06:52:28 +00:00
Siddharth Bhat 56572c6a5e [PPCGCodeGen] Convert intrinsics to libdevice functions whenever possible.
This is useful when we face certain intrinsics such as `llvm.exp.*`
which cannot be lowered by the NVPTX backend while other intrinsics can.

So, we would need to keep blacklists of intrinsics that cannot be
handled by the NVPTX backend. It is much simpler to try and promote
all intrinsics to libdevice versions.

This patch makes function/intrinsic very uniform, and will always try to use
a libdevice version if it exists.

Differential Revision: https://reviews.llvm.org/D37056

llvm-svn: 312239
2017-08-31 13:03:37 +00:00
Tobias Grosser c43d0360cc [BlockGenerator] Generate entry block of regions from instruction lists
The adds code generation support for the previous commit.

This patch has been re-applied, after the memory issue in the previous patch
has been fixed.

llvm-svn: 312211
2017-08-31 03:17:35 +00:00
Tobias Grosser 6f1f5cbb5b Revert "[BlockGenerator] Generate entry block of regions from instruction lists"
This reverts commit r312129. It caused some memory issues.

llvm-svn: 312208
2017-08-31 02:43:27 +00:00
Tobias Grosser 1e34508bcc [BlockGenerator] Generate entry block of regions from instruction lists
The adds code generation support for the previous commit.

llvm-svn: 312129
2017-08-30 15:08:30 +00:00
Tobias Grosser ee8ad1c0ff [IslAst] Do not compare arrays in alias check which are known to be identical
This possibly helps to avoid run-time check failures in the COSMO kernels.

llvm-svn: 311920
2017-08-28 20:17:02 +00:00
Michael Kruse a4f447c2a4 [PM] Properly require and preserve OptimizationRemarkEmitter. NFCI.
Properly require and preserve the OptimizationRemarkEmitter for use in
ScopPass. Previously one had to get the ORE from ScopDetection because
CodeGeneration did not mark it as preserved. It would need to be
recomputed which results in the legacy PM to throw away all previous
SCoP analysis.

This also changes the implementation of ScopPass::getAnalysisUsage to
not unconditionally preserve all passes, but only those needed to be
preserved by any SCoP pass (at least when using the legacy PM). This
allows invalidating DependenceInfo (and IslAstInfo) in case the pass
would cause them to change (e.g. OpTree, DeLICM, MaximalArrayExpansion)

JSONImporter should also invalidate the DependenceInfo. In this patch
it marks DependenceInfo as preserved anyway because some regression
tests depend on it.

Differential Revision: https://reviews.llvm.org/D37010

llvm-svn: 311888
2017-08-28 14:07:33 +00:00
Eugene Zelenko 9248fde53a [Polly] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311704
2017-08-24 21:22:41 +00:00
Michael Kruse b795bfc0d4 [CodeGen] Detect impossible partial write conditions more reliably.
Whether a partial write is tautological/unsatisfiable not only
depends on the access domain, but also on the domain covered
by its node in the AST.

In the example below, there are two instances of Stmt_cond_false. It may have a partial write access that is not executed in instance Stmt_cond_false(0).

      for (int c0 = 0; c0 < tmp5; c0 += 1) {
        Stmt_for_body344(c0);
        if (tmp5 >= c0 + 2)
          Stmt_cond_false(c0);
        Stmt_cond_end(c0);
      }
      if (tmp5 <= 0) {
        Stmt_for_body344(0);
        Stmt_cond_false(0);
        Stmt_cond_end(0);
      }

Isl cannot derive a subscript for an array element that is never accessed.
This caused an error in that no subscript expression has been generated
in IslNodeBuilder::createNewAccesses, but BlockGenerator expected one
to exist because there is an execution of that write, just not in that
ast node.

Fixed by instead of determining whether the access domain is empty,
inspect whether isl generated a constant "false" ast expression in
the current ast node.

This should fix a compiler crash of the aosp buildbot.

llvm-svn: 311663
2017-08-24 14:51:35 +00:00
Siddharth Bhat 78027437e6 [Polly] [PPCGCodeGeneration] Mild refactoring of checking validity of functions in a kernel.
This is a stylistic change to make the function a little more readable.
Also add a debug print to show what instruction contains a use of a
function we don't understand in the kernel.

Differential Revision: https://reviews.llvm.org/D37058

llvm-svn: 311648
2017-08-24 09:54:15 +00:00
Michael Kruse 06ed529205 Add more statistics.
Add statistics about
- Which optimizations are applied
- Number of loops in Scops at various stages
- Number of scalar/singleton writes at various stages representative
  for scalar false dependencies
- Number of parallel loops

These will be useful to find regressions due to moving Polly further
down of LLVM's pass pipeline.

Differential Revision: https://reviews.llvm.org/D37049

llvm-svn: 311553
2017-08-23 13:50:30 +00:00
Michael Kruse 3044dc51cf [PPCGCodeGen] Fix compiler warning: '<': signed/unsigned mismatch. NFC.
MSVC warns about comparison between a signed and unsigned integer.
The rules of C(++) define that an unsigned comparison has to be
carried-out in this case. This is unlikely to be intended.

Fix by assigning the loop's upper bound to a signed integer first.
This also avoids repeated evaluation of the invariant upper bound.

llvm-svn: 311548
2017-08-23 12:45:25 +00:00
Tobias Grosser 4a07bbe3f6 [IRBuilder] Only emit alias scop metadata for arrays, but not scalars
Summary:
There is no need to emit alias metadata for scalars, as basicaa will easily
distinguish them from arrays. This reduces the size of the metadata we generate.
This is especially useful after we moved to -polly-position=before-vectorizer,
where a lot more scalar dependences are introduced, which increased the size of
the alias analysis metadata and made us commonly reach the limits after which
we do not emit alias metadata that have been introduced to prevent quadratic
growth of this alias metadata.

This improves 2mm performance from 1.5 seconds to 0.17 seconds.

Reviewers: Meinersbur, bollu, singam-sanjay

Reviewed By: Meinersbur

Subscribers: pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D37028

llvm-svn: 311498
2017-08-22 21:58:48 +00:00
Roman Gareev 6bfeba24d3 [NFC] Fix the broken comment.
llvm-svn: 311477
2017-08-22 17:43:03 +00:00
Roman Gareev 0956a606ff Disable the Loop Vectorizer in case of GEMM
Currently, in case of GEMM and the pattern matching based optimizations, we
use only the SLP Vectorizer out of two LLVM vectorizers. Since the Loop
Vectorizer can get in the way of optimal code generation, we disable the Loop
Vectorizer for the innermost loop using mark nodes and emitting the
corresponding metadata.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D36928

llvm-svn: 311473
2017-08-22 17:38:46 +00:00
Siddharth Bhat cb5155bf6d [ManagedMemoryRewrite] Use `unit64_t` to store size, not `int`.
llvm-svn: 311440
2017-08-22 09:30:37 +00:00
Siddharth Bhat 603544863f [ManagedMemoryRewrite] Get size in bytes rather than in bits and dividing by 8.
llvm-svn: 311439
2017-08-22 09:27:41 +00:00
Siddharth Bhat a8c329b0eb [ManagedMemoryRewrite] slightly tweak debug output style. [NFC]
llvm-svn: 311361
2017-08-21 18:58:33 +00:00
Siddharth Bhat 557ce3a8b0 [ManagedMemoryRewrite] Print reasons for skipping global array to dbgs(). [NFC]
llvm-svn: 311360
2017-08-21 18:52:15 +00:00
Siddharth Bhat 0a198dc18a [ManagedMemoryRewrite] hide debug output behing DEBUG(...). [NFC]
llvm-svn: 311331
2017-08-21 12:51:57 +00:00
Siddharth Bhat 7b9f5ca27e [PPCGCodeGeneration] Enable `polly-codegen-perf-monitoring` for PPCGCodegen.
This feature was not enabled for `PPCGCodeGeneration`. Now that this is
enabled, we can benchmark Scops that have been optimised with
`-polly-codegen-ppcg` with the `-polly-codegen-perf-monitoring` option.

Differential Revision: https://reviews.llvm.org/D36934

llvm-svn: 311328
2017-08-21 11:44:01 +00:00
Tobias Grosser b09bd74da8 [GPGPU] Add llvm.powi to the libdevice supported functions
These intrinsics are used in COSMO.

llvm-svn: 311324
2017-08-21 09:52:08 +00:00