Commit Graph

70 Commits

Author SHA1 Message Date
Arthur Eubanks c384b20b55 [opt] Remove temporary legacy pass name translations
And update corresponding tests.
2022-10-07 11:09:46 -07:00
Francis Visoiu Mistrih 0fcc99ade4 [Matrix] Add tests for addition transpose optimizations
Tests before transpose optimizations around additions.

Differential Revision: https://reviews.llvm.org/D133656
2022-09-26 13:27:03 -07:00
Francis Visoiu Mistrih 81bdb4068d [Matrix] Simplify matmuls with scalars
If one of the operands is a transposed splat, the transpose can be
removed.

This is useful to simplify when transposes are distributed to operands
of a matmul:

* k^T -> k
* (A * k)^t -> A^t * k

Differential Revision: https://reviews.llvm.org/D130177
2022-09-02 15:50:25 -07:00
Vir Narula 625877b0ef
[Matrix] Add tests dot product with varied strides
Add more tests with varied strides. Changes to lowering upcoming in https://reviews.llvm.org/D131125

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D131444
2022-08-11 19:09:21 +01:00
Nuno Lopes 022bd92c78 [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] 2022-07-03 12:32:19 +01:00
Nuno Lopes 7c4f45f87a Revert [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC]
This reverts commits 47e6f98f84 and 3e701bcd2a
2022-07-01 23:53:41 +01:00
Nuno Lopes 3e701bcd2a attempt to fix aarch64 build bot 2022-07-01 23:43:48 +01:00
Nuno Lopes 47e6f98f84 [LowerMatrixMultiplication] Switch dummy values from undef to poison [NFC] 2022-07-01 23:31:31 +01:00
Florian Hahn 7c0089d735
[Matrix] Check if iterator is at beginning of BB in optimizeTranspose.
If an instruction at the beginning of a block is erased,  this may
trigger crash due to dereferencing an invalid iterator.

Check if II is at the end before dereferencing it.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D127736
2022-06-14 21:37:02 +01:00
Vir Narula 210c851327
[Matrix] Add dot product tests
LLVM LIT tests for our upcoming dot product lowering change

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D126942
2022-06-03 20:02:42 +01:00
Johannes Doerfert a81fff8afd Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes"
This reverts commit c5f789050d and
reapplies 7aea3ea8c3 with additional test
changes.
2022-03-25 09:36:50 -05:00
Andrew Wei 0af3e6a22d [InstCombine] Sink instructions with multiple users in a successor block.
This patch tries to sink instructions when they are only used in a successor block.

This is a further enhancement patch based on Anna's commit:
D109700, which allows sinking an instruction having multiple uses in a single user.

In this patch, sink instructions with multiple users in a single successor block will be supported.
It could fix a known issue from rust:
  https://github.com/rust-lang/rust/issues/51346#issuecomment-394443610

Reviewed By: nikic, reames

Differential Revision: https://reviews.llvm.org/D121585
2022-03-18 11:53:45 +08:00
Arthur Eubanks dec9be85cc [test][LowerMatrixIntrinsics] Use new PM RUN lines 2022-03-08 13:39:18 -08:00
Florian Hahn b339bbdb19
[Matrix] Use ArrayType for allocas instead of VectorType.
When creating an alloca to copy a matrix due to memory conflicts, those
allocas used to use VectorTypes, which forced them to have huge
alignments for large vectors.

This patch updates LowerMatrixIntrinsics to use a corresponding array
type, like Clang already does, to get more manageable alignments.

Reviewed By: anemet, thegameg

Differential Revision: https://reviews.llvm.org/D118239
2022-01-28 10:47:52 +00:00
Nikita Popov 80110aafa0 [Tests] Fix incorrect noalias metadata
Mostly this fixes cases where !noalias or !alias.scope were passed
a scope rather than a scope list. In some cases I opted to drop
the metadata entirely instead, because it is not really relevant
to the test.
2021-09-18 20:51:00 +02:00
Bjorn Pettersson d52f506192 [NewPM] Use parameterized syntax for a couple of more passes
A couple of passes that are parameterized in new-PM used different
pass names (in cmd line interface) while using the same pass class
name. This patch updates the PassRegistry to model pass parameters
more properly using PASS_WITH_PARAMS.

Reason for the change is to ensure that we have a 1-1 mapping
between class name and pass name (when disregarding the params).
With a 1-1 mapping it is more obvious which pass name to use in
options such as -debug-only, -print-after etc.

The opt -passes syntax is changed for the following passes:
  early-cse-memssa => early-cse<memssa>
  post-inline-ee-instrument => ee-instrument<post-inline>
  loop-extract-single => loop-extract<single>
  lower-matrix-intrinsics-minimal => lower-matrix-intrinsics<minimal>

This patch is not updating pass names in docs/Passes.rst. Not quite
sure what the status is for that document (e.g. when it comes to
listing pass paramters). It is only loop-extract-single that is
mentioned in Passes.rst today, out of the passes mentioned above.

Differential Revision: https://reviews.llvm.org/D108362
2021-08-20 14:59:21 +02:00
Florian Hahn f999312872
Recommit "[Matrix] Overload stride arg in matrix.columnwise.load/store."
This reverts the revert 28c04794df.

The failing MLIR test that caused the revert should be fixed  in this
version.

Also includes a PPC test fix previously in 1f87c7c478.
2021-08-12 18:31:57 +01:00
Mehdi Amini 28c04794df Revert "[Matrix] Overload stride arg in matrix.columnwise.load/store."
This reverts commit a1ef81de35.

Broke the MLIR buildbot.
2021-08-12 11:57:19 +00:00
Florian Hahn a1ef81de35
[Matrix] Overload stride arg in matrix.columnwise.load/store.
This patch adjusts the intrinsics definition of
llvm.matrix.column.major.load and llvm.matrix.column.major.store to
allow overloading the type of the stride. The bitwidth of the stride is
used to perform the offset computation.

This fixes a crash when using __builtin_matrix_column_major_load or
__builtin_matrix_column_major_store on 32 bit platforms. The stride argument
of the builtins are defined as `size_t`, which is 32 bits wide on 32 bit
platforms.

Note that we still perform offset computations with 64 bit width on 32
bit platforms for accesses that do not take a user-specified stride.
This can be fixed separately.

Fixes PR51304.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D107349
2021-08-12 10:45:25 +01:00
Adam Nemet d87d3615f7 [Matrix] Fix shape for factored transpose
The shape of the input is C x R.

Differential Revision: https://reviews.llvm.org/D106722
2021-07-27 11:36:13 -07:00
Adam Nemet bf7eb48454 [Matrix] RAUW should only replace an instruction in ShapeMap if supportsShapeInfo
As an instruction is replaced in optimizeTransposes RAUW will replace it in
the ShapeMap (ShapeMap is ValueMap so that uses are updated).  In
finalizeLowering however we skip updating uses if they are in the ShapeMap
since they will be lowered separately at which point we pick up the lowered
operands.

In the testcase what happened was that since we replaced the doubled-transpose
with the shuffle, it ended up in the ShapeMap.  As we lowered the
columnwise-load the use in the shuffle was not updated.  Then as we removed
the original columnwise-load we changed that to an undef.  I.e. we ended up
with:

```
%shuf = shufflevector <8 x double> undef, <8 x double> poison, <6 x i32>
                                   ^^^^^
                                  <i32 0, i32 1, i32 2, i32 4, i32 5, i32 6>
```

Besides the fix itself, I have fortified this last bit.  As we change uses to
undef when removing instruction we track the undefed instruction to make sure
we eventually remove those too.  This would have caught the issue at compile
time.

Differential Revision: https://reviews.llvm.org/D106714
2021-07-27 11:36:13 -07:00
Adam Nemet ce5b1320a7 [Matrix] Fix miscompile for NT matmul if the transpose has other use
We should only add the fake lowering entry for the matrix remark if the
transpose is not lowered on its own.  `MapVector::insert` is used to insert
the entry during proper lowering which does not overwrite the fake entry in
the map.

We actually had test coverage for this but the reference output code was
wrong; it was storing undef rather than the transposed column.

Also add an assert that would have caught this.

Differential Revision: https://reviews.llvm.org/D106457
2021-07-22 10:45:56 -07:00
Florian Hahn a3ca578eb9
[Matrix] Fix crash during fusion if the same load is re-used.
This patch fixes a crash when the same load is used for both operands of
a fuseable multiply.
2021-07-02 14:00:17 +01:00
Florian Hahn 7655061cc6
[Matrix] Hoist address computation before multiply to enable fusion.
If the store address does not dominate the matrix multiply, try to hoist
address computation instructions without side-effects and/or memory
reads before the multiply, to allow fusion.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D105193
2021-07-02 09:52:11 +01:00
Florian Hahn 8db9cb262f
[Matrix] Add tests for hoisting address computations. 2021-06-30 14:31:00 +01:00
Adam Nemet e0efebb8eb [Matrix] In transpose opts, handle a^t * a^t
Without the fix the testcase crashes because we remove the same instruction
twice.

Differential Revision: https://reviews.llvm.org/D104127
2021-06-11 09:29:43 -07:00
Florian Hahn 87c99d2b97
[Matrix] Add -matrix-allow-contract=false to tests.
Explicitly specify contract behavior, so the tests are independent of
the current default of the flag.
2021-06-07 12:13:20 +01:00
Adam Nemet ffde966cd9 [Matrix] Fix transpose-multiply folding if transpose has multiple uses
Don't add it to FusedInsts in this case.

Differential Revision: https://reviews.llvm.org/D103627
2021-06-04 10:55:03 -07:00
Hamza Mahfooz 83235b07e3
[Matrix] Preserve existing fast-math flags during lowering
This patch makes it so, floating-point instructions created in
LowerMatrixIntrinsics retain fast-math flags from instructions that are
higher up the chain.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49738

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D103233
2021-06-03 15:29:31 +01:00
Adam Nemet dfd1bbd00a [Matrix] Factor and distribute transposes across multiplies
Now that we can fold some transposes into multiplies (CM: A * B^t and RM:
A^t * B), we want to move them around to create the optimal expressions:

* fold away double transposes while still using them to assert the shape
* sink transposes hoping they cancel out
* lift transposes when both operands are transposed

This also modifies the matrix remarks to include the number of exposed
transposes (i.e. transposes that we couldn't fold into a multiply).

The adjustment to the test remarks-inlining is a bit subtle: I am changing the
double transpose to a single transpose so that we don't remove it completely.
More importantly this changes some of the total instruction count, most
notable stores because we can no longer use a vector store.

Differential Revision: https://reviews.llvm.org/D102733
2021-05-25 11:12:20 -07:00
Adam Nemet fcffd087c6 [Matrix] Fold the transpose into the matmul operand used to fetch scalars
For column-major this is:
  A * B^t
whereas for row-major:
  A^t * B

Differential Revision: https://reviews.llvm.org/D101762
2021-05-17 17:40:46 -07:00
Dávid Bolvanský cd54c57919 Reland "[Libcalls, Attrs] Annotate libcalls with noundef"
Fixed Clang tests.
2021-02-20 06:18:48 +01:00
Dávid Bolvanský 94d034fb86 Revert "[Libcalls, Attrs] Annotate libcalls with noundef"
This reverts commit 33b0c63775. Bots are failing. Some Clang tests need to be updated too.
2021-02-20 04:18:42 +01:00
Dávid Bolvanský 33b0c63775 [Libcalls, Attrs] Annotate libcalls with noundef
I think we can use here same logic as for nonnull.

strlen(X) - X must be noundef => valid pointer.

for libcalls with size arg, we add noundef only if size is known and greater than 0 - so pointers must be noundef (valid ones)

Reviewed By: jdoerfert, aqjune

Differential Revision: https://reviews.llvm.org/D95122
2021-02-20 04:10:07 +01:00
Francis Visoiu Mistrih 0cc38acfc4 [Matrix] Propagate shape information through fneg
Similar to binary operators like fadd/fmul/fsub, propagate shape info
through unary operators (fneg is the only one?).

Differential Revision: https://reviews.llvm.org/D95252
2021-01-22 14:34:28 -08:00
Juneyoung Lee 9b29610228 Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.

Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)

The order is swapped, but in terms of correctness it is still fine.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93923
2020-12-30 22:36:08 +09:00
Juneyoung Lee 278aa65cc4 [IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93793
2020-12-30 04:21:04 +09:00
Arthur Eubanks 8d5673ffd8 [test] Fix multiply-minimal.ll 2020-11-19 18:16:35 -08:00
Arthur Eubanks 513d165b80 Port -lower-matrix-intrinsics-minimal to NPM
This reuses the existing lower-matrix-intrinsics pass rather than going
the legacy pass route of creating a new pass.

Use this new variant in the NPM -O0 pipeline.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D91811
2020-11-19 17:42:48 -08:00
Arthur Eubanks aa6c305344 [LowerMatrixIntrinsics][NewPM] Fix PreservedAnalyses result
PreservedCFGCheckerInstrumentation was saying that LowerMatrixIntrinsics
didn't properly preserve CFG even though it claimed to. The legacy pass
says it doesn't. Match the legacy pass's preserved analyses.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D89175
2020-10-21 12:42:16 -07:00
sstefan1 fbfb1c7909 [IR] Make nosync, nofree and willreturn default for intrinsics.
D70365 allows us to make attributes default. This is a follow up to
actually make nosync, nofree and willreturn default. The approach we
chose, for now, is to opt-in to default attributes to avoid introducing
problems to target specific intrinsics. Intrinsics with default
attributes can be created using `DefaultAttrsIntrinsic` class.
2020-10-20 11:57:19 +02:00
Florian Hahn f13a59bcff [Matrix] Use TileInfo to create tiled loop nest for matrix multiply.
This patch uses the TileInfo introduced in D77550 to generate a loop
nest for tiled matrix multiplication, instead of generating the
unrolled code for the whole multiplication. This makes code-generation
more scalable for larger matrixes.

Initially loops are only used if both the number of rows and columns are
divisible by the tile size. Other cases will be added as follow-up.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D81308
2020-07-20 21:11:53 +01:00
Florian Hahn dc1087d408 [Matrix] Add minimal lowering pass that only requires TTI.
This patch adds a new variant of the matrix lowering pass that only does
a minimal lowering and only depends on TTI. The main purpose of this pass
is to have a pass with minimal dependencies to run as part of the backend
pipeline.

At the moment, the only difference to the regular lowering pass is that it
does not support remarks. But in subsequent patches add support for tiling
to the lowering pass which will require more analysis, which we do not want
to run in the backend, as the lowering should happen in the middle-end in
practice and running it in the backend is mostly for convenience when
running llc.

Reviewers: anemet, Gerolf, efriedma, hfinkel

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D76867
2020-07-20 11:16:11 +01:00
Sjoerd Meijer 2b3c505d0f [Matrix] Intrinsic descriptions
This changes the matrix load/store intrinsic definitions to load/store from/to
a pointer, and not from/to a pointer to a vector, as discussed in D83477.

This also includes the recommit of "[Matrix] Tighten LangRef definitions and
Verifier checks" which adds improved language reference descriptions of the
matrix intrinsics and verifier checks.

Differential Revision: https://reviews.llvm.org/D83785
2020-07-14 19:58:16 +01:00
Dmitry Polukhin 9e7fddbd36 [yaml][clang-tidy] Fix multiline YAML serialization
Summary:
New line duplication logic introduced in https://reviews.llvm.org/D63482
has two issues: (1) there is no logic that removes duplicate newlines
when clang-apply-replacment reads YAML and (2) in general such logic
should be applied to all strings and should happen on string
serialization level instead in YAML parser.

This diff changes multiline strings quotation from single quote `'` to
double `"`. It solves problems with internal newlines because now they are
escaped. Also double quotation solves the problem with leading whitespace after
newline. In case of single quotation YAML parsers should remove leading
whitespace according to specification. In case of double quotation these
leading are internal space and they are preserved. There is no way to
instruct YAML parsers to preserve leading whitespaces after newline so
double quotation is the only viable option that solves all problems at
once.

Test Plan: check-all

Reviewers: gribozavr, mgehre, yvvan

Subscribers: xazax.hun, hiraditya, cfe-commits, llvm-commits

Tags: #clang-tools-extra, #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80301
2020-07-09 02:41:58 -07:00
Florian Hahn 1669fddc9f [Matrix] Use alignment info when lowering loads/stores.
This patch updates LowerMatrixIntrinsics to preserve the alignment
specified at the original load/stores and the align attribute for the
pointer argument of the column.major.load/store intrinsics.

We can always use the specified alignment for the load of the first
column. For subsequent columns, the alignment may need to be reduced.

For ConstantInt strides, compute the offset for the start of the column in
bytes and use commonAlignment to get the largest valid alignment.

For non-ConstantInt strides, we need to take the common alignment of the
initial alignment and the element size in bytes.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, rjmccall

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D81960
2020-06-18 13:19:31 +01:00
Florian Hahn d88acd8f7d [Matrix] Preserve volatile when loading loads/stores.
Currently the matrix lowering turns volatile loads/stores into
non-volatile ones. This patch updates the lowering to preserve the
volatile bit.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D81498
2020-06-18 12:14:19 +01:00
Florian Hahn 9ce89b3b64 [Matrix] Add volatile load/store tests (NFC). 2020-06-18 09:57:13 +01:00
Florian Hahn 6d18c2067e [Matrix] Update load/store intrinsics.
This patch adjust the load/store matrix intrinsics, formerly known as
llvm.matrix.columnwise.load/store, to improve the naming and allow
passing of extra information (volatile).

The patch performs the following changes:
 * Rename columnwise.load/store to column.major.load/store. This is more
   expressive and also more in line with the naming in Clang.
 * Changes the stride arguments from i32 to i64. The stride can be
   larger than i32 and this makes things more uniform with the way
   things are handled in Clang.
 * A new boolean argument is added to indicate whether the load/store
   is volatile. The lowering respects that when emitting vector
   load/store instructions
 * MatrixBuilder is updated to require both Alignment and IsVolatile
   arguments, which are passed through to the generated intrinsic. The
   alignment is set using the `align` attribute.

The changes are grouped together in a single patch, to have a single
commit that breaks the compatibility. We probably should be fine with
updating the intrinsics, as we did not yet officially support them in
the last stable release. If there are any concerns, we can add
auto-upgrade rules for the columnwise intrinsics though.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache, rjmccall, ftynse

Reviewed By: anemet, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D81472
2020-06-18 09:44:52 +01:00
Florian Hahn 08f62ff8ef [Matrix] Add align info to some more loads/stores (NFC).
Some tests were missing alignment info. Subsequent changes properly
preserve the set alignment. Set it properly beforehand, to avoid
unnecessary test changes.
2020-06-16 20:42:59 +01:00