Commit Graph

8 Commits

Author SHA1 Message Date
Bing1 Yu 56d5c46b49 [X86] Support __tile_stream_loadd intrinsic for new AMX interface
Adding support for __tile_stream_loadd intrinsic.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D103784
2021-06-11 17:28:43 +08:00
Bing1 Yu 320b72e9cd [X86][AMX] Rename amx-bf16 intrinsic according to correct naming convention
__tile_tdpbf16ps should be renamed with __tile_dpbf16ps

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D98685
2021-03-17 11:22:52 +08:00
Liu, Chen3 4bc7c8631a [X86] Support amx-bf16 intrinsic.
Adding support for intrinsics of AMX-BF16.
This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong
predicate.

Differential Revision: https://reviews.llvm.org/D97358
2021-02-25 09:06:48 +08:00
Liu, Chen3 f8b9035aae [X86] Support amx-int8 intrinsic.
Adding support for intrinsics of TDPBSUD/TDPBUSD/TDPBUUD.

Differential Revision: https://reviews.llvm.org/D97259
2021-02-23 17:08:05 +08:00
Luo, Yuanke 7e1d2224b4 [X86][AMX] Fix the typo.
The dpbsud should be dpbssd.

Differential Revision: https://reviews.llvm.org/D94943
2021-01-19 16:57:34 +08:00
Luo, Yuanke 08665b1805 Support tilezero intrinsic and c interface for AMX.
Differential Revision: https://reviews.llvm.org/D92837
2020-12-31 13:24:57 +08:00
Luo, Yuanke 981a0bd858 [X86] Add x86_amx type for intel AMX.
The x86_amx is used for AMX intrisics. <256 x i32> is bitcast to x86_amx when
it is used by AMX intrinsics, and x86_amx is bitcast to <256 x i32> when it
is used by load/store instruction. So amx intrinsics only operate on type x86_amx.
It can help to separate amx intrinsics from llvm IR instructions (+-*/).
Thank Craig for the idea. This patch depend on https://reviews.llvm.org/D87981.

Differential Revision: https://reviews.llvm.org/D91927
2020-12-30 13:52:13 +08:00
Luo, Yuanke f80b29878b [X86] AMX programming model.
This patch implements amx programming model that discussed in llvm-dev
 (http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html).
 Thank Hal for the good suggestion in the RA. The fast RA is not in the patch yet.
 This patch implemeted 7 components.

1. The c interface to end user.
2. The AMX intrinsics in LLVM IR.
3. Transform load/store <256 x i32> to AMX intrinsics or split the
   type into two <128 x i32>.
4. The Lowering from AMX intrinsics to AMX pseudo instruction.
5. Insert psuedo ldtilecfg and build the def-use between ldtilecfg to amx
   intruction.
6. The register allocation for tile register.
7. Morph AMX pseudo instruction to AMX real instruction.

Change-Id: I935e1080916ffcb72af54c2c83faa8b2e97d5cb0

Differential Revision: https://reviews.llvm.org/D87981
2020-12-10 17:01:54 +08:00