Farhana Aleen
9250c92d0e
[AMDGPU] Match udot4 pattern.
...
Summary: D.u32 = S0.u8[0] * S1.u8[0] +
S0.u8[1] * S1.u8[1] +
S0.u8[2] * S1.u8[2] +
S0.u8[3] * S1.u8[3] + S2.u32
Author: FarhanaAleen
Reviewed By: arsenm
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D50921
llvm-svn: 340936
2018-08-29 16:31:18 +00:00
Farhana Aleen
3528c80378
[AMDGPU] Support idot2 pattern.
...
Summary: Transform add (mul ((i32)S0.x, (i32)S1.x),
add( mul ((i32)S0.y, (i32)S1.y), (i32)S3) => i/udot2((v2i16)S0, (v2i16)S1, (i32)S3)
Author: FarhanaAleen
Reviewed By: arsenm
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D50024
llvm-svn: 340295
2018-08-21 16:21:15 +00:00
Konstantin Zhuravlyov
bb30ef7af4
AMDGPU: Add clamp bit to dot intrinsics
...
Differential Revision: https://reviews.llvm.org/D49874
llvm-svn: 338470
2018-08-01 01:31:30 +00:00
Farhana Aleen
c370d7b33d
[AMDGPU] [AMDGPU] Support a fdot2 pattern.
...
Summary: Optimize fma((float)S0.x, (float)S1.x fma((float)S0.y, (float)S1.y, z))
-> fdot2((v2f16)S0, (v2f16)S1, (float)z)
Author: FarhanaAleen
Reviewed By: rampitec, b-sumner
Subscribers: AMDGPU
Differential Revision: https://reviews.llvm.org/D49146
llvm-svn: 337198
2018-07-16 18:19:59 +00:00
Konstantin Zhuravlyov
f13c9969fc
AMDGPU: Fix v_dot{4, 8}* instruction encoding
...
Differential Revision: https://reviews.llvm.org/D46848
llvm-svn: 332387
2018-05-15 19:32:47 +00:00
Matt Arsenault
0084adc516
AMDGPU: Add Vega12 and Vega20
...
Changes by
Matt Arsenault
Konstantin Zhuravlyov
llvm-svn: 331215
2018-04-30 19:08:16 +00:00
Nicolai Haehnle
4f850eabb6
AMDGPU: Introduce common SOP_Pseudo and VOP_Pseudo TableGen base classes
...
Differential revision: https://reviews.llvm.org/D44820
Change-Id: I732979e2964006aa15d78a333d8886e6855f319a
llvm-svn: 328496
2018-03-26 13:56:53 +00:00
Matt Arsenault
28f52e51f1
AMDGPU: Add max-mix-insts subtarget feature
...
llvm-svn: 316553
2017-10-25 07:00:51 +00:00
Matt Arsenault
90c7593a75
AMDGPU: Remove global isGCN predicates
...
These are problematic because they apply to everything,
and can easily clobber whatever more specific predicate
you are trying to add to a function.
Currently instructions use SubtargetPredicate/PredicateControl
to apply this to patterns applied to an instruction definition,
but not to free standing Pats. Add a wrapper around Pat
so the special PredicateControls requirements can be appended
to the final predicate list like how Mips does it.
llvm-svn: 314742
2017-10-03 00:06:41 +00:00
Matt Arsenault
8cbb4884a5
AMDGPU: Start selecting v_mad_mixhi_f16
...
llvm-svn: 313814
2017-09-20 21:01:24 +00:00
Matt Arsenault
e135c4c6a6
AMDGPU: Add tied operands to v_mad_mix{lo|hi}_f16
...
These write to the low and high half of the destination
register and leave the other 16-bits unchanged. This is true
for most 16-bit instructions on gfx9, but we don't use that
now.
llvm-svn: 313812
2017-09-20 20:53:49 +00:00
Matt Arsenault
76935122cc
AMDGPU: Start selecting v_mad_mixlo_f16
...
Also add some tests that should be able to use v_mad_mixhi_f16,
but do not yet. This is trickier because we don't really model
the partial update of the register done by 16-bit instructions.
llvm-svn: 313806
2017-09-20 20:28:39 +00:00
Matt Arsenault
644883ff07
AMDGPU: Fix encoding of op_sel for mad_mix* opcodes
...
llvm-svn: 313797
2017-09-20 19:09:28 +00:00
Matt Arsenault
c8f8cda0cd
AMDGPU: Correct operand types for v_mad_mix*
...
These aren't really packed instructions, so the default
op_sel_hi should be 0 since this indicates a conversion.
The operand types are scalar values that behave similar
to an f16 scalar that may be converted to f32.
Doesn't change the default printing for op_sel_hi, just
the parsing.
llvm-svn: 312179
2017-08-30 22:18:40 +00:00
Dmitry Preobrazhensky
095ec3da81
[AMDGPU][MC] Added missing VOP3P opcodes
...
Added support of the following opcodes:
v_pk_sub_u16
v_pk_mad_i16
v_pk_mad_u16
See Bug 33593: https://bugs.llvm.org//show_bug.cgi?id=33593
Reviewers: vpykhtin, artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D34890
llvm-svn: 308281
2017-07-18 09:24:10 +00:00
Dmitry Preobrazhensky
b2d24e23ce
[AMDGPU][mc][gfx9] Added support of op_sel/op_sel_hi for V_MAD_MIX*
...
See https://bugs.llvm.org//show_bug.cgi?id=33595
Reviewers: vpykhtin, artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D35021
llvm-svn: 307402
2017-07-07 14:29:06 +00:00
Matt Arsenault
eb522e68bc
AMDGPU: Support v2i16/v2f16 packed operations
...
llvm-svn: 296396
2017-02-27 22:15:25 +00:00
Matt Arsenault
9be7b0d485
AMDGPU: Add VOP3P instruction format
...
Add a few non-VOP3P but instructions related to packed.
Includes hack with dummy operands for the benefit of the assembler
llvm-svn: 296368
2017-02-27 18:49:11 +00:00