Commit Graph

56 Commits

Author SHA1 Message Date
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Tim Corringham 4c4d2fe280 [AMDGPU] Add new Mode Register pass
A new pass to manage the Mode register.

Currently this just manages the floating point double precision
rounding requirements, but is intended to be easily extended to
encompass all Mode register settings.

The immediate motivation comes from the requirement to use the
round-to-zero rounding mode for the 16 bit interpolation
instructions, where the rounding mode setting is shared between
16 and 64 bit operations.

llvm-svn: 348754
2018-12-10 12:06:10 +00:00
Valery Pykhtin 3d9afa273f [AMDGPU] Combine DPP mov with use instructions (VOP1/2/3)
Introduces DPP pseudo instructions and the pass that combines DPP mov with subsequent uses.

Differential revision: https://reviews.llvm.org/D53762

llvm-svn: 347993
2018-11-30 14:21:56 +00:00
Matt Arsenault 687ec75d10 DAG: Change behavior of fminnum/fmaxnum nodes
Introduce new versions that follow the IEEE semantics
to help with legalization that may need quieted inputs.

There are some regressions from inserting unnecessary
canonicalizes when these are matched from fast math
fcmp + select which should be fixed in a future commit.

llvm-svn: 344914
2018-10-22 16:27:27 +00:00
Konstantin Zhuravlyov 5f1b8181ad AMDGPU: Split HasExt into HasExtDPP/SDWA/SDWA9
llvm-svn: 343264
2018-09-27 20:49:00 +00:00
Konstantin Zhuravlyov 9da26b20da AMDGPU: Split VOP2Inst into VOP2Inst_e32/e64/sdwa
llvm-svn: 343259
2018-09-27 19:46:41 +00:00
Konstantin Zhuravlyov 7d424aae13 AMDGPU/NFC: Simplify VOP_MAC_F16/F32
llvm-svn: 343254
2018-09-27 19:24:05 +00:00
Alexander Timofeev 36617f0160 [AMDGPU] Divergence driven instruction selection. Part 1.
Summary: This change is the first part of the AMDGPU target description
    change. The aim of it is the effective splitting the vector and scalar
    flows at the selection stage. Selection uses predicate functions based
    on the framework implemented earlier - https://reviews.llvm.org/D35267

    Differential revision: https://reviews.llvm.org/D52019

    Reviewers: rampitec

llvm-svn: 342719
2018-09-21 10:31:22 +00:00
Nicolai Haehnle 283b995097 AMDGPU: Fix getInstSizeInBytes
Summary:
Add some optional code to validate getInstSizeInBytes for emitted
instructions. This flushed out some issues which are fixed by this
patch:

- Streamline getInstSizeInBytes
- Properly define the VI readlane/writelane instruction as VOP3
- Fix the inline constant determination. Specifically, this change
  fixes an issue where a 32-bit value of 0xffffffff was recorded
  as unsigned. This is equal to -1 when restricting to a 32-bit
  comparison, and an inline constant can be used.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D50629

Change-Id: Id87c3b7975839da0de8156a124b0ce98c5fb47f2
llvm-svn: 340903
2018-08-29 07:46:09 +00:00
Matt Arsenault 709374d186 AMDGPU: Improve hack for packing conversion ops
Mutate the node type during selection when it
doesn't matter. This avoids an intermediate bitcast
node on targets with legal i16/f16.

Also fixes missing output modifiers on v_cvt_pkrtz_f32_f16,
which I assume are OK.

llvm-svn: 338619
2018-08-01 20:13:58 +00:00
Matt Arsenault 0084adc516 AMDGPU: Add Vega12 and Vega20
Changes by
  Matt Arsenault
  Konstantin Zhuravlyov

llvm-svn: 331215
2018-04-30 19:08:16 +00:00
Dmitry Preobrazhensky 4c45e6ff0e [AMDGPU][MC][VI][GFX9] Added support of SDWA/DPP for v_cndmask_b32
See bug 36356: https://bugs.llvm.org/show_bug.cgi?id=36356

Differential Revision: https://reviews.llvm.org/D45446

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 330123
2018-04-16 12:41:38 +00:00
Nicolai Haehnle 4f850eabb6 AMDGPU: Introduce common SOP_Pseudo and VOP_Pseudo TableGen base classes
Differential revision: https://reviews.llvm.org/D44820

Change-Id: I732979e2964006aa15d78a333d8886e6855f319a
llvm-svn: 328496
2018-03-26 13:56:53 +00:00
Tim Renouf 2a99fa2c08 [AMDGPU] added writelane intrinsic
Summary:
For use by LLPC SPV_AMD_shader_ballot extension.

The v_writelane instruction was already implemented for use by SGPR
spilling, but I had to add an extra dummy operand tied to the
destination, to represent that all lanes except the selected one keep
the old value of the destination register.

.ll test changes were due to schedule changes caused by that new
operand.

Differential Revision: https://reviews.llvm.org/D42838

llvm-svn: 326353
2018-02-28 19:10:32 +00:00
Marek Olsak 13e4741275 AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16}
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D41663

llvm-svn: 323908
2018-01-31 20:18:04 +00:00
Stanislav Mekhanoshin f630047ef6 [AMDGPU] Copy impdefs from pseudo to real instructions
In some cases we do not copy implicit defs from pseudo to real
VOP instructions. It has no visible impact at the moment thus no
tests are affected or added.

Differential Revision: https://reviews.llvm.org/D41783

llvm-svn: 322496
2018-01-15 17:55:35 +00:00
Dmitry Preobrazhensky 1ac7177abb [AMDGPU][MC][GFX9] Corrected mapping of GFX9 v_add/sub/subrev_u32
When translating pseudo to MC, v_add/sub/subrev_u32 shall be mapped via a separate table as GFX8 has opcodes with the same names.
These instructions shall also be labelled as renamed for pseudoToMCOpcode to handle them correctly.

Reviewers: arsenm

Differential Revision: https://reviews.llvm.org/D40550

llvm-svn: 319311
2017-11-29 13:33:40 +00:00
Dmitry Preobrazhensky a0342dc9eb [AMDGPU][MC][GFX8][GFX9] Corrected names of integer v_{add/addc/sub/subrev/subb/subbrev}
See bug 34765: https://bugs.llvm.org//show_bug.cgi?id=34765

Reviewers: tamazov, SamWot, arsenm, vpykhtin

Differential Revision: https://reviews.llvm.org/D40088

llvm-svn: 318675
2017-11-20 18:24:21 +00:00
Matt Arsenault 90c7593a75 AMDGPU: Remove global isGCN predicates
These are problematic because they apply to everything,
and can easily clobber whatever more specific predicate
you are trying to add to a function.

Currently instructions use SubtargetPredicate/PredicateControl
to apply this to patterns applied to an instruction definition,
but not to free standing Pats. Add a wrapper around Pat
so the special PredicateControls requirements can be appended
to the final predicate list like how Mips does it.

llvm-svn: 314742
2017-10-03 00:06:41 +00:00
Dmitry Preobrazhensky ff64aa514b [AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodes
See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152

Reviewers: SamWot, artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D36674

llvm-svn: 311006
2017-08-16 13:51:56 +00:00
Connor Abbott 79f3ade51a [AMDGPU] Add pseudo "old" source to all DPP instructions
Summary:
All instructions with the DPP modifier may not write to certain lanes of
the output if bound_ctrl=1 is set or any bits in bank_mask or row_mask
aren't set, so the destination register may be both defined and modified.
The right way to handle this is to add a constraint that the destination
register is the same as one of the inputs. We could tie the destination
to the first source, but that would be too restrictive for some use-cases
where we want the destination to be some other value before the
instruction executes. Instead, add a fake "old" source and tie it to the
destination. Effectively, the "old" source defines what value unwritten
lanes will get. We'll expose this functionality to users with a new
intrinsic later.

Also, we want to use DPP instructions for computing derivatives, which
means we need to set WQM for them. We also need to enable the entire
wavefront when using DPP intrinsics to implement nonuniform subgroup
reductions, since otherwise we'll get incorrect results in some cases.
To accomodate this, add a new operand to all DPP instructions which will
be interpreted by the SI WQM pass. This will be exposed with a new
intrinsic later. We'll also add support for Whole Wavefront Mode later.

I also fixed llvm.amdgcn.mov.dpp to overwrite the source and fixed up
the test. However, I could also keep the old behavior (where lanes that
aren't written are undefined) if people want it.

Reviewers: tstellar, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D34716

llvm-svn: 310283
2017-08-07 19:10:56 +00:00
Matt Arsenault c37fe66ec5 AMDGPU: Add encoding for carryless add/sub instructions
llvm-svn: 308639
2017-07-20 17:42:47 +00:00
Sam Kolton 4685b70a77 [AMDGPU] resubmit r308179: CodeGen: check dst operand type to determine if omod is supported for VOP3 instructions
llvm-svn: 308310
2017-07-18 14:23:26 +00:00
Chandler Carruth 9a7442d088 Revert r308179 which causes tablegen to spam stderr on every build.
Original commit log:
[AMDGPU] CodeGen: check dst operand type to determine if omod is supported for VOP3 instructions

llvm-svn: 308270
2017-07-18 07:40:47 +00:00
Sam Kolton a2b9e2f755 [AMDGPU] CodeGen: check dst operand type to determine if omod is supported for VOP3 instructions
Summary:
Previously, CodeGen checked first src operand type to determine if omod is supported by instruction. This isn't correct for some instructions: e.g. V_CMP_EQ_F32 has floating-point src operands but desn't support omod.
Changed .td files to check if dst operand instead of src operand.

Reviewers: arsenm, vpykhtin

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D35350

llvm-svn: 308179
2017-07-17 14:23:38 +00:00
Sam Kolton ca5a30ed74 [AMDGPU] SDWA: remove support for VOP2 instructions that have only 64-bit encoding
Summary:
Despite that this instructions are listed in VOP2, they are treated as VOP3 in specs. They should not support SDWA.
There are no real instructions for them, but there are pseudo instructions.

Reviewers: arsenm, vpykhtin, cfang

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D34403

llvm-svn: 305999
2017-06-22 12:42:14 +00:00
Stanislav Mekhanoshin e3eb42cef6 [AMDGPU] simplify add x, *ext (setcc) => addc|subb x, 0, setcc
This simplification allows to avoid generating v_cndmask_b32
to serialize condition code between compare and use.

Differential Revision: https://reviews.llvm.org/D34300

llvm-svn: 305962
2017-06-21 22:05:06 +00:00
Sam Kolton 549c89d2c9 [AMDGPU] SDWA: merge VI and GFX9 pseudo instructions
Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9.

Reviewers: dp, arsenm, vpykhtin

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov

Differential Revision: https://reviews.llvm.org/D34026

llvm-svn: 305886
2017-06-21 08:53:38 +00:00
Sam Kolton f7659d71eb [AMDGPU] SDWA: Add assembler support for GFX9
Summary:
Added separate pseudo and real instruction for GFX9 SDWA instructions.
Currently supports only in assembler.
Depends D32493

Reviewers: vpykhtin, artem.tamazov

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D33132

llvm-svn: 303620
2017-05-23 10:08:55 +00:00
Dmitry Preobrazhensky 167f8b69e3 [AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64
See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D33123

llvm-svn: 303070
2017-05-15 14:28:23 +00:00
Dmitry Preobrazhensky da61a7f9ef [AMDGPU][MC] Corrected v_madak/madmk to avoid printing "_e32" in disassembler output
See bug 32927: https://bugs.llvm.org//show_bug.cgi?id=32927

Reviewers: vpykhtin, artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D32913

llvm-svn: 302648
2017-05-10 13:00:28 +00:00
Matt Arsenault 678e111e11 AMDGPU: Fix crash when disassembling VOP3 mac
The unused dummy src2_modifiers is missing, so it crashes
when trying to print it.

I tried to fully remove src2_modifiers, but there are some
irritations in the places where it is converted to mad since
it starts to require modifying use lists while iterating over
them.

llvm-svn: 299861
2017-04-10 17:58:06 +00:00
Dmitry Preobrazhensky 45db65037f [AMDGPU][MC] Fix for Bug 28167 + LIT tests
Corrected src0 for v_writelane_b32:
- Enabled inline constants and literals for SI/CI (VOP2)
- Enabled inline constants for VI (VOP3)

Reviewers: vpykhtin, arsenm

https://reviews.llvm.org/D31463

llvm-svn: 299555
2017-04-05 16:08:21 +00:00
Dmitry Preobrazhensky 03880f8d24 [AMDGPU][MC] Fix for Bug 30829 + LIT tests
Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction).
Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code).
Added LIT tests.

llvm-svn: 296873
2017-03-03 14:31:06 +00:00
Matt Arsenault 9be7b0d485 AMDGPU: Add VOP3P instruction format
Add a few non-VOP3P but instructions related to packed.

Includes hack with dummy operands for the benefit of the assembler

llvm-svn: 296368
2017-02-27 18:49:11 +00:00
Matt Arsenault 1f17c66890 AMDGPU: Add cvt.pkrtz intrinsic
Convert llvm.SI.packf16 test uses

llvm-svn: 295797
2017-02-22 00:27:34 +00:00
Matt Arsenault b4493e909f AMDGPU: Fix trailing whitespace
llvm-svn: 294694
2017-02-10 02:42:31 +00:00
Matt Arsenault af635240d5 AMDGPU: Undo sub x, c -> add x, -c canonicalization
This is worse if the original constant is an inline immediate.

This should also be done for 64-bit adds, but requires fixing
operand folding bugs first.

llvm-svn: 293540
2017-01-30 19:30:24 +00:00
Sam Kolton 07dbde214b [AMDGPU] Add subtarget features for SDWA/DPP
Reviewers: vpykhtin, artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28900

llvm-svn: 292596
2017-01-20 10:01:25 +00:00
Sam Kolton 9772eb3907 [AMDGPU] Assembler: SDWA/DPP should not accept scalar registers and immediate operands
Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28157

llvm-svn: 291668
2017-01-11 11:46:30 +00:00
Sam Kolton e66365e07d [AMDGPU] Assembler: support SDWA and DPP for VOP2b instructions
Reviewers: nhaustov, artem.tamazov, vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28051

llvm-svn: 290599
2016-12-27 10:06:42 +00:00
Matt Arsenault 941632839f AMDGPU: Use i16 for i16 shift amount
llvm-svn: 290351
2016-12-22 16:36:25 +00:00
Sam Kolton a568e3dde7 [AMDGPU] Add pseudo SDWA instructions
Summary: This is needed for later SDWA support in CodeGen.

Reviewers: vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D27412

llvm-svn: 290338
2016-12-22 12:57:41 +00:00
Sam Kolton a6792a39c4 [AMDGPU] Disassembler: fix for disaasembling v_mac_f32/16_dpp/sdwa
Summary: Real instruction should copy constraints from real instruction. This allows auto-generated disassembler to correctly process tied operands.

Reviewers: nhaustov, vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D27847

llvm-svn: 290336
2016-12-22 11:30:48 +00:00
Matt Arsenault 55e7d65b12 AMDGPU: Fix name for v_ashrrev_i16
llvm-svn: 289967
2016-12-16 17:40:11 +00:00
Matt Arsenault 4bd7236193 AMDGPU: Fix handling of 16-bit immediates
Since 32-bit instructions with 32-bit input immediate behavior
are used to materialize 16-bit constants in 32-bit registers
for 16-bit instructions, determining the legality based
on the size is incorrect. Change operands to have the size
specified in the type.

Also adds a workaround for a disassembler bug that
produces an immediate MCOperand for an operand that
is supposed to be OPERAND_REGISTER.

The assembler appears to accept out of bounds immediates and
truncates them, but this seems to be an issue for 32-bit
already.

llvm-svn: 289306
2016-12-10 00:39:12 +00:00
Matt Arsenault 27c062932a AMDGPU: Select i16 instructions to VOP3 forms
These were selecting directly to the VOP2 form instead
of VOP3 like the i32 instructions. Fixes regressions in
future commits where an immediate isn't folded because it was
initially used for the second operand.

Because uniform 16-bit operations are promoted to i32, it's
difficult to get a simple testcase where this matters. Fold
failures in SIFoldOperands here tend to be hidden by commute
and fold in SIShrinkInstructions.

llvm-svn: 289189
2016-12-09 06:19:12 +00:00
Matt Arsenault 6c06a6f48a AMDGPU: Fix commuting v_sub_u16
The correct commutable opcode was set to itself, so this
was simply swapping the operands to commute instead of also
changing the opcode to v_subrev_u16.

llvm-svn: 289093
2016-12-08 19:52:38 +00:00
Tom Stellard 01e65d2cfc AMDGPU/SI: Remove zero_extend patterns for i16 ops selected to 32-bit insts
Summary:
The 32-bit instructions don't zero the high 16-bits like the 16-bit
instructions do.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D26828

llvm-svn: 287342
2016-11-18 13:53:34 +00:00
Tom Stellard d23de360db AMDGPU/SI: Fix pattern for i16 = sign_extend i1
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D26670

llvm-svn: 287035
2016-11-15 21:25:56 +00:00