Commit Graph

234 Commits

Author SHA1 Message Date
Matt Arsenault 7d67742160 AMDGPU/GlobalISel: Fix import of zext of s16 op patterns 2020-01-09 10:29:32 -05:00
Matt Arsenault e71af77568 AMDGPU/GlobalISel: Add IMMPopCount xform
Partially fixes BFE pattern import.
2020-01-09 10:29:32 -05:00
Matt Arsenault 79450a4ea2 AMDGPU/GlobalISel: Add selectVOP3Mods_nnan
This doesn't enable any new imports yet, but moves the fmed patterns
from failing on this to hitting the "complex suboperand referenced
more than once" limitation in tablegen.
2020-01-09 10:29:32 -05:00
Matt Arsenault d964086c62 AMDGPU/GlobalISel: Add equiv xform for bitcast_fpimm_to_i32
Only partially fixes one pattern import.
2020-01-09 10:29:31 -05:00
Matt Arsenault 3952748ffd AMDGPU/GlobalISel: Fix add of neg inline constant pattern 2020-01-09 10:29:31 -05:00
Matt Arsenault c3a10faadc AMDGPU: Remove VOP3Mods0Clamp0OMod
Now that overridable default operands work, there's no reason to use
complex patterns to just produce 0s.
2020-01-07 15:10:08 -05:00
Matt Arsenault d4c9e13324 AMDGPU/GlobalISel: Select G_UADDE/G_USUBE 2020-01-06 18:27:52 -05:00
Matt Arsenault 4e85ca9562 AMDGPU/GlobalISel: Replace handling of boolean values
This solves selection failures with generated selection patterns,
which would fail due to inferring the SGPR reg bank for virtual
registers with a set register class instead of VCC bank. Use
instruction selection would constrain the virtual register to a
specific class, so when the def was selected later the bank no longer
was set to VCC.

Remove the SCC reg bank. SCC isn't directly addressable, so it
requires copying from SCC to an allocatable 32-bit register during
selection, so these might as well be treated as 32-bit SGPR values.

Now any scalar boolean value that will produce an outupt in SCC should
be widened during RegBankSelect to s32. Any s1 value should be a
vector boolean during selection. This makes the vcc register bank
unambiguous with a normal SGPR during selection.

Summary of how this should now work:

- G_TRUNC is always a no-op, and never should use a vcc bank result.

- SALU boolean operations should be promoted to s32 in RegBankSelect
  apply mapping

- An s1 value means vcc bank at selection. The exception is for
  legalization artifacts that use s1, which are never VCC. All other
  contexts should infer the VCC register classes for s1 typed
  registers. The LLT for the register is now needed to infer the
  correct register class. Extensions with vcc sources should be
  legalized to a select of constants during RegBankSelect.

- Copy from non-vcc to vcc ensures high bits of the input value are
  cleared during selection.

- SALU boolean inputs should ensure the inputs are 0/1. This includes
  select, conditional branches, and carry-ins.

There are a few somewhat dirty details. One is that G_TRUNC/G_*EXT
selection ignores the usual register-bank from register class
functions, and can't handle truncates with VCC result banks. I think
this is OK, since the artifacts are specially treated anyway. This
does require some care to avoid producing cases with vcc. There will
also be no 100% reliable way to verify this rule is followed in
selection in case of register classes, and violations manifests
themselves as invalid copy instructions much later.

Standard phi handling also only considers the bank of the result
register, and doesn't insert copies to make the source banks
match. This doesn't work for vcc, so we have to manually correct phi
inputs in this case. We should add a verifier check to make sure there
are no phis with mixed vcc and non-vcc register bank inputs.

There's also some duplication with the LegalizerHelper, and some code
which should live in the helper. I don't see a good way to share
special knowledge about what types to use for intermediate operations
depending on the bank for example. Using the helper to replace
extensions with selects also seems somewhat awkward to me.

Another issue is there are some contexts calling
getRegBankFromRegClass that apparently don't have the LLT type for the
register, but I haven't yet run into a real issue from this.

This also introduces new unnecessary instructions in most cases, since
we don't yet try to optimize out the zext when the source is known to
come from a compare.
2020-01-06 18:26:42 -05:00
Matt Arsenault 14d25052a2 AMDGPU: Use ImmLeaf for inline immediate predicates 2020-01-06 17:21:51 -05:00
Matt Arsenault e4464bf3d4 AMDGPU/GlobalISel: Select scalar v2s16 G_BUILD_VECTOR 2020-01-06 11:19:33 -05:00
Matt Arsenault f1c85ecdfc AMDGPU/GlobalISel: Select more G_EXTRACTs correctly
This assumed a 32-bit extract size, which would produce invalid copies
with 64-bit extracts. Handle the easy case. Ideally we would have a
way to get the proper subreg index for any 32-bit offset, but there
should probably be a tablegenerated way of getting the subreg index
for any size and offset.
2020-01-06 11:10:13 -05:00
Matt Arsenault 9861a8538c AMDGPU/GlobalISel: Add new utils file
There are some things that are shareable between the legalizer,
regbankselect, and the selector that don't have an obvious place to
go.
2020-01-03 15:25:50 -05:00
Matt Arsenault 53fc484067 AMDGPU/GlobalISel: Fix off by one in operand index
This should be looking at the RHS of the add for a constant.
2020-01-03 10:30:30 -05:00
Matt Arsenault 25e7da0c24 AMDGPU/GlobalISel: Remove manual G_FENCE selection
The tablegen emitter now handles the immediate operand correctly, so
let the generatedd matcher works.
2020-01-02 17:16:10 -05:00
Matt Arsenault 48e0e68edb AMDGPU/GlobalISel: Re-use MRI available in selector 2019-12-30 13:00:17 -05:00
Matt Arsenault dff3f8d742 AMDGPU/GlobalISel: Fix missing scc imp-def on scalar and/or/xor 2019-12-21 04:55:36 -05:00
Matt Arsenault bc276c6379 GlobalISel: Lower s1 source G_SITOFP/G_UITOFP 2019-11-15 13:37:20 +05:30
Daniel Sanders e74c5b9661 [globalisel] Rename G_GEP to G_PTR_ADD
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD

Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm

Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69734
2019-11-05 10:31:17 -08:00
Matt Arsenault f9a42ed0a7 AMDGPU: Relax 32-bit SGPR register class
Mostly use SReg_32 instead of SReg_32_XM0 for arbitrary values. This
will allow the register coalescer to do a better job eliminating
copies to m0.

For GlobalISel, as a terrible hack, use SGPR_32 for things that should
use SCC until booleans are solved.

llvm-svn: 375267
2019-10-18 18:26:37 +00:00
Matt Arsenault 538b73b797 AMDGPU/GlobalISel: Handle more G_INSERT cases
Start manually writing a table to get the subreg index. TableGen
should probably generate this, but I'm not sure what it looks like in
the arbitrary case where subregisters are allowed to not fully cover
the super-registers.

llvm-svn: 373947
2019-10-07 19:16:26 +00:00
Matt Arsenault 0b2ea91d6d AMDGPU/GlobalISel: Use S_MOV_B64 for inline constants
This hides some defects in SIFoldOperands when the immediates are
split.

llvm-svn: 373943
2019-10-07 19:07:19 +00:00
Matt Arsenault b4cbf9862c AMDGPU/GlobalISel: Select more G_INSERT cases
At minimum handle the s64 insert type, which are emitted in real cases
during legalization.

We really need TableGen to emit something to emit something like the
inverse of composeSubRegIndices do determine the subreg index to use.

llvm-svn: 373938
2019-10-07 18:43:31 +00:00
Matt Arsenault 27269054d2 GlobalISel: Add target pre-isel instructions
Allows targets to introduce regbankselectable
pseudo-instructions. Currently the closet feature to this is an
intrinsic. However this requires creating a public intrinsic
declaration. This litters the public intrinsic namespace with
operations we don't necessarily want to expose to IR producers, and
would rather leave as private to the backend.

Use a new instruction bit. A previous attempt tried to keep using enum
value ranges, but it turned into a mess.

llvm-svn: 373937
2019-10-07 18:43:29 +00:00
Matt Arsenault e59296a051 AMDGPU/GlobalISel: Fall back on weird G_EXTRACT offsets
llvm-svn: 373842
2019-10-06 01:41:22 +00:00
Matt Arsenault 412e0bf8f3 AMDGPU/GlobalISel: Select G_PTRTOINT
llvm-svn: 373715
2019-10-04 08:35:37 +00:00
Piotr Sobczak 265e94e657 [AMDGPU] Extend buffer intrinsics with swizzling
Summary:
Extend cachepolicy operand in the new VMEM buffer intrinsics
to supply information whether the buffer data is swizzled.
Also, propagate this information to MIR.

Intrinsics updated:
int_amdgcn_raw_buffer_load
int_amdgcn_raw_buffer_load_format
int_amdgcn_raw_buffer_store
int_amdgcn_raw_buffer_store_format
int_amdgcn_raw_tbuffer_load
int_amdgcn_raw_tbuffer_store
int_amdgcn_struct_buffer_load
int_amdgcn_struct_buffer_load_format
int_amdgcn_struct_buffer_store
int_amdgcn_struct_buffer_store_format
int_amdgcn_struct_tbuffer_load
int_amdgcn_struct_tbuffer_store

Furthermore, disable merging of VMEM buffer instructions
in SI Load/Store optimizer, if the "swizzled" bit on the instruction
is on.

The default value of the bit is 0, meaning that data in buffer
is linear and buffer instructions can be merged.

There is no difference in the generated code with this commit.
However, in the future it will be expected that front-ends
use buffer intrinsics with correct "swizzled" bit set.

Reviewers: arsenm, nhaehnle, tpr

Reviewed By: nhaehnle

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, arphaman, jfb, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68200

llvm-svn: 373491
2019-10-02 17:22:36 +00:00
Matt Arsenault 86f864dace AMDGPU/GlobalISel: Use getIntrinsicID helper
llvm-svn: 373417
2019-10-02 01:02:27 +00:00
Matt Arsenault fdea5e02ce AMDGPU/GlobalISel: Select s1 src G_SITOFP/G_UITOFP
llvm-svn: 373298
2019-10-01 02:23:20 +00:00
Matt Arsenault 59b91aa93e AMDGPU/GlobalISel: Add support for init.exec intrinsics
TThe existing wave32 behavior seems broken and incomplete, but this
reproduces it.

llvm-svn: 373296
2019-10-01 02:07:25 +00:00
Matt Arsenault 54167ea316 AMDGPU/GlobalISel: Select G_UADDO/G_USUBO
llvm-svn: 373288
2019-10-01 01:23:13 +00:00
Matt Arsenault 76f44f6b53 AMDGPU/GlobalISel: Avoid getting MRI in every function
Store it in AMDGPUInstructionSelector to avoid boilerplate in nearly
every select function.

llvm-svn: 373139
2019-09-28 03:41:13 +00:00
Bjorn Pettersson 169cb63478 [AMDGPU] Use std::make_tuple to make some toolchains happy again
My toolchain stopped working (LLVM 8.0 , libstdc++ 5.4.0) after
r372338.

The same problem was seen in clang-cuda-build buildbots:

clang-cuda-build/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:763:12:
error: chosen constructor is explicit in copy-initialization
    return {Reg, 0, nullptr};
           ^~~~~~~~~~~~~~~~~
/usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19:
note: explicit constructor declared here
        constexpr tuple(_UElements&&... __elements)
                  ^

This commit adds explicit calls to std::make_tuple to work around
the problem.

llvm-svn: 372384
2019-09-20 12:13:12 +00:00
Matt Arsenault 3ecab8e455 Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)

This was missing one switch to getTargetConstant in an untested case.

llvm-svn: 372338
2019-09-19 16:26:14 +00:00
Hans Wennborg 13bdae8541 Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"
This broke the Chromium build, causing it to fail with e.g.

  fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>

See llvm-commits thread of r372285 for details.

This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.

> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.

llvm-svn: 372314
2019-09-19 12:33:07 +00:00
Matt Arsenault 494243597b AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store.format
This needs special handling due to some subtargets that have a
nonstandard register layout for f16 vectors

Also reject some illegal types on other targets.

llvm-svn: 372293
2019-09-19 02:35:08 +00:00
Matt Arsenault 67f1f6ff8c AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store
llvm-svn: 372292
2019-09-19 02:30:27 +00:00
Matt Arsenault d8399d12cd GlobalISel: Don't materialize immarg arguments to intrinsics
Encode them directly as an imm argument to G_INTRINSIC*.

Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.

This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.

SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.

Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.

The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.

This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.

llvm-svn: 372285
2019-09-19 01:33:14 +00:00
Matt Arsenault fb51e64eac AMDGPU/GlobalISel: Fail select of G_INSERT non-32-bit source
This was producing an illegal copy which would hit an assert
later. Error on selection for now until this is implemented.

llvm-svn: 371993
2019-09-16 14:26:14 +00:00
Matt Arsenault 3b7ffc6ae7 AMDGPU/GlobalISel: Fix assert on multi-return side effect intrinsics
llvm.amdgcn.else hits this.

llvm-svn: 371812
2019-09-13 04:12:12 +00:00
Matt Arsenault 77e3e9cafd AMDGPU/GlobalISel: Select llvm.amdgcn.class
Also fixes missing SubtargetPredicate on f16 class instructions.

llvm-svn: 371436
2019-09-09 18:29:45 +00:00
Matt Arsenault d6c1f5bb15 AMDGPU/GlobalISel: Select fmed3
llvm-svn: 371435
2019-09-09 18:29:37 +00:00
Matt Arsenault c34b4036ff AMDGPU/GlobalISel: Select G_PTR_MASK
llvm-svn: 371412
2019-09-09 15:46:13 +00:00
Matt Arsenault 2dd088ec7d AMDGPU/GlobalISel: Use known bits for selection
llvm-svn: 371409
2019-09-09 15:39:32 +00:00
Matt Arsenault d50f937378 AMDGPU/GlobalISel: Try generated matcher before add/sub code
This will allow optimization patterns which fold adds away to work.

llvm-svn: 371406
2019-09-09 15:20:44 +00:00
Matt Arsenault d51a3746d0 AMDGPU/GlobalISel: Fix assert on load from constant address
llvm-svn: 371006
2019-09-05 02:20:25 +00:00
Matt Arsenault a8bbcbd006 AMDGPU/GlobalISel: Fix constraining scalar and/or/xor
If the result register already had a register class assigned, the
sources may not have been properly constrained.

llvm-svn: 370150
2019-08-28 02:11:03 +00:00
Daniel Sanders 0c47611131 Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor

Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&

Depends on D65919

Reviewers: arsenm, bogner, craig.topper, RKSimon

Reviewed By: arsenm

Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65962

llvm-svn: 369041
2019-08-15 19:22:08 +00:00
Amara Emerson e14c91b71a [GlobalISel] Make the InstructionSelector instance non-const, allowing state to be maintained.
Currently we can't keep any state in the selector object that we get from
subtarget. As a result we have to plumb through all our variables through
multiple functions. This change makes it non-const and adds a virtual init()
method to allow further state to be captured for each target.

AArch64 makes use of this in this patch to cache a call to hasFnAttribute()
which is expensive to call, and is used on each selection of G_BRCOND.

Differential Revision: https://reviews.llvm.org/D65984

llvm-svn: 368652
2019-08-13 06:26:59 +00:00
Daniel Sanders 2bea69bf65 Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC
llvm-svn: 367633
2019-08-01 23:27:28 +00:00
Matt Arsenault 57495268ac AMDGPU/GlobalISel: Remove manual store select code
This regresses the weird types that are newly treated as legal load
types, but fixes incorrectly using flat instrucions on SI.

llvm-svn: 367512
2019-08-01 03:52:40 +00:00
Matt Arsenault 26cb53b260 AMDGPU/GlobalISel: Handle G_ATOMICRMW_FADD
llvm-svn: 367509
2019-08-01 03:33:15 +00:00
Matt Arsenault da5b9bfa95 AMDGPU/GlobalISel: Allow selection of DS atomicrmw
llvm-svn: 367507
2019-08-01 03:29:01 +00:00
Matt Arsenault 3baf4d3418 AMDGPU/GlobalISel: Select simple local stores
llvm-svn: 367504
2019-08-01 03:09:15 +00:00
Matt Arsenault 3594011de0 AMDGPU/GlobalISel: Select local loads
llvm-svn: 367498
2019-08-01 00:53:38 +00:00
Matt Arsenault 0e7d8698b5 AMDGPU/GlobalISel: Don't assume instruction can be erased when selecting exts
The G_ANYEXT handling can end up reaching selectCOPY, which mutates
the instruction in place.

llvm-svn: 366915
2019-07-24 16:05:53 +00:00
Matt Arsenault 937d0ee5d8 AMDGPU/GlobalISel: Remove unnecessary code
The minnum/maxnum case are dead, and the cvt is handled by the
default.

llvm-svn: 366685
2019-07-22 13:05:25 +00:00
Matt Arsenault 7161fb0be5 AMDGPU/GlobalISel: Select private loads
llvm-svn: 366248
2019-07-16 19:22:21 +00:00
Matt Arsenault dad1f89210 AMDGPU/GlobalISel: Select flat stores
llvm-svn: 366246
2019-07-16 18:42:53 +00:00
Matt Arsenault 35c96598b1 AMDGPU/GlobalISel: Select flat loads
Now that the patterns use the new PatFrag address space support, the
only blocker to importing most load patterns is the addressing mode
complex patterns.

llvm-svn: 366237
2019-07-16 18:05:29 +00:00
Matt Arsenault 22c4a147a9 AMDGPU/GlobalISel: Fix test failures in release build
Apparently the check for legal instructions during instruction
select does not happen without an asserts build, so these would
successfully select in release, and fail in debug.

Make s16 and/or/xor legal. These can just be selected directly
to the 32-bit operation, as is already done in SelectionDAG, so just
make them legal.

llvm-svn: 366210
2019-07-16 14:28:30 +00:00
Matt Arsenault c8291c94f8 AMDGPU/GlobalISel: Select G_AND/G_OR/G_XOR
llvm-svn: 366121
2019-07-15 19:50:07 +00:00
Matt Arsenault ad19b50c00 AMDGPU/GlobalISel: Don't constrain source register of VCC copies
This is a hack until I come up with a better way of dealing with the
pseudo-register banks used for boolean values. If the use instruction
constrains the register, the selector for the def instruction won't
see that the bank was VCC. A 1-bit SReg_32 is could ambiguously have
been SCCRegBank or VCCRegBank in wave32.

This is necessary to successfully select branches with and and/or/xor
condition.

llvm-svn: 366120
2019-07-15 19:48:36 +00:00
Matt Arsenault e1b52f4180 AMDGPU/GlobalISel: Fix selecting vcc->vcc bank copies
The extra test change is correct, although how it arrives there is a
bug that needs work. With wave32, the test for isVCC ambiguously
reports true for an SCC or VCC source. A new allocatable pseudo
register class for SCC may be necesssary.

llvm-svn: 366119
2019-07-15 19:46:48 +00:00
Matt Arsenault 3bfdb54d88 AMDGPU/GlobalISel: Fix not constraining result reg of copies to VCC
llvm-svn: 366118
2019-07-15 19:45:49 +00:00
Matt Arsenault 18b7133843 AMDGPU/GlobalISel: Fix handling of sgpr (not scc bank) s1 to VCC
This was emitting a copy from a 32-bit register to a 64-bit.

llvm-svn: 366117
2019-07-15 19:44:07 +00:00
Matt Arsenault 5dfd466032 AMDGPU/GlobalISel: Fix G_ICMP for wave32
llvm-svn: 366114
2019-07-15 19:39:31 +00:00
Matt Arsenault 53fa759ff5 AMDGPU/GlobalISel: Handle llvm.amdgcn.if.break
llvm-svn: 366102
2019-07-15 18:25:24 +00:00
Matt Arsenault b390121efb AMDGPU/GlobalISel: Select llvm.amdgcn.end.cf
llvm-svn: 366099
2019-07-15 18:18:46 +00:00
Matt Arsenault a65913e752 AMDGPU/GlobalISel: Select easy cases for G_BUILD_VECTOR
llvm-svn: 366087
2019-07-15 17:26:43 +00:00
Matt Arsenault e6d10f97dd AMDGPU/GlobalISel: Select G_SUB
llvm-svn: 365484
2019-07-09 14:05:11 +00:00
Matt Arsenault 872f38be7e AMDGPU/GlobalISel: Select G_UNMERGE_VALUES
llvm-svn: 365483
2019-07-09 14:02:26 +00:00
Matt Arsenault 9b7ffc4e55 AMDGPU/GlobalISel: Select G_MERGE_VALUES
llvm-svn: 365482
2019-07-09 14:02:20 +00:00
Matt Arsenault 50be3481d4 AMDGPU/GlobalISel: Try generated matcher with intrinsics
llvm-svn: 364933
2019-07-02 14:52:16 +00:00
Matt Arsenault 70a4d3f67c AMDGPU/GlobalISel: Fix G_GEP with mixed SGPR/VGPR operands
The register bank for the destination of the sample argument copy was
wrong. We shouldn't be constraining each source to the result register
bank. Allow constraining the original register to the right size.

llvm-svn: 364928
2019-07-02 14:40:22 +00:00
Matt Arsenault ed63399244 AMDGPU/GlobalISel: Select G_FENCE
Manually select to workaround tablegen emitter emitting checks for
G_CONSTANT.

llvm-svn: 364927
2019-07-02 14:17:38 +00:00
Matt Arsenault 9e8e8c60fa AMDGPU/GlobalISel: Lower kernarg segment ptr intrinsics
llvm-svn: 364835
2019-07-01 18:49:01 +00:00
Matt Arsenault a310727830 AMDGPU/GlobalISel: Fail instead of assert when selecting loads
llvm-svn: 364807
2019-07-01 16:36:39 +00:00
Matt Arsenault 0a52e9d026 AMDGPU/GlobalISel: Complete implementation of G_GEP
Also works around tablegen defect in selecting add with unused carry,
but if we have to manually select GEP, might as well handle add
manually.

llvm-svn: 364806
2019-07-01 16:34:48 +00:00
Matt Arsenault e1006259d8 AMDGPU/GlobalISel: Select G_PHI
llvm-svn: 364805
2019-07-01 16:32:47 +00:00
Tom Stellard 9e9dd30de3 AMDGPU/GlobalISel: Implement select for 32-bit G_ADD
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58804

llvm-svn: 364797
2019-07-01 16:09:33 +00:00
Matt Arsenault 2ab25f9ceb AMDGPU/GlobalISel: Select G_BRCOND for vcc
llvm-svn: 364795
2019-07-01 16:06:02 +00:00
Matt Arsenault cda82f0bb6 AMDGPU/GlobalISel: Select G_FRAME_INDEX
llvm-svn: 364789
2019-07-01 15:48:18 +00:00
Matt Arsenault fdf36729c7 AMDGPU/GlobalISel: Make s16 select legal
This is easy to handle and avoids legalization artifacts which are
likely to obscure combines.

llvm-svn: 364787
2019-07-01 15:42:47 +00:00
Matt Arsenault 6464280eb0 AMDGPU/GlobalISel: Select G_BRCOND for scc conditions
llvm-svn: 364786
2019-07-01 15:39:27 +00:00
Matt Arsenault 1daad91af6 AMDGPU/GlobalISel: Tolerate copies with no type set
isVCC has the same bug, but isn't used in a context where it can cause
a problem.

llvm-svn: 364784
2019-07-01 15:23:04 +00:00
Matt Arsenault 4f64ade04c AMDGPU/GlobalISel: Select src modifiers
llvm-svn: 364782
2019-07-01 15:18:56 +00:00
Matt Arsenault 89fc8bcdd6 AMDGPU/GlobalISel: Fail on store to 32-bit address space
llvm-svn: 364766
2019-07-01 13:37:39 +00:00
Matt Arsenault 3b7668ae4b AMDGPU/GlobalISel: Improve icmp selection coverage.
Select s64 eq/ne scalar icmp.

llvm-svn: 364765
2019-07-01 13:34:26 +00:00
Matt Arsenault 9f992c238a AMDGPU/GlobalISel: Fix scc->vcc copy handling
This was checking the size of the register with the value of the size,
which happens to be exec. Also fix assuming VCC is 64-bit to fix
wave32.

Also remove some untested handling for physical registers which is
skipped. This doesn't insert the V_CNDMASK_B32 if SCC is the physical
copy source. I'm not sure if this should be trying to handle this
special case instead of dealing with this in copyPhysReg.

llvm-svn: 364761
2019-07-01 13:22:07 +00:00
Matt Arsenault 5dafcb9b11 AMDGPU/GlobalISel: Use and instead of BFE with inline immediate
Zext from s1 is the only case where this should do anything with the
current legal extensions.

llvm-svn: 364760
2019-07-01 13:22:06 +00:00
Matt Arsenault d7ffa2a948 AMDGPU: Select G_SEXT/G_ZEXT/G_ANYEXT
llvm-svn: 364308
2019-06-25 13:18:11 +00:00
Matt Arsenault dbb6c03175 AMDGPU/GlobalISel: Select G_TRUNC
llvm-svn: 364215
2019-06-24 18:02:18 +00:00
Matt Arsenault f8a841b88e AMDGPU/GlobalISel: Fix selecting G_IMPLICIT_DEF for s1
Try to fail for scc, since I don't think that should ever be produced.

llvm-svn: 364199
2019-06-24 16:24:03 +00:00
Matt Arsenault fee1949b35 AMDGPU/GlobalISel: Account for multiple defs when finding intrinsic ID
llvm-svn: 363578
2019-06-17 17:01:27 +00:00
Tom Stellard 8b1c53b528 AMDGPU/GlobalISel: Implement select for G_ICMP and G_SELECT
Reviewers: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60640

llvm-svn: 363576
2019-06-17 16:27:43 +00:00
Stanislav Mekhanoshin a6322941ff [AMDGPU] gfx1010 VMEM and SMEM implementation
Differential Revision: https://reviews.llvm.org/D61330

llvm-svn: 359621
2019-04-30 22:08:23 +00:00
Tom Stellard 33634d1b25 AMDGPU/GlobalISel: Implement select for G_INSERT
Re-commit r344310.

Reviewers: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D53116

llvm-svn: 355159
2019-03-01 00:50:26 +00:00
Tom Stellard 41f32196a0 AMDGPU/GlobalISel: Implement select for G_EXTRACT
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D49714

llvm-svn: 355156
2019-02-28 23:37:48 +00:00
Tom Stellard 79b5c3842b AMDGPU/GlobalISel: Move SMRD selection logic to TableGen
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D52922

llvm-svn: 354516
2019-02-20 21:02:37 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Tom Stellard a894043910 Revert "AMDGPU/GlobalISel: Implement select for G_INSERT"
This reverts commit r344310.

The test case was failing on some bots.

llvm-svn: 344317
2018-10-11 23:36:46 +00:00
Tom Stellard 4733be6e7b AMDGPU/GlobalISel: Implement select for G_INSERT
Reviewers: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D53116

llvm-svn: 344310
2018-10-11 22:49:54 +00:00
Tom Stellard 7c65078f04 AMDGPU/GlobalISel: Add support for G_INTTOPTR
Summary: This is a no-op.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D52916

llvm-svn: 343839
2018-10-05 04:34:09 +00:00
Matt Arsenault 0da6350dc8 AMDGPU: Remove remnants of old address space mapping
llvm-svn: 341165
2018-08-31 05:49:54 +00:00
Tom Stellard ac68471326 AMDGPU/GlobalISel: Implement select() for 32-bit @llvm.minnun and @llvm.maxnum
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D46172

llvm-svn: 337056
2018-07-13 22:16:03 +00:00
Tom Stellard 390a5f4774 AMDGPU/GlobalISel: Implement select() for @llvm.amdgcn.exp
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45882

llvm-svn: 337046
2018-07-13 21:05:14 +00:00
Tom Stellard 5bfbae5cb1 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

llvm-svn: 336851
2018-07-11 20:59:01 +00:00
Matt Arsenault b1cc4f52ff AMDGPU/GlobalISel: Add support for llvm.amdgcn.kernarg.segment.ptr
Note a normal select test is not currently possible because this
relies on input registers tracked in SIMachineFunctionInfo which
are not currently serializable in MIR, but this does work end-to-end
from the IR.

llvm-svn: 335490
2018-06-25 16:17:48 +00:00
Tom Stellard 6af7307650 AMDGPU/GlobalISel: Default to using TableGen'd instruction selector
Summary:
We can select all instructions that are marked as legal in a full piglit run,
so now is a good time to make the TableGen'd instruction selector default
for all opcodes.  This is NFC for a full piglit run, which is why there are
no tests.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48198

llvm-svn: 335319
2018-06-22 03:04:35 +00:00
Tom Stellard 26fac0f8e1 AMDGPU/GlobalISel: legalize and select 32-bit G_ASHR
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D48196

llvm-svn: 335318
2018-06-22 02:54:57 +00:00
Tom Stellard 9a6535718e AMDGPU/GlobalISel: legalize and select 32-bit G_SITOFP
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48195

llvm-svn: 335316
2018-06-22 02:34:29 +00:00
Tom Stellard 7712ee8891 AMDGPU/GlobalISel: Implement select() for COPY
Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D46151

llvm-svn: 335315
2018-06-22 00:44:29 +00:00
Tom Stellard 3f1c6fe156 AMDGPU/GlobalISel: Implement select() for G_IMPLICIT_DEF
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D46150

llvm-svn: 335307
2018-06-21 23:38:20 +00:00
Tom Stellard a92847359a AMDGPU/GlobalISel: Implement select() for @llvm.amdgcn.cvt.pkrtz
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45907

llvm-svn: 334757
2018-06-14 19:26:37 +00:00
Tom Stellard 46bbbc33c0 AMDGPU/GlobalISel: Implement select() for 32-bit G_FADD and G_FMUL
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D46171

llvm-svn: 334665
2018-06-13 22:30:47 +00:00
Tom Stellard 44b30b4537 AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers
Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are huge so we only want to include
them where needed.

This will also make it easier if we want to split the R600 and GCN
definitions into separate tablegenerated files.

I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h
because it uses some enums from the header to initialize default values
for the SIMachineFunction class, so I ended up having to remove includes of
SIMachineFunctionInfo.h from headers too.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46272

llvm-svn: 332930
2018-05-22 02:03:23 +00:00
Tom Stellard a91ce17b5f AMDGPU/GlobalISel: Address post-commit review comments for r332379
MCRegisterInfo::getPhysRegSize() will be deprecated.

llvm-svn: 332856
2018-05-21 17:49:31 +00:00
Tom Stellard e182b28ae4 AMDGPU/GlobalISel: Implement select() for G_FCONSTANT
Summary: Also clean up G_CONSTANT selection.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D46170

llvm-svn: 332379
2018-05-15 17:57:09 +00:00
Tom Stellard 655fdd3f82 AMDGPU/GlobalISel: Implement select() for >32-bit G_STORE
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D46153

llvm-svn: 332154
2018-05-11 23:12:49 +00:00
Tom Stellard dcc95e9385 AMDGPU/GlobalISel: Implement select() for 32-bit G_FPTOUI
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45883

llvm-svn: 332082
2018-05-11 05:44:16 +00:00
Tom Stellard 1e0edad4bb AMDGPU/GlobalISel: Implement select() for G_BITCAST s32 <--> <2 x s16>
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45881

llvm-svn: 332042
2018-05-10 21:20:10 +00:00
Tom Stellard 1dc90204bf AMDGPU/GlobalISel: Enable TableGen'd instruction selector
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, mgorny, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45994

llvm-svn: 332039
2018-05-10 20:53:06 +00:00
Matt Arsenault 923712b6b5 Reapply "AMDGPU: Add 32-bit constant address space"
This reverts r324494 and reapplies r324487.

llvm-svn: 324747
2018-02-09 16:57:57 +00:00
Rafael Espindola f4e3f3e31c Revert "AMDGPU: Add 32-bit constant address space"
This reverts commit r324487.

It broke clang tests.

llvm-svn: 324494
2018-02-07 18:09:35 +00:00
Marek Olsak 871c30e540 AMDGPU: Add 32-bit constant address space
Note: This is a candidate for LLVM 6.0, because it was planned to be
      in that release but was delayed due to a long review period.

Merge conflict in release_60 - resolution:
    Add "-p6:32:32" into the second (non-amdgiz) string.

Only scalar loads support 32-bit pointers. An address in a VGPR will
fail to compile. That's OK because the results of loads will only be used
in places where VGPRs are forbidden.

Updated AMDGPUAliasAnalysis and used SReg_64_XEXEC.
The tests cover all uses cases we need for Mesa.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D41651

llvm-svn: 324487
2018-02-07 16:01:00 +00:00
Aditya Nandakumar 18b3f9d384 [GISel] Make constrainSelectedInstRegOperands() available to the legalizer. NFC
https://reviews.llvm.org/D42149

llvm-svn: 322743
2018-01-17 19:31:33 +00:00
Daniel Sanders f76f315436 [globalisel][tablegen] Generate rule coverage and use it to identify untested rules
Summary:
This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
causes TableGen to instrument the generated table to collect rule coverage
information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
LLVM_ENABLE_DAGISEL_COV. The information is written to files
(${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
read this information and use it to emit warnings about untested rules.

This technique could also be used by SelectionDAG and can be further
extended to detect hot rules and give them priority over colder rules.

Usage:
* Enable LLVM_ENABLE_GISEL_COV in CMake
* Build the compiler and run some tests
* cat gisel-coverage-[0-9]* > gisel-coverage-all
* Delete lib/Target/*/*GenGlobalISel.inc*
* Build the compiler

Known issues:
* ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual
  step due to a lack of a portable 'cat' command. It should be the
  concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
* There's no mechanism to discard coverage information when the ruleset
  changes

Depends on D39742

Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka

Reviewed By: rovka

Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39747

llvm-svn: 318356
2017-11-16 00:46:35 +00:00
Matt Arsenault fd02314113 AMDGPU: Start adding offset fields to flat instructions
llvm-svn: 305194
2017-06-12 15:55:58 +00:00
Matt Arsenault 47ccafe787 AMDGPU: Remove tfe bit from flat instruction definitions
We don't use it and it was removed in gfx9, and the encoding
bit repurposed.

Additionally actually using it requires changing the output register
class, which wasn't done anyway.

llvm-svn: 302814
2017-05-11 17:38:33 +00:00
Yaxun Liu 1a14bfa022 [AMDGPU] Get address space mapping by target triple environment
As we introduced target triple environment amdgiz and amdgizcl, the address
space values are no longer enums. We have to decide the value by target triple.

The basic idea is to use struct AMDGPUAS to represent address space values.
For address space values which are not depend on target triple, use static
const members, so that they don't occupy extra memory space and is equivalent
to a compile time constant.

Since the struct is lightweight and cheap, it can be created on the fly at
the point of usage. Or it can be added as member to a pass and created at
the beginning of the run* function.

Differential Revision: https://reviews.llvm.org/D31284

llvm-svn: 298846
2017-03-27 14:04:01 +00:00
Tom Stellard 124f5cc8c2 AMDGPU/SI: Fix inst-select-load-smrd.mir on some builds
Summary:
For some reason instructions are being inserted in the wrong order with some
builds.  I'm not sure why this is happening.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D29325

llvm-svn: 293639
2017-01-31 15:24:11 +00:00
Tom Stellard ca16621b2a Re-commit AMDGPU/GlobalISel: Add support for simple shaders
Fix build when global-isel is disabled and fix a warning.

Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.

Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm

Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D26730

llvm-svn: 293551
2017-01-30 21:56:46 +00:00
Tom Stellard 7a19d56f73 Revert "AMDGPU/GlobalISel: Add support for simple shaders"
This reverts commit r293503.

Revert while I investigate some of the buildbot failures.

llvm-svn: 293509
2017-01-30 17:42:41 +00:00
Tom Stellard e48f60aec8 AMDGPU/GlobalISel: Add support for simple shaders
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.

Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm

Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D26730

llvm-svn: 293503
2017-01-30 17:09:15 +00:00