Commit Graph

1680 Commits

Author SHA1 Message Date
Matt Arsenault ea956685a1 GlobalISel: Implement s32->s64 G_FPTOSI lowering
Port directly from DAG version.

The lowering for G_FPTOUI used to fail on AMDGPU because it uses
G_FPTOSI.
2020-01-30 08:47:07 -05:00
Dominik Montada dc141af755 [GlobalISel] (fix) Use pointer type size for offset constant when lowering stores
Commit 9965b12fd1 was supposed to change the offset constant when
lowering load/stores, but only introduced this change for loads. This
patch adds the same fix for stores.
2020-01-30 08:32:35 -05:00
Matt Arsenault c5fffa4da3 GlobalISel: Add observer argument to legalizeIntrinsic
This is passed to legalizeCustom, but not intrinsic. Also remove the
MRI argument, since you can get that from the MachineIRBuilder.

I'm not sure why MachineIRBuilder has a private observer member, and
this is passed separately.
2020-01-29 18:33:45 -05:00
Amara Emerson c12f046eb9 [GlobalISel] Add new combine to convert scalar G_MUL to G_SHL.
For pow2 constants we should use G_SHL for pattern matching (and perf)
purposes later.

Vector support not yet implemented.

Differential Revision: https://reviews.llvm.org/D73659
2020-01-29 13:39:00 -08:00
Amara Emerson 0da937bb5c [GlobalISel][IRTranslator] Follow convention and put constant offset of getelementptr arithmetic on RHS.
We were needlessly putting known constant values on the LHS of a G_MUL, which
is suboptimal.

Differential Revision: https://reviews.llvm.org/D73650
2020-01-29 11:37:19 -08:00
Matt Arsenault b63629a58d GlobalISel: Fix mask computation in lowerInsert
This is supposed to be the high bit index, not the width. Use the
wrapping form of getBitsSet and avoid the bitflip.
2020-01-29 08:25:36 -08:00
Matt Arsenault f717483acd GlobalISel: Assert on invalid bitcast in MIRBuilder
The other casts validate, so this should too.
2020-01-29 07:49:39 -08:00
Matt Arsenault c5c1bb3374 GlobalISel: Lower G_WRITE_REGISTER 2020-01-29 06:48:24 -08:00
David Stenberg 6a2413c435 [ARM64] Debug info for structure argument missing DW_AT_location
Summary:
Prevent eliminating dbg_val due to COPY.

Fixes this
https://bugs.llvm.org/show_bug.cgi?id=40709

Patch by: Kamlesh Kumar (kamleshbhalui)

Reviewers: aprantl, dblaikie, vsk, dsanders

Reviewed By: dsanders

Subscribers: dstenb, kristof.beyls, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D73159
2020-01-29 10:56:23 +01:00
Jay Foad cbbbd5b5f6 [GlobalISel] Make use of KnownBits::computeForAddSub
Summary:
This is mostly NFC. computeForAddSub may give more precise results in
some cases, but that doesn't seem to affect any existing GlobalISel
tests.

Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73431
2020-01-27 22:22:56 +00:00
Dominik Montada 9965b12fd1 Use pointer type size for offset constant when lowering load/stores 2020-01-27 06:55:32 -08:00
Matt Arsenault 2a160ba5b0 GlobalISel: Reimplement widenScalar for G_UNMERGE_VALUES results
Only use shifts if the requested type exactly matches the source type,
and create sub-unmerges otherwise.
2020-01-27 06:18:26 -08:00
Matt Arsenault 06d9230fef GlobalISel: Translate vector GEPs 2020-01-27 05:35:05 -08:00
Petar Avramovic cbf03aee6d [MIPS GlobalISel] Select population count (popcount)
G_CTPOP is generated from llvm.ctpop.<type> intrinsics, clang generates
these intrinsics from __builtin_popcount and __builtin_popcountll.
Add lower and narrow scalar for G_CTPOP.
Lower G_CTPOP for MIPS32.

Differential Revision: https://reviews.llvm.org/D73216
2020-01-27 09:59:50 +01:00
Petar Avramovic 8bc7ba5b9e [MIPS GlobalISel] Select count trailing zeros
llvm.cttz.<type> intrinsic has additional i1 argument is_zero_undef,
it tells whether zero as the first argument produces a defined result.
G_CTTZ is generated from llvm.cttz.<type> (<type> <src>, i1 false)
intrinsics, clang generates these intrinsics from __builtin_ctz and
__builtin_ctzll.
G_CTTZ_ZERO_UNDEF comes from llvm.cttz.<type> (<type> <src>, i1 true).
Clang generates such intrinsics as parts of expansion of builtin_ffs
and builtin_ffsll. It is also traditionally part of and many
algorithms that are now predicated on avoiding zero-value inputs.

Add narrow scalar (algorithm uses G_CTTZ_ZERO_UNDEF) for G_CTTZ.
Lower G_CTTZ and G_CTTZ_ZERO_UNDEF for MIPS32.

Differential Revision: https://reviews.llvm.org/D73215
2020-01-27 09:51:06 +01:00
Petar Avramovic 2b66d32f3f [MIPS GlobalISel] Select count leading zeros
llvm.ctlz.<type> intrinsic has additional i1 argument is_zero_undef,
it tells whether zero as the first argument produces a defined result.
MIPS clz instruction returns 32 for zero input.
G_CTLZ is generated from llvm.ctlz.<type> (<type> <src>, i1 false)
intrinsics, clang generates these intrinsics from __builtin_clz and
__builtin_clzll.
G_CTLZ_ZERO_UNDEF can also be generated from llvm.ctlz with true as
second argument. It is also traditionally part of and many algorithms
that are now predicated on avoiding zero-value inputs.

Add narrow scalar for G_CTLZ (algorithm uses G_CTLZ_ZERO_UNDEF).
Lower G_CTLZ_ZERO_UNDEF and select G_CTLZ for MIPS32.

Differential Revision: https://reviews.llvm.org/D73214
2020-01-27 09:43:38 +01:00
Quentin Colombet 5d87b5d202 [GISelKnownBits] Add support for PHIs
Teach the GISelKnowBits analysis how to deal with PHI operations.
PHIs are essentially COPYs happening on edges, so we can just reuse
the code for COPY.

This is NFC COPY-wise has we leave Depth untouched when calling
computeKnownBitsImpl for COPYs, like it was before this patch.
Increasing Depth is however required for PHIs as they may loop back to
themselves and we would end up in an infinite loop if we were not
increasing Depth.

Differential Revision: https://reviews.llvm.org/D73317
2020-01-24 16:43:52 -08:00
Guillaume Chatelet 805c157e8a [Alignment][NFC] Deprecate Align::None()
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.

Reviewers: xbolva00, courbet, bollu

Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73099
2020-01-24 12:53:58 +01:00
Jay Foad b482e1bfe2 [CodeGen] Make use of MachineInstrBuilder::getReg
Reviewers: arsenm

Subscribers: wdng, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73262
2020-01-23 13:38:13 +00:00
Quentin Colombet ff1f3cc1a1 [GISelKnownBits] Make the max depth a parameter of the analysis
Allow users of that analysis to define the cut off depth of the
analysis instead of hardcoding 6.

NFC as the default parameter is 6.
2020-01-21 11:35:31 -08:00
Matt Arsenault a66d2817ca GlobalISel: Don't ignore requested ext narrowing type
This was assuming the narrow target was the source type. Respect the
requested type when these don't match by using intermediate
merges. This avoids producing very wide, illegal shift expansions.
2020-01-16 14:29:37 -05:00
Matt Arsenault be31a7b7ee GlobalISel: Move extension scalar narrowing to separate function
Also rename a few things. Handling a different requested type will
require this to become much more complex.
2020-01-16 14:29:37 -05:00
Matt Arsenault d0943537e1 GlobalISel: Apply target MMO flags to atomics
Unify MMO flag handling with SelectionDAG like with loads and stores.
2020-01-16 13:49:43 -05:00
Matt Arsenault 0d0fce42b0 GlobalISel: Preserve load/store metadata in IRTranslator
This was dropping the invariant metadata on dead argument loads, so
they weren't deleted.

Atomics still need to be fixed the same way. Also, apparently store
was never preserving dereferencable which should also be fixed.
2020-01-16 13:49:43 -05:00
Jay Foad 885260d5d8 [GlobalISel] Don't arbitrarily limit a mask to 64 bits
Reviewers: arsenm

Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72853
2020-01-16 16:13:20 +00:00
Jay Foad 63f73545dd [GlobalISel] Pass MachineOperands into MachineIRBuilder helper methods
Reviewers: arsenm, aditya_nandakumar, aemerson

Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72849
2020-01-16 16:04:21 +00:00
Jay Foad 28bb43bdf8 [GlobalISel] Use more MachineIRBuilder helper methods
Reviewers: arsenm, nhaehnle

Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72833
2020-01-16 15:34:51 +00:00
Matt Arsenault 25e9938a45 GlobalISel: Handle more cases of G_SEXT narrowing
This now develops the same problem G_ZEXT/G_ANYEXT have where the
requested type is assumed to be the source type. This will be fixed
separately by creating intermediate merges.
2020-01-15 18:33:15 -05:00
Matt Arsenault 936483fb7d GlobalISel: Implement lower for G_BITCAST
Bitcast only really applies between scalars and vectors. Implement as
an unmerge and remerge. The test needs to tolerate failure since one
of the unmerges currently fails to legalize.
2020-01-15 08:58:58 -05:00
Matt Arsenault 91715617ad GlobalISel: Fix narrowScalar for G_ANYEXT results
This is nearly the same as G_ZEXT.
2020-01-15 08:58:57 -05:00
Eli Friedman e68e4cbcc5 [GlobalISel] Change representation of shuffle masks in MachineOperand.
We're planning to remove the shufflemask operand from ShuffleVectorInst
(D72467); fix GlobalISel so it doesn't depend on that Constant.

The change to prelegalizercombiner-shuffle-vector.mir happens because
the input contains a literal "-1" in the mask (so the parser/verifier
weren't really handling it properly). We now treat it as equivalent to
"undef" in all contexts.

Differential Revision: https://reviews.llvm.org/D72663
2020-01-13 16:55:41 -08:00
Matt Arsenault 0ea3c7291f GlobalISel: Handle llvm.read_register
Compared to the attempt in bdcc6d3d26,
this uses intermediate generic instructions.
2020-01-09 17:37:52 -05:00
Matt Arsenault 595ac8c46e GlobalISel: Move getLLTForMVT/getMVTForLLT
As an intermediate step, some TLI functions can be converted to using
LLT instead of MVT. Move this somewhere out of GlobalISel so DAG
functions can use these.
2020-01-09 16:32:51 -05:00
Matt Arsenault fba1fbb9c7 GlobalISel: Don't assert on MoreElements creating vectors
If the original type was a scalar, it should be valid to add elements
to turn it into a vector.

Tests included with following legalization change.
2020-01-09 16:29:44 -05:00
Amara Emerson b6598bcf4b [AArch64][GlobalISel] Fold a chain of two G_PTR_ADDs of constant offsets.
E.g.
%addr1 = G_PTR_ADD %base, G_CONSTANT 20
%addr2 = G_PTR_ADD %addr1, G_CONSTANT 8
  -->
%addr2 = G_PTR_ADD %base, G_CONSTANT 28

Differential Revision: https://reviews.llvm.org/D72351
2020-01-07 14:12:42 -08:00
Matt Arsenault f3de8ab5cc GlobalISel: Implement lower for G_INTRINSIC_ROUND
Mostly copied from AMDGPU lowering implementation, except used
G_SITOFP instead of directly creating a select on -1.0, 0.0.
2020-01-06 18:26:42 -05:00
Matt Arsenault 1060b9e23b GlobalISel: Correct result type for G_FCMP in lowerFPTOUI
Using the final result type doesn't make any sense. Use the natural
default boolean type for the select condition.
2020-01-06 17:21:51 -05:00
Matt Arsenault 0b093f0212 GlobalISel: Start adding computeNumSignBits to GISelKnownBits 2020-01-06 17:21:51 -05:00
Matt Arsenault d12f2a2998 GlobalISel: Scalarize all division operations
This only handled G_SDIV, but they all are trivially scalarizable.

Also define placeholder AMDGPU division legalizer rules.
2020-01-04 13:47:10 -05:00
Matt Arsenault 1f950ced50 GlobalISel: Define G_READCYCLECOUNTER 2020-01-04 13:10:19 -05:00
Matt Arsenault 21309eafde GlobalISel: Add type argument to getRegBankFromRegClass
AMDGPU can't unambiguously go back from the selected instruction
register class to the register bank without knowing if this was used
in a boolean context.
2020-01-03 16:25:10 -05:00
Reid Kleckner 9c2b72821b Move tail call disabling code to target independent code
When the "disable-tail-calls" attribute was added, checks were added for
it in various backends. Now this code has proliferated, and it is
something the target is responsible for checking. Move that
responsibility back to the ISels (fast, global, and SD).

There's no major functionality change, except for targets that never
implemented this check.

This LLVM attribute was originally added in
d9699bc7bd (2015).

Reviewers: echristo, MaskRay

Differential Revision: https://reviews.llvm.org/D72118
2020-01-03 11:27:41 -08:00
Petar Avramovic 98f72a5107 [MIPS GlobalISel] Select bitreverse. Recommit
G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.

Recommit notes:
Introduce temporary variables in order to make sure
instructions get inserted into MachineFunction in same order
regardless of compiler used to build llvm.

Differential Revision: https://reviews.llvm.org/D71363
2019-12-30 18:06:29 +01:00
Matt Arsenault 9fd31fdbd3 GlobalISel: moreElementsVector for FP min/max 2019-12-30 10:39:53 -05:00
Dmitri Gribenko 32cc14100e Revert "[MIPS GlobalISel] Select bitreverse"
This reverts commit dbc136e0fe.
It broke buildbots:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/21066
2019-12-30 14:29:47 +01:00
Petar Avramovic dbc136e0fe [MIPS GlobalISel] Select bitreverse
G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.

Differential Revision: https://reviews.llvm.org/D71363
2019-12-30 11:26:45 +01:00
Petar Avramovic 94a24e7a40 [MIPS GlobalISel] Select bswap
G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates
these intrinsics from __builtin_bswap32 and __builtin_bswap64.
Add lower and narrowscalar for G_BSWAP.
Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later.

Differential Revision: https://reviews.llvm.org/D71362
2019-12-30 11:13:22 +01:00
Matt Arsenault 0d47399167 GlobalISel: Update syntax in debug printing
Physical register names now start with $, not %
2019-12-24 10:37:36 -05:00
Matt Arsenault 9b61641564 GlobalISel: Fix naming variables "brank" instead of "bank" 2019-12-24 10:36:54 -05:00
Martin Storsjö 5a751e747d [AArch64] [Windows] Use COFF stubs for calls to extern_weak functions
As the extern_weak target might be missing, resolving to the absolute
address zero, we can't use the normal direct PC-relative branch
instructions (as that would result in relocations out of range).

Improve the classifyGlobalFunctionReference method to set
MO_DLLIMPORT/MO_COFFSTUB, and simplify the existing code in
AArch64TargetLowering::LowerCall to use the return value from
classifyGlobalFunctionReference for these cases.

Add code in both AArch64FastISel and GlobalISel/IRTranslator to
bail out for function calls to extern weak functions on windows,
to let SelectionDAG handle them.

This matches what was done for X86 in 6bf108d77a.

Differential Revision: https://reviews.llvm.org/D71721
2019-12-23 12:13:49 +02:00
Daniel Sanders c3cb089a87 [gicombiner] Import tryCombineIndexedLoadStore()
Summary:
Now that arbitrary data is supported, import tryCombineIndexedLoadStore()

Depends on D69147

Reviewers: bogner, volkan

Reviewed By: volkan

Subscribers: hiraditya, arphaman, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69151
2019-12-18 14:41:38 +00:00
Roman Tereshin 8731799fc6 [Legalizer] Making artifact combining order-independent
Legalization algorithm is complicated by two facts:
1) While regular instructions should be possible to legalize in
   an isolated, per-instruction, context-free manner, legalization
   artifacts can only be eliminated in pairs, which could be deeply, and
   ultimately arbitrary nested: { [ () ] }, where which paranthesis kind
   depicts an artifact kind, like extend, unmerge, etc. Such structure
   can only be fully eliminated by simple local combines if they are
   attempted in a particular order (inside out), or alternatively by
   repeated scans each eliminating only one innermost pair, resulting in
   O(n^2) complexity.
2) Some artifacts might in fact be regular instructions that could (and
   sometimes should) be legalized by the target-specific rules. Which
   means failure to eliminate all artifacts on the first iteration is
   not a failure, they need to be tried as instructions, which may
   produce more artifacts, including the ones that are in fact regular
   instructions, resulting in a non-constant number of iterations
   required to finish the process.

I trust the recently introduced termination condition (no new artifacts
were created during as-a-regular-instruction-retrial of artifacts not
eliminated on the previous iteration) to be efficient in providing
termination, but only performing the legalization in full if and only if
at each step such chains of artifacts are successfully eliminated in
full as well.

Which is currently not guaranteed, as the artifact combines are applied
only once and in an arbitrary order that has to do with the order of
creation or insertion of artifacts into their worklist, which is a no
particular order.

In this patch I make a small change to the artifact combiner, making it
to re-insert into the worklist immediate (modulo a look-through copies)
artifact users of each vreg that changes its definition due to an
artifact combine.

Here the first scan through the artifacts worklist, while not
being done in any guaranteed order, only needs to find the innermost
pair(s) of artifacts that could be immediately combined out. After that
the process follows def-use chains, making them shorter at each step, thus
combining everything that can be combined in O(n) time.

Reviewers: volkan, aditya_nandakumar, qcolombet, paquette, aemerson, dsanders

Reviewed By: aditya_nandakumar, paquette

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71448
2019-12-13 15:45:18 -08:00
Roman Tereshin 18bf9670aa [Legalizer] Refactoring out legalizeMachineFunction
and introducing new unittests/CodeGen/GlobalISel/LegalizerTest.cpp
relying on it to unit test the entire legalizer algorithm (including the
top-level main loop).

See also https://reviews.llvm.org/D71448
2019-12-13 15:45:18 -08:00
Roman Tereshin 8207c81597 [Legalizer] More detailed debugging printing in main loop 2019-12-13 15:45:18 -08:00
Kiran Chandramohan 965ed1e974 [AArch64] Fix issues with large arrays on stack
Summary:
This patch fixes a few issues when large arrays are allocated on the
stack. Currently, clang has inconsistent behaviour, for debug builds
there is an assertion failure when the array size on stack is around 2GB
but there is no assertion when the stack is around 8GB. For release
builds there is no assertion, the compilation succeeds but generates
incorrect code. The incorrect code generated is due to using
int/unsigned int instead of their 64-bit counterparts. This patch,
1) Removes the assertion in frame legality check.
2) Converts int/unsigned int in some places to the 64-bit variants. This
helps in generating correct code and removes the inconsistent behaviour.
3) Adds a test which runs without optimisations.

Reviewers: sdesmalen, efriedma, fhahn, aemerson

Reviewed By: efriedma

Subscribers: eli.friedman, fpetrogalli, kristof.beyls, hiraditya,
llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70496
2019-12-10 11:44:41 +00:00
Volkan Keles bfa3d260b8 [GlobalISel] Localizer: Allow targets not to run the pass conditionally
Summary:
Previously, it was not possible to skip running the localizer pass
conditionally. This patch adds an input function to the pass which
decides if the pass should run on the given MachineFunction or not.

No test case as there is no upstream target needs this functionality.

Reviewers: qcolombet

Reviewed By: qcolombet

Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71038
2019-12-05 11:09:50 -08:00
Amara Emerson 28f5ad5801 [GlobalISel] Fix compiler crash lowering G_LOAD in AArch64.
Patch by Daniel Rodríguez Troitiño.

Differential Revision: https://reviews.llvm.org/D70794
2019-12-04 17:04:54 -08:00
Aditya Nandakumar 6da7dbb806 [GlobalISel]: Allow targets to override how to widen constants during legalization
https://reviews.llvm.org/D70922

This adds a hook to allow targets to define exactly what extension
operation should be performed for widening constants. This handles cases
like widening i1 true which would end up becoming -1 which affects code
quality during combines.
Additionally, in order to stay consistent with how DAG is promoting
constants, we now signextend for byte sized types and zero extend
otherwise (by default). Targets can of course override this if
necessary.
2019-12-03 10:41:10 -08:00
Volkan Keles 3d02fa6da7 [GlobalISel] CombinerHelper: Fix a bug in matchCombineCopy
Summary:
When combining COPY instructions, we were replacing the destination registers
with the source register without checking register constraints. This patch adds
a simple logic to check if the constraints match before replacing registers.

Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, dsanders, Petar.Avramovic

Reviewed By: aditya_nandakumar

Subscribers: rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70616
2019-12-02 12:05:09 -08:00
Tom Stellard ab411801b8 [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO.  I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so.  Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:

1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so.  This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.

With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.

2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set.  This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.

I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:

- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON

Reviewers: beanz, smeenai, compnerd, phosek

Reviewed By: beanz

Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70179
2019-11-21 10:48:08 -08:00
Matt Arsenault b696b9dba7 DAG: Add function context to isFMAFasterThanFMulAndFAdd
AMDGPU needs to know the FP mode for the function to answer this
correctly when this is removed from the subtarget.

AArch64 had to make this more complicated by using this from an IR
hook, so add an IR typed overload.
2019-11-19 19:25:26 +05:30
Quentin Colombet 98ceac4981 [GISel][CombinerHelper] Use uses() instead of operands() when traversing use operands.
NFC
2019-11-15 13:54:33 -08:00
Quentin Colombet 304abde077 [GISel][CombinerHelper] Add support for scalar type for the result of shuffle vector
LLVM IR of 1-element vectors get lower into scalar in GISel. As a
result, shuffle vector may also produce a scalar.

This patch teaches the shuffle combiner how to deal with scalars when
they are in the destination type of a shuffle vector.

For now, we just support the easy case where this can be lowered to
a plain copy. For other cases, we leave the shuffle vector as is.

This type of IR are seen in O0 pipelines. E.g., as produced with
SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c.

rdar://problem/57198904
2019-11-15 13:54:33 -08:00
Matt Arsenault bc276c6379 GlobalISel: Lower s1 source G_SITOFP/G_UITOFP 2019-11-15 13:37:20 +05:30
Daniel Sanders b2839c442e [globalisel][irtanslator] The IRTranslator should preserve TBAA information 2019-11-14 12:11:27 -08:00
Reid Kleckner 05da2fe521 Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 16:34:37 -08:00
Daniel Sanders e74c5b9661 [globalisel] Rename G_GEP to G_PTR_ADD
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD

Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm

Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69734
2019-11-05 10:31:17 -08:00
Hiroshi Yamauchi 0d987e411a [PGO][PGSO] TargetLowering/TargetTransformationInfo/SwitchLoweringUtils part.
Summary:
(Split of off D67120)

TargetLowering/TargetTransformationInfo/SwitchLoweringUtils changes for profile
guided size optimization.

Reviewers: davidxl

Subscribers: eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69580
2019-10-31 13:22:56 -07:00
Quentin Colombet f0eeb3c7a7 [GISel][CombinerHelper] Combine shuffle_vector scalar to build_vector
Teach the combiner helper how to replace shuffle_vector of scalars
into build_vector.
I am not particularly happy about having to add this combine, but we
currently get those from <1 x iN> from the IR.

Bonus: This fixes an assert in the shuffle_vector combines since before
this patch, we were expecting vector types.
2019-10-30 18:20:37 -07:00
Greg Bedwell b1c4b4d5cb Fix a spelling mistake in a comment. NFC 2019-10-29 12:19:52 +00:00
Andrew Paverd d157a9bc8b Add Windows Control Flow Guard checks (/guard:cf).
Summary:
A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the `cfguard` CC1 option.

Reviewers: thakis, rnk, theraven, pcc

Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65761
2019-10-28 15:19:39 +00:00
Matt Arsenault 1a276d1e8c GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELT 2019-10-25 13:55:07 -07:00
Austin Kerbow c35b358b74 AMDGPU/GlobalISel: Legalize FDIV16
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69347
2019-10-25 11:07:17 -07:00
Quentin Colombet 6f0ae81512 [GISel][CombinerHelper] Add a combine turning shuffle_vector into concat_vectors
Teach the CombinerHelper how to turn shuffle_vectors, that
concatenate vectors, into concat_vectors and add this combine
to the AArch64 pre-legalizer combiner.

Differential Revision: https://reviews.llvm.org/D69149

llvm-svn: 375452
2019-10-21 20:39:58 +00:00
Guillaume Chatelet 5df90cd71c [Alignment][NFC] TargetCallingConv::setByValAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69248

llvm-svn: 375410
2019-10-21 12:05:33 +00:00
Guillaume Chatelet bac5f6bd21 [Alignment][NFC] TargetCallingConv::setOrigAlign and TargetLowering::getABIAlignmentForCallingConv
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69243

llvm-svn: 375407
2019-10-21 11:01:55 +00:00
Reid Kleckner 1d7b41361f Prune two MachineInstr.h includes, fix up deps
MachineInstr.h included AliasAnalysis.h, which includes a world of IR
constructs mostly unneeded in CodeGen. Prune it. Same for
DebugInfoMetadata.h.

Noticed with -ftime-trace.

llvm-svn: 375311
2019-10-19 00:22:07 +00:00
Daniel Sanders 149a020425 Fix unused variable in r375066
llvm-svn: 375070
2019-10-17 01:21:40 +00:00
Daniel Sanders 329e748c8c [gicombiner] Add the run-time rule disable option
Summary:
Each generated helper can be configured to generate an option that disables
rules in that helper. This can be used to bisect rulesets.

The disable bits are stored in a SparseVector as this is very cheap for the
common case where nothing is disabled. It gets more expensive the more rules
are disabled but you're generally doing that for debug purposes where
performance is less of a concern.

Depends on D68426

Reviewers: volkan, bogner

Reviewed By: volkan

Subscribers: hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68438

llvm-svn: 375067
2019-10-17 00:37:04 +00:00
Quentin Colombet c319afc903 [GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => build_vector
Teach the combiner helper how to flatten concat_vectors of build_vectors
into a build_vector.

Add this combine as part of AArch64 pre-legalizer combiner.

Differential Revision: https://reviews.llvm.org/D69071

llvm-svn: 375066
2019-10-17 00:34:32 +00:00
Daniel Sanders ec5208fd65 [gicombiner] Hoist pure C++ combine into the tablegen definition
Summary:
This is just moving the existing C++ code around and will be NFC w.r.t
AArch64. Renamed 'CombineBr' to something more descriptive
('ElideByByInvertingCond') at the same time.

The remaining combines in AArch64PreLegalizeCombiner require features that
aren't implemented at this point and will be hoisted as they are added.

Depends on D68424

Reviewers: bogner, volkan

Subscribers: kristof.beyls, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68426

llvm-svn: 375057
2019-10-16 23:53:35 +00:00
Matt Arsenault 34ed76e180 GlobalISel: Implement lower for G_SADDO/G_SSUBO
Port directly from SelectionDAG, minus the path using
ISD::SADDSAT/ISD::SSUBSAT.

llvm-svn: 375042
2019-10-16 20:46:32 +00:00
Simon Pilgrim e2163f96ab CombinerHelper - silence dead assignment warnings. NFCI.
Copy the NewAlignment value to Alignment first and then use that to update the stack frame object alignments.

llvm-svn: 375019
2019-10-16 17:21:50 +00:00
Jeremy Morse ed29dbaafa [DebugInfo] Remove some users of DBG_VALUEs IsIndirect field
This patch kills off a significant user of the "IsIndirect" field of
DBG_VALUE machine insts. Brought up in in PR41675, IsIndirect is
techncally redundant as it can be expressed by the DIExpression of a
DBG_VALUE inst, and it isn't helpful to have two ways of expressing
things.

Rather than setting IsIndirect, have DBG_VALUE creators add an extra deref
to the insts DIExpression. There should now be no appearences of
IsIndirect=True from isel down to LiveDebugVariables / VirtRegRewriter,
which is ensured by an assertion in LDVImpl::handleDebugValue. This means
we also get to delete the IsIndirect handling in LiveDebugVariables. Tests
can be upgraded by for example swapping the following IsIndirect=True
DBG_VALUE:

  DBG_VALUE $somereg, 0, !123, !DIExpression(DW_OP_foo)

With one where the indirection is in the DIExpression, by _appending_
a deref:

  DBG_VALUE $somereg, $noreg, !123, !DIExpression(DW_OP_foo, DW_OP_deref)

Which both mean the same thing. 

Most of the test changes in this patch are updates of that form; also some
changes in how the textual assembly printer handles these insts.

Differential Revision: https://reviews.llvm.org/D68945

llvm-svn: 374877
2019-10-15 10:46:24 +00:00
Joerg Sonnenberger 9681ea9560 Reapply r374743 with a fix for the ocaml binding
Add a pass to lower is.constant and objectsize intrinsics

This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374784
2019-10-14 16:15:14 +00:00
Dmitri Gribenko 1a21f98ac3 Revert "Add a pass to lower is.constant and objectsize intrinsics"
This reverts commit r374743. It broke the build with Ocaml enabled:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218

llvm-svn: 374768
2019-10-14 12:22:48 +00:00
Joerg Sonnenberger e4300c392d Add a pass to lower is.constant and objectsize intrinsics
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374743
2019-10-13 23:00:15 +00:00
Simon Pilgrim 944a051ebb IRTranslator - silence static analyzer null dereference warnings. NFCI.
The CmpInst::getType() calls can be replaced by just using User::getType() that it was dyn_cast from, and we then need to assert that any default predicate cases came from the CmpInst.

llvm-svn: 374716
2019-10-13 11:29:35 +00:00
Quentin Colombet 9c36ec5941 [GISel][CallLowering] Enable vector support in argument lowering
The exciting code is actually already enough to handle the splitting
of vector arguments but we were lacking a test case.

This commit adds a test case for vector argument lowering involving
splitting and enable the related support in call lowering.

llvm-svn: 374589
2019-10-11 20:22:57 +00:00
Quentin Colombet 7720f11498 [MachineIRBuilder] Fix an assertion failure with buildMerge
Teach buildMerge how to deal with scalar to vector kind of requests.

Prior to this patch, buildMerge would issue either a G_MERGE_VALUES
when all the vregs are scalars or a G_CONCAT_VECTORS when the destination
vreg is a vector.
G_CONCAT_VECTORS was actually not the proper instruction when the source
vregs were scalars and the compiler would assert that the sources must
be vectors. Instead we want is to issue a G_BUILD_VECTOR when we are
in this situation.

This patch fixes that.

llvm-svn: 374588
2019-10-11 20:22:47 +00:00
Marcello Maggioni a064edf55e [GISel] Simplifying return from else in function. NFC
Forgot to integrate this little change in previous commit

llvm-svn: 374463
2019-10-10 21:51:30 +00:00
Marcello Maggioni 0112123eea [GISel] Allow getConstantVRegVal() to return G_FCONSTANT values.
In GISel we have both G_CONSTANT and G_FCONSTANT, but because
in GISel we don't really have a concept of Float vs Int value
the only difference between the two is where the data originates
from.

What both G_CONSTANT and G_FCONSTANT return is just a bag of bits
with the constant representation in it.

By making getConstantVRegVal() return G_FCONSTANTs bit representation
as well we allow ConstantFold and other things to operate with
G_FCONSTANT.

Adding tests that show ConstantFolding to work on mixed G_CONSTANT
and G_FCONSTANT sources.

Differential Revision: https://reviews.llvm.org/D68739

llvm-svn: 374458
2019-10-10 21:46:26 +00:00
Guillaume Chatelet ff054b9e32 [Alignment][NFC] Use llv::Align in GISelKnownBits
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68786

llvm-svn: 374369
2019-10-10 15:38:22 +00:00
Matt Arsenault 3cd3959fe2 GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR
Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR.

llvm-svn: 374252
2019-10-09 22:44:43 +00:00
Matt Arsenault 4bcdcad91b GlobalISel: Partially implement lower for G_INSERT
llvm-svn: 373946
2019-10-07 19:13:27 +00:00
Matt Arsenault 27269054d2 GlobalISel: Add target pre-isel instructions
Allows targets to introduce regbankselectable
pseudo-instructions. Currently the closet feature to this is an
intrinsic. However this requires creating a public intrinsic
declaration. This litters the public intrinsic namespace with
operations we don't necessarily want to expose to IR producers, and
would rather leave as private to the backend.

Use a new instruction bit. A previous attempt tried to keep using enum
value ranges, but it turned into a mess.

llvm-svn: 373937
2019-10-07 18:43:29 +00:00
Jordan Rose fdaa742174 Second attempt to add iterator_range::empty()
Doing this makes MSVC complain that `empty(someRange)` could refer to
either C++17's std::empty or LLVM's llvm::empty, which previously we
avoided via SFINAE because std::empty is defined in terms of an empty
member rather than begin and end. So, switch callers over to the new
method as it is added.

https://reviews.llvm.org/D68439

llvm-svn: 373935
2019-10-07 18:14:24 +00:00
Matt Arsenault a5b9c75674 GlobalISel: Partially implement lower for G_EXTRACT
Turn into shift and truncate. Doesn't yet handle pointers.

llvm-svn: 373838
2019-10-06 01:37:35 +00:00
Dmitri Gribenko 827a7fab78 Revert "GlobalISel: Handle llvm.read_register"
This reverts commit r373294. It broke Clang's
CodeGen/arm64-microsoft-status-reg.cpp:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/18483

llvm-svn: 373310
2019-10-01 08:24:01 +00:00
Matt Arsenault bdcc6d3d26 GlobalISel: Handle llvm.read_register
SelectionDAG has a bunch of machinery to defer this to selection time
for some reason. Just directly emit a copy during IRTranslator. The
x86 usage does somewhat questionably check hasFP, which could depend
on the whole function being at minimum translated.

This does lose the convergent bit if the callsite had it, which may be
a problem. We also lose that in general for intrinsics, which may also
be a problem.

llvm-svn: 373294
2019-10-01 02:07:16 +00:00
Matt Arsenault ed85b0cee6 GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sources
Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU.

llvm-svn: 373287
2019-10-01 01:06:48 +00:00
Daniel Sanders cbe13a1461 [globalisel][knownbits] Allow targets to call GISelKnownBits::computeKnownBitsImpl()
Summary:
It seems we missed that the target hook can't query the known-bits for the
inputs to a target instruction. Fix that oversight

Reviewers: aditya_nandakumar

Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67380

llvm-svn: 373264
2019-09-30 20:55:53 +00:00
Jessica Paquette b1c1095fdc [AArch64][GlobalISel] Support lowering variadic musttail calls
This adds support for lowering variadic musttail calls. To do this, we have
to...

- Detect a musttail call in a variadic function before attempting to lower the
  call's formal arguments. This is done in the IRTranslator.
- Compute forwarded registers in `lowerFormalArguments`, and add copies for
  those registers.
- Restore the forwarded registers in `lowerTailCall`.

Because there doesn't seem to be any nice way to wrap these up into the outgoing
argument handler, the restore code in `lowerTailCall` is done separately.

Also, irritatingly, you have to make sure that the registers don't overlap with
any passed parameters. Otherwise, the scheduler doesn't know what to do with the
extra copies and asserts.

Add call-translator-variadic-musttail.ll to test this. This is pretty much the
same as the X86 musttail-varargs.ll test. We didn't have as nice of a test to
base this off of, but the idea is the same.

Differential Revision: https://reviews.llvm.org/D68043

llvm-svn: 373226
2019-09-30 16:49:13 +00:00
Amara Emerson 509a4947c9 Add an operand to memory intrinsics to denote the "tail" marker.
We need to propagate this information from the IR in order to be able to safely
do tail call optimizations on the intrinsics during legalization. Assuming
it's safe to do tail call opt without checking for the marker isn't safe because
the mem libcall may use allocas from the caller.

This adds an extra immediate operand to the end of the intrinsics and fixes the
legalizer to handle it.

Differential Revision: https://reviews.llvm.org/D68151

llvm-svn: 373140
2019-09-28 05:33:21 +00:00
Guillaume Chatelet 18f805a7ea [Alignment][NFC] Remove unneeded llvm:: scoping on Align types
llvm-svn: 373081
2019-09-27 12:54:21 +00:00
Jessica Paquette 8535a8672e [AArch64][GlobalISel] Choose CCAssignFns per-argument for tail call lowering
When checking for tail call eligibility, we should use the correct CCAssignFn
for each argument, rather than just checking if the caller/callee is varargs or
not.

This is important for tail call lowering with varargs. If we don't check it,
then basically any varargs callee with parameters cannot be tail called on
Darwin, for one thing. If the parameters are all guaranteed to be in registers,
this should be entirely safe.

On top of that, not checking for this could potentially make it so that we have
the wrong stack offsets when checking for tail call eligibility.

Also refactor some of the stuff for CCAssignFnForCall and pull it out into a
helper function.

Update call-translator-tail-call.ll to show that we can now correctly tail call
on Darwin. Also add two extra tail call checks. The first verifies that we still
respect the caller's stack size, and the second verifies that we still don't
tail call when a varargs function has a memory argument.

Differential Revision: https://reviews.llvm.org/D67939

llvm-svn: 372897
2019-09-25 16:45:35 +00:00
Amara Emerson adec1209e6 [GlobalISel][IRTranslator] Fix switch table lowering to use signed LE not unsigned.
We were miscompiling switch value comparisons with the wrong signedness, which
shows up when we have things like switch case values with i1 types, which end up
being legalized incorrectly.

Fixes PR43383

llvm-svn: 372675
2019-09-24 00:09:23 +00:00
Guillaume Chatelet 1ae7905fc8 [Alignment][NFC] DataLayout migration to llvm::Align
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67914

llvm-svn: 372596
2019-09-23 12:41:36 +00:00
Simon Pilgrim f6f6c6ca3b Localizer - fix "variable used but never read" analyzer warning. NFCI.
Simplify the code by separating the modification of the Changed variable from returning it.

llvm-svn: 372583
2019-09-23 11:38:10 +00:00
Amara Emerson 7ac1039957 [GlobalISel] Defer setting HasCalls on MachineFrameInfo to selection time.
We currently always set the HasCalls on MFI during translation and legalization if
we're handling a call or legalizing to a libcall. However, if that call is later
optimized to a tail call then we don't need the flag. The flag being set to true
causes frame lowering to always save and restore FP/LR, which adds unnecessary code.

This change does the same thing as SelectionDAG and ports over some code that scans
instructions after selection, using TargetInstrInfo to determine if target opcodes
are known calls.

Code size geomean improvements on CTMark:
 -O0 : 0.1%
 -Os : 0.3%

Differential Revision: https://reviews.llvm.org/D67868

llvm-svn: 372443
2019-09-20 23:52:07 +00:00
Matt Arsenault 3ecab8e455 Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)

This was missing one switch to getTargetConstant in an untested case.

llvm-svn: 372338
2019-09-19 16:26:14 +00:00
Hans Wennborg 13bdae8541 Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"
This broke the Chromium build, causing it to fail with e.g.

  fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>

See llvm-commits thread of r372285 for details.

This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.

> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.

llvm-svn: 372314
2019-09-19 12:33:07 +00:00
Matt Arsenault d8399d12cd GlobalISel: Don't materialize immarg arguments to intrinsics
Encode them directly as an imm argument to G_INTRINSIC*.

Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.

This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.

SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.

Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.

The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.

This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.

llvm-svn: 372285
2019-09-19 01:33:14 +00:00
Amara Emerson 9d64721ca5 [GlobalISel] Partially revert r371901.
r371901 was overeager and widenScalarDst() and the like in the legalizer
attempt to increment the insert point given in order to add new instructions
after the currently legalizing inst. In cases where the insertion point is not
exactly the current instruction, then callers need to de-compensate for the
behaviour by decrementing the insertion iterator before calling them. It's not
a nice state of affairs, for now just undo the problematic parts of the change.

llvm-svn: 372050
2019-09-16 23:46:03 +00:00
Simon Pilgrim a8a4953fdf [GlobalISel] findGISelOptimalMemOpLowering - remove dead initalization. NFCI.
Fixes static analyzer warning that "Value stored to 'NewTySize' during its initialization is never read".

llvm-svn: 371937
2019-09-15 16:56:06 +00:00
Amara Emerson 02bcc86b08 [GlobalISel] Fix insertion point of new instructions to be after PHIs.
For some reason we sometimes insert new instructions one instruction before
the first non-PHI when legalizing. This can result in having non-PHI
instructions before PHIs, which mean that PHI elimination doesn't catch them.

Differential Revision: https://reviews.llvm.org/D67570

llvm-svn: 371901
2019-09-13 21:49:24 +00:00
Jessica Paquette 727328ab63 [AArch64][GlobalISel] Tail call memory intrinsics
Because memory intrinsics are handled differently than other calls, we need to
check them for tail call eligiblity in the legalizer. This allows us to still
inline them when it's beneficial to do so, but also tail call when possible.

This adds simple tail calling support for when the intrinsic is followed by a
return.

It ports the attribute checks from `TargetLowering::isInTailCallPosition` into
a similarly-named function in LegalizerHelper.cpp. The target-specific
`isUsedByReturnOnly` hook is not ported here.

Update tailcall-mem-intrinsics.ll to show that GlobalISel can now tail call
memory intrinsics.

Update legalize-memcpy-et-al.mir to have a case where we don't tail call.

Differential Revision: https://reviews.llvm.org/D67566

llvm-svn: 371893
2019-09-13 20:25:58 +00:00
Matt Arsenault 4d33918034 AMDGPU/GlobalISel: Legalize G_FMAD
Unlike SelectionDAG, treat this as a normally legalizable operation.
In SelectionDAG this is supposed to only ever formed if it's legal,
but I've found that to be restricting. For AMDGPU this is contextually
legal depending on whether denormal flushing is allowed in the use
function.

Technically we currently treat the denormal mode as a subtarget
feature, so custom lowering could be avoided. However I consider this
to be a defect, and this should be contextually dependent on the
controllable rounding mode of the parent function.

llvm-svn: 371800
2019-09-13 00:44:35 +00:00
Jessica Paquette a42070a6aa [AArch64][GlobalISel] Support sibling calls with outgoing arguments
This adds support for lowering sibling calls with outgoing arguments.

e.g

```
define void @foo(i32 %a)
```

Support is ported from AArch64ISelLowering's `isEligibleForTailCallOptimization`.
The only thing that is missing is a full port of
`TargetLowering::parametersInCSRMatch`. So, if we're using swiftself,
we'll never tail call.

- Rename `analyzeCallResult` to `analyzeArgInfo`, since the function is now used
  for both outgoing and incoming arguments
- Teach `OutgoingArgHandler` about tail calls. Tail calls use frame indices for
  stack arguments.
- Teach `lowerFormalArguments` to set the bytes in the caller's stack argument
  area. This is used later to check if the tail call's parameters will fit on
  the caller's stack.
- Add `areCalleeOutgoingArgsTailCallable` to perform the eligibility check on
  the callee's outgoing arguments.

For testing:

- Update call-translator-tail-call to verify that we can now tail call with
  outgoing arguments, use G_FRAME_INDEX for stack arguments, and respect the
  size of the caller's stack
- Remove GISel-specific check lines from speculation-hardening.ll, since GISel
  now tail calls like the other selectors
- Add a GISel test line to tailcall-string-rvo.ll since we can tail call in that
  test now
- Add a GISel test line to tailcall_misched_graph.ll since we tail call there
  now. Add specific check lines for GISel, since the debug output from the
  machine-scheduler differs with GlobalISel. The dependency still holds, but
  the output comes out in a different order.

Differential Revision: https://reviews.llvm.org/D67471

llvm-svn: 371780
2019-09-12 22:10:36 +00:00
Amara Emerson 55d86f04c7 [AArch64][GlobalISel] Fall back on attempts to allocate split types on the stack.
First we were asserting that the ValNo of a VA was the wrong value. It doesn't actually
make a difference for us in CallLowering but fix that anyway to silence the assert.

The bigger issue was that after fixing the assert we were generating invalid MIR
because the merging/unmerging of values split across multiple registers wasn't
also implemented for memory locs. This happens when we run out of registers and
have to pass the split types like i128 -> i64 x 2 on the stack. This is do-able, but
for now just fall back.

llvm-svn: 371693
2019-09-11 23:53:23 +00:00
Jessica Paquette 469d42fcf6 [GlobalISel] When a tail call is emitted in a block, stop translating it
This fixes a crash in tail call translation caused by assume and lifetime_end
intrinsics.

It's possible to have instructions other than a return after a tail call which
will still have `Analysis::isInTailCallPosition` return true. (Namely,
lifetime_end and assume intrinsics.)

If we emit a tail call, we should stop translating instructions in the block.
Otherwise, we can end up emitting an extra return, or dead instructions in
general. This makes the verifier unhappy, and is generally unfortunate for
codegen.

This also removes the code from AArch64CallLowering that checks if we have a
tail call when lowering a return. This is covered by the new code now.

Also update call-translator-tail-call.ll to show that we now properly tail call
in the presence of lifetime_end and assume.

Differential Revision: https://reviews.llvm.org/D67415

llvm-svn: 371572
2019-09-10 23:34:45 +00:00
Jessica Paquette 2af5b193d5 [AArch64][GlobalISel] Support sibling calls with mismatched calling conventions
Add support for sibcalling calls whose calling convention differs from the
caller's.

- Port over `CCState::resultsCombatible` from CallingConvLower.cpp into
  CallLowering. This is used to verify that the way the caller and callee CC
  handle incoming arguments matches up.

- Add `CallLowering::analyzeCallResult`. This is basically a port of
  `CCState::AnalyzeCallResult`, but using `ArgInfo` rather than `ISD::InputArg`.

- Add `AArch64CallLowering::doCallerAndCalleePassArgsTheSameWay`. This checks
  that the calling conventions are compatible, and that the caller and callee
  preserve the same registers.

For testing:

- Update call-translator-tail-call.ll to show that we can now handle this.

- Add a GISel line to tailcall-ccmismatch.ll to show that we will not tail call
  when the regmasks don't line up.

Differential Revision: https://reviews.llvm.org/D67361

llvm-svn: 371570
2019-09-10 23:25:12 +00:00
Aditya Nandakumar 5112b71126 [GlobalISel]: Fix a bug where we could dereference None
getConstantVRegVal returns None when dealing with constants > 64 bits.
Don't assume we always have a value in GISelKnownBits.

llvm-svn: 371465
2019-09-09 22:51:41 +00:00
Tim Northover 06d93e0a25 GlobalISel: fix unused warnings in release builds.
llvm-svn: 371385
2019-09-09 10:36:58 +00:00
Tim Northover 36147adc0b GlobalISel: add combiner to form indexed loads.
Loosely based on DAGCombiner version, but this part is slightly simpler in
GlobalIsel because all address calculation is performed by G_GEP. That makes
the inc/dec distinction moot so there's just pre/post to think about.

No targets can handle it yet so testing is via a special flag that overrides
target hooks.

llvm-svn: 371384
2019-09-09 10:04:23 +00:00
Matt Arsenault cf10372119 GlobalISel: Add G_FMAD instruction
llvm-svn: 371254
2019-09-06 20:49:10 +00:00
Daniel Sanders f803237926 [globalisel][knownbits] Account for missing type constraints
Now that we look through copies, it's possible to visit registers that
have a register class constraint but not a type constraint. Avoid looking
through copies when this occurs as the SrcReg won't be able to determine
it's bit width or any known bits.

Along the same lines, if the initial query is on a register that doesn't
have a type constraint then the result is a default-constructed KnownBits,
that is, a 1-bit fully-unknown value.

llvm-svn: 371116
2019-09-05 20:26:02 +00:00
Jessica Paquette 20e8667098 Recommit "[AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls"
Recommit basic sibling call lowering (https://reviews.llvm.org/D67189)

The issue was that if you have a return type other than void, call lowering
will emit COPYs to get the return value after the call.

Disallow sibling calls other than ones that return void for now. Also
proactively disable swifterror tail calls for now, since there's a similar issue
with COPYs there.

Update call-translator-tail-call.ll to include test cases for each of these
things.

llvm-svn: 371114
2019-09-05 20:18:34 +00:00
Simon Pilgrim 071287c5a9 Revert rL370996 from llvm/trunk: [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls
This adds support for basic sibling call lowering in AArch64. The intent here is
to only handle tail calls which do not change the ABI (hence, sibling calls.)

At this point, it is very restricted. It does not handle

- Vararg calls.
- Calls with outgoing arguments.
- Calls whose calling conventions differ from the caller's calling convention.
- Tail/sibling calls with BTI enabled.

This patch adds

- `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent
   to the same function in AArch64ISelLowering.cpp (albeit with the restrictions
   above.)
- `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in
   AArch64ISelLowering.cpp.
- `getCallOpcode`, which is exactly what it sounds like.

Tail/sibling calls are lowered by checking if they pass target-independent tail
call positioning checks, and checking if they satisfy
`isEligibleForTailCallOptimization`. If they do, then a tail call instruction is
emitted instead of a normal call. If we have a sibling call (which is always the
case in this patch), then we do not emit any stack adjustment operations. When
we go to lower a return, we check if we've already emitted a tail call. If so,
then we skip the return lowering.

For testing, this patch

- Adds call-translator-tail-call.ll to test which tail calls we currently lower,
  which ones we don't, and which ones we shouldn't.
- Updates branch-target-enforcement-indirect-calls.ll to show that we fall back
  as expected.

Differential Revision: https://reviews.llvm.org/D67189
........
This fails on EXPENSIVE_CHECKS builds due to a -verify-machineinstrs test failure in CodeGen/AArch64/dllimport.ll

llvm-svn: 371051
2019-09-05 10:38:39 +00:00
Jessica Paquette b78324fc40 [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls
This adds support for basic sibling call lowering in AArch64. The intent here is
to only handle tail calls which do not change the ABI (hence, sibling calls.)

At this point, it is very restricted. It does not handle

- Vararg calls.
- Calls with outgoing arguments.
- Calls whose calling conventions differ from the caller's calling convention.
- Tail/sibling calls with BTI enabled.

This patch adds

- `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent
   to the same function in AArch64ISelLowering.cpp (albeit with the restrictions
   above.)
- `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in
   AArch64ISelLowering.cpp.
- `getCallOpcode`, which is exactly what it sounds like.

Tail/sibling calls are lowered by checking if they pass target-independent tail
call positioning checks, and checking if they satisfy
`isEligibleForTailCallOptimization`. If they do, then a tail call instruction is
emitted instead of a normal call. If we have a sibling call (which is always the
case in this patch), then we do not emit any stack adjustment operations. When
we go to lower a return, we check if we've already emitted a tail call. If so,
then we skip the return lowering.

For testing, this patch

- Adds call-translator-tail-call.ll to test which tail calls we currently lower,
  which ones we don't, and which ones we shouldn't.
- Updates branch-target-enforcement-indirect-calls.ll to show that we fall back
  as expected.

Differential Revision: https://reviews.llvm.org/D67189

llvm-svn: 370996
2019-09-04 22:54:52 +00:00
Matt Arsenault 5ff310e298 GlobalISel: Add basic legalization for G_BITREVERSE
llvm-svn: 370979
2019-09-04 20:46:15 +00:00
Daniel Sanders b276a9a51e [globalisel] Support trivial COPY in GISelKnownBits
Summary: Allow GISelKnownBits to look through the trivial case of TargetOpcode::COPY

Reviewers: aditya_nandakumar

Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67131

llvm-svn: 370955
2019-09-04 18:59:43 +00:00
Matt Arsenault 70becc20fa GlobalISel: Add G_BITREVERSE
This is the first failing pattern for AMDGPU and is trivial to handle.

llvm-svn: 370927
2019-09-04 17:06:53 +00:00
Amara Emerson 5d5150f0b4 [GlobalISel] Fix G_SEXT narrowScalar to bail out of unsupported type combination.
Similar to the issue with G_ZEXT that was fixed earlier, this is a quick
to fall back if the source type is not exactly half of the dest type.

Fixes the clang-cmake-aarch64-lld bot build.

llvm-svn: 370847
2019-09-04 07:58:45 +00:00
Amara Emerson 2a2c25ba48 [AArch64][GlobalISel] Legalize 128 bit divisions to libcalls.
Now that we have the infrastructure to support s128 types as parameters
we can expand these to libcalls.

Differential Revision: https://reviews.llvm.org/D66185

llvm-svn: 370823
2019-09-03 21:42:32 +00:00
Amara Emerson fbaf425b79 [GlobalISel][CallLowering] Add support for splitting types according to calling conventions.
On AArch64, s128 types have to be split into s64 GPRs when passed as arguments.
This change adds the generic support in call lowering for dealing with multiple
registers, for incoming and outgoing args.

Support for splitting for return types not yet implemented.

Differential Revision: https://reviews.llvm.org/D66180

llvm-svn: 370822
2019-09-03 21:42:28 +00:00
Amara Emerson 453ef4e376 [AArch64][GlobalISel] Fix zext narrowScalar to use the right type when creating
the merges.

Fixes PR43171.

llvm-svn: 370627
2019-09-02 08:18:55 +00:00
Matt Arsenault 466ec2d552 GlobalISel: Fix missing pass dependency
llvm-svn: 370496
2019-08-30 17:41:58 +00:00
Petar Avramovic 6412b56513 [MIPS GlobalISel] Lower fptoui
Add lower for G_FPTOUI. Algorithm is similar to the SDAG version
in TargetLowering::expandFP_TO_UINT.
Lower G_FPTOUI for MIPS32.

Differential Revision: https://reviews.llvm.org/D66929

llvm-svn: 370431
2019-08-30 05:44:02 +00:00
Matt Arsenault 093ebf9275 GlobalISel: Don't compute known bits for non-integral GEP
llvm-svn: 370392
2019-08-29 17:55:05 +00:00
Matt Arsenault b2b9a23758 GlobalISel: Add maskedValueIsZero and signBitIsZero to known bits
I dropped the DemandedElts since it seems to be missing from some of
the new interfaces, but not others.

llvm-svn: 370389
2019-08-29 17:24:36 +00:00
Matt Arsenault caff0a88dd GlobalISel: Add known bits to InstructionSelector
AMDGPU uses this for some addressing mode selection patterns. The
analysis run itself doesn't do anything so it seems easier to just
always require this than adding a way to opt in.

llvm-svn: 370388
2019-08-29 17:24:32 +00:00
Jessica Paquette af0bd41e06 [AArch64][GlobalISel] Fall back when translating musttail calls
These are currently translated as normal functions calls in AArch64.

Until we have proper tail call lowering, we shouldn't translate these.

Differential Revision: https://reviews.llvm.org/D66842

llvm-svn: 370225
2019-08-28 16:19:01 +00:00
Amara Emerson e20b91c265 [GlobalISel] Replace hard coded dynamic alloca handling with G_DYN_STACKALLOC.
This change moves the actual stack pointer manipulation into the legalizer,
available to targets via lower(). The codegen is slightly different because
we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than
adding a negative amount via G_GEP.

Differential Revision: https://reviews.llvm.org/D66678

llvm-svn: 370104
2019-08-27 19:54:27 +00:00
Petar Avramovic a393238422 [GlobalISel] Factor narrowScalar for G_ASHR and G_LSHR. NFC
Main difference is in the way Hi for Long shift (HiL) is made.
G_LSHR fills HiL with zeros, while G_ASHR fills HiL with sign bit value.

Differential Revision: https://reviews.llvm.org/D66589

llvm-svn: 370064
2019-08-27 14:33:05 +00:00
Petar Avramovic d568ed40e0 [GlobalISel] Fix narrowScalar for shifts to match algorithm from SDAG
Fix typos. Use Hi and Lo prefixes for Or instead of LHS and RHS
to match names of surrounding variables.

Differential Revision: https://reviews.llvm.org/D66587

llvm-svn: 370062
2019-08-27 14:22:32 +00:00
Volkan Keles 277631e3b8 [GlobalISel] Legalizer: Retry combining illegal artifacts as long as there new artifacts
Summary:
Currently, Legalizer aborts if it’s unable to legalize artifacts. However, it’s
possible to combine them after processing the rest of the instruction because
the legalization is likely to generate more artifacts that allow ArtifactCombiner
to combine away them.

Instead, move illegal artifacts to another list called RetryList and wait until all of the
instruction in InstList are legalized. After that, check if there is any new artifacts and
try to combine them again if that’s the case. If not, abort. The idea is similar to D59339,
but the approach is a bit different.

This patch fixes the issue described above, but the legalizer still may be unable to handle
some cases depending on when to legalize artifacts. So, in the long run, we probably need
a different legalization strategy that handles this dependency in a better way.

Reviewers: dsanders, aditya_nandakumar, qcolombet, arsenm, aemerson, paquette

Reviewed By: dsanders

Subscribers: jvesely, wdng, nhaehnle, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65894

llvm-svn: 369805
2019-08-23 20:30:35 +00:00
Matt Arsenault fba82858f2 GlobalISel: Don't create G_UADDE with constant false carry in
The x86 tests are now broken (in paticular add-scalar.ll now hits the
DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a
custom lowering for this, so that will need to be implemented.

llvm-svn: 369673
2019-08-22 17:29:17 +00:00
Matt Arsenault 954a012b4c GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sources
This is necessary for handling <3 x s16> on AMDGPU, assuming this
should be handled as 2 separate legalization actions. The alternative
would be for fewerElementsVector to handle 3->2.

llvm-svn: 369547
2019-08-21 16:59:10 +00:00
Petar Avramovic 5b4c5c2c54 [MIPS GlobalISel] NarrowScalar G_TRUNC
Add NarrowScalar for G_TRUNC when NarrowTy is half the size of source.
NarrowScalar G_TRUNC to s32 for MIPS32.

Differential Revision: https://reviews.llvm.org/D66202

llvm-svn: 369509
2019-08-21 09:26:39 +00:00
Amara Emerson 56606a4db3 [AArch64][GlobalISel] Add support for narrowScalar of G_ZEXT
We do this by merging the source with the high bits set to 0.

Differential Revision: https://reviews.llvm.org/D66181

llvm-svn: 369480
2019-08-21 00:12:37 +00:00
Aditya Nandakumar 08bd080872 [GlobalISel] Handle multiple registers in dbg.value intrinsic
https://reviews.llvm.org/D66077

The value passed into dbg.value may relate to multiple registers,
each of which need a DBG_VALUE.

This fix calls MIRBuilder.buildDirectDbgValue for each register.

Without this, IR passed in from flang-compiler/flang may fail an
assertion in getOrCreateVReg.

Patch by : peterwaller-arm.

llvm-svn: 369403
2019-08-20 16:28:37 +00:00
Amara Emerson c809230a69 [AArch64][GlobalISel] Lower G_SHUFFLE_VECTOR with 1 elt src and 1 elt mask.
Again, it's weird that these are allowed. Since lowering support was added in
r368709 we started crashing on compiling the neon intrinsics test in the test
suite. This fixes the lowering to fold the 1 elt src/mask case into copies.

llvm-svn: 369135
2019-08-16 18:06:53 +00:00
Volkan Keles 0ae6006bee [GlobalISel] CSEMIRBuilder: Add support for G_GEP
Summary:
This patch adds G_GEP to `shouldCSEOpc` so that it can be CSEd. It also refactors
`translateGetElementPtr` by replacing `createGenericVirtualRegister` calls with types.

Reviewers: aditya_nandakumar, arsenm, dsanders, paquette, aemerson

Reviewed By: aditya_nandakumar

Subscribers: wdng, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66316

llvm-svn: 369070
2019-08-15 23:45:45 +00:00
Daniel Sanders 0c47611131 Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor

Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&

Depends on D65919

Reviewers: arsenm, bogner, craig.topper, RKSimon

Reviewed By: arsenm

Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65962

llvm-svn: 369041
2019-08-15 19:22:08 +00:00
Jonas Devlieghere 0eaee545ee [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Aditya Nandakumar c65ac865c3 [GlobalISel]: Fix lowering of G_Shuffle_vector where we pick up the wrong source index
https://reviews.llvm.org/D66182

llvm-svn: 368781
2019-08-14 01:23:33 +00:00
Aditya Nandakumar 615eee6402 [GlobalISel]: Fix lowering of G_SHUFFLE_VECTOR with scalar sources
https://reviews.llvm.org/D66171

llvm-svn: 368753
2019-08-13 21:49:11 +00:00
Matt Arsenault 28215caa60 GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUES
Odd sized vectors aren't handled yet.

llvm-svn: 368713
2019-08-13 16:26:28 +00:00
Matt Arsenault 690645bda0 GlobalISel: Implement lower for G_SHUFFLE_VECTOR
llvm-svn: 368709
2019-08-13 16:09:07 +00:00
Matt Arsenault 5af9cf042f GlobalISel: Change representation of shuffle masks
Currently shufflemasks get emitted as any other constant, and you end
up with a bunch of virtual registers of G_CONSTANT with a
G_BUILD_VECTOR. The AArch64 selector then asserts on anything that
doesn't fit this pattern. This isn't an ideal representation, and
should avoid legalization and have fewer opportunities for a
representational error.

Rather than invent a new shuffle mask operand type, similar to what
ShuffleVectorSDNode does, just track the original IR Constant mask
operand. I don't completely like the idea of adding another link to
the IR, but MIR is already quite dependent on IR constants already,
and this will allow sharing the shuffle mask utility functions with
the IR.

llvm-svn: 368704
2019-08-13 15:34:38 +00:00
Amara Emerson e14c91b71a [GlobalISel] Make the InstructionSelector instance non-const, allowing state to be maintained.
Currently we can't keep any state in the selector object that we get from
subtarget. As a result we have to plumb through all our variables through
multiple functions. This change makes it non-const and adds a virtual init()
method to allow further state to be captured for each target.

AArch64 makes use of this in this patch to cache a call to hasFnAttribute()
which is expensive to call, and is used on each selection of G_BRCOND.

Differential Revision: https://reviews.llvm.org/D65984

llvm-svn: 368652
2019-08-13 06:26:59 +00:00
Aditya Nandakumar 70fdfed45f [GlobalISel]: Add KnownBits for G_XOR
https://reviews.llvm.org/D66119

llvm-svn: 368648
2019-08-13 04:32:33 +00:00
Aditya Nandakumar 55371e697c [GISel]: Fix a bug in KnownBits where we should have been using SizeInBits
https://reviews.llvm.org/D66039

We were using getIndexSize instead of getIndexSizeInBits().
Added test case for G_PTRTOINT and G_INTTOPTR.

llvm-svn: 368618
2019-08-12 21:28:12 +00:00
Daniel Sanders e9a57c2b23 [globalisel] Add G_SEXT_INREG
Summary:
Targets often have instructions that can sign-extend certain cases faster
than the equivalent shift-left/arithmetic-shift-right. Such cases can be
identified by matching a shift-left/shift-right pair but there are some
issues with this in the context of combines. For example, suppose you can
sign-extend 8-bit up to 32-bit with a target extend instruction.
  %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity)
  %2:_(s32) = G_ASHR %1:_(s32), i32 24
  %3:_(s32) = G_ASHR %2:_(s32), i32 1
would reasonably combine to:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 25
which no longer matches the special case. If your shifts and extend are
equal cost, this would break even as a pair of shifts but if your shift is
more expensive than the extend then it's cheaper as:
  %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8
  %3:_(s32) = G_ASHR %2:_(s32), i32 1
It's possible to match the shift-pair in ISel and emit an extend and ashr.
However, this is far from the only way to break this shift pair and make
it hard to match the extends. Another example is that with the right
known-zeros, this:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 24
  %3:_(s32) = G_MUL %2:_(s32), i32 2
can become:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 23

All upstream targets have been configured to lower it to the current
G_SHL,G_ASHR pair but will likely want to make it legal in some cases to
handle their faster cases.

To follow-up: Provide a way to legalize based on the constant. At the
moment, I'm thinking that the best way to achieve this is to provide the
MI in LegalityQuery but that opens the door to breaking core principles
of the legalizer (legality is not context sensitive). That said, it's
worth noting that looking at other instructions and acting on that
information doesn't violate this principle in itself. It's only a
violation if, at the end of legalization, a pass that checks legality
without being able to see the context would say an instruction might not be
legal. That's a fairly subtle distinction so to give a concrete example,
saying %2 in:
  %1 = G_CONSTANT 16
  %2 = G_SEXT_INREG %0, %1
is legal is in violation of that principle if the legality of %2 depends
on %1 being constant and/or being 16. However, legalizing to either:
  %2 = G_SEXT_INREG %0, 16
or:
  %1 = G_CONSTANT 16
  %2:_(s32) = G_SHL %0, %1
  %3:_(s32) = G_ASHR %2, %1
depending on whether %1 is constant and 16 does not violate that principle
since both outputs are genuinely legal.

Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm

Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61289

llvm-svn: 368487
2019-08-09 21:11:20 +00:00
Tim Northover e1a5f668b3 GlobalISel: pack various parameters for lowerCall into a struct.
I've now needed to add an extra parameter to this call twice recently. Not only
is the signature getting extremely unwieldy, but just updating all of the
callsites and implementations is a pain. Putting the parameters in a struct
sidesteps both issues.

llvm-svn: 368408
2019-08-09 08:26:38 +00:00
Tim Northover 3c10f346dc GlobalISel: factor common code from translateCall and translateInvoke. NFC.
llvm-svn: 368166
2019-08-07 12:43:53 +00:00
Aditya Nandakumar 6bbfde5c48 [GISel]: Fix trivial build breakage
llvm-svn: 368067
2019-08-06 17:53:04 +00:00
Aditya Nandakumar c8ac029d0a [GISel]: Add GISelKnownBits analysis
https://reviews.llvm.org/D65698

This adds a KnownBits analysis pass for GISel. This was done as a
pass (compared to static functions) so that we can add other features
such as caching queries(within a pass and across passes) in the future.
This patch only adds the basic pass boiler plate, and implements a lazy
non caching knownbits implementation (ported from SelectionDAG). I've
also hooked up the AArch64PreLegalizerCombiner pass to use this - there
should be no compile time regression as the analysis is lazy.

llvm-svn: 368065
2019-08-06 17:18:29 +00:00
Amara Emerson bc1172df14 [GlobalISel][CallLowering] Rename isArgumentHandler() -> isIncomingArgumentHandler()
Previous name and comment incorrectly implied it was just for formal arg handlers,
which is not true.

llvm-svn: 367945
2019-08-05 23:05:28 +00:00
Amara Emerson 85e5e28ab4 [AArch64][GlobalISel] Inline tiny memcpy et al at -O0.
FastISel already does this since the initial arm64 port was upstreamed, so
it seems there are no issues with doing this at -O0 for very small memcpys.

Gives a 0.2% geomean code size improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D65758

llvm-svn: 367919
2019-08-05 20:02:52 +00:00
Guillaume Chatelet c97a3d15d2 [LLVM][Alignment] Introduce Alignment Type
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, jfb, jakehehrlich

Reviewed By: jfb

Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65514

llvm-svn: 367828
2019-08-05 11:02:05 +00:00
Guillaume Chatelet 65e4b47aad [LLVM][Alignment] Introduce Alignment Type in DataLayout
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, jfb, jakehehrlich

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65521

Make getFunctionPtrAlign() return MaybeAlign

llvm-svn: 367817
2019-08-05 09:00:43 +00:00
Amara Emerson c835164a47 Re-commit "[GlobalISel] Add legalization support for non-power-2 loads and stores""
This is an old commit that exposed a bug in the GISel importer, which caused
non-truncating stores to be selected for truncating store patterns. Now that's
been fixed in r367737 this can go back in.

llvm-svn: 367739
2019-08-02 23:44:24 +00:00
Tim Northover 522fb7eedc GlobalISel: support swiftself attribute
llvm-svn: 367683
2019-08-02 14:09:49 +00:00
Daniel Sanders 2bea69bf65 Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC
llvm-svn: 367633
2019-08-01 23:27:28 +00:00
Matt Arsenault d9d30a408e GlobalISel: Lower scalarizing unmerge of a vector to shifts
AMDGPU sometimes has legal s16 and <2 x s16> operations, but all
registers are really 32-bit. An unmerge destination really should ben
widened to a 32-bit register. If widening a scalarizing vector with a
target size that matches the vector size, bitcast to integer and
extract the relevant bits with shifts.

I'm not sure if this is the right place for this. This could arguably
be part of widenScalar for the result. I also have a growing feeling
that we're missing a bitcast legalize action.

llvm-svn: 367604
2019-08-01 19:10:05 +00:00
Matt Arsenault 5faa533e47 GlobalISel: Fix widenScalar for G_MERGE_VALUES to pointer
AMDGPU testcase isn't broken now, but will be in a future patch
without this.

llvm-svn: 367591
2019-08-01 18:13:16 +00:00
Matt Arsenault 7bedceb5b2 GlobalISel: moreElementsVector for G_LOAD/G_STORE
AMDGPU change and test is a placeholder until a future patch with
complete handling.

llvm-svn: 367503
2019-08-01 01:44:22 +00:00
Mark Lacey 7b8d3eb9e2 [GISel] Pass MD_callees metadata down in call lowering.
Summary:
This will make it possible to improve IPRA by taking into account
register usage in indirect calls.

NFC yet; this is just laying the groundwork to start building
up patches to take advantage of the information for improved register
allocation.

Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette

Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65488

llvm-svn: 367476
2019-07-31 20:34:02 +00:00
Matt Arsenault 9cf980d4a7 GlobalISel: Add G_ATOMICRMW_{FADD|FSUB}
llvm-svn: 367369
2019-07-30 23:56:30 +00:00
Austin Kerbow c99f62e313 [AMDGPU/GlobalISel] Add llvm.amdgcn.fdiv.fast legalization.
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64966

llvm-svn: 367344
2019-07-30 18:49:16 +00:00
Amara Emerson 7bc4fad0fb [AArch64][GlobalISel] Implement narrowing of G_SEXT.
We need this to narrow a sext to s128.

Differential Revision: https://reviews.llvm.org/D65357

llvm-svn: 367164
2019-07-26 23:46:38 +00:00
Amara Emerson 13af1ed8e3 [GlobalISel] Support for inlining memcpy, memset and memmove calls.
This introduces a new family of combiner helper routines that re-use the
target specific cost model from SelectionDAG, and generate inline implementations
of the memcpy family of intrinsics.

The combines are only enabled at optimization levels higher than -O0, and give
very substantial performance improvements.

Differential Revision: https://reviews.llvm.org/D65167

llvm-svn: 366951
2019-07-24 22:17:31 +00:00
Amara Emerson a1997ce2e5 [AArch64][GlobalISel] Fix a crash during s128 G_ICMP legalization due to r366317.
r366317 added a legalization for s128 G_ICMP narrow scalar which tried to hard
code the result type of the new legalized G_SELECT. Change this to instead use
type of the original G_ICMP result and allow the target to legalize it if necessary
later.

llvm-svn: 366943
2019-07-24 20:46:42 +00:00
Aditya Nandakumar d7504a1569 [GISel]: Attach missing range metadata while translating G_LOADs
https://reviews.llvm.org/D65048

Attach range information to G_LOAD when only defining one register.

reviewed by: arsenm

llvm-svn: 366656
2019-07-21 14:07:54 +00:00
Amara Emerson cf12c7815f [GlobalISel] Translate calls to memcpy et al to G_INTRINSIC_W_SIDE_EFFECTs and legalize later.
I plan on adding memcpy optimizations in the GlobalISel pipeline, but we can't
do that unless we delay lowering to actual function calls. This patch changes
the translator to generate G_INTRINSIC_W_SIDE_EFFECTS for these functions, and
then have each target specify that using the new custom legalizer for intrinsics
hook that they want it expanded it a libcall.

Differential Revision: https://reviews.llvm.org/D64895

llvm-svn: 366516
2019-07-19 00:24:45 +00:00
Matt Arsenault 0966dd0d69 GlobalISel: Handle widenScalar of arbitrary G_MERGE_VALUES sources
Extract the sources to the GCD of the original size and target size,
padding with implicit_def as necessary.

Also fix the case where the requested source type is wider than the
original result type. This was ignoring the type, and just using the
destination. Do the operation in the requested type and truncate back.

llvm-svn: 366367
2019-07-17 20:22:44 +00:00
Matt Arsenault 914a59cad8 GlobalISel: Handle more cases for widenScalar of G_MERGE_VALUES
Use an anyext to the requested type for the leftover operand to
produce a slightly wider type, and then truncate the final merge.

I have another implementation almost ready which handles arbitrary
widens, but I think it produces worse code in this example (which I
think is 90% due to not folding redundant copies or folding out
implicit_def users), so I wanted to add this as a baseline first.

llvm-svn: 366366
2019-07-17 20:22:38 +00:00
Petar Avramovic 1e62635d05 [MIPS GlobalISel] ClampScalar and select pointer G_ICMP
Add narrowScalar to half of original size for G_ICMP.
ClampScalar G_ICMP's operands 2 and 3 to to s32.
Select G_ICMP for pointers for MIPS32. Pointer compare is same
as for integers, it is enough to declare them as legal type.

Differential Revision: https://reviews.llvm.org/D64856

llvm-svn: 366317
2019-07-17 12:08:01 +00:00
Matt Arsenault 1c3f4ec7fc GlobalISel: Add overload of handleAssignments with CCState
AMDGPU needs to allocate special argument registers separately from
the user function argument list, so needs direct control over the
CCState.

The ArgLocs argument is only really necessary because CCState doesn't
allow access to it.

llvm-svn: 366279
2019-07-16 22:41:34 +00:00
Matt Arsenault 434d664095 GlobalISel: Implement narrowScalar for vector extract/insert indexes
llvm-svn: 366113
2019-07-15 19:37:34 +00:00
Fangrui Song b251cc0d91 Delete dead stores
llvm-svn: 365903
2019-07-12 14:58:15 +00:00
Matt Arsenault 7e71902b79 GlobalISel: Use Register
llvm-svn: 365780
2019-07-11 14:18:19 +00:00
Amara Emerson 7a4d2df04a [AArch64][GlobalISel] Optimize compare and branch cases with G_INTTOPTR and unknown values.
Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes
obstruct attempts to find constant source values. These usually come about when
try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough
about this operation allows the CBZ/CBNZ optimization to catch more cases.

This change also improves the case where we can't find a constant source at all.
Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit
a cmp and conditional branch, saving an instruction.

The cumulative code size improvement of this change plus D64354 is 5.5% geomean
on arm64 CTMark -O0.

Differential Revision: https://reviews.llvm.org/D64377

llvm-svn: 365690
2019-07-10 19:21:43 +00:00
Matt Arsenault 6ce1b4fec5 GlobalISel: Legalization for G_FMINNUM/G_FMAXNUM
llvm-svn: 365658
2019-07-10 16:31:19 +00:00
Matt Arsenault e595a2c964 GlobalISel: Define the full family of FP min/max instructions
llvm-svn: 365657
2019-07-10 16:31:15 +00:00
Matt Arsenault b1843e130a GlobalISel: Implement lower for G_FCOPYSIGN
In SelectionDAG AMDGPU treated these as legal, but this was mostly
because the bitcasts required for FP types were painful. Theoretically
the bitpattern should eventually match to bfi, so don't bother trying
to get the patterns to import.

llvm-svn: 365583
2019-07-09 23:34:29 +00:00
Matt Arsenault 14a4495155 GlobalISel: Combine unmerge of merge with intermediate cast
This eliminates some illegal intermediate vectors when operations are
scalarized.

llvm-svn: 365566
2019-07-09 22:19:13 +00:00
Amara Emerson 6616e269a6 [AArch64][GlobalISel] Optimize conditional branches followed by unconditional branches
If we have an icmp->brcond->br sequence where the brcond just branches to the
next block jumping over the br, while the br takes the false edge, then we can
modify the conditional branch to jump to the br's target while inverting the
condition of the incoming icmp. This means we can eliminate the br as an
unconditional branch to the fallthrough block.

Differential Revision: https://reviews.llvm.org/D64354

llvm-svn: 365510
2019-07-09 16:05:59 +00:00
Petar Avramovic be20e36107 [MIPS GlobalISel] Register bank select for G_PHI. Select i64 phi
Select gprb or fprb when def/use register operand of G_PHI is
used/defined by either:
 copy to/from physical register or
 instruction with only one mapping available for that use/def operand.

Integer s64 phi is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.

Differential Revision: https://reviews.llvm.org/D64351

llvm-svn: 365494
2019-07-09 14:36:17 +00:00
Matt Arsenault 079f77b590 GlobalISel: Convert some build functions to using SrcOp/DstOp
llvm-svn: 365343
2019-07-08 16:27:47 +00:00
Matt Arsenault bd791b57f8 GlobalISel: widenScalar for G_BUILD_VECTOR
llvm-svn: 365320
2019-07-08 13:48:06 +00:00
Matt Arsenault 43cbca50e4 GlobalISel: Fix widenScalar for pointer typed G_MERGE_VALUES
llvm-svn: 365093
2019-07-03 23:08:06 +00:00
Matt Arsenault ce690544a6 GlobalISel: Add G_FENCE
The pattern importer is for some reason emitting checks for G_CONSTANT
for the immediate operands.

llvm-svn: 364926
2019-07-02 14:16:39 +00:00
Matt Arsenault c9f14f29f5 GlobalISel: Try to widen merges with other merges
If the requested source type an be used as a merge source type, create
a merge of merges. This avoids creating large, illegal extensions and
bit-ops directly to the result type.

llvm-svn: 364841
2019-07-01 19:36:10 +00:00
Aditya Nandakumar 1023a2eca3 [GlobalISel]: Allow backends to custom legalize Intrinsics
https://reviews.llvm.org/D31359

Add a hook "legalizeInstrinsic" to allow backends to override this
and custom lower/legalize intrinsics.

llvm-svn: 364821
2019-07-01 17:53:50 +00:00
Matt Arsenault 6f74f55750 GlobalISel: Implement lower for min/max
llvm-svn: 364816
2019-07-01 17:18:03 +00:00
Diana Picus 2ba16011c1 Fixup r364512
Fix stack-use-after-scope errors from r364512. One instance was already
fixed in r364611 - this patch simplifies that fix and addresses one more
instance of similar code.

Discussed in: https://reviews.llvm.org/D63905

llvm-svn: 364778
2019-07-01 15:07:38 +00:00
Fangrui Song 78ee2fbf98 Cleanup: llvm::bsearch -> llvm::partition_point after r364719
llvm-svn: 364720
2019-06-30 11:19:56 +00:00
Matt Arsenault 3018d1845b GlobalISel: Use Register
llvm-svn: 364618
2019-06-28 01:47:44 +00:00
Matt Arsenault 5e66db6b8c GlobalISel: Convert rest of MachineIRBuilder to using Register
llvm-svn: 364615
2019-06-28 01:16:41 +00:00
Amara Emerson ecb7ac35f9 [GlobalISel][IRTranslator] Fix some PHI bugs related to jump tables when optimizations are used.
The new switch lowering code that tries to generate jump tables and range checks
were tested at -O0 on arm64, but on -O3 the generic switch lowering code goes to
town on trying to generate optimized lowerings, e.g. multiple jump tables, range
checks etc. This exposed bugs in the way PHI nodes are handled because the CFG
looks even stranger after all of this is done.

llvm-svn: 364613
2019-06-27 23:56:34 +00:00
Rumeet Dhindsa ddc2804e1a Fix ASAN error caused by commit r364512.
This patch intends to fix ASAN stack-use-after-scope error.
This is at least a short-term fix to unbreak LLVM's mainline.

Differential Revision: https://reviews.llvm.org/D63905

llvm-svn: 364611
2019-06-27 23:37:04 +00:00
Diana Picus 74a50a723b [GlobalISel] Remove [un]packRegs from IRTranslator
Remove the last use of packRegs from IRTranslator and delete
pack/unpackRegs. This introduces a fallback to DAGISel for intrinsics
with aggregate arguments, since we don't have a testcase for them so
it's hard to tell how we'd want to handle them.

Discussed in https://reviews.llvm.org/D63551

llvm-svn: 364514
2019-06-27 09:49:07 +00:00
Diana Picus 43fb5ae50c [GlobalISel] Accept multiple vregs for lowerCall's args
Change the interface of CallLowering::lowerCall to accept several
virtual registers for each argument, instead of just one.  This is a
follow-up to D46018.

CallLowering::lowerReturn was similarly refactored in D49660 and
lowerFormalArguments in D63549.

With this change, we no longer pack the virtual registers generated for
aggregates into one big lump before delegating to the target. Therefore,
the target can decide itself whether it wants to handle them as separate
pieces or use one big register.

ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.

NFCI for AMDGPU, Mips and X86.

Differential Revision: https://reviews.llvm.org/D63551

llvm-svn: 364512
2019-06-27 09:18:03 +00:00
Diana Picus 8138996128 [GlobalISel] Accept multiple vregs for lowerCall's result
Change the interface of CallLowering::lowerCall to accept several
virtual registers for the call result, instead of just one.  This is a
follow-up to D46018.

CallLowering::lowerReturn was similarly refactored in D49660 and
lowerFormalArguments in D63549.

With this change, we no longer pack the virtual registers generated for
aggregates into one big lump before delegating to the target. Therefore,
the target can decide itself whether it wants to handle them as separate
pieces or use one big register.

ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.

NFCI for AMDGPU, Mips and X86.

Differential Revision: https://reviews.llvm.org/D63550

llvm-svn: 364511
2019-06-27 09:15:53 +00:00
Diana Picus c3dbe23977 [GlobalISel] Accept multiple vregs in lowerFormalArgs
Change the interface of CallLowering::lowerFormalArguments to accept
several virtual registers for each formal argument, instead of just one.
This is a follow-up to D46018.

CallLowering::lowerReturn was similarly refactored in D49660. lowerCall
will be refactored in the same way in follow-up patches.

With this change, we forward the virtual registers generated for
aggregates to CallLowering. Therefore, the target can decide itself
whether it wants to handle them as separate pieces or use one big
register. We also copy the pack/unpackRegs helpers to CallLowering to
facilitate this.

ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.

AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was
put into a s64 instead of a p0. Added a test-case which illustrates the
problem more clearly (it crashes without this patch) and fixed the
existing test-case to expect p0.

AMDGPU has been updated to unpack into the virtual registers for
kernels. I think the other code paths fall back for aggregates, so this
should be NFC.

Mips doesn't support aggregates yet, so it's also NFC.

x86 seems to have code for dealing with aggregates, but I couldn't find
the tests for it, so I just added a fallback to DAGISel if we get more
than one virtual register for an argument.

Differential Revision: https://reviews.llvm.org/D63549

llvm-svn: 364510
2019-06-27 08:54:17 +00:00
Diana Picus 69ce1c1319 [GlobalISel] Allow multiple VRegs in ArgInfo. NFC
Allow CallLowering::ArgInfo to contain more than one virtual register.
This is useful when passes split aggregates into several virtual
registers, but need to also provide information about the original type
to the call lowering. Used in follow-up patches.

Differential Revision: https://reviews.llvm.org/D63548

llvm-svn: 364509
2019-06-27 08:50:53 +00:00
Matt Arsenault faeaedf8e9 GlobalISel: Remove unsigned variant of SrcOp
Force using Register.

One downside is the generated register enums require explicit
conversion.

llvm-svn: 364194
2019-06-24 16:16:12 +00:00
Matt Arsenault e3a676e9ad CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().

llvm-svn: 364191
2019-06-24 15:50:29 +00:00
Fangrui Song 43e14390b0 Make GlobalISel depend on SelectionDAG after D63169
GlobalISel/IRTranslator.cpp now references SelectionDAG/FunctionLoweringInfo.cpp.
This fixes a link error in -DBUILD_SHARED_LIBS=on builds:

    ld.lld: error: undefined symbol: llvm::FunctionLoweringInfo::clear()
    >>> referenced by IRTranslator.cpp:2198 (../lib/CodeGen/GlobalISel/IRTranslator.cpp:2198)
    >>>               lib/CodeGen/GlobalISel/CMakeFiles/LLVMGlobalISel.dir/IRTranslator.cpp.o:(llvm::IRTranslator::finalizeFunction())

llvm-svn: 364124
2019-06-22 01:30:17 +00:00
Amara Emerson fe4625fb24 [GlobalISel][IRTranslator] Change switch table translation to generate jump tables and range checks.
This change makes use of the newly refactored SwitchLoweringUtils code from
SelectionDAG to in order to generate jump tables and range checks where appropriate.

Much of this code is ported from SDAG with some modifications. We generate
G_JUMP_TABLE and G_BRJT instructions when JT opportunities are found. This means
that targets which previously relied on the naive one MBB per case stmt
translation will now start falling back until they add support for the new opcodes.

For range checks, we don't generate any previously unused operations. This
just recognizes contiguous ranges of case values and generates a single block per
range. Single case value blocks are just a special case of ranges so we get that
support almost for free.

There are still some optimizations missing that I haven't ported over, and
bit-tests are also unimplemented. This patch series is already complex enough.

Actual arm64 support for selection of jump tables is coming in a later patch.

Differential Revision: https://reviews.llvm.org/D63169

llvm-svn: 364085
2019-06-21 18:10:38 +00:00
Amara Emerson bc0d08e0ee [GlobalISel][Localizer] Allow localization of G_INTTOPTR and chains of instructions.
G_INTTOPTR can prevent the localizer from moving G_CONSTANTs, but since it's
essentially a side effect free cast instruction we can remat both instructions.
This patch changes the localizer to enable localization of the chains by
iterating over the entry block instructions in reverse order. That way, uses will
localized first, and then the defs are free to be localized as well.

This also changes the previous SmallPtrSet of localized instructions to use a
SetVector instead. We're dealing with pointers and need deterministic iteration
order.

Overall, this change improves ARM64 -O0 CTMark code size by around 0.7% geomean.

Differential Revision: https://reviews.llvm.org/D63630

llvm-svn: 364001
2019-06-21 00:36:19 +00:00
Petar Avramovic 153bd24eda [MIPS GlobalISel] Select integer to floating point conversions
Select G_SITOFP and G_UITOFP for MIPS32.

Differential Revision: https://reviews.llvm.org/D63542

llvm-svn: 363912
2019-06-20 09:05:02 +00:00
Petar Avramovic 4b4dae1c76 [MIPS GlobalISel] Select floating point to integer conversions
Select G_FPTOSI and G_FPTOUI for MIPS32.

Differential Revision: https://reviews.llvm.org/D63541

llvm-svn: 363911
2019-06-20 08:52:53 +00:00
Amara Emerson d11ea2c8c5 [GlobalISel][Localizer] Remove redundant set lookup.
After changing the algorithm to only process the entry block we never revisit
a processed instruction.

llvm-svn: 363745
2019-06-18 22:08:40 +00:00
Tom Stellard 1f7f64665c GlobalISel: Remove redundant pass initialization
Summary:
All the GlobalISel passes are initialized when the target calls
initializeGlobalISel(), so we don't need to call the initializers
from the pass constructors.

Reviewers: qcolombet, t.p.northover, paquette, dsanders, aemerson, aditya_nandakumar

Reviewed By: aemerson

Subscribers: rovka, kristof.beyls, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63235

llvm-svn: 363642
2019-06-18 02:05:06 +00:00
Matt Arsenault 5a321b899e GlobalISel: Use the original flags when lowering fneg to fsub
This was ignoring the flag on fneg, and using the source instruction's
flags. Also fixes tests missing from r358702.

Note the expansion itself isn't correct without nnan, but that should
be fixed separately.

llvm-svn: 363637
2019-06-17 23:48:43 +00:00
Amara Emerson 146882242f [GlobalISel][Localizer] Rewrite localizer to run in 2 phases, inter & intra block.
Inter-block localization is the same as what currently happens, except now it
only runs on the entry block because that's where the problematic constants with
long live ranges come from.

The second phase is a new intra-block localization phase which attempts to
re-sink the already localized instructions further right before one of the
multiple uses.

One additional change is to also localize G_GLOBAL_VALUE as they're constants
too. However, on some targets like arm64 it takes multiple instructions to
materialize the value, so some additional heuristics with a TTI hook have been
introduced attempt to prevent code size regressions when localizing these.

Overall, these changes improve CTMark code size on arm64 by 1.2%.

Full code size results:

Program                                         baseline       new       diff
------------------------------------------------------------------------------
 test-suite...-typeset/consumer-typeset.test    1249984      1217216     -2.6%
 test-suite...:: CTMark/ClamAV/clamscan.test    1264928      1232152     -2.6%
 test-suite :: CTMark/SPASS/SPASS.test          1394092      1361316     -2.4%
 test-suite...Mark/mafft/pairlocalalign.test    731320       714928      -2.2%
 test-suite :: CTMark/lencod/lencod.test        1340592      1324200     -1.2%
 test-suite :: CTMark/kimwitu++/kc.test         3853512      3820420     -0.9%
 test-suite :: CTMark/Bullet/bullet.test        3406036      3389652     -0.5%
 test-suite...ark/tramp3d-v4/tramp3d-v4.test    8017000      8016992     -0.0%
 test-suite...TMark/7zip/7zip-benchmark.test    2856588      2856588      0.0%
 test-suite...:: CTMark/sqlite3/sqlite3.test    765704       765704       0.0%
 Geomean difference                                                      -1.2%

Differential Revision: https://reviews.llvm.org/D63303

llvm-svn: 363632
2019-06-17 23:20:29 +00:00
Michael Berg f9bff2a55e Propagate fmf in IRTranslate for fneg
Summary: This case is related to D63405 in that we need to be propagating FMF on negates.

Reviewers: volkan, spatel, arsenm

Reviewed By: arsenm

Subscribers: wdng, javed.absar

Differential Revision: https://reviews.llvm.org/D63458

llvm-svn: 363631
2019-06-17 23:19:40 +00:00
Daniel Sanders 184c8ee920 [globalisel] Fix iterator invalidation in the extload combines
Summary:
Change the way we deal with iterator invalidation in the extload combines as it
was still possible to neglect to visit a use. Even worse, it happened in the
in-tree test cases and the checks weren't good enough to detect it.

We now take a cheap copy of the use list before iterating over it. This
prevents iterator invalidation from occurring and has the nice side effect
of making the existing schedule-for-erase/schedule-for-insert mechanism
moot.

Reviewers: aditya_nandakumar

Reviewed By: aditya_nandakumar

Subscribers: rovka, kristof.beyls, javed.absar, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61813

llvm-svn: 363616
2019-06-17 20:56:31 +00:00
Matt Arsenault 3e140066bc GlobalISel: Ignore callsite attributes when picking intrinsic type
A target intrinsic may be defined as possibly reading memory, but the
call site may have additional knowledge that it doesn't read
memory. The intrinsic lowering will expect the pessimistic assumption
of the intrinsic definition, so the chain should still be used.

I fixed the same bug in SelectionDAG in r287593.

llvm-svn: 363580
2019-06-17 17:01:35 +00:00
Matt Arsenault 9487278010 Reapply "GlobalISel: Avoid producing Illegal copies in RegBankSelect"
This reapplies r363410, avoiding null dereference if there is no
AltRegBank.

llvm-svn: 363478
2019-06-15 00:33:26 +00:00
Mitch Phillips 0d44f129bb Revert "GlobalISel: Avoid producing Illegal copies in RegBankSelect"
This patch breaks UBSan build bots. See
https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild for
a guide as to how to reproduce the error.

This reverts commit c2864c0de0.
This reverts rL363410.

llvm-svn: 363476
2019-06-14 23:45:34 +00:00
Amara Emerson f79d3bc724 [GlobalISel] Add a G_BRJT opcode.
This is a branch opcode that takes a jump table pointer, jump table index and an
index into the table to do an indirect branch.

We pass both the table pointer and JTI to allow targets like ARM64 to more
easily use the existing jump table compression optimization without having to
walk up the block to find a paired G_JUMP_TABLE.

Differential Revision: https://reviews.llvm.org/D63159

llvm-svn: 363434
2019-06-14 17:55:48 +00:00
Matt Arsenault c2864c0de0 GlobalISel: Avoid producing Illegal copies in RegBankSelect
Avoid producing illegal register bank copies for reg_sequence and
phi. The default implementation assumes it is possible to pick any
operand's bank and use that for the result, introducing a copy for
operands with a different bank. This does not check for illegal
copies. It is not legal to introduce a VGPR->SGPR copy, so any VGPR
operand requires the result to be a VGPR.

The changes in getInstrMappingImpl aren't strictly necessary, since
AMDGPU now just bypasses this for reg_sequence/phi. This could be
replaced with an assert in case other targets run into this. It is
currently responsible for producing the error for unsatisfiable
copies, but this will be better served with a verifier check.

For phis, for now assume any undetermined operands must be
VGPRs. Eventually, this needs to be able to defer mapping these
operations. This also does not yet have a way to check for whether the
block is in a divergent region.

llvm-svn: 363410
2019-06-14 15:22:25 +00:00
Matt Arsenault 731a81598e RegBankSelect: Remove checks for invalid mappings
Avoid a check for valid and a set of redundant asserts. The place
InstructionMapping is constructed asserts all of the default fields
are passed anyway for an invalid mapping, so don't overcomplicate
this.

llvm-svn: 363391
2019-06-14 13:42:40 +00:00
Amara Emerson fb0a40f064 [GlobalISel][IRTranslator] Add debug loc with line 0 to constants emitted into the entry block.
Constants, including G_GLOBAL_VALUE, are all emitted into the entry block which
lets us use the vreg def assuming it dominates all other users. However, it can
cause jumpy debug behaviour since the DebugLoc attached to these MIs are from
a user instruction that could be in a different block.

Fixes PR40887.

Differential Revision: https://reviews.llvm.org/D63286

llvm-svn: 363331
2019-06-13 22:15:35 +00:00
Amara Emerson d133c15925 [GlobalISel] Add a G_JUMP_TABLE opcode.
This opcode generates a pointer to the address of the jump table
specified by the source operand, which is a jump table index.

It will be used in conjunction with an upcoming G_BRJT opcode to support
jump table codegen with GlobalISel.

Differential Revision: https://reviews.llvm.org/D63111

llvm-svn: 363096
2019-06-11 19:58:06 +00:00
Jessica Paquette b22954384e [GlobalISel] Translate memset/memmove/memcpy from undef ptrs into nops
If the source is undef, then just don't do anything.

This matches SelectionDAG's behaviour in SelectionDAG.cpp.

Also add a test showing that we do the right thing here.
(irtranslator-memfunc-undef.ll)

Differential Revision: https://reviews.llvm.org/D63095

llvm-svn: 362989
2019-06-10 21:53:56 +00:00
Volkan Keles 97204a6788 [GlobalISel] IRTranslator: Translate the intrinsics ignored by CodeGen
Summary:
Translate `llvm.assume`, `llvm.var.annotation` and `llvm.sideeffect` to nothing
as they have no effect on CodeGen.

Reviewers: qcolombet, aditya_nandakumar, dsanders, paquette, aemerson, arsenm

Reviewed By: arsenm

Subscribers: hiraditya, wdng, rovka, kristof.beyls, javed.absar, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63022

llvm-svn: 362834
2019-06-07 20:19:27 +00:00
Petar Avramovic faaa2b5d21 [MIPS GlobalISel] Select floor and ceil
Select G_FFLOOR and G_FCEIL for MIPS32.

Differential Revision: https://reviews.llvm.org/D62901

llvm-svn: 362688
2019-06-06 09:02:24 +00:00
Ulrich Weigand 6c5d5ce551 Allow target to handle STRICT floating-point nodes
The ISD::STRICT_ nodes used to implement the constrained floating-point
intrinsics are currently never passed to the target back-end, which makes
it impossible to handle them correctly (e.g. mark instructions are depending
on a floating-point status and control register, or mark instructions as
possibly trapping).

This patch allows the target to use setOperationAction to switch the action
on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code
will stop converting the STRICT nodes to regular floating-point nodes, but
instead pass the STRICT nodes to the target using normal SelectionDAG
matching rules.

To avoid having the back-end duplicate all the floating-point instruction
patterns to handle both strict and non-strict variants, we make the MI
codegen explicitly aware of the floating-point exceptions by introducing
two new concepts:

- A new MCID flag "mayRaiseFPException" that the target should set on any
  instruction that possibly can raise FP exception according to the
  architecture definition.
- A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI
  instruction resulting from expansion of any constrained FP intrinsic.

Any MI instruction that is *both* marked as mayRaiseFPException *and*
FPExcept then needs to be considered as raising exceptions by MI-level
codegen (e.g. scheduling).

Setting those two new flags is straightforward. The mayRaiseFPException
flag is simply set via TableGen by marking all relevant instruction
patterns in the .td files.

The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes
in the SelectionDAG, and gets inherited in the MachineSDNode nodes created
from it during instruction selection. The flag is then transfered to an
MIFlag when creating the MI from the MachineSDNode. This is handled just
like fast-math flags like no-nans are handled today.

This patch includes both common code changes required to implement the
new features, and the SystemZ implementation.

Reviewed By: andrew.w.kaylor

Differential Revision: https://reviews.llvm.org/D55506

llvm-svn: 362663
2019-06-05 22:33:10 +00:00
Tim Northover b7141207a4 Reapply: IR: add optional type to 'byval' function parameters
When we switch to opaque pointer types we will need some way to describe
how many bytes a 'byval' parameter should occupy on the stack. This adds
a (for now) optional extra type parameter.

If present, the type must match the pointee type of the argument.

The original commit did not remap byval types when linking modules, which broke
LTO. This version fixes that.

Note to front-end maintainers: if this causes test failures, it's probably
because the "byval" attribute is printed after attributes without any parameter
after this change.

llvm-svn: 362128
2019-05-30 18:48:23 +00:00
Tim Northover 71ee3d0237 Revert "IR: add optional type to 'byval' function parameters"
The IRLinker doesn't delve into the new byval attribute when mapping types, and
this breaks LTO.

llvm-svn: 362029
2019-05-29 20:46:38 +00:00
Tim Northover 6e07f16fae IR: add optional type to 'byval' function parameters
When we switch to opaque pointer types we will need some way to describe
how many bytes a 'byval' parameter should occupy on the stack. This adds
a (for now) optional extra type parameter.

If present, the type must match the pointee type of the argument.

Note to front-end maintainers: if this causes test failures, it's probably
because the "byval" attribute is printed after attributes without any parameter
after this change.

llvm-svn: 362012
2019-05-29 19:12:48 +00:00
Pengfei Wang 72e3f9662b Revert "[X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to"
This reverts commit c1b3716614bc0a107e6f41a7d3d503baefad8a5b.

llvm-svn: 361918
2019-05-29 02:49:59 +00:00
Pengfei Wang 818c652643 [X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to
avoid static check fail

RegClassOrBank is an object of RegClassOrRegBank, which is defined as
using llvm::RegClassOrRegBank = typedef PointerUnion<const
TargetRegisterClass *, const RegisterBank *>
so control flow can not get here. Use ""llvm_unreachable" here to avoid
"null pointer" confusion.

Patch by Shengchen Kan (skan)

Differential Revision: https://reviews.llvm.org/D62006

Signed-off-by: pengfei <pengfei.wang@intel.com>
llvm-svn: 361912
2019-05-29 02:20:37 +00:00
Tim Northover 3b2157aeed GlobalISel: support swifterror attribute on AArch64.
swifterror marks an argument as a register pretending to be a pointer, so we
need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the
infrastructure can be reused from the DAG world.

llvm-svn: 361608
2019-05-24 08:40:13 +00:00
Matt Arsenault 0f3ba44b57 AMDGPU/GlobalISel: Legality for integer min/max
llvm-svn: 361519
2019-05-23 17:58:48 +00:00
Matt Arsenault 02b5ca8cd1 GlobalISel: Implement lower for S64->S32 [SU]ITOFP
This is ported from the custom AMDGPU DAG implementation. I think this
is a better default expansion than what the DAG currently uses, at
least if the target has CTLZ.

This implements the signed version in terms of the unsigned
conversion, which is implemented with bit operations. SelectionDAG has
several other implementations that should eventually be ported
depending on what instructions are legal.

llvm-svn: 361081
2019-05-17 23:05:13 +00:00
Matt Arsenault f3cedf4823 GlobalISel: Define integer min/max instructions
Doesn't attempt to emit them for anything yet, but some legalizations
I want to port use them.

llvm-svn: 361061
2019-05-17 18:36:31 +00:00
Matt Arsenault 1448f5689e AMDGPU/GlobalISel: Legalize G_FCOPYSIGN
llvm-svn: 361025
2019-05-17 12:19:52 +00:00
Fangrui Song ec6dc3089e [GlobalISel] Fix -Wsign-compare on 32-bit -DLLVM_ENABLE_ASSERTIONS=on builds
llvm-svn: 360989
2019-05-17 05:53:39 +00:00
Matt Arsenault 27ac8408f6 GlobalISel: Add DstOp version of buildIntrinsic
llvm-svn: 360879
2019-05-16 12:22:56 +00:00
Matt Arsenault 11be78bc7a GlobalISel: Add buildFConstant for APFloat
llvm-svn: 360853
2019-05-16 04:09:06 +00:00
Matt Arsenault 012ecbbbba GlobalISel: Fix indentation
llvm-svn: 360851
2019-05-16 04:08:46 +00:00
Matt Arsenault 55146d3139 GlobalISel: Add G_FCOPYSIGN
llvm-svn: 360850
2019-05-16 04:08:39 +00:00
Diana Picus a568222ddd [IRTranslator] Don't hardcode GEP index type
When breaking up loads and stores of aggregates, the IRTranslator uses
LLT::scalar(64) for the index type of the G_GEP instructions that
compute the addresses. This is unnecessarily large for 32-bit targets.
Use the int ptr type provided by the DataLayout instead.

Note that we're already doing the right thing when translating
getelementptr instructions from the IR. This is just an oversight when
generating new ones while translating loads/stores.

Both x86 and AArch64 already have tests confirming that the old
behaviour is preserved for 64-bit targets.

Differential Revision: https://reviews.llvm.org/D61852

llvm-svn: 360656
2019-05-14 09:25:17 +00:00
Quentin Colombet c9256cc6ba [IRTranslator] Use the alloc size instead of the store size when translating allocas
We use to incorrectly use the store size instead of the alloc size when
creating the stack slot for allocas.
On aarch64 this can be demonstrated by allocating weirdly sized types.

For instance, in the added test case, we use an alloca for i19. We used
to allocate a slot of size 24-bit (19 rounded up to the next byte),
whereas we really want to use a full 32-bit slot for this type.

llvm-svn: 359856
2019-05-03 01:23:56 +00:00
Daniel Sanders 8f079844d0 [globalisel] Improve Legalizer debug output
* LegalizeAction should be printed by name rather than number
* Newly created instructions are incomplete at the point the observer first sees
  them. They are therefore recorded in a small vector and printed just before
  the legalizer moves on to another instruction. By this point, the instruction
  must be complete.

llvm-svn: 359481
2019-04-29 18:45:59 +00:00
Marcello Maggioni c596584f67 [GlobalISel] Fix inserting copies in the right position for reg definitions
When constrainRegClass is called if the constraining happens on a use the COPY
needs to be inserted before the instruction that contains the MachineOperand,
but if we are constraining a definition it actually needs to be added
after the instruction. In addition, the COPY needs to have its operands
flipped (in the use case we are copying from the old unconstrained register
to the new constrained register, while in the definition case we are copying
from the new constrained register that the instruction defines to the old
unconstrained register).

llvm-svn: 359282
2019-04-26 07:21:56 +00:00
Jessica Paquette ba55767f51 [GlobalISel][AArch64] Legalize G_FNEARBYINT
Add legalizer support for G_FNEARBYINT. It's the same as G_FCEIL etc.

Since the importer allows us to automatically select this after legalization,
also add tests for selection etc. Also update arm64-vfloatintrinsics.ll.

llvm-svn: 359204
2019-04-25 16:44:40 +00:00
Jessica Paquette bd7ac30b15 [GlobalISel] Add IRTranslator support for G_FNEARBYINT
Translate llvm.nearbyint into G_FNEARBYINT as a simple intrinsic. Update
arm64-irtranslator.ll.

Differential Revision: https://reviews.llvm.org/D60922

llvm-svn: 359203
2019-04-25 16:39:28 +00:00
Bjorn Pettersson 71e8c6f20f Add "const" in GetUnderlyingObjects. NFC
Summary:
Both the input Value pointer and the returned Value
pointers in GetUnderlyingObjects are now declared as
const.

It turned out that all current (in-tree) uses of
GetUnderlyingObjects were trivial to update, being
satisfied with have those Value pointers declared
as const. Actually, in the past several of the users
had to use const_cast, just because of ValueTracking
not providing a version of GetUnderlyingObjects with
"const" Value pointers. With this patch we get rid
of those const casts.

Reviewers: hfinkel, materi, jkorous

Reviewed By: jkorous

Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61038

llvm-svn: 359072
2019-04-24 06:55:50 +00:00
Jessica Paquette 3cc6d1f542 [AArch64][GlobalISel] Legalize G_INTRINSIC_ROUND
Add it to the same rule as G_FCEIL etc. Add a legalizer test, and add a missing
switch case to AArch64LegalizerInfo.cpp.

llvm-svn: 359033
2019-04-23 21:11:57 +00:00
Jessica Paquette 56342642a0 [AArch64][GlobalISel] Legalize G_INTRINSIC_TRUNC
Same patch as G_FCEIL etc.

Add the missing switch case in widenScalar, add G_INTRINSIC_TRUNC to the correct
rule in AArch64LegalizerInfo.cpp, and add a test.

llvm-svn: 359021
2019-04-23 18:20:44 +00:00
Matt Arsenault 8f624abc1d GlobalISel: Legalize scalar G_EXTRACT sources
llvm-svn: 358892
2019-04-22 15:10:42 +00:00
Amara Emerson 4286652556 Revert r358800. Breaks Obsequi from the test suite.
The last attempt fixed gcc and consumer-typeset, but Obsequi seems to fail with
a different issue.

llvm-svn: 358829
2019-04-20 21:25:00 +00:00
Amara Emerson eac69e9377 Revert "Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores""
We were shifting the wrong component of a split load when trying to combine them
back into a single value.

llvm-svn: 358800
2019-04-19 23:54:44 +00:00
Jessica Paquette d5c69e0836 [GlobalISel][AArch64] Legalize + select G_FRINT
Exactly the same as G_FCEIL, G_FABS, etc.

Add tests for the fp16/nofp16 behaviour, update arm64-vfloatintrinsics, etc.

Differential Revision: https://reviews.llvm.org/D60895

llvm-svn: 358799
2019-04-19 23:41:52 +00:00
Jessica Paquette ad69af3e95 [GlobalISel] Add IRTranslator support for G_FRINT
Add it as a simple intrinsic, update arm64-irtranslator.ll.

Differential Revision: https://reviews.llvm.org/D60893

llvm-svn: 358787
2019-04-19 21:46:12 +00:00
Amara Emerson 36c5baef49 Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores"
This introduces some runtime failures which I'll need to investigate further.

llvm-svn: 358771
2019-04-19 17:42:13 +00:00
Jessica Paquette dfd87f6fa1 [GlobalISel][AArch64] Legalize vector G_FPOW
This instruction is legalized in the same way as G_FSIN, G_FCOS, G_FLOG10, etc.

Update legalize-pow.mir and arm64-vfloatintrinsics.ll to reflect the change.

Differential Revision: https://reviews.llvm.org/D60218

llvm-svn: 358764
2019-04-19 16:28:08 +00:00
Michael Berg d573aa0156 [NFC] FMF propagation for GlobalIsel
llvm-svn: 358702
2019-04-18 18:48:57 +00:00
Aditya Nandakumar 9266337656 [GISel]:IRTranslator: Prefer a buidInstr form that allows CSE of cast instructions
https://reviews.llvm.org/D60844

Use the style of buildInstr that allows CSEing.

llvm-svn: 358637
2019-04-18 02:19:29 +00:00
Amara Emerson d51adf0568 Add a getSizeInBits() accessor to MachineMemOperand. NFC.
Cleans up a bunch of places where we do getSize() * 8.

Differential Revision: https://reviews.llvm.org/D60799

llvm-svn: 358617
2019-04-17 22:21:05 +00:00
Amara Emerson daf6e66ac5 [GlobalISel] Add legalization support for non-power-2 loads and stores
Legalize things like i24 load/store by splitting them into smaller power of 2 operations.

This matches how SelectionDAG handles these operations.

Differential Revision: https://reviews.llvm.org/D59971

llvm-svn: 358613
2019-04-17 21:30:07 +00:00
Fangrui Song c82e92bca8 Change some llvm::{lower,upper}_bound to llvm::bsearch. NFC
llvm-svn: 358564
2019-04-17 07:58:05 +00:00
Amara Emerson 02a90ea73d [AArch64][GlobalISel] Don't do extending loads combine for non-pow-2 types.
Since non-pow-2 types are going to get split up into multiple loads anyway,
don't do the [SZ]EXTLOAD combine for those and save us trouble later in
legalization.

llvm-svn: 358458
2019-04-15 22:34:08 +00:00
Amara Emerson 946b1246d6 [GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only.
Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't
be degraded.

This change also improves the IRTranslator so that in most places, but not all,
it creates constants using the MIRBuilder directly instead of first creating a
new destination vreg and then creating a constant. By doing this, the
buildConstant() method can just return the vreg of an existing G_CONSTANT
instead of having to create a COPY from it.

I measured a 0.2% improvement in compile time and a 0.9% improvement in code
size at -O0 ARM64.

Compile time:
Program                                        base   cse    diff
test-suite...ark/tramp3d-v4/tramp3d-v4.test     9.04   9.12  0.8%
test-suite...Mark/mafft/pairlocalalign.test     2.68   2.66 -0.7%
test-suite...-typeset/consumer-typeset.test     5.53   5.51 -0.4%
test-suite :: CTMark/lencod/lencod.test         5.30   5.28 -0.3%
test-suite :: CTMark/Bullet/bullet.test        25.82  25.76 -0.2%
test-suite...:: CTMark/ClamAV/clamscan.test     6.92   6.90 -0.2%
test-suite...TMark/7zip/7zip-benchmark.test    34.24  34.17 -0.2%
test-suite :: CTMark/SPASS/SPASS.test           6.25   6.24 -0.1%
test-suite...:: CTMark/sqlite3/sqlite3.test     1.66   1.66 -0.1%
test-suite :: CTMark/kimwitu++/kc.test         13.61  13.60 -0.0%
Geomean difference                                          -0.2%

Code size:
Program                                        base     cse      diff
test-suite...-typeset/consumer-typeset.test    1315632  1266480 -3.7%
test-suite...:: CTMark/ClamAV/clamscan.test    1313892  1297508 -1.2%
test-suite :: CTMark/lencod/lencod.test        1439504  1423112 -1.1%
test-suite...TMark/7zip/7zip-benchmark.test    2936980  2904172 -1.1%
test-suite :: CTMark/Bullet/bullet.test        3478276  3445460 -0.9%
test-suite...ark/tramp3d-v4/tramp3d-v4.test    8082868  8033492 -0.6%
test-suite :: CTMark/kimwitu++/kc.test         3870380  3853972 -0.4%
test-suite :: CTMark/SPASS/SPASS.test          1434904  1434896 -0.0%
test-suite...Mark/mafft/pairlocalalign.test    764528   764528   0.0%
test-suite...:: CTMark/sqlite3/sqlite3.test    782092   782092   0.0%
Geomean difference                                              -0.9%

Differential Revision: https://reviews.llvm.org/D60580

llvm-svn: 358369
2019-04-15 05:04:20 +00:00
Amara Emerson d189680baa [GlobalISel] Introduce a CSEConfigBase class to allow targets to define their own CSE configs.
Because CodeGen can't depend on GlobalISel, we need a way to encapsulate the CSE
configs that can be passed between TargetPassConfig and the targets' custom
pass configs. This CSEConfigBase allows targets to create custom CSE configs
which is then used by the GISel passes for the CSEMIRBuilder.

This support will be used in a follow up commit to allow constant-only CSE for
-O0 compiles in D60580.

llvm-svn: 358368
2019-04-15 04:53:46 +00:00
Amara Emerson 93e58d2396 [AArch64][GlobalISel] Enable copy elision in the pre-legalizer combine and fix a crash.
This enables the simple copy combine that already exists in the CombinerHelper.
However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a
set of MIs to process, and so would end up causing a crash when deleted MIs were
being added to the combiner worklist again.

Differential Revision: https://reviews.llvm.org/D60579

llvm-svn: 358318
2019-04-13 00:33:25 +00:00
Amara Emerson bdb5e4e4ca [GlobalISel] Fix a crash when handling an invalid MVT during call lowering.
This crash was introduced in r358032 as we try to construct an EVT from an MVT
in order to find the register type for the calling conv. Fall back instead of
trying to do this with an invalid MVT coming from i256.

llvm-svn: 358314
2019-04-12 22:05:46 +00:00
Fangrui Song fb79ff6ab5 Use llvm::upper_bound. NFC
llvm-svn: 358277
2019-04-12 11:31:16 +00:00
Fangrui Song cecc435250 Use llvm::lower_bound. NFC
This reapplies rL358161. That commit inadvertently reverted an exegesis file to an old version.

llvm-svn: 358246
2019-04-12 02:02:06 +00:00
Ali Tamur 7822b46188 Revert "Use llvm::lower_bound. NFC"
This reverts commit rL358161.

This patch have broken the test:
llvm/test/tools/llvm-exegesis/X86/uops-CMOV16rm-noreg.s

llvm-svn: 358199
2019-04-11 17:35:20 +00:00
Fangrui Song 71cce580b9 Use llvm::lower_bound. NFC
llvm-svn: 358161
2019-04-11 10:25:41 +00:00
Amara Emerson ae878dab03 [AArch64][GlobalISel] Scalarize vector SDIV.
llvm-svn: 358142
2019-04-10 23:06:08 +00:00
Matt Arsenault 2064e45ce3 GlobalISel: Move computeValueLLTs
Call lowering should use this directly instead of going through the
EVT version, but more work is needed to deal with this (mostly the
passing of the IR type pointer instead of the relevant properties in
ArgInfo).

llvm-svn: 358111
2019-04-10 17:27:56 +00:00
Matt Arsenault 0aab99902b GlobalISel: Fix invoke lowering creating invalid type registers
Unlike the call handling, this wasn't checking for void results and
creating a register with the invalid LLT

llvm-svn: 358110
2019-04-10 17:27:55 +00:00
Matt Arsenault 7187272b2b GlobalISel: Support legalizing G_CONSTANT with irregular breakdown
llvm-svn: 358109
2019-04-10 17:27:53 +00:00
Matt Arsenault 9e0eeba569 GlobalISel: Handle odd breakdowns for bit ops
llvm-svn: 358105
2019-04-10 17:07:56 +00:00
Amara Emerson 2b523f8162 [GlobalISel][AArch64] Allow CallLowering to handle types which are normally
required to be passed as different register types. E.g. <2 x i16> may need to
be passed as a larger <2 x i32> type, so formal arg lowering needs to be able
truncate it back. Likewise, when dealing with returns of these types, they need
to be widened in the appropriate way back.

Differential Revision: https://reviews.llvm.org/D60425

llvm-svn: 358032
2019-04-09 21:22:33 +00:00
Matt Arsenault 106429b4e4 GlobalISel: Add another overload of buildUnmerge
It's annoying to have to create an array of the result type,
particularly when you don't care about the size of the value.

llvm-svn: 357763
2019-04-05 14:03:07 +00:00
Evandro Menezes 85bd3978ae [IR] Refactor attribute methods in Function class (NFC)
Rename the functions that query the optimization kind attributes.

Differential revision: https://reviews.llvm.org/D60287

llvm-svn: 357731
2019-04-04 22:40:06 +00:00
Evandro Menezes 7c711ccf36 [IR] Create new method in `Function` class (NFC)
Create method `optForNone()` testing for the function level equivalent of
`-O0` and refactor appropriately.

Differential revision: https://reviews.llvm.org/D59852

llvm-svn: 357638
2019-04-03 21:27:03 +00:00
Jessica Paquette e794121cd0 [AArch64][GlobalISel] Legalize G_FEXP2
Same as G_EXP. Add a test, and update legalizer-info-validation.mir and
f16-instructions.ll.

Differential Revision: https://reviews.llvm.org/D60165

llvm-svn: 357605
2019-04-03 16:58:32 +00:00
Jessica Paquette ed23352379 [GlobalISel] Add IRTranslator support for llvm.stacksave and llvm.stackrestore
Also update arm64-irtranslator.ll.

Differential Revision: https://reviews.llvm.org/D60140

llvm-svn: 357538
2019-04-02 22:46:31 +00:00
Amara Emerson 381188f1f3 [GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead instructions.
The artifact combiners push instructions which have been marked for deletion
onto an list for the legalizer to deal with on return. However, for trunc(ext)
combines the combiner routine recursively calls itself. When it does this the
dead instructions list may not be empty, and the other combiners don't expect
to be dealing with essentially invalid MIR (multiple vreg defs etc).

This change fixes it by ensuring that the dead instructions are processed on
entry into tryCombineInstruction.

As a result, this fix exposed a few places in tests where G_TRUNC instructions
were not being deleted even though they were dead.

Differential Revision: https://reviews.llvm.org/D59892

llvm-svn: 357101
2019-03-27 17:47:42 +00:00
Matt Arsenault b34afa311d GlobalISel: Fix RegBankSelect for REG_SEQUENCE
The AArch64 test was broken since the result register already had a
set register class, so this test was a no-op. The mapping verify call
would fail because the result size is not the same as the inputs like
in a copy or phi.

The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR
copies which need much more work to handle correctly (same for phis),
but add them as a baseline.

llvm-svn: 356713
2019-03-21 20:45:36 +00:00
Amara Emerson a140276a1e [GlobalISel] Include missing change from r356396
Forgot to add a change to relax some asserts in r356396.

llvm-svn: 356411
2019-03-18 21:29:21 +00:00
Amara Emerson 8627178d46 Revert r356304: remove subreg parameter from MachineIRBuilder::buildCopy()
After review comments, it was preferred to not teach MachineIRBuilder about
non-generic instructions beyond using buildInstr().

For AArch64 I've changed the buildCopy() calls to buildInstr() + a
separate addReg() call.

This also relaxes the MachineIRBuilder's COPY checking more because it may
not always have a SrcOp given to it.

llvm-svn: 356396
2019-03-18 19:20:10 +00:00
Amara Emerson 7097e83dab [GlobalISel] Make isel verification checks of vregs run under NDEBUG only.
llvm-svn: 356309
2019-03-16 01:02:10 +00:00
Amara Emerson 3739a20875 [GlobalISel] Allow MachineIRBuilder to build subregister copies.
This relaxes some asserts about sizes, and adds an optional subreg parameter
to buildCopy().

Also update AArch64 instruction selector to use this in places where we
previously used MachineInstrBuilder manually.

Differential Revision: https://reviews.llvm.org/D59434

llvm-svn: 356304
2019-03-15 21:59:50 +00:00
Matt Arsenault 133716929c GlobalISel: Use multiple returns for intrinsic structs
This is consistent with what SelectionDAG does and is much easier to
work with than the extract sequence with an artificial wide register.

For the AMDGPU control flow intrinsics, this was producing an s128 for
the i64, i1 tuple return. Any legalization that should apply to a real
s128 value would badly obscure the direct values that need to be seen.

llvm-svn: 356147
2019-03-14 14:18:56 +00:00
Quentin Colombet e77e5f44b8 [GlobalISel][Utils] Add a getConstantVRegVal variant that looks through instrs
getConstantVRegVal used to only look for G_CONSTANT when looking at
unboxing the value of a vreg. However, constants are sometimes not
directly used and are hidden behind trunc, s|zext or copy chain of
computation.

In particular this may be introduced by the legalization process that
doesn't want to simplify these patterns because it can lead to infine
loop when legalizing a constant.

To circumvent that problem, add a new variant of getConstantVRegVal,
named getConstantVRegValWithLookThrough, that allow to look through
extensions.

Differential Revision: https://reviews.llvm.org/D59227

llvm-svn: 356116
2019-03-14 01:37:13 +00:00
Jessica Paquette 42d16501e6 [GlobalISel][AArch64] Always fall back on aarch64.neon.addp.*
Overloaded intrinsics aren't necessarily safe for instruction selection. One
such intrinsic is aarch64.neon.addp.*.

This is a temporary workaround to ensure that we always fall back on that
intrinsic. Eventually this will be replaced with a proper solution.

https://bugs.llvm.org/show_bug.cgi?id=40968

Differential Revision: https://reviews.llvm.org/D59062

llvm-svn: 355865
2019-03-11 20:51:17 +00:00
Benjamin Kramer 6ff32e143a [MIPS GlobalISel] Silence uninitialized variable warning
The control flow here cannot ever use the uninitialized value, but it's
too hard for the compiler to figure that out. Clang warns:

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: error: variable 'CarrySum' is used uninitialized whenever 'for' loop exits because its condition is false [-Werror,-Wsometimes-uninitialized]
      for (unsigned i = 2; i < Factors.size(); ++i)
                           ^~~~~~~~~~~~~~~~~~
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2604:26: note: uninitialized use occurs here
    CarrySumPrevDstIdx = CarrySum;
                         ^~~~~~~~
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: note: remove the condition if it is always true
      for (unsigned i = 2; i < Factors.size(); ++i)
                           ^~~~~~~~~~~~~~~~~~
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2583:22: note: initialize the variable 'CarrySum' to silence this warning
    unsigned CarrySum;
                     ^
                      = 0

llvm-svn: 355818
2019-03-11 10:39:15 +00:00
Petar Avramovic 5229f47f9f [MIPS GlobalISel] NarrowScalar G_UMULH
NarrowScalar G_UMULH in LegalizerHelper 
using multiplyRegisters helper function.
NarrowScalar G_UMULH for MIPS32.

Differential Revision: https://reviews.llvm.org/D58825

llvm-svn: 355815
2019-03-11 10:08:44 +00:00
Petar Avramovic 0b17e59b5c [MIPS GlobalISel] NarrowScalar G_MUL
Narrow Scalar G_MUL for MIPS32.
Revisit NarrowScalar implementation in LegalizerHelper.
Introduce new helper function multiplyRegisters.
It performs generic multiplication of values held in multiple registers.
Generated instructions use only types NarrowTy and i1.
Destination can be same or two times size of the source.

Differential Revision: https://reviews.llvm.org/D58824

llvm-svn: 355814
2019-03-11 10:00:17 +00:00
Matt Arsenault d3093c2f1f GlobalISel: Implement fewerElementsVector for phi
llvm-svn: 355048
2019-02-28 00:16:32 +00:00
Matt Arsenault 72bcf15dbf GlobalISel: Implement moreElementsVector for phi
llvm-svn: 355047
2019-02-28 00:01:05 +00:00
Petar Avramovic bd39569913 [MIPS GlobalISel] Select G_UADDO
Lower G_UADDO.
Legalize G_UADDO for MIPS32

Differential Revision: https://reviews.llvm.org/D58671

llvm-svn: 354900
2019-02-26 17:22:42 +00:00
Matt Arsenault 752579736e RegBankSelect: Handle slightly more complex value mappings
Try to use concat_vectors. Also remove unnecessary assert on
pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers
for AMDGPU.

llvm-svn: 354828
2019-02-25 22:24:13 +00:00
Matt Arsenault 8df2f3dab2 RegBankSelect: Allow targets to introduce control flow for mapping
For AMDGPU, if an operand requires an SGPR but is only available as a
VGPR, a loop needs to be introduced to execute the instruction with
each unique combination of values across all lanes. The rest of the
instructions in the block will be moved to a new block following the
loop. Check if the next instruction's parent changed, and update the
iterators and insertion block if this happened.

Tests will be included in a future patch.

llvm-svn: 354591
2019-02-21 15:48:13 +00:00
Matt Arsenault 75e30c4d5d GlobalISel: Fix fewerElementsVector for ctlz with different result type
Also complete the set of related operations.

llvm-svn: 354480
2019-02-20 16:42:52 +00:00
Matt Arsenault c4d07554e4 GlobalISel: Implement moreElementsVector for g_insert results
llvm-svn: 354477
2019-02-20 16:11:22 +00:00
Matt Arsenault b4c95b338b GlobalISel: Implement moreElementsVector for select
llvm-svn: 354354
2019-02-19 17:03:09 +00:00
Matt Arsenault 4d88427a58 GlobalISel: Implement moreElementsVector for G_EXTRACT source
llvm-svn: 354348
2019-02-19 16:44:22 +00:00
Matt Arsenault 26b7e859ef GlobalISel: Implement moreElementsVector for bit ops
llvm-svn: 354345
2019-02-19 16:30:19 +00:00
Jessica Paquette b53e0f4b81 [GlobalISel][AArch64] Legalize + select some llvm.ctlz.* intrinsics
Legalize/select llvm.ctlz.*

Add select-ctlz to show that we actually select them. Update arm64-clrsb.ll and
arm64-vclz.ll to show that we perform valid transformations in optimized builds,
and document where GISel can improve.

Differential Revision: https://reviews.llvm.org/D58155

llvm-svn: 354299
2019-02-18 23:33:24 +00:00
Matt Arsenault fbe92a53d0 GlobalISel: Implement widenScalar for g_extract scalar results
llvm-svn: 354293
2019-02-18 22:39:27 +00:00
Matt Arsenault e84bdce609 GlobalISel: Make buildExtract use DstOp/SrcOp
llvm-svn: 354292
2019-02-18 22:39:22 +00:00
Matt Arsenault debaf4bd31 GlobalISel: Fix double count of offset for irregular vector breakdowns
Fixes cases with odd vectors that break into multiple requested size
pieces.

llvm-svn: 354280
2019-02-18 17:01:09 +00:00
Aditya Nandakumar 0e362ec19a [GISel][NFC]: Add methods to speed up insertion into GISelWorklist
https://reviews.llvm.org/D58073

Speed up insertion during the initial populating phase into the
GISelWorkList by deferring repeatedly resizing the DenseMap.
This results in ~10% improvement in the combiner passes, and
~3% speedup in the Legalizer.

reviewed by: aemerson.

llvm-svn: 354093
2019-02-15 01:37:54 +00:00
Matt Arsenault 530d05e94a GlobalISel: Add alignment to LegalityQuery MMOs
This allows targets to specify the minimum alignment required for the
load/store.

llvm-svn: 354071
2019-02-14 22:41:09 +00:00
Petar Avramovic 5d9b8eed85 [MIPS GlobalISel] Select branch instructions
Select G_BR and G_BRCOND for MIPS32.
Unconditional branch G_BR does not have register operand,
for that reason we only add tests.
Since conditional branch G_BRCOND compares register to zero on MIPS32,
explicit extension must be performed on i1 condition in order to set
high bits to appropriate value.

Differential Revision: https://reviews.llvm.org/D58182

llvm-svn: 354022
2019-02-14 11:39:53 +00:00
Daniel Sanders dfa0f556bf [globalisel][combine] Split existing rules into a match and apply step
Summary:
The declarative tablegen definitions split rules into match and apply steps.
Prepare for that by doing the same in the C++ implementations. This aids
some of the migration effort while the tablegen version is incomplete.

Reviewers: bogner, volkan, aditya_nandakumar, paquette, aemerson

Reviewed By: aditya_nandakumar

Subscribers: rovka, kristof.beyls, Petar.Avramovic, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58150

llvm-svn: 353996
2019-02-14 00:15:28 +00:00
Jessica Paquette acbb7ca26c [GlobalISel][NFC] Gardening: Make translateSimpleUnaryIntrinsic general
Instead of only having this code work for unary intrinsics, have it work for
an arbitrary number of parameters.

Factor out the cases that fall under this (fma, pow).

This makes it a bit easier to add more intrinsics which don't require any
special work.

Differential Revision: https://reviews.llvm.org/D58079

llvm-svn: 353863
2019-02-12 17:38:34 +00:00
Jessica Paquette 0e71e73faa [GlobalISel][AArch64] Select llvm.bswap* for non-vector types
This teaches the IRTranslator to emit G_BSWAP when it runs into
Intrinsic::bswap. This allows us to select G_BSWAP for non-vector types in
AArch64.

Add a select-bswap.mir test, and add global isel checks to a couple existing
tests in test/CodeGen/AArch64.

This doesn't handle every bswap case, since some of these rely on known bits
stuff. This just lets us handle the naive case.

Differential Revision: https://reviews.llvm.org/D58081

llvm-svn: 353861
2019-02-12 17:28:17 +00:00
Matt Arsenault 996c66620e GlobalISel: Use default rounding mode when extending fconstant
I don't think this matters since the values should all be exactly
representable.

llvm-svn: 353844
2019-02-12 14:54:54 +00:00
Matt Arsenault 1cf713664d GlobalISel: Move some more legalize cases into functions
llvm-svn: 353843
2019-02-12 14:54:52 +00:00
Matt Arsenault 18ec382698 GlobalISel: Implement moreElementsVector for implicit_def
llvm-svn: 353754
2019-02-11 22:00:39 +00:00
Matt Arsenault 68fc38ce80 GlobalISel: Fix not calling the observer when legalizing G_EXTRACT
llvm-svn: 353750
2019-02-11 21:33:54 +00:00
Daniel Sanders 24e0af6906 [globalisel] Correct string emitted by GISelChangeObserver::erasingInstr()
The API indicates that the MI is about to be erased rather than it has been erased.

llvm-svn: 353746
2019-02-11 20:45:19 +00:00
Jessica Paquette ebdb021031 [GlobalISel][AArch64] Select G_FFLOOR
This teaches the legalizer about G_FFLOOR, and lets us select G_FFLOOR in
AArch64.

It updates the existing floating point tests, and adds a select-floor.mir test.

Differential Revision: https://reviews.llvm.org/D57486

llvm-svn: 353722
2019-02-11 17:22:58 +00:00
Jessica Paquette f472f31876 Recommit "[GlobalISel] Add IRTranslator support for G_FFLOOR"
After the changes introduced in r353586, this instruction doesn't cause any
issues for any backend.

Original review: https://reviews.llvm.org/D57485

llvm-svn: 353720
2019-02-11 17:16:32 +00:00
Matt Arsenault 9dba67f431 GlobalISel: Add G_FCANONICALIZE instruction
llvm-svn: 353719
2019-02-11 17:05:20 +00:00
Francis Visoiu Mistrih 8bc57953b7 Re-apply r353553 "[GISel][NFC]: Add missing call to record CSE hits in the CSEMIRBuilder"
With a fix after r353563 that adds some more opcodes.

llvm-svn: 353579
2019-02-08 23:34:11 +00:00
Francis Visoiu Mistrih decba8aa06 Revert r353553 "[GISel][NFC]: Add missing call to record CSE hits in the CSEMIRBuilder"
This reverts commit r353553.

This breaks CodeGen/AArch64/GlobalISel/legalize-ext-csedebug-output.mir:

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/57963/console

llvm-svn: 353575
2019-02-08 22:49:43 +00:00
Craig Topper 784929d045 Implementation of asm-goto support in LLVM
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html

This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.

This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.

There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.

Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii

Differential Revision: https://reviews.llvm.org/D53765

llvm-svn: 353563
2019-02-08 20:48:56 +00:00
Aditya Nandakumar 01e818a97d [GISel][NFC]: Add missing call to record CSE hits in the CSEMIRBuilder
https://reviews.llvm.org/D57932

Add some logging + tests to make sure CSEInfo prints debug output.

reviewed by: arsenm

llvm-svn: 353553
2019-02-08 19:41:13 +00:00
Petar Avramovic c98b26d326 [MIPS GlobalISel] Select any extending load and truncating store
Make behavior of G_LOAD in widenScalar same as for G_ZEXTLOAD and
G_SEXTLOAD. That is perform widenScalarDst to size given by the target
and avoid additional checks in common code. Targets can reorder or add
additional rules in LegalizeRuleSet for the opcode to achieve desired
behavior.

Select extending load that does not have specified type of extension
into zero extending load.

Select truncating store that stores number of bytes indicated by size
in MachineMemoperand.

Differential Revision: https://reviews.llvm.org/D57454

llvm-svn: 353520
2019-02-08 14:27:23 +00:00
Matt Arsenault a8b4339c2f AMDGPU/GlobalISel: Legalize addrspacecast
Use a placeholder constant for now on targets
that need the load from the queue ptr.

llvm-svn: 353497
2019-02-08 02:40:47 +00:00
Matt Arsenault e98cab11d7 GlobalISel: Try to fix bot failures
Don't rely on order of evaluation of function arguments.

llvm-svn: 353460
2019-02-07 20:44:08 +00:00
Matt Arsenault fbec8fe93b GlobalISel: Implement narrowScalar for shift main type
This is pretty much directly ported from SelectionDAG. Doesn't include
the shift by non-constant but known bits version, since there isn't a
globalisel version of computeKnownBits yet.

This shows a disadvantage of targets not specifically which type
should be used for the shift amount. If type 0 is legalized before
type 1, the operations on the shift amount type use the wider type
(which are also less likely to legalize). This can be avoided by
targets specifying legalization actions on type 1 earlier than for
type 0.

llvm-svn: 353455
2019-02-07 19:37:44 +00:00
Matt Arsenault c83b82363c GlobalISel: Implement fewerElementsVector for shifts
Introduce a new function which handles instructions with multiple type
indices, but have the same number of vector elements.

Also legalize v2s16 shifts when applicable.

llvm-svn: 353432
2019-02-07 17:38:00 +00:00
Matt Arsenault 91be65be65 GlobalISel: Try to make legalize rules more useful for vectors
Mostly keep the existing functions on scalars, but add versions which
also operate based on the vector element size.

llvm-svn: 353430
2019-02-07 17:25:51 +00:00
Michael Berg f0d81a31b6 Move IR flag handling directly into builder calls for cases translated from Instructions in GlobalIsel
Reviewers: aditya_nandakumar, volkan

Reviewed By: aditya_nandakumar

Subscribers: rovka, kristof.beyls, volkan, Petar.Avramovic

Differential Revision: https://reviews.llvm.org/D57630

llvm-svn: 353336
2019-02-06 19:57:06 +00:00
Jessica Paquette e288c526f1 [GlobalISel][NFC] Gardening: Factor out code for simple unary intrinsics
There was a lot of repeated code wrt unary math intrinsics in
translateKnownIntrinsic. This factors out the repeated MIRBuilder code into
two functions: translateSimpleUnaryIntrinsic and getSimpleUnaryIntrinsicOpcode.

This simplifies adding simple unary intrinsics, since after this, all you have
to do is add the mapping to SimpleUnaryIntrinsicOpcodes.

Differential Revision: https://reviews.llvm.org/D57774

llvm-svn: 353316
2019-02-06 17:25:54 +00:00
Matt Arsenault 7f09fd6b04 GlobalISel: Consolidate load/store legalization
The fewerElementsVectors implementation for load/stores
handles the scalar reduction case just as well, so drop
the redundant code in narrowScalar. This also introduces
support for narrowing irregular size breakdowns for
scalars.

llvm-svn: 353125
2019-02-05 00:26:12 +00:00
Matt Arsenault 81511e5428 GlobalISel: Implement narrowScalar for select
Don't handle vector conditions.

I think this can be merged in the future with
fewerElementsVectorSelect, although this becomes slightly tricky with
a vector condition.

llvm-svn: 353122
2019-02-05 00:13:44 +00:00
Matt Arsenault 24f14993e8 GlobalISel: Combine g_extract with g_merge_values
Try to use the underlying source registers.

This enables legalization in more cases where some irregular
operations are widened and others narrowed.

This seems to make the test_combines_2 AArch64 test worse, since the
MERGE_VALUES has multiple uses. Since this should be required for
legalization, a hasOneUse check is probably inappropriate (or maybe
should only be used if the merge is legal?).

llvm-svn: 353121
2019-02-04 23:41:59 +00:00
Matt Arsenault 3d6a49b0b9 GlobalISel: Fix not calling observer when legalizing bitcount ops
This was hiding bugs from never legalizing the source type.

llvm-svn: 353102
2019-02-04 22:26:33 +00:00
Matt Arsenault 8121ec26c0 GlobalISel: Fix CSE handling of buildConstant
This fixes two problems with CSE done in buildConstant. First, this
would hit an assert when used with a vector result type. Solve this by
allowing CSE on the vector elements, but not on the result vector for
now.

Second, this was also performing the CSE based on the input
ConstantInt pointer. The underlying buildConstant could potentially
convert the constant depending on the result type, giving in a
different ConstantInt*. Stop allowing the APInt and ConstantInt forms
from automatically casting to the result type to avoid any similar
problems in the future.

llvm-svn: 353077
2019-02-04 19:15:50 +00:00
Matt Arsenault 0723828675 GlobalISel: Fix moreElementsToNextPow2
This was completely broken. The condition was inverted, and changed
the element type for vectors of pointers.

Fixes bug 40592.

llvm-svn: 353069
2019-02-04 18:42:24 +00:00
Jessica Paquette 834bded9d6 Revert "[GlobalISel] Add IRTranslator support for G_FFLOOR"
This reverts commit 8bbd570fd5205a04d88d2e5513a6e4adbd028039.

Apparently adding ffloor breaks AMDGPU somehow, so I need to back this out
while I look into it.

llvm-svn: 353064
2019-02-04 17:32:43 +00:00
Jessica Paquette 73158e7201 [GlobalISel] Add IRTranslator support for G_FFLOOR
Follow-up to https://reviews.llvm.org/D57484

Adds G_FFLOOR to translateKnownIntrinsic and update arm64-irtranslator.ll.

Differential Revision: https://reviews.llvm.org/D57485

llvm-svn: 353058
2019-02-04 17:15:34 +00:00
Matt Arsenault 56edf3f344 GlobalISel: Fix formatting of debug output
There was a missing space before the instruction name, and the newline
is redundant since MI::print by default adds one.

llvm-svn: 353046
2019-02-04 14:05:33 +00:00
Matt Arsenault 888aa5dedd GlobalISel: Implement widenScalar for G_UNMERGE_VALUES
For the scalar case only.

Also move the similar G_MERGE_VALUES handling to a separate function
and cleanup to make them look more similar.

llvm-svn: 352979
2019-02-03 00:07:33 +00:00
Matt Arsenault 0e5d856eb8 GlobalISel: Implement widenScalar for G_EXTRACT vector sources
Handle the basic element extract case.

llvm-svn: 352978
2019-02-02 23:56:00 +00:00
Matt Arsenault cbaada6bc1 GlobalISel: Legalization for inttoptr/ptrtoint
llvm-svn: 352973
2019-02-02 23:29:55 +00:00
Matt Arsenault 50d6579bac GlobalISel: Fix MMO creation with non-power-of-2 mem size
It should probably just be mandatory for getTgtMemIntrinsic to return
the alignment.

llvm-svn: 352817
2019-01-31 23:41:23 +00:00
Matt Arsenault c7bce739ad GlobalISel: Handle odd splits in fewerElementsVector for load/store
llvm-svn: 352720
2019-01-31 02:46:05 +00:00
Matt Arsenault d1bfc8d0c3 GlobalISel: Implement narrowScalar for bswap
llvm-svn: 352719
2019-01-31 02:34:03 +00:00
Matt Arsenault cf4db733d8 GlobalISel: Don't call changingInstruction before giving up
llvm-svn: 352718
2019-01-31 02:22:39 +00:00
Matt Arsenault d5684f76e0 GlobalISel: Allow bitcount ops to have different result type
For AMDGPU the result is always 32-bit for 64-bit inputs.

llvm-svn: 352717
2019-01-31 02:09:57 +00:00
Matt Arsenault 8db2001d52 GlobalISel: Use helper function for MMO splitting
Also fix an alignment bug getMachineMemOperand. If the
tracked value is null, the offset isn't tracked so the
base alignment needs to be reduced.

llvm-svn: 352716
2019-01-31 01:49:58 +00:00
Matt Arsenault 2a64598ef2 GlobalISel: Fix creating MMOs with align 0
llvm-svn: 352712
2019-01-31 01:38:47 +00:00
Jessica Paquette 84bedac7e9 [GlobalISel][AArch64] Select G_FEXP
This teaches the legalizer to handle G_FEXP in AArch64. As a result, it also
allows us to select G_FEXP.

It...

- Updates the legalizer-info tests
- Adds a test for legalizing exp
- Updates the existing fp tests to show that we can now select G_FEXP

https://reviews.llvm.org/D57483

llvm-svn: 352692
2019-01-30 23:46:15 +00:00
Amara Emerson 13311e5274 [GlobalISel][LegalizerHelper] Add some missing MI change observer calls.
No test as it's a preventative fix.

llvm-svn: 352691
2019-01-30 23:42:46 +00:00
Jessica Paquette 0154bd1385 [GlobalISel][AArch64] Add instruction selection support for @llvm.log2
This teaches GlobalISel to emit a RTLib call for @llvm.log2 when it encounters
it.

It updates the existing floating point tests to show that we don't fall back on
the intrinsic, and select the correct instructions. It also adds a legalizer
test for G_FLOG2.

https://reviews.llvm.org/D57357

llvm-svn: 352673
2019-01-30 21:16:04 +00:00
Jessica Paquette 22457f8e9b [GlobalISel][AArch64] Add instruction selection support for @llvm.sqrt
This teaches the legalizer about G_FSQRT in AArch64. Also adds a legalizer
test for G_FSQRT, a selection test for it, and updates existing floating point
tests.

https://reviews.llvm.org/D57361

llvm-svn: 352671
2019-01-30 21:03:52 +00:00
Jessica Paquette b147e7d853 [GlobalISel] Add IRTranslator support for @llvm.sqrt -> G_FSQRT
Follow-up commit to https://reviews.llvm.org/D57359. (r352668)

This adds IRTranslator support for recognising a @llvm.sqrt intrinsic and
translating it into a G_FSQRT.

https://reviews.llvm.org/D57360

llvm-svn: 352670
2019-01-30 20:58:14 +00:00
Matt Arsenault dc8258c4aa GlobalISel: Add assert that legalize mutation makes sense
I've repeatedly encountered bugs resulting from custom legalize
mutations returning nonsense legalize results, such as increasing the
number of elements for FewerElements. Add an assert function to make
sure the type to mutate to is consistent with the legalize action.

llvm-svn: 352636
2019-01-30 17:52:23 +00:00
Matt Arsenault dc6c78596b GlobalISel: Implement fewerElementsVector for select
llvm-svn: 352601
2019-01-30 04:19:31 +00:00
Matt Arsenault 6d8e1b456a GlobalISel: Use appropriate extension for legalizing select conditions
llvm-svn: 352597
2019-01-30 02:57:43 +00:00
Matt Arsenault 045bc9a4a6 GlobalISel: Support narrowScalar for uneven loads
llvm-svn: 352594
2019-01-30 02:35:38 +00:00
Matt Arsenault ccefbbd0f0 GlobalISel: Handle some odd splits in fewerElementsVector
Also add some quick hacks to AMDGPU legality for the tests.

llvm-svn: 352591
2019-01-30 02:22:13 +00:00
Matt Arsenault 92c5001136 GlobalISel: Handle more cases for widenScalar for G_STORE
llvm-svn: 352585
2019-01-30 02:04:31 +00:00
Matt Arsenault 3de9a96174 GlobalISel: Fix unused variable warning in release builds
llvm-svn: 352565
2019-01-29 23:38:42 +00:00
Matt Arsenault d8d193d5e2 GlobalISel: Partially implement widenScalar for MERGE_VALUES
llvm-svn: 352560
2019-01-29 23:17:35 +00:00
Matt Arsenault 18619afe1d GlobalISel: Fix narrowScalar for load/store with different mem size
This was ignoring the memory size, and producing multiple loads/stores
if the operand size was different from the memory size.

I assume this is the intent of not having an explicit G_ANYEXTLOAD
(although I think that would probably be better).

llvm-svn: 352523
2019-01-29 18:13:02 +00:00
Jessica Paquette 2d73ecd0a3 [GlobalISel][AArch64] Add legalization for G_FLOG
This adds support for legalizing G_FLOG into a RTLib call.

It adds a legalizer test, and updates the existing floating point tests.

https://reviews.llvm.org/D57347

llvm-svn: 352429
2019-01-28 21:27:23 +00:00
Jessica Paquette c49428a97d [GlobalISel][AArch64] Add instruction selection support for @llvm.log10
This adds instruction selection support for @llvm.log10 in AArch64. It teaches
GISel to lower it to a library call, updates the relevant tests, and adds a
legalizer test for log10.

https://reviews.llvm.org/D57341

llvm-svn: 352418
2019-01-28 19:53:14 +00:00
Jessica Paquette 2e35dc5185 [GlobalISel] Add ISel support for @llvm.lifetime.start and @llvm.lifetime.end
This adds ISel support for lifetime markers in opt levels above O0.

It also updates the arm64-irtranslator test, and updates some AArch64 tests that
use them for added coverage.

It also adds a testcase taken from the X86 codegen tests which verified a bug
caused by lifetime markers + stack colouring in the past. This is intended to
make sure that GISel doesn't re-introduce the bug.

(This is basically a straight copy from what SelectionDAG does in
SelectionDAGBuilder.cpp)

https://reviews.llvm.org/D57187

llvm-svn: 352410
2019-01-28 19:22:29 +00:00
Jessica Paquette 7db82d7257 [GlobalISel][AArch64] Add instruction selection support for G_FCOS and G_FSIN
This contains all of the legalizer changes from D57197 necessary to select
G_FCOS and G_FSIN. It also updates several existing IR tests in
test/CodeGen/AArch64 that verify that we correctly lower the G_FCOS and G_FSIN
instructions.

https://reviews.llvm.org/D57197
3/3

llvm-svn: 352402
2019-01-28 18:34:18 +00:00
Jessica Paquette 296f19b3d9 [GlobalISel][AArch64] Add IRTranslator support for G_FCOS and G_FSIN
This adds IRTranslator support for the G_FCOS and G_FSIN generic instructions.

https://reviews.llvm.org/D57197
2/3

llvm-svn: 352401
2019-01-28 18:34:17 +00:00
Petar Avramovic 7cecadb9af [MIPS GlobalISel] Select sub
Lower G_USUBO and G_USUBE. Add narrowScalar for G_SUB.
Legalize and select G_SUB for MIPS 32.

Differential Revision: https://reviews.llvm.org/D53416

llvm-svn: 352351
2019-01-28 12:10:17 +00:00
Matt Arsenault cfca2a7adf GlobalISel: Don't reduce elements for atomic load/store
This is invalid for the same reason as in the narrowScalar handling
for load.

llvm-svn: 352334
2019-01-27 22:36:24 +00:00
Matt Arsenault 816c9b3e25 GlobalISel: Factor fewerElementVectors into separate functions
llvm-svn: 352332
2019-01-27 21:53:09 +00:00
Amara Emerson bf43004ff1 [AArch64][GlobalISel] Fix the G_EXTLOAD combiner creating non-extending illegal instructions.
This fixes loads like 's1 = load %p (load 1 from %p)' being combined with an
extend into an illegal 's8 = g_extload %p (load 1 from %p)' which doesn't do any
extension, by avoiding touching those < s8 size loads.

This bug was uncovered by a verifier update r351584, which I reverted it to keep
the bots green.

llvm-svn: 352311
2019-01-27 10:56:20 +00:00
Matt Arsenault 590c67507a GlobalISel: Fix typo in assert messages
llvm-svn: 352301
2019-01-27 00:53:54 +00:00
Matt Arsenault 211e89d4dd GlobalISel: Implement narrowScalar for mul
llvm-svn: 352300
2019-01-27 00:52:51 +00:00
Matt Arsenault 2e5f900849 GlobalISel: fewerElementsVector for intrinsic_trunc/intrinsic_round
llvm-svn: 352298
2019-01-27 00:12:21 +00:00
Amara Emerson 203760ab9c [GlobalISel][IRTranslator] Fix crash on translation of fneg.
When the fneg IR instruction was added the code to do translation wasn't
tested, and tried to get an invalid operand.

llvm-svn: 352296
2019-01-26 23:47:09 +00:00
Matt Arsenault 26a6c74fbe AMDGPU/GlobalISel: Legalize more bit ops
llvm-svn: 352295
2019-01-26 23:47:07 +00:00
Simon Pilgrim cdf58092e4 Fix gcc -Wparentheses warning. NFCI.
llvm-svn: 352191
2019-01-25 11:34:58 +00:00
Matt Arsenault 3e08b772b3 AMDGPU/GlobalISel: Scalarize add/sub
llvm-svn: 352167
2019-01-25 04:53:57 +00:00
Matt Arsenault e6cebd0d69 GlobalISel: fewerElementsVector for more cast types
llvm-svn: 352166
2019-01-25 04:37:33 +00:00
Matt Arsenault 95fd95cfe0 GlobalISel: fewerElementsVector for a few more trivial ops
llvm-svn: 352165
2019-01-25 04:03:38 +00:00
Matt Arsenault 5d622fbcc1 AMDGPU/GlobalISel: Legalize smulh/umulh and scalarize mul
llvm-svn: 352162
2019-01-25 03:23:04 +00:00
Matt Arsenault 1b1e685f10 GlobalISel: Support fewerElementsVector for icmp/fcmp
Also legalize 64-bit compares for AMDGPU

llvm-svn: 352157
2019-01-25 02:59:34 +00:00
Matt Arsenault ca676343a9 GlobalISel: Implement fewerElementsVector for extensions
llvm-svn: 352155
2019-01-25 02:36:32 +00:00
Matt Arsenault 990f507704 GlobalISel: Add convenience mutatations to scalarize
llvm-svn: 352143
2019-01-25 00:51:00 +00:00
Matt Arsenault 6bab7ab11e RegBankSelect: Fix use after free in r352123
llvm-svn: 352130
2019-01-24 23:42:01 +00:00
Aditya Nandakumar 3ba0d94bce [GISel]: Change how CSE is enabled by default for each pass
https://reviews.llvm.org/D57178

Now add a hook in TargetPassConfig to query if CSE needs to be
enabled. By default this hook returns false only for O0 opt level but
this can be overridden by the target.
As a consequence of the default of enabled for non O0, a few tests
needed to be updated to not use CSE (by passing in -O0) to the run
line.

reviewed by: arsenm

llvm-svn: 352126
2019-01-24 23:11:25 +00:00
Matt Arsenault baa5d2e69c RegBankSelect: Support some more complex part mappings
llvm-svn: 352123
2019-01-24 22:47:04 +00:00
Jessica Paquette 245047dfe8 [GlobalISel][AArch64] Add isel support for FP16 vector @llvm.ceil
This patch adds support for vector @llvm.ceil intrinsics when full 16 bit
floating point support isn't available.

To do this, this patch...

- Implements basic isel for G_UNMERGE_VALUES
- Teaches the legalizer about 16 bit floats
- Teaches AArch64RegisterBankInfo to respect floating point registers on
  G_BUILD_VECTOR and G_UNMERGE_VALUES
- Teaches selectCopy about 16-bit floating point vectors

It also adds

- A legalizer test for the 16-bit vector ceil which verifies that we create a
  G_UNMERGE_VALUES and G_BUILD_VECTOR when full fp16 isn't supported
- An instruction selection test which makes sure we lower to G_FCEIL when
  full fp16 is supported
- A test for selecting G_UNMERGE_VALUES

And also updates arm64-vfloatintrinsics.ll to show that the new ceiling types
work as expected.

https://reviews.llvm.org/D56682

llvm-svn: 352113
2019-01-24 22:00:41 +00:00
Matt Arsenault 30989e492b GlobalISel: Allow shift amount to be a different type
For AMDGPU the shift amount is never 64-bit, and
this needs to use a 32-bit shift.

X86 uses i8, but seemed to be hacking around this before.

llvm-svn: 351882
2019-01-22 21:42:11 +00:00
Matt Arsenault 52133812f6 GlobalISel: Make buildConstant handle vectors
Produce a splat build_vector similar to how
SelectionDAG::getConstant does.

llvm-svn: 351880
2019-01-22 21:31:02 +00:00
Matt Arsenault 6378629609 GlobalISel: Implement widen for extract_vector_elt elt type
llvm-svn: 351871
2019-01-22 20:38:15 +00:00
Matt Arsenault aebb2ee036 GlobalISel: Implement fewerElementsVector for basic FP ops
llvm-svn: 351866
2019-01-22 20:14:29 +00:00
Matt Arsenault 6614f852b6 GlobalISel: Support narrowing zextload/sextload
llvm-svn: 351856
2019-01-22 19:02:10 +00:00
Matt Arsenault a7cd83bc88 GlobalISel: Disallow vectors for G_CONSTANT/G_FCONSTANT
llvm-svn: 351853
2019-01-22 18:53:41 +00:00
Matt Arsenault a5195829f6 GlobalISel: Add isPointer legality predicates
llvm-svn: 351699
2019-01-20 19:45:14 +00:00
Matt Arsenault 745fd9f547 GlobalISel: Implement widenScalar for basic FP ops
llvm-svn: 351696
2019-01-20 19:10:31 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Mandeep Singh Grang f0e99779ae [GlobalISel] Change to range-based invocation of llvm::sort
llvm-svn: 351574
2019-01-18 18:53:48 +00:00
Aditya Nandakumar 500e3ead9f [GISel]: Add support for CSEing continuously during GISel passes.
https://reviews.llvm.org/D52803

This patch adds support to continuously CSE instructions during
each of the GISel passes. It consists of a GISelCSEInfo analysis pass
that can be used by the CSEMIRBuilder.

llvm-svn: 351283
2019-01-16 00:40:37 +00:00
Benjamin Kramer b17d2136ea Give helper classes/functions local linkage. NFC.
llvm-svn: 351016
2019-01-12 18:36:22 +00:00
Matt Arsenault 3dddb163dd GlobalISel: Implement fewerElements for implicit_def
llvm-svn: 350697
2019-01-09 07:51:52 +00:00
Matt Arsenault befee402ff GlobalISel: Implement widenScalar for implicit_def
llvm-svn: 350695
2019-01-09 07:34:14 +00:00
Benjamin Kramer a480523ce9 [GlobalISel] Fix unused variable warning in Release builds.
llvm-svn: 350618
2019-01-08 12:54:26 +00:00
Matt Arsenault 376f2ef2f0 Fix typos
llvm-svn: 350597
2019-01-08 01:25:47 +00:00
Matt Arsenault adc40baa29 RegBankSelect: Fix copy insertion point for terminators
If a copy was needed to handle the condition of brcond, it was being
inserted before the defining instruction. Add tests for iterator edge
cases.

I find the existing code here suspect for the case where it's looking
for terminators that modify the register. It's going to insert a copy
in the middle of the terminators, which isn't allowed (it might be
necessary to have a COPY_terminator if anybody actually needs this).

Also legalize brcond for AMDGPU.

llvm-svn: 350595
2019-01-08 01:22:47 +00:00
Richard Trieu a87b70d1db Add vtable anchor to classes.
llvm-svn: 350142
2018-12-29 02:02:13 +00:00
Petar Avramovic 09dff33349 [MIPS GlobalISel] Select G_SELECT
Add widen scalar for type index 1 (i1 condition) for G_SELECT.
Select G_SELECT for pointer, s32(integer) and smaller low level
types on MIPS32.

Differential Revision: https://reviews.llvm.org/D56001

llvm-svn: 350063
2018-12-25 14:42:30 +00:00
Jessica Paquette 453ab1db5b [GlobalISel][AArch64] Add support for widening G_FCEIL
This adds support for widening G_FCEIL in LegalizerHelper and
AArch64LegalizerInfo. More specifically, it teaches the AArch64 legalizer to
widen G_FCEIL from a 16-bit float to a 32-bit float when the subtarget doesn't
support full FP 16.

This also updates AArch64/f16-instructions.ll to show that we perform the
correct transformation.

llvm-svn: 349927
2018-12-21 17:05:26 +00:00
Rhys Perry 972273d1d3 Fix test commit
Seems that was actually a eight space tab...

llvm-svn: 349690
2018-12-19 22:33:42 +00:00
Rhys Perry 111bf831de Test commit
Replace tab with 4 spaces.

llvm-svn: 349689
2018-12-19 22:26:51 +00:00
Jessica Paquette 3560e93dc1 [GlobalISel][AArch64] Add support for @llvm.ceil
This adds a G_FCEIL generic instruction and uses it in AArch64. This adds
selection for floating point ceil where it has a supported, dedicated
instruction. Other cases aren't handled here.

It updates the relevant gisel tests and adds a select-ceil test. It also adds a
check to arm64-vcvt.ll which ensures that we don't fall back when we run into
one of the relevant cases.

llvm-svn: 349664
2018-12-19 19:01:36 +00:00
Michael Berg c6a5245cf7 Add FMF management to common fp intrinsics in GlobalIsel
Summary: This the initial code change to facilitate managing FMF flags from Instructions to MI wrt Intrinsics in Global Isel.  Eventually the GlobalObserver interface will be added as well, where FMF additions can be tracked for the builder and CSE.

Reviewers: aditya_nandakumar, bogner

Reviewed By: bogner

Subscribers: rovka, kristof.beyls, javed.absar

Differential Revision: https://reviews.llvm.org/D55668

llvm-svn: 349514
2018-12-18 17:54:52 +00:00
Petar Avramovic 0a5e4eb776 [MIPS GlobalISel] Select G_SDIV, G_UDIV, G_SREM and G_UREM
Add support for s64 libcalls for G_SDIV, G_UDIV, G_SREM and G_UREM
and use integer type of correct size when creating arguments for
CLI.lowerCall.
Select G_SDIV, G_UDIV, G_SREM and G_UREM for types s8, s16, s32 and s64
on MIPS32.

Differential Revision: https://reviews.llvm.org/D55651

llvm-svn: 349499
2018-12-18 15:59:51 +00:00
Petar Avramovic 150fd430f6 [MIPS GlobalISel] ClampScalar G_AND G_OR and G_XOR
Add narrowScalar for G_AND and G_XOR.
Legalize G_AND G_OR and G_XOR for types other then s32 
with clampScalar on MIPS32.

Differential Revision: https://reviews.llvm.org/D55362

llvm-svn: 349475
2018-12-18 11:36:14 +00:00
Matt Arsenault 1ac38ba73f GlobalISel: Improve crash on invalid mapping
If NumBreakDowns is 0, BreakDown is null.
This trades a null dereference with an assert somewhere
else.

llvm-svn: 349464
2018-12-18 09:27:29 +00:00
Petar Avramovic b8276f2280 [MIPS GlobalISel] Lower G_UADDE and narrowScalar G_ADD
Lower G_UADDE and legalize G_ADD using narrowScalar on MIPS32.

Differential Revision: https://reviews.llvm.org/D54580

llvm-svn: 349346
2018-12-17 12:31:07 +00:00
Volkan Keles 574d737e06 [GlobalISel] LegalizerHelper: Implement fewerElementsVector for G_LOAD/G_STORE
Reviewers: aemerson, dsanders, bogner, paquette, aditya_nandakumar

Reviewed By: dsanders

Subscribers: rovka, kristof.beyls, javed.absar, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53728

llvm-svn: 349200
2018-12-14 22:11:20 +00:00
Daniel Sanders 629db5d8e5 [globalisel][combiner] Make the CombinerChangeObserver a MachineFunction::Delegate
Summary:
This allows us to register it with the MachineFunction delegate and be
notified automatically about erasure and creation of instructions. However,
we still need explicit notification for modifications such as those caused
by setReg() or replaceRegWith().

There is a catch with this though. The notification for creation is
delivered before any operands can be added. While appropriate for
scheduling combiner work. This is unfortunate for debug output since an
opcode by itself doesn't provide sufficient information on what happened.
As a result, the work list remembers the instructions (when debug output is
requested) and emits a more complete dump later.

Another nit is that the MachineFunction::Delegate provides const pointers
which is inconvenient since we want to use it to schedule future
modification. To resolve this GISelWorkList now has an optional pointer to
the MachineFunction which describes the scope of the work it is permitted
to schedule. If a given MachineInstr* is in this function then it is
permitted to schedule work to be performed on the MachineInstr's. An
alternative to this would be to remove the const from the
MachineFunction::Delegate interface, however delegates are not permitted
to modify the MachineInstr's they receive.

In addition to this, the observer has three interface changes.
* erasedInstr() is now erasingInstr() to indicate it is about to be erased
  but still exists at the moment.
* changingInstr() and changedInstr() have been added to report changes
  before and after they are made. This allows us to trace the changes
  in the debug output.
* As a convenience changingAllUsesOfReg() and
  finishedChangingAllUsesOfReg() will report changingInstr() and
  changedInstr() for each use of a given register. This is primarily useful
  for changes caused by MachineRegisterInfo::replaceRegWith()

With this in place, both combine rules have been updated to report their
changes to the observer.

Finally, make some cosmetic changes to the debug output and make Combiner
and CombinerHelp

Reviewers: aditya_nandakumar, bogner, volkan, rtereshin, javed.absar

Reviewed By: aditya_nandakumar

Subscribers: mgorny, rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D52947

llvm-svn: 349167
2018-12-14 17:50:14 +00:00
Daniel Sanders d001e0e0f4 [globalisel] Add GISelChangeObserver::changingInstr()
Summary:
In addition to knowing that an instruction is changed. It's also useful to
know when it's about to change. For example, it might print the instruction so
you can track the changes in a debug log, it might remove it from some queue
while it's being worked on, or it might want to change several instructions as
a single transaction and act on all the changes at once.

Added changingInstr() to all existing uses of changedInstr()

Reviewers: aditya_nandakumar

Reviewed By: aditya_nandakumar

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D55623

llvm-svn: 348992
2018-12-12 23:48:13 +00:00
Daniel Sanders 91dfdd5734 [globalisel] Rename GISelChangeObserver's erasedInstr() to erasingInstr() and related nits. NFC
Summary:
There's little of interest that can be done to an already-erased instruction.
You can't inspect it, write it to a debug log, etc. It ought to be notification
that we're about to erase it. Rename the function to clarify the timing of the
event and reflect current usage.

Also fixed one case where we were trying to print an erased instruction.

Reviewers: aditya_nandakumar

Reviewed By: aditya_nandakumar

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D55611

llvm-svn: 348976
2018-12-12 21:32:01 +00:00
Craig Topper 502865bddb [GISel] Add parentheses to an assert because gcc is mean.
llvm-svn: 348900
2018-12-11 22:07:06 +00:00
Aditya Nandakumar 853a667812 [GISel]: Add MachineIRBuilder support for passing in Flags while building
https://reviews.llvm.org/D55516

Add the ability to pass in flags to buildInstr calls. Currently no
validation is performed but that can be easily performed based on the
opcode (if necessary).

Reviewed by: paquette.

llvm-svn: 348893
2018-12-11 20:04:40 +00:00
Aditya Nandakumar cef44a2342 [GISel]: Refactor MachineIRBuilder to allow passing additional parameters to build Instrs
https://reviews.llvm.org/D55294

Previously MachineIRBuilder::buildInstr used to accept variadic
arguments for sources (which were either unsigned or
MachineInstrBuilder). While this worked well in common cases, it doesn't
allow us to build instructions that have multiple destinations.
Additionally passing in other optional parameters in the end (such as
flags) is not possible trivially. Also a trivial call such as

B.buildInstr(Opc, Reg1, Reg2, Reg3)
can be interpreted differently based on the opcode (2defs + 1 src for
unmerge vs 1 def + 2srcs).
This patch refactors the buildInstr to

buildInstr(Opc, ArrayRef<DstOps>, ArrayRef<SrcOps>)
where DstOps and SrcOps are typed unions that know how to add itself to
MachineInstrBuilder.
After this patch, most invocations would look like

B.buildInstr(Opc, {s32, DstReg}, {SrcRegs..., SrcMIBs..});
Now all the other calls (such as buildAdd, buildSub etc) forward to
buildInstr. It also makes it possible to build instructions with
multiple defs.
Additionally in a subsequent patch, we should make it possible to add
flags directly while building instructions.
Additionally, the main buildInstr method is now virtual and other
builders now only have to override buildInstr (for say constant
folding/cseing) is straightforward.

Also attached here (https://reviews.llvm.org/F7675680) is a clang-tidy
patch that should upgrade the API calls if necessary.

llvm-svn: 348815
2018-12-11 00:48:50 +00:00
Amara Emerson 5ec146046c [GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes.
This patch restricts the capability of G_MERGE_VALUES, and uses the new
G_BUILD_VECTOR and G_CONCAT_VECTORS opcodes instead in the appropriate places.

This patch also includes AArch64 support for selecting G_BUILD_VECTOR of <4 x s32>
and <2 x s64> vectors.

Differential Revisions: https://reviews.llvm.org/D53629

llvm-svn: 348788
2018-12-10 18:44:58 +00:00
Petr Pavlu 84e89ff06f [GlobalISel] Set stack protector index when translating Intrinsic::stackprotector
Record the stack protector index in MachineFrameInfo when translating
Intrinsic::stackprotector similarly as is done by SelectionDAG when
processing the same intrinsic.

Setting this index allows the Prologue/Epilogue Insertion to recognize
that the stack protection is enabled. The pass can then make sure that
the stack protector comes before local variables on the stack and
assigns potentially vulnerable objects first so they are close to the
stack protector slot.

Differential Revision: https://reviews.llvm.org/D55418

llvm-svn: 348761
2018-12-10 15:15:05 +00:00
Jessica Paquette cc4b6920b3 [GlobalISel] Add IR translation support for the @llvm.log10 intrinsic
This adds IR translation support for @llvm.log10 and updates relevant tests.

https://reviews.llvm.org/D55392

llvm-svn: 348657
2018-12-07 22:08:02 +00:00
Amara Emerson a0b15d8f3e [GlobalISel] Introduce G_BUILD_VECTOR, G_BUILD_VECTOR_TRUNC and G_CONCAT_VECTOR opcodes.
These opcodes are intended to subsume some of the capability of G_MERGE_VALUES,
as it was too powerful and thus complex to add deal with throughout the GISel
pipeline.

G_BUILD_VECTOR creates a vector value from a sequence of uniformly typed
scalar values. G_BUILD_VECTOR_TRUNC is a special opcode for handling scalar
operands which are larger than the destination vector element type, and
therefore does an implicit truncate.

G_CONCAT_VECTOR creates a vector by concatenating smaller, uniformly typed,
vectors together.

These will be used in a subsequent commit. This commit just adds the initial
infrastructure.

Differential Revision: https://reviews.llvm.org/D53594

llvm-svn: 348430
2018-12-05 23:53:30 +00:00
Aditya Nandakumar f75d4f329c [GISel]: Provide standard interface to observe changes in GISel passes
https://reviews.llvm.org/D54980

This provides a standard API across GISel passes to observe and notify
passes about changes (insertions/deletions/mutations) to MachineInstrs.
This patch also removes the recordInsertion method in MachineIRBuilder
and instead provides method to setObserver.

Reviewed by: vkeles.

llvm-svn: 348406
2018-12-05 20:14:52 +00:00
Diana Picus 0528e2cfb3 [ARM GlobalISel] Support G_CTLZ and G_CTLZ_ZERO_UNDEF
We can now select CLZ via the TableGen'erated code, so support G_CTLZ
and G_CTLZ_ZERO_UNDEF throughout the pipeline for types <= s32.

Legalizer:
If the CLZ instruction is available, use it for both G_CTLZ and
G_CTLZ_ZERO_UNDEF. Otherwise, use a libcall for G_CTLZ_ZERO_UNDEF and
lower G_CTLZ in terms of it.

In order to achieve this we need to add support to the LegalizerHelper
for the legalization of G_CTLZ_ZERO_UNDEF for s32 as a libcall (__clzsi2).

We also need to allow lowering of G_CTLZ in terms of G_CTLZ_ZERO_UNDEF
if that is supported as a libcall, as opposed to just if it is Legal or
Custom. Due to a minor refactoring of the helper function in charge of
this, we will also allow the same behaviour for G_CTTZ and G_CTPOP.
This is not going to be a problem in practice since we don't yet have
support for treating G_CTTZ and G_CTPOP as libcalls (not even in
DAGISel).

Reg bank select:
Map G_CTLZ to GPR. G_CTLZ_ZERO_UNDEF should not make it to this point.

Instruction select:
Nothing to do.

llvm-svn: 347545
2018-11-26 11:07:02 +00:00
Diana Picus 30887bf6c3 Fix typo in comment. NFC
llvm-svn: 347544
2018-11-26 11:06:53 +00:00
Fangrui Song 7570932977 Use llvm::copy. NFC
llvm-svn: 347126
2018-11-17 01:44:25 +00:00
Cameron McInally cbde0d9c7b [IR] Add a dedicated FNeg IR Instruction
The IEEE-754 Standard makes it clear that fneg(x) and
fsub(-0.0, x) are two different operations. The former is a bitwise
operation, while the latter is an arithmetic operation. This patch
creates a dedicated FNeg IR Instruction to model that behavior.

Differential Revision: https://reviews.llvm.org/D53877

llvm-svn: 346774
2018-11-13 18:15:47 +00:00
James Y Knight 72f76bf230 Add support for llvm.is.constant intrinsic (PR4898)
This adds the llvm-side support for post-inlining evaluation of the
__builtin_constant_p GCC intrinsic.

Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call
to a function where canConstantFoldTo returns true, and one of the
arguments is a struct.

Updated from patch initially by Janusz Sobczak.

Differential Revision: https://reviews.llvm.org/D4276

llvm-svn: 346322
2018-11-07 15:24:12 +00:00
Daniel Sanders 3b39040ad4 [globalisel][irtranslator] Verify that DILocations aren't lost in translation
Summary:
Also fix a couple bugs where DILocations are lost. EntryBuilder wasn't passing
on debug locations for PHI's, constants, GLOBAL_VALUE, etc.

Reviewers: aprantl, vsk, bogner, aditya_nandakumar, volkan, rtereshin, aemerson

Reviewed By: aemerson

Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D53740

llvm-svn: 345743
2018-10-31 17:31:23 +00:00
Matthias Braun 9fd397b423 ADT/STLExtras: Introduce llvm::empty; NFC
This is modeled after C++17 std::empty().

Differential Revision: https://reviews.llvm.org/D53909

llvm-svn: 345679
2018-10-31 00:23:23 +00:00
Volkan Keles 60c6affcb0 [GlobalISel] LegalizerHelper: Fix the incorrect alignment when splitting loads/stores in narrowScalar
Reviewers: dsanders, bogner, jpaquette, aemerson, ab, paquette

Reviewed By: dsanders

Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D53664

llvm-svn: 345292
2018-10-25 17:52:19 +00:00
Volkan Keles f87473fe1c [GISel] LegalizerInfo: Rename MemDesc::Size to SizeInBits to make the value clearer
Requested in D53679.

llvm-svn: 345288
2018-10-25 17:37:07 +00:00
Amara Emerson cbd86d8429 [GlobalISel] Use the target preferred type for G_EXTRACT_VECTOR_ELT index.
Allows for better imported pattern re-use.

llvm-svn: 345265
2018-10-25 14:04:54 +00:00
Aditya Nandakumar cd04e366d7 [GISel]: Allow PHIs to be DCEd
https://reviews.llvm.org/D53304

Currently dead phis are not cleaned up during DCE. This patch allows
dead PHI and G_PHI insts to be deleted.

Reviewed by: dsanders

llvm-svn: 344811
2018-10-19 20:11:52 +00:00
Jessica Paquette b328d95333 [GlobalIsel] Add llvm.invariant.start and llvm.invariant.end
Port over the implementation in SelectionDAGBuilder.cpp into the IRTranslator
and update the arm64-irtranslator test.

These were causing fallbacks in CTMark/Bullet (-Rpass-missed=gisel-select),
and this patch fixes that.

https://reviews.llvm.org/D52945

llvm-svn: 343885
2018-10-05 21:02:46 +00:00
Daniel Sanders a464ffd52c [globalisel][combine] When placing truncates, handle the case when the BB is empty
GlobalISel uses MIR with implicit fallthrough on each basic block. As a result,
getFirstNonPhi() can return end().

llvm-svn: 343829
2018-10-04 23:47:37 +00:00
Daniel Sanders ab358bfd09 [globalisel][combine] Fix a rare crash when encountering an instruction whose op0 isn't a reg
The simplest instance of this is an intrinsic with no results which will have the
intrinsic ID as operand 0.

Also fix some benign incorrectness when op0 is a reg but isn't a def that was
guarded against by checking for the extension opcodes.

llvm-svn: 343821
2018-10-04 21:44:32 +00:00
Daniel Sanders a05c7583c9 [globalisel][combine] Improve the truncate placement for the extending-loads combine
This brings the extending loads patch back to the original intent but minus the
PHI bug and with another small improvement to de-dupe truncates that are
inserted into the same block.

The truncates are sunk to their uses unless this would require inserting before a
phi in which case it sinks to the _beginning_ of the predecessor block for that
path (but no earlier than the def).

The reason for choosing the beginning of the predecessor is that it makes de-duping
multiple truncates in the same block simple, and optimized code is going to run a
scheduler at some point which will likely change the position anyway.

llvm-svn: 343804
2018-10-04 18:44:58 +00:00
Daniel Sanders fb9b99b26e [globalisel][combines] Don't sink G_TRUNC down to use if that use is a G_PHI
This fixes a problem where the register allocator fails to eliminate a PHI
because there's a non-PHI in the middle of the PHI instructions at the start
of a BB.

This G_TRUNC can be better placed but this at least fixes the correctness issue
quickly. I'll follow up with a patch to the verifier to catch this kind of bug
in future.

llvm-svn: 343693
2018-10-03 15:43:39 +00:00
Jonas Toth 602e3a640f [CodeGen] NFC fix pedantic warning from extra semicolon
llvm-svn: 343674
2018-10-03 10:59:19 +00:00
Daniel Sanders c973ad1878 Re-commit: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

The previous commit failed portions of the test-suite on GreenDragon due to
duplicate COPY instructions and iterator invalidation. Both issues have now
been fixed. To assist with this, a helper (cloneVirtualRegister) has been added
to MachineRegisterInfo that can be used to get another register that has the same
type and class/bank as an existing one.

llvm-svn: 343654
2018-10-03 02:12:17 +00:00
Daniel Sanders 33f42f97af Revert: r343521 and r343541: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
There's a strange assertion on two of the Green Dragon bots that goes away when
this is reverted. The assertion is in RegBankAlloc and if it is this commit then
-verify-machine-instrs should have caught it earlier in the pipeline.

llvm-svn: 343546
2018-10-01 22:32:08 +00:00
Reid Kleckner 7a6966ec27 Fix the Windows build in GlobalISel
Clang-cl was complaining about some sort of constexpr narrowing bug:

C:\src\llvm-project\llvm\lib\CodeGen\GlobalISel\CombinerHelper.cpp(136,31):  error: non-constant-expression cannot be narrowed from type 'llvm::TargetOpcode::(anonymous enum at C:\src\llvm-project\llvm\include\llvm/CodeGen/TargetOpcodes.h:22:1)' to 'unsigned int' in initializer list [-Wc++11-narrowing]
                              unsigned(MI.getOpcode()) == unsigned(TargetOpcode::G_LOAD)
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\src\llvm-project\llvm\lib\CodeGen\GlobalISel\CombinerHelper.cpp(136,31):  note: insert an explicit cast to silence this issue
                              unsigned(MI.getOpcode()) == unsigned(TargetOpcode::G_LOAD)
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                              static_cast<unsigned int>(

llvm-svn: 343541
2018-10-01 21:39:39 +00:00
Daniel Sanders 9659bfda5a [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

llvm-svn: 343521
2018-10-01 18:56:47 +00:00
Aditya Nandakumar 1cbb057142 [GISel]: Remove an incorrect assert in CallLowering
https://reviews.llvm.org/D51147

Asserting if any extend of vectors should be up to the target's
legalizer/target specific code not in CallLowering.

reviewed by : dsanders.

llvm-svn: 343325
2018-09-28 15:08:49 +00:00
Fangrui Song 0cac726a00 llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb

Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52573

llvm-svn: 343163
2018-09-27 02:13:45 +00:00
Heejin Ahn e41be38efd Unify landing pad information adding routines (NFC)
Summary:
We have `llvm::addLandingPadInfo` and `MachineFunction::addLandingPad`,
both of which add landing pad information to populate `LandingPadInfo`
but are called from different locations, which was confusing. This patch
unifies them with one `MachineFunction::addLandingPad` function, which
now has functionlities of both functions.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52428

llvm-svn: 343018
2018-09-25 19:56:44 +00:00
Michael Berg 894c39f770 Copy utilities updated and added for MI flags
Summary: This patch adds a GlobalIsel copy utility into MI for flags and updates the instruction emitter for the SDAG path.  Some tests show new behavior and I added one for GlobalIsel which mirrors an SDAG test for handling nsw/nuw.

Reviewers: spatel, wristow, arsenm

Reviewed By: arsenm

Subscribers: wdng

Differential Revision: https://reviews.llvm.org/D52006

llvm-svn: 342576
2018-09-19 18:52:08 +00:00
Josh Stone f446facab0 [GlobalISel] Lower dbg.declare into indirect DBG_VALUE
Summary:
D31439 changed the semantics of dbg.declare to take the address of a
variable as the first argument, making it indirect.  It specifically
updated FastISel for this change here:

https://reviews.llvm.org/D31439#change-WVArzi177jPl

GlobalISel needs to follow suit, or else it will be missing a level of
indirection in the generated debuginfo.  This problem was seen in a Rust
debuginfo test on aarch64, since GlobalISel is used at -O0 for aarch64.

https://github.com/rust-lang/rust/issues/49807
https://bugzilla.redhat.com/show_bug.cgi?id=1611597
https://bugzilla.redhat.com/show_bug.cgi?id=1625768

Reviewers: dblaikie, aprantl, t.p.northover, javed.absar, rnk

Reviewed By: rnk

Subscribers: #debug-info, rovka, kristof.beyls, JDevlieghere, llvm-commits, tstellar

Differential Revision: https://reviews.llvm.org/D51749

llvm-svn: 341969
2018-09-11 17:52:01 +00:00
Aditya Nandakumar 6d47a41520 [GISel]: Add legalization support for Widening UADDO/USUBO
https://reviews.llvm.org/D51384

Added code in LegalizerHelper to widen UADDO/USUBO along with unit
tests.

Reviewed by volkan.

llvm-svn: 340892
2018-08-29 03:17:08 +00:00
Aditya Nandakumar 6b4d343e13 [GISel]: Add missing opcodes for overflow intrinsics
https://reviews.llvm.org/D51197

Currently, IRTranslator (and GISel) seems to be arbitrarily picking
which overflow intrinsics get mapped into opcodes which either have a
carry as an input or not.
For intrinsics such as Intrinsic::uadd_with_overflow, translate it to an
opcode (G_UADDO) which doesn't have any carry inputs (similar to LLVM
IR).

This patch adds 4 missing opcodes for completeness - G_UADDO, G_USUBO,
G_SSUBE and G_SADDE.

llvm-svn: 340865
2018-08-28 18:54:10 +00:00
Chandler Carruth 96fc1de77d [IR] Begin removal of TerminatorInst by removing successor manipulation.
The core get and set routines move to the `Instruction` class. These
routines are only valid to call on instructions which are terminators.

The iterator and *generic* range based access move to `CFG.h` where all
the other generic successor and predecessor access lives. While moving
the iterator here, simplify it using the iterator utilities LLVM
provides and updates coding style as much as reasonable. The APIs remain
pointer-heavy when they could better use references, and retain the odd
behavior of `operator*` and `operator->` that is common in LLVM
iterators. Adjusting this API, if desired, should be a follow-up step.

Non-generic range iteration is added for the two instructions where
there is an especially easy mechanism and where there was code
attempting to use the range accessor from a specific subclass:
`indirectbr` and `br`. In both cases, the successors are contiguous
operands and can be easily iterated via the operand list.

This is the first major patch in removing the `TerminatorInst` type from
the IR's instruction type hierarchy. This change was discussed in an RFC
here and was pretty clearly positive:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html

There will be a series of much more mechanical changes following this
one to complete this move.

Differential Revision: https://reviews.llvm.org/D47467

llvm-svn: 340698
2018-08-26 08:41:15 +00:00
Aditya Nandakumar c106183518 [GISel]: Add legalization support for widening bit counting operations
https://reviews.llvm.org/D51053

Added legalization for WidenScalar of various bitcounting opcodes.

Reviewed by arsenm.

llvm-svn: 340429
2018-08-22 17:59:18 +00:00
Aditya Nandakumar c0333f7184 Revert "Revert rr340111 "[GISel]: Add Legalization/lowering code for bit counting operations""
This reverts commit d1341152d91398e9a882ba2ee924147ea2f9b589.

This patch originally made use of Nested MachineIRBuilder buildInstr
calls, and since order of argument processing is not well defined, the
instructions were built slightly in a different order (still correct).
I've removed the nested buildInstr calls to have a defined order now.

Patch was tested by Mikael.

llvm-svn: 340309
2018-08-21 17:30:31 +00:00
Aditya Nandakumar 2a08285cf3 Revert "Revert r339977: [GISel]: Add Opcodes for a few LLVM Intrinsics"
This reverts commit 7debc334e6421bb5251ef8f18e97166dfc7dd787.

I missed updating legalizer-info-validation.mir as I had assertions
turned off in my build and that specific test requires asserts. Fixed it
now.

llvm-svn: 340197
2018-08-20 18:43:19 +00:00
Reid Kleckner 918930adf9 Revert rr340111 "[GISel]: Add Legalization/lowering code for bit counting operations"
It causes LegalizerHelperTest.LowerBitCountingCTTZ1 to fail.

llvm-svn: 340186
2018-08-20 16:50:19 +00:00
Aditya Nandakumar 59b2485ba2 [GISel]: Add Legalization/lowering code for bit counting operations
https://reviews.llvm.org/D48847#inline-448257

Ported legalization expansions for CTLZ/CTTZ from DAG to GISel.

Reviewed by rtereshin.

llvm-svn: 340111
2018-08-18 00:01:54 +00:00
Hsiangkai Wang 2532ac880a [DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

llvm-svn: 340039
2018-08-17 15:22:04 +00:00
Chandler Carruth b898b86f49 Revert r339977: [GISel]: Add Opcodes for a few LLVM Intrinsics
This is breaking ~all the bots.

llvm-svn: 339982
2018-08-17 04:47:16 +00:00
Aditya Nandakumar 973a557338 [GISel]: Add Opcodes for a few LLVM Intrinsics
https://reviews.llvm.org/D50401

Add opcodes for llvm.intrinsic.trunc, round, and update the IRTranslator
for the same.

Reviewed by: dsanders.

llvm-svn: 339977
2018-08-17 01:41:56 +00:00
Chandler Carruth c73c0307fe [MI] Change the array of `MachineMemOperand` pointers to be
a generically extensible collection of extra info attached to
a `MachineInstr`.

The primary change here is cleaning up the APIs used for setting and
manipulating the `MachineMemOperand` pointer arrays so chat we can
change how they are allocated.

Then we introduce an extra info object that using the trailing object
pattern to attach some number of MMOs but also other extra info. The
design of this is specifically so that this extra info has a fixed
necessary cost (the header tracking what extra info is included) and
everything else can be tail allocated. This pattern works especially
well with a `BumpPtrAllocator` which we use here.

I've also added the basic scaffolding for putting interesting pointers
into this, namely pre- and post-instruction symbols. These aren't used
anywhere yet, they're just there to ensure I've actually gotten the data
structure types correct. I'll flesh out support for these in
a subsequent patch (MIR dumping, parsing, the works).

Finally, I've included an optimization where we store any single pointer
inline in the `MachineInstr` to avoid the allocation overhead. This is
expected to be the overwhelmingly most common case and so should avoid
any memory usage growth due to slightly less clever / dense allocation
when dealing with >1 MMO. This did require several ergonomic
improvements to the `PointerSumType` to reasonably support the various
usage models.

This also has a side effect of freeing up 8 bits within the
`MachineInstr` which could be repurposed for something else.

The suggested direction here came largely from Hal Finkel. I hope it was
worth it. ;] It does hopefully clear a path for subsequent extensions
w/o nearly as much leg work. Lots of thanks to Reid and Justin for
careful reviews and ideas about how to do all of this.

Differential Revision: https://reviews.llvm.org/D50701

llvm-svn: 339940
2018-08-16 21:30:05 +00:00
Bruno Cardoso Lopes f446282aad Revert "[DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)"
This reverts commit cb8c5e417d55141f3f079a8a876e786f44308336 / r339676.

This causing a test to fail in http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/48406/

    LLVM :: DebugInfo/Generic/debug-label.ll

llvm-svn: 339700
2018-08-14 17:54:41 +00:00
Hsiangkai Wang ccae278938 [DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

llvm-svn: 339676
2018-08-14 13:50:59 +00:00
Amara Emerson 30e61404a8 [GlobalISel][IRTranslator] Fix a bug in handling repeating struct types during argument lowering.
Differential Revision: https://reviews.llvm.org/D49442

llvm-svn: 339674
2018-08-14 12:04:25 +00:00
Aditya Nandakumar e07b3b737b [GISel]: Add Opcodes for CTLZ/CTTZ/CTPOP
https://reviews.llvm.org/D48600

Added IRTranslator support to translate these known intrinsics into GISel opcodes.

llvm-svn: 338944
2018-08-04 01:22:12 +00:00
Alexander Ivchenko 49168f6778 [GlobalISel] Rewrite CallLowering::lowerReturn to accept multiple VRegs per Value
This is logical continuation of https://reviews.llvm.org/D46018 (r332449)

Differential Revision: https://reviews.llvm.org/D49660

llvm-svn: 338685
2018-08-02 08:33:31 +00:00
Amara Emerson 6cdfe29d8e [GlobalISel][IRTranslator] Use RPO traversal when visiting blocks to translate.
Previously we were just visiting the blocks in the function in IR order, which
is rather arbitrary. Therefore we wouldn't always visit defs before uses, but
the translation code relies on this assumption in some places.

Only codegen change seen in tests is an elision of a redundant copy.

Fixes PR38396

llvm-svn: 338476
2018-08-01 02:17:42 +00:00
Vlad Tsyrklevich 48ed9acede Revert "[DebugInfo] Generate DWARF debug information for labels."
This reverts commits r338390 and r338398, they were causing LSan
failures on the ASan bot.

llvm-svn: 338408
2018-07-31 18:10:37 +00:00
Hsiangkai Wang cbc58ada99 [DebugInfo] Generate DWARF debug information for labels.
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

llvm-svn: 338390
2018-07-31 14:48:32 +00:00
Amara Emerson 6aff5a7810 [GlobalISel] Add a G_BLOCK_ADDR opcode to handle IR blockaddress constants.
Differential Revision: https://reviews.llvm.org/D49900

llvm-svn: 338335
2018-07-31 00:08:50 +00:00
Amara Emerson fdd089aa14 [GlobalISel] Fall back to SDISel for swifterror/swiftself attributes.
We don't currently support these, fall back until we do.

llvm-svn: 337994
2018-07-26 01:25:58 +00:00
Tom Stellard 179757ef05 [RegisterBankInfo] Ignore InstrMappings that create impossible to repair operands
Summary:
This is a follow-up to r303043.  In computeMapping(), we need to disqualify an
InstrMapping if it would be impossible to repair one of the registers in the
instruction to match the mapping.

This change is needed in order to be able to define an instruction
mapping for G_SELECT for the AMDGPU target and will be tested
by test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir

Reviewers: ab, qcolombet, t.p.northover, dsanders

Reviewed By: qcolombet

Subscribers: tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D49735

llvm-svn: 337882
2018-07-25 03:08:35 +00:00
Matthias Braun 90ad6835dd CodeGen: Remove pipeline dependencies on StackProtector; NFC
This re-applies r336929 with a fix to accomodate for the Mips target
scheduling multiple SelectionDAG instances into the pass pipeline.

PrologEpilogInserter and StackColoring depend on the StackProtector analysis
being alive from the point it is run until PEI, which requires that they are all
scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass
between StackProtector and PEI results in these passes being in separate
FunctionPassManagers and the StackProtector is not available for PEI.

PEI and StackColoring don't use much information from the StackProtector pass,
so transfering the required information to MachineFrameInfo is cleaner than
keeping the StackProtector pass around. This commit moves the SSP layout
information to MFI instead of keeping it in the pass.

This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587)
is a first draft of the pagerando implementation described in
http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html.

Patch by Stephen Crane <sjc@immunant.com>

Differential Revision: https://reviews.llvm.org/D49256

llvm-svn: 336964
2018-07-13 00:08:38 +00:00
Matthias Braun f03f32d478 Revert "(HEAD -> master, origin/master, arcpatch-D37582) CodeGen: Remove pipeline dependencies on StackProtector; NFC"
This was triggering pass scheduling failures.

This reverts commit r336929.

llvm-svn: 336934
2018-07-12 19:27:01 +00:00
Matthias Braun 9436570cbd CodeGen: Remove pipeline dependencies on StackProtector; NFC
PrologEpilogInserter and StackColoring depend on the StackProtector analysis
being alive from the point it is run until PEI, which requires that they are all
scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass
between StackProtector and PEI results in these passes being in separate
FunctionPassManagers and the StackProtector is not available for PEI.

PEI and StackColoring don't use much information from the StackProtector pass,
so transfering the required information to MachineFrameInfo is cleaner than
keeping the StackProtector pass around. This commit moves the SSP layout
information to MFI instead of keeping it in the pass.

This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587)
is a first draft of the pagerando implementation described in
http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html.

Patch by Stephen Crane <sjc@immunant.com>

Differential Revision: https://reviews.llvm.org/D49256

llvm-svn: 336929
2018-07-12 18:33:32 +00:00
Daniel Sanders 9481399c0f [globalisel][irtranslator] Add support for atomicrmw and (strong) cmpxchg
Summary:
This patch adds support for the atomicrmw instructions and the strong
cmpxchg instruction to the IRTranslator.

I've left out weak cmpxchg because LangRef.rst isn't entirely clear on what
difference it makes to the backend. As far as I can tell from the code, it
only matters to AtomicExpandPass which is run at the LLVM-IR level.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, javed.absar

Reviewed By: qcolombet

Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D40092

llvm-svn: 336589
2018-07-09 19:33:40 +00:00
Daniel Sanders bdeb880d14 [globalisel][legalizer] Add AtomicOrdering to LegalityQuery and use it in AArch64
Now that we have the ability to legalize based on MMO's. Add support for
legalizing based on AtomicOrdering and use it to correct the legalization
of the atomic instructions.

Also extend all() to be a variadic template as this ruleset now requires
3 and 4 argument versions.

llvm-svn: 335767
2018-06-27 19:03:21 +00:00
Amara Emerson 5a3bb68e12 [AArch64][GlobalISel] Zero-extend s1 values when returning.
Before we were relying on the any extend of the s1 to s32, but
for AAPCS we need to zero-extend it to at least s8.

Fixes PR36719

Differential Revision: https://reviews.llvm.org/D47425

llvm-svn: 333747
2018-06-01 13:20:32 +00:00
Roman Tereshin 5952576de5 [GlobalISel][Legalizer] LegalizerInfo verifier: Making LegalizerInfo::verify(...) errors fatal
Reviewers: aemerson, qcolombet

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D46339

llvm-svn: 333619
2018-05-31 01:56:07 +00:00
Roman Tereshin 5404136d06 [GlobalISel][Legalizer] LegalizerInfo verifier: check rules cover type indices
This commit adds a simple verifier that tracks type indices being
touched by legalization rules' builders.

Every target will now have an opportunity to call
LegalizerInfo::verify(...) at the end of its derived LegalizerInfo's
constructor and check there are no obvious mistakes like checking only
first type for an opcode that has more than one type index and therefore
implicitly declaring any type for the second (and higher) type index
legal.

The check is only ran in assert builds and should have very minor
performance impact in assert builds and none in release builds.

This commit does not add LegalizerInfo::verify(...) calls to
target-specific legalizers, look for separate commits for that.

This commit also doesn't make the verification errors fatal, only
produces an error message, look for a later commit that does.

Reviewers: aemerson, qcolombet

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D46338

llvm-svn: 333576
2018-05-30 18:45:32 +00:00
Roman Tereshin 13229aff54 [GlobalISel] NFCI, Getting GlobalISel ~5% faster
by replacing DenseMap with IndexedMap for LLTs within MRI, as
benchmarked by cross-compiling sqlite3 amalgamation for AArch64
on x86 machine.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D46809

llvm-svn: 333125
2018-05-23 21:12:02 +00:00
Amara Emerson 0d6a26dffc [GlobalISel][IRTranslator] Split aggregates during IR translation.
We currently handle all aggregates by creating one large LLT, and letting the
legalizer deal with splitting them up. However using this approach means that
we can't support big endian code correctly.

This patch changes the way that the IRTranslator deals with aggregate values,
by splitting them up into their constituent element values. To do this, parts
of the translator need to be modified to deal with multiple VRegs for a single
Value.

A new Value to VReg mapper is introduced to help keep compile time under
control, currently there is no measurable impact on CTMark despite the extra
code being generated in some cases.

Patch is based on the original work of Tim Northover.

Differential Revision: https://reviews.llvm.org/D46018

llvm-svn: 332449
2018-05-16 10:32:02 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Roman Tereshin 6d26638c90 [GlobalISel][Legalizer] Widening the second src op of shifts bug fix
The second source operand of G_SHL, G_ASHR, and G_LSHR must preserve its
value as a (small) unsigned integer, therefore its incorrect to widen it
in any way but by zero extending it.

G_SHL was using G_ANYEXT and G_ASHR - G_SEXT (which is correct for their
destination and first source operands, but not the "number of bits to
shift" operand).

Generally, shifts aren't as similar to regular binary operations as it
might seem, for instance, they aren't commutative nor associative and
the second source operand usually requires a special treatment.

Reviewers: bogner, javed.absar, aivchenk, rovka

Reviewed By: bogner

Subscribers: igorb, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D46413

llvm-svn: 331926
2018-05-09 21:43:30 +00:00
Roman Tereshin d5fa9fde58 Reapplying r331819 [GlobalISel][Legalizer] More concise and faster widenScalar, NFC
The commit was a suspect for clang-cmake-aarch64-global-isel and
    clang-cmake-aarch64-quick bot failures, proved to be innocent.

llvm-svn: 331898
2018-05-09 17:28:18 +00:00
Daniel Sanders 618437459c Revert r331816 and r331820 - [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Reverting this to see if the clang-cmake-aarch64-global-isel and
clang-cmake-aarch64-quick bots are failing because of this commit.
We know it wasn't r331819.

llvm-svn: 331846
2018-05-09 05:00:17 +00:00
Roman Tereshin 27bba4495a Revert r331819 [GlobalISel][Legalizer] More concise and faster widenScalar, NFC
Reverting this to see if the clang-cmake-aarch64-global-isel and
clang-cmake-aarch64-quick bots are failing because of this commit

llvm-svn: 331839
2018-05-09 01:43:12 +00:00
Daniel Sanders ec17920da1 [globalisel] Correct r331816 to check the opcode before calling getOperand().
Fix a silly mistake in my pre-commit changes for r331816. It should check what
opcode the insn is before extracting the operands.

NFC at the moment since the caller already checked the opcode.

llvm-svn: 331820
2018-05-08 22:58:35 +00:00
Roman Tereshin 25cbfe680e [GlobalISel][Legalizer] More concise and faster widenScalar, NFC
Refactoring LegalizerHelper::widenScalar member function reducing its
size by approximately a factor of 2 and (hopefuly) making it more
straightforward and regular by introducing widenScalarSrc and
widenScalarDst helper methods.

The new widenScalar* methods mutate the instructions in place instead
of recreating them from scratch and removing the originals. The
compile time implications of this were measured on sqlite3
amalgamation, targeting AArch64 in -O0:

LegalizerHelper::widenScalar: > 25% faster
Legalizer::runOnMachineFunction: ~ 4.0 - 4.5% faster

Also adding MachineOperand::setCImm and refactoring out
MachineIRBuilder::recordInsertion methods to make the change possible.

Reviewers: aditya_nandakumar, bogner, javed.absar, t.p.northover, ab, dsanders, arsenm

Reviewed By: aditya_nandakumar

Subscribers: wdng, rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D46414

llvm-svn: 331819
2018-05-08 22:53:09 +00:00
Daniel Sanders d24dcdd1f7 [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Reviewed By: aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

llvm-svn: 331816
2018-05-08 22:26:39 +00:00
Tom Stellard abc9871d60 GlobalISel: Use a callback to compute constrained reg class for unallocatble registers
Summary:
constrainOperandRegClass() currently fails if it tries to constrain the
register class of an operand that is defeined with an unallocatable register
class.  This patch resolves this by adding a target callback to compute
register constriants in this case.

This is required by the AMDGPU because many of its instructions have source opreands
defined with the unallocatable register classe VS_32 which is a union of two allocatable
register classes VGPR_32 and SReg_32.

Reviewers: dsanders, aditya_nandakumar

Reviewed By: aditya_nandakumar

Subscribers: rovka, kristof.beyls, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D45991

llvm-svn: 331485
2018-05-03 21:44:16 +00:00
Daniel Sanders 2de9d4ad5d Fix infinite loop after r331115
There are two separate fixes here:
* The lowering code for non-extending loads should report UnableToLegalize instead of emitting the same instruction.
* The target should not be requesting lowering of non-extending loads.

llvm-svn: 331201
2018-04-30 17:20:01 +00:00
Nico Weber 432a38838d IWYU for llvm-config.h in llvm, additions.
See r331124 for how I made a list of files missing the include.
I then ran this Python script:

    for f in open('filelist.txt'):
        f = f.strip()
        fl = open(f).readlines()

        found = False
        for i in xrange(len(fl)):
            p = '#include "llvm/'
            if not fl[i].startswith(p):
                continue
            if fl[i][len(p):] > 'Config':
                fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
                found = True
                break
        if not found:
            print 'not found', f
        else:
            open(f, 'w').write(''.join(fl))

and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.

No intended behavior change.

llvm-svn: 331184
2018-04-30 14:59:11 +00:00
Daniel Sanders 5eb9f581b6 [globalisel][legalizerinfo] Introduce dedicated extending loads and add lowerings for them
Summary:
Previously, a extending load was represented at (G_*EXT (G_LOAD x)).
This had a few drawbacks:
* G_LOAD had to be legal for all sizes you could extend from, even if
  registers didn't naturally hold those sizes.
* All sizes you could extend from had to be allocatable just in case the
  extend went missing (e.g. by optimization).
* At minimum, G_*EXT and G_TRUNC had to be legal for these sizes. As we
  improve optimization of extends and truncates, this legality requirement
  would spread without considerable care w.r.t when certain combines were
  permitted.
* The SelectionDAG importer required some ugly and fragile pattern
  rewriting to translate patterns into this style.

This patch begins changing the representation to:
* (G_[SZ]EXTLOAD x)
* (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits()
which resolves these issues by allowing targets to work entirely in their
native register sizes, and by having a more direct translation from
SelectionDAG patterns.

This patch introduces the new generic instructions and new variation on
G_LOAD and adds lowering for them to convert back to the existing
representations.

Depends on D45466

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, aemerson, javed.absar

Reviewed By: aemerson

Subscribers: aemerson, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D45540

llvm-svn: 331115
2018-04-28 18:14:50 +00:00
Daniel Sanders 4f246999d9 Attempt to fix remaining build failures after r331071 by changing the tuple to a struct
Some of the bots were failing in a different way to the others. These were
unable to compare tuples. Fix this by changing to a struct, thereby avoiding
the quirks of tuples.

llvm-svn: 331081
2018-04-27 21:03:27 +00:00
Daniel Sanders a05e8d3e68 Attempt to fix build failure after r331071 using std::make_tuple
llvm-svn: 331074
2018-04-27 20:17:44 +00:00
Daniel Sanders 27fe8a5011 [globalisel][legalizerinfo] Add support for legalization based on the MachineMemOperand
Summary:
Currently only the memory size is supported but others can be added as
needed.

narrowScalar for G_LOAD and G_STORE now correctly update the
MachineMemOperand and will refuse to legalize atomics since those need more
careful expansions to maintain atomicity.

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, aemerson, javed.absar

Reviewed By: aemerson

Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D45466

llvm-svn: 331071
2018-04-27 19:48:53 +00:00
Roman Tereshin 38489ed416 [GlobalISel] Reporting rules covered as part of the InstructionSelect's debug-only printing
The main goal of this change is to make it much easier to track which
rules are actually covered by Testgen'erated regression tests.

Reviewers: aemerson, dsanders

Differential Revision: https://reviews.llvm.org/D46095

llvm-svn: 330988
2018-04-26 20:22:17 +00:00
Daniel Sanders 5281b02e84 [globalisel][legalizerinfo] Add support for the Lower action in getActionDefinitionsBuilder() and use it in AArch64.
Lower is slightly odd. It often doesn't change the type but the lowerings
do use the new type to decide what code to create. Treat it like a mutation
but provide convenience functions that re-use the existing type.

Re-uses the existing tests:
test/CodeGen/AArch64/GlobalISel/legalize-rem.mir
test/CodeGen/AArch64/GlobalISel//legalize-mul.mir
test/CodeGen/AArch64/GlobalISel//legalize-cmpxchg-with-success.mir

llvm-svn: 329623
2018-04-09 21:10:09 +00:00
Aditya Nandakumar b1c467dbe7 [GISel] Refactor MachineIRBuilder to allow transformations while
building.

https://reviews.llvm.org/D45067

This change attempts to do two things:
1) It separates out the state that is stored in the
MachineIRBuilder(InsertionPt, MF, MRI, InsertFunction etc) into a
separate object called MachineIRBuilderState.
2) Add the ability to constant fold operations while building instructions
(optionally). MachineIRBuilder is now refactored into a MachineIRBuilderBase
which contains lots of non foldable build methods and their implementation.
Instructions which can be constant folded/transformed are now in a class
called FoldableInstructionBuilder which uses CRTP to use the implementation
of the derived class for buildBinaryOps. Additionally buildInstr in the derived
class can be used to implement other kinds of transformations.

Also because of separation of state, given a MachineIRBuilder in an API,
if one wishes to use another MachineIRBuilder, a new one can be
constructed from the state locally. For eg,

void doFoo(MachineIRBuilder &B) {
  MyCustomBuilder CustomB(B.getState());
  // Use CustomB for building.
}

reviewed by : aemerson

llvm-svn: 329596
2018-04-09 17:30:56 +00:00
Mandeep Singh Grang e92f0cfe34 [CodeGen] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: bogner, rnk, MatzeB, RKSimon

Reviewed By: rnk

Subscribers: JDevlieghere, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D45133

llvm-svn: 329435
2018-04-06 18:08:42 +00:00
Aditya Nandakumar b3297ef051 [GISel]: Fix incorrect IRTranslation while translating null pointer types
https://reviews.llvm.org/D44762

Currently IRTranslator produces
%vreg17<def>(p0) = G_CONSTANT 0;

instead we should build
%vreg16(s64) = G_CONSTANT 0
%vreg17(p0) = G_INTTOPTR %vreg16

reviewed by @aemerson.

llvm-svn: 328218
2018-03-22 17:31:38 +00:00
Aditya Nandakumar 91fc4e0949 [GISel]: Add helpers for easy building G_FCONSTANT along with matchers
Added helpers to build G_FCONSTANT, along with matching ConstantFP and
unit tests for the same.

Sample usage.

auto MIB = Builder.buildFConstant(s32, 0.5); // Build IEEESingle
For Matching the above

const ConstantFP* Tmp;
mi_match(DstReg, MRI, m_GFCst(Tmp));

https://reviews.llvm.org/D44128
reviewed by: volkan

llvm-svn: 327152
2018-03-09 17:31:51 +00:00
Volkan Keles 2bc42e90ed GlobalISel: IRTranslate llvm.fabs.* intrinsic
Summary:
Fabs is a common floating-point operation, especially for some expansions. This patch adds
a new generic opcode for llvm.fabs.* intrinsic in order to avoid building/matching this intrinsic.

Reviewers: qcolombet, aditya_nandakumar, dsanders, rovka

Reviewed By: aditya_nandakumar

Subscribers: kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D43864

llvm-svn: 326749
2018-03-05 22:31:55 +00:00
Roman Tereshin 2b94972eb9 [GlobalISel][AArch64] Adding -disable-gisel-legality-check CL option
Currently it's impossible to test InstructionSelect pass with MIR which
is considered illegal by the Legalizer in Assert builds. In early stages
of porting an existing backend from SelectionDAG ISel to GlobalISel,
however, we would have very basic CallLowering, Legalizer, and
RegBankSelect implementations, but rather functional Instruction Select
with quite a few patterns selectable due to the semi-automatic porting
process borrowing them from SelectionDAG ISel.

As we are trying to define legality as a property of being selectable by
the instruction selector, it would be nice to be able to easily check
what the selector can do in its current state w/o the legality check
provided by the Legalizer getting in the way.

It also seems beneficial to have a regression testing set up that would
not allow the selector to silently regress in its support of the MIR not
supported yet by the previous passes in the GlobalISel pipeline.

This commit adds -disable-gisel-legality-check command line option to
llc that disables those legality checks in RegBankSelect and
InstructionSelect passes.

It also adds quite a few MIR test cases for AArch64's Instruction
Selector. Every one of them would fail on the legality check at the
moment, but will select just fine if the check is disabled. Every test
MachineFunction is intended to exercise a specific selection rule and
that rule only, encoded in the MachineFunction's name by the rule's
number, ID, and index of its GIM_Try opcode in TableGen'erated
MatchTable (-optimize-match-table=false).

Reviewers: ab, dsanders, qcolombet, rovka

Reviewed By: bogner

Subscribers: kristof.beyls, volkan, aditya_nandakumar, aemerson,
rengolin, t.p.northover, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D42886

llvm-svn: 326396
2018-03-01 00:27:48 +00:00
Roman Tereshin 3054ecea3f [GlobalISel] Print/Parse FailedISel MachineFunction property
FailedISel MachineFunction property is part of the CodeGen pipeline
state as much as every other property, notably, Legalized,
RegBankSelected, and Selected. Let's make that part of the state also
serializable / de-serializable, so if GlobalISel aborts on some of the
functions of a large module, but not the others, it could be easily seen
and the state of the pipeline could be maintained through llc's
invocations with -stop-after / -start-after.

To make MIR printable and generally to not to break it too much too
soon, this patch also defers cleaning up the vreg -> LLT map until
ResetMachineFunctionPass.

To make MIR with FailedISel: true also machine verifiable, machine
verifier is changed so it treats a MIR-module as non-regbankselected and
non-selected if there is FailedISel property set.

Reviewers: qcolombet, ab

Reviewed By: dsanders

Subscribers: javed.absar, rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42877

llvm-svn: 326343
2018-02-28 17:55:45 +00:00
Aditya Nandakumar abf7594099 [GISel]: Print more fallback information when aborting
Currently when abort is enabled, we get a diagnostic saying "Fallback
path used .... " and the program terminates. To actually figure out what
the reason is, we need to run again with another verbose argument
"-pass-remarks-missed=gisel". Instead, when we are going to abort,
we might as well print expensive remarks.

https://reviews.llvm.org/D43796

llvm-svn: 326215
2018-02-27 18:04:23 +00:00
Aditya Nandakumar 599990530e [GISel]: Don't assert when constraining RegisterOperands which are uses.
Currently we assert that only non target specific opcodes can have
missing RegisterClass constraints in the MCDesc. The backend can have
instructions with register operands but don't have RegisterClass
constraints (say using unknown_class) in which case the instruction
defining the register will constrain it.
Change the assert to only fire if a def has no regclass.

https://reviews.llvm.org/D43409

llvm-svn: 326142
2018-02-26 22:56:21 +00:00
Martin Storsjo a63a5b993e [AArch64] Implement dynamic stack probing for windows
This makes sure that alloca() function calls properly probe the
stack as needed.

Differential Revision: https://reviews.llvm.org/D42356

llvm-svn: 325433
2018-02-17 14:26:32 +00:00
Volkan Keles 02bb1747a3 GlobalISel: Add templated functions and pattern matcher support for some more opcodes
Summary:
This patch adds templated functions to MachineIRBuilder for some opcodes
and adds pattern matcher support for G_AND and G_OR.

Reviewers: aditya_nandakumar

Reviewed By: aditya_nandakumar

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D43309

llvm-svn: 325162
2018-02-14 19:58:36 +00:00
Daniel Sanders 7fc87360e9 [globalisel][legalizerinfo] Follow up on post-commit review comments after r323681
* Document most API's
* Delete a useless function call
* Fix a discrepancy between the single and multi-opcode variants of
  getActionDefinitions().
  The multi-opcode variant now requires that more than one opcode is requested.
  Previously it acted much like the single-opcode form but unnecessarily
  enforced the requirements of the multi-opcode form.

llvm-svn: 325067
2018-02-13 23:02:44 +00:00
Volkan Keles 9283763865 GlobalISel: IRTranslate llvm.fmuladd.* intrinsic
Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, bogner

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D43090

llvm-svn: 324971
2018-02-13 00:47:46 +00:00
Aditya Nandakumar 58eb183128 [GISel][NFC]: Move RegisterBankInfo::getSizeInBits into TargetRegisterInfo.
llvm-svn: 324125
2018-02-02 19:42:07 +00:00
Amara Emerson 4d19655a56 [GlobalISel][Legalizer] Relax a legalization loop detecting assert.
Legalizing vectors may keep the element type the same but change the number of
elements, the assert didn't take this into account.

llvm-svn: 324028
2018-02-01 23:10:57 +00:00
Amara Emerson cbc02c71a4 [GlobalISel] Fix assert failure when legalizing non-power-2 loads.
Until we support extending loads properly we're going to fall back for these.
We already handle stores in the same way, so this is just being consistent.

llvm-svn: 324001
2018-02-01 20:47:03 +00:00
Martin Storsjo cc981d285d [GlobalISel] Bail out on calls to dllimported functions
Differential Revision: https://reviews.llvm.org/D42568

llvm-svn: 323811
2018-01-30 19:50:58 +00:00
Diana Picus 517531e5a5 [ARM GlobalISel] Legalize G_SITOFP and G_UITOFP
Legal if we have hardware support, libcall otherwise.

Also add supporting code to the legalizer helper for libcalls.

llvm-svn: 323730
2018-01-30 09:15:17 +00:00
Diana Picus 4ed0ee7b5f [ARM GlobalISel] Legalize G_FPTOSI and G_FPTOUI
Legal if we have hardware support for floating point, libcalls
otherwise.

Also add the necessary support for libcalls in the legalizer helper.

llvm-svn: 323726
2018-01-30 07:54:52 +00:00
Daniel Sanders 08464524c3 [ARM][GISel] PR35965 Constrain RegClasses of nested instructions built from Dst Pattern
Summary:
Apparently, we missed on constraining register classes of VReg-operands of all the instructions
built from a destination pattern but the root (top-level) one. The issue exposed itself
while selecting G_FPTOSI for armv7: the corresponding pattern generates VTOSIZS wrapped
into COPY_TO_REGCLASS, so top-level COPY_TO_REGCLASS gets properly constrained,
while nested VTOSIZS (or rather its destination virtual register to be exact) does not.

Fixing this by issuing GIR_ConstrainSelectedInstOperands for every nested GIR_BuildMI.

https://bugs.llvm.org/show_bug.cgi?id=35965
rdar://problem/36886530

Patch by Roman Tereshin

Reviewers: dsanders, qcolombet, rovka, bogner, aditya_nandakumar, volkan

Reviewed By: dsanders, qcolombet, rovka

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42565

llvm-svn: 323692
2018-01-29 21:09:12 +00:00
Daniel Sanders 1cc575666f [globalisel][legalizer] Change identity() to changeTo() to clarify that it changes things. NFC
Prior to committing r323681, we decided to change pick() to identity() since
it wasn't clear from the name what pick() did. However, identity() isn't a very
good name either since it implies that no changes are made. For some reason,
naming it changeTo() didn't occur to me until just after the commit. This
should resolve the lack of clarity that pick() had while still implying that
it changes the MIR.

llvm-svn: 323689
2018-01-29 20:46:16 +00:00
Daniel Sanders 79cb839fcd [globalisel][legalizer] Adapt LegalizerInfo to support inter-type dependencies and other things.
Summary:
As discussed in D42244, we have difficulty describing the legality of some
operations. We're not able to specify relationships between types.
For example, declaring the following
  setAction({..., 0, s32}, Legal)
  setAction({..., 0, s64}, Legal)
  setAction({..., 1, s32}, Legal)
  setAction({..., 1, s64}, Legal)
currently declares these type combinations as legal:
  {s32, s32}
  {s64, s32}
  {s32, s64}
  {s64, s64}
but we currently have no means to say that, for example, {s64, s32} is
not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/
G_UNMERGE_VALUES have relationships between the types that are currently
described incorrectly.
    
Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics
differently to atomics. The necessary information is in the MMO but we have no
way to use this in the legalizer. Similarly, there is currently no way for the
register type and the memory type to differ so there is no way to cleanly
represent extending-load/truncating-store in a way that can't be broken by
optimizers (resulting in illegal MIR).

It's also difficult to control the legalization strategy. We've added support
for legalizing non-power of 2 types but there's still some hardcoded assumptions
about the strategy. The main one I've noticed is that type0 is always legalized
before type1 which is not a good strategy for `type0 = G_EXTRACT type1, ...` if
you need to widen the container. It will converge on the same result eventually
but it will take a much longer route when legalizing type0 than if you legalize
type1 first.

Lastly, the definition of legality and the legalization strategy is kept
separate which is not ideal. It's helpful to be able to look at a one piece of
code and see both what is legal and the method the legalizer will use to make
illegal MIR more legal.

This patch adds a layer onto the LegalizerInfo (to be removed when all targets
have been migrated) which resolves all these issues.

Here are the rules for shift and division:
  for (unsigned BinOp : {G_LSHR, G_ASHR, G_SDIV, G_UDIV})
    getActionDefinitions(BinOp)
        .legalFor({s32, s64})     // If type0 is s32/s64 then it's Legal
        .clampScalar(0, s32, s64) // If type0 is <s32 then WidenScalar to s32
                                  // If type0 is >s64 then NarrowScalar to s64
        .widenScalarToPow2(0)     // Round type0 scalars up to powers of 2
        .unsupported();           // Otherwise, it's unsupported
This describes everything needed to both define legality and describe how to
make illegal things legal.

Here's an example of a complex rule:
  getActionDefinitions(G_INSERT)
      .unsupportedIf([=](const LegalityQuery &Query) {
        // If type0 is smaller than type1 then it's unsupported
        return Query.Types[0].getSizeInBits() <= Query.Types[1].getSizeInBits();
      })
      .legalIf([=](const LegalityQuery &Query) {
        // If type0 is s32/s64/p0 and type1 is a power of 2 other than 2 or 4 then it's legal
        // We don't need to worry about large type1's because unsupportedIf caught that.
        const LLT &Ty0 = Query.Types[0];
        const LLT &Ty1 = Query.Types[1];
        if (Ty0 != s32 && Ty0 != s64 && Ty0 != p0)
          return false;
        return isPowerOf2_32(Ty1.getSizeInBits()) &&
               (Ty1.getSizeInBits() == 1 || Ty1.getSizeInBits() >= 8);
      })
      .clampScalar(0, s32, s64)
      .widenScalarToPow2(0)
      .maxScalarIf(typeInSet(0, {s32}), 1, s16) // If type0 is s32 and type1 is bigger than s16 then NarrowScalar type1 to s16
      .maxScalarIf(typeInSet(0, {s64}), 1, s32) // If type0 is s64 and type1 is bigger than s32 then NarrowScalar type1 to s32
      .widenScalarToPow2(1)                     // Round type1 scalars up to powers of 2
      .unsupported();
This uses a lambda to say that G_INSERT is unsupported when type0 is bigger than
type1 (in practice, this would be a default rule for G_INSERT). It also uses one
to describe the legal cases. This particular predicate is equivalent to:
  .legalFor({{s32, s1}, {s32, s8}, {s32, s16}, {s64, s1}, {s64, s8}, {s64, s16}, {s64, s32}})

In terms of performance, I saw a slight (~6%) performance improvement when
AArch64 was around 30% ported but it's pretty much break even right now.
I'm going to take a look at constexpr as a means to reduce the initialization
cost.

Future work:
* Make it possible for opcodes to share rulesets. There's no need for
  G_LSHR/G_ASHR/G_SDIV/G_UDIV to have separate rule and ruleset objects. There's
  no technical barrier to this, it just hasn't been done yet.
* Replace the type-index numbers with an enum to get .clampScalar(Type0, s32, s64)
* Better names for things like .maxScalarIf() (clampMaxScalar?) and the vector rules.
* Improve initialization cost using constexpr

Possible future work:
* It's possible to make these rulesets change the MIR directly instead of
  returning a description of how to change the MIR. This should remove a little
  overhead caused by parsing the description and routing to the right code, but
  the real motivation is that it removes the need for LegalizeAction::Custom.
  With Custom removed, there's no longer a requirement that Custom legalization
  change the opcode to something that's considered legal.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, reames, bogner

Reviewed By: bogner

Subscribers: hintonda, bogner, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42251

llvm-svn: 323681
2018-01-29 19:54:49 +00:00
Daniel Sanders 9ade5592d9 [globalisel] Make LegalizerInfo::LegalizeAction available outside of LegalizerInfo. NFC
Summary:
The improvements to the LegalizerInfo discussed in D42244 require that
LegalizerInfo::LegalizeAction be available for use in other classes. As such,
it needs to be moved out of LegalizerInfo. This has been done separately to the
next patch to minimize the noise in that patch.

llvm-svn: 323669
2018-01-29 17:37:29 +00:00
Amara Emerson 77a5c96560 [GlobalISel][Legalizer] Convert the FP constants to the right APFloat type for G_FCONSTANT.
We weren't converting the immediate ConstantFP during legalization, which caused
the wrong bit patterns to be emitted for half type FP constants.

Fixes PR36106.

llvm-svn: 323582
2018-01-27 07:07:20 +00:00
Aditya Nandakumar 81c81b6426 [GISel]: Implement GlobalISel combiner API.
https://reviews.llvm.org/D41373

The various components are

GICombinerHelper contains transformations that are common to all
targets. Targets can pick and choose which transformations (at
function/opcode granularity) each pass uses via configuring a
GICombinerInfo.

GICombiner contains some common code and it does the traversal,
driving of combines, worklist management and iterating until
convergence.

GICombinerInfo is an interface with a virtual method called combine.
The combiner info will allow targets to pick and choose (or
implement their own specific combines). CombineInfos can make
use of available combines in GICombineHelper to configure the
transformations for a particular pass. Currently this approach allows
cherry picking transformations from helpers (at function/opcode
granularity) and also allows early returning on specific
transformations. Targets also get to prioritize whether target specific
combines run before/after the opt-in generic combines. Ideally we would
like this part to be configured by both C++ and Tablegen. The
CombinerInfo also has a field which indicates how to deal with
IllegalOps (ie - should we allow to create them/or legalize them?).

A CombinerPass would configure a CombinerInfo, create the GICombiner
with the Info, and call
GICombiner::combineMachineInstrs(MachineFunction&).
This organization is very similar to the GISelLegalizer.

llvm-svn: 323392
2018-01-25 00:41:58 +00:00
Daniel Sanders 262ed0ecd7 [globalisel] Introduce LegalityQuery to better encapsulate the legalizer decisions. NFC.
Summary:
`getAction(const InstrAspect &) const` breaks encapsulation by exposing
the smaller components that are used to decide how to legalize an
instruction.

This is a problem because we need to change the implementation of
LegalizerInfo so that it's able to describe particular type combinations
rather than just cartesian products of types.

For example, declaring the following
  setAction({..., 0, s32}, Legal)
  setAction({..., 0, s64}, Legal)
  setAction({..., 1, s32}, Legal)
  setAction({..., 1, s64}, Legal)
currently declares these type combinations as legal:
  {s32, s32}
  {s64, s32}
  {s32, s64}
  {s64, s64}
but we currently have no means to say that, for example, {s64, s32} is
not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/
G_UNMERGE_VALUES has relationships between the types that are currently
described incorrectly.

Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics
differently to atomics. The necessary information is in the MMO but we have no
way to use this in the legalizer. Similarly, there is currently no way for the
register type and the memory type to differ so there is no way to cleanly
represent extending-load/truncating-store in a way that can't be broken by
optimizers (resulting in illegal MIR).

This patch introduces LegalityQuery which provides all the information
needed by the legalizer to make a decision on whether something is legal
and how to legalize it.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, reames, bogner

Reviewed By: bogner

Subscribers: bogner, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D42244

llvm-svn: 323342
2018-01-24 17:17:46 +00:00
Aditya Nandakumar f2aa2af24e [GISel]: Remove redundant copies at the end of ISel
https://reviews.llvm.org/D42402

A lot of these copies are useless (copies b/w VRegs having the same
regclass) and should be cleaned up.

llvm-svn: 323291
2018-01-24 01:35:26 +00:00
Aditya Nandakumar 18b3f9d384 [GISel] Make constrainSelectedInstRegOperands() available to the legalizer. NFC
https://reviews.llvm.org/D42149

llvm-svn: 322743
2018-01-17 19:31:33 +00:00
Diana Picus 65ed364fac [ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNC
Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware
support, but only for conversions between float and double.

Also add the necessary boilerplate so that the LegalizerHelper can
introduce the required libcalls. This also works only for float and
double, but isn't too difficult to extend when the need arises.

llvm-svn: 322651
2018-01-17 13:34:10 +00:00
Diana Picus e74243d473 [ARM GlobalISel] Legalize G_FMA
For hard float with VFP4, it is legal. Otherwise, we use libcalls.

This needs a bit of support in the LegalizerHelper for soft float
because we didn't handle G_FMA libcalls yet. The support is trivial, as
the only difference between G_FMA and other libcalls that we already
handle is that it has 3 input operands rather than just 2.

llvm-svn: 322366
2018-01-12 11:30:45 +00:00
Aditya Nandakumar 5710c44eee [GISel]: Don't create G_MUL with 1 during translation of GEP
When element size is 1, it's just wasteful to create MUL with 1.
https://reviews.llvm.org/D41738

llvm-svn: 321857
2018-01-05 02:56:28 +00:00
Amara Emerson 9de62130fd [GlobalISel][Legalizer] Fix legalization of llvm.smul.with.overflow
Previously the code for handling G_SMULO didn't properly check for the signed
multiply overflow, instead treating it the same as the unsigned G_UMULO.

Fixes PR35800.

llvm-svn: 321690
2018-01-03 04:56:56 +00:00
Amara Emerson 913918cbef [AArch64][GlobalISel] Fix assert fail with unknown intrinsic.
A call may have an intrinsic name but not have a valid intrinsic ID,
for example with llvm.invariant.group.barrier. If so, treat it as a
normal call like FastISel does.

llvm-svn: 321662
2018-01-02 18:56:39 +00:00
Amara Emerson b6ddbef673 [GlobalISel][Legalizer] Fix crash when trying to lower G_FNEG of fp128 types.
This doesn't add legalizer support, just prevents crashing so that we
can gracefully fall back to SDAG.

Fixes PR35690.

llvm-svn: 321091
2017-12-19 17:21:35 +00:00
Matthias Braun f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Matt Arsenault 7d7adf4f2e TLI: Allow using PSV for intrinsic mem operands
llvm-svn: 320756
2017-12-14 22:34:10 +00:00
Matt Arsenault 1117133687 DAG: Expose all MMO flags in getTgtMemIntrinsic
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.

On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.

llvm-svn: 320746
2017-12-14 21:39:51 +00:00
Michael Zolotukhin c468b648fd Remove redundant includes from lib/CodeGen.
llvm-svn: 320619
2017-12-13 21:30:47 +00:00
Amara Emerson df9b529d42 [GlobalISel] Disable GISel for big endian.
This is due to PR26161 needing to be resolved before we can fix
big endian bugs like PR35359. The work to split aggregates into smaller LLTs
instead of using one large scalar will take some time, so in the mean time
we'll fall back to SDAG.

Some ARM BE tests xfailed for now as a result.

Differential Revision: https://reviews.llvm.org/D40789

llvm-svn: 320388
2017-12-11 16:58:29 +00:00
Daniel Sanders 3c1c4c0ee0 Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
Some concerns were raised with the direction. Revert while we discuss it and look into an alternative

llvm-svn: 319739
2017-12-05 05:52:07 +00:00
Daniel Sanders 04e4f47e93 [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.

All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.

There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
  (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
  (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.

llvm-svn: 319691
2017-12-04 20:39:32 +00:00
Volkan Keles a32ff00b00 GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUES
Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES.

Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls

Reviewed By: dsanders

Subscribers: rovka, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39823

llvm-svn: 319524
2017-12-01 08:19:10 +00:00
Daniel Sanders aef1dfc690 [aarch64][globalisel] Legalize G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_*
G_ATOMICRMW_* is generally legal on AArch64. The exception is G_ATOMICRMW_NAND.

G_ATOMIC_CMPXCHG_WITH_SUCCESS needs to be lowered to G_ATOMIC_CMPXCHG with an
external comparison.

Note that IRTranslator doesn't generate these instructions yet.

llvm-svn: 319466
2017-11-30 20:11:42 +00:00
Amara Emerson d78d65c2a4 [GlobalISel][IRTranslator] Fix crash during translation of zero sized loads/stores/args/returns.
This fixes PR35358.

rdar://35619533

Differential Revision: https://reviews.llvm.org/D40604

llvm-svn: 319465
2017-11-30 20:06:02 +00:00
Jonas Paulsson f0ff20f1f0 Use getStoreSize() in various places instead of 'BitSize >> 3'.
This is needed for cases when the memory access is not as big as the width of
the data type. For instance, storing i1 (1 bit) would be done in a byte (8
bits).

Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a
size of 0, which for instance makes alias analysis return NoAlias even when
it shouldn't.

There are no tests as this was done as a follow-up to the bugfix for the case
where this was discovered (r318824). This handles more similar cases.

Review: Björn Petterson
https://reviews.llvm.org/D40339

llvm-svn: 319173
2017-11-28 14:44:32 +00:00
Francis Visoiu Mistrih 9d419d3b0c [CodeGen] Rename functions PrintReg* to printReg*
LLVM Coding Standards:
  Function names should be verb phrases (as they represent actions), and
  command-like function should be imperative. The name should be camel
  case, and start with a lower case letter (e.g. openFile() or isFoo()).

Differential Revision: https://reviews.llvm.org/D40416

llvm-svn: 319168
2017-11-28 12:42:37 +00:00
Diana Picus c01f7f131b [ARM GlobalISel] Support G_FDIV for s32 and s64
TableGen already generates code for selecting a G_FDIV, so we only need
to add a test.

For the legalizer and reg bank select, we do the same thing as for the
other floating point binary operations: either mark as legal if we have
a FP unit or lower to a libcall, and map to the floating point
registers.

llvm-svn: 318915
2017-11-23 13:26:07 +00:00
Diana Picus 9faa09b21e [ARM GlobalISel] Support G_FMUL for s32 and s64
TableGen already generates code for selecting a G_FMUL, so we only need
to add a test for that part.

For the legalizer and reg bank select, we do the same thing as the other
floating point binary operators: either mark as legal if we have a FP
unit or lower to a libcall, and map to the floating point registers.

llvm-svn: 318910
2017-11-23 12:44:20 +00:00
Quentin Colombet fe538f7145 [RegisterBankInfo] Relax the assert of having matching type sizes on default mappings
Instead of asserting that the type sizes are exactly equal, we check
that the new size is big enough to contain the original type.
We have to relax this constrain because, right now, we sometimes
specify that things that are smaller than a storage type are legal
instead of widening everything to the size of a storage type.
E.g., we say that G_AND s16 is legal and we map that on GPR32.

This is something we may revisit in the future (either by changing
the legalization process or keeping track separately of the storage
size and the size of the type), but let us reflect the reality of
the situation for now.

llvm-svn: 318587
2017-11-18 04:28:58 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Daniel Sanders f76f315436 [globalisel][tablegen] Generate rule coverage and use it to identify untested rules
Summary:
This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
causes TableGen to instrument the generated table to collect rule coverage
information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
LLVM_ENABLE_DAGISEL_COV. The information is written to files
(${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
read this information and use it to emit warnings about untested rules.

This technique could also be used by SelectionDAG and can be further
extended to detect hot rules and give them priority over colder rules.

Usage:
* Enable LLVM_ENABLE_GISEL_COV in CMake
* Build the compiler and run some tests
* cat gisel-coverage-[0-9]* > gisel-coverage-all
* Delete lib/Target/*/*GenGlobalISel.inc*
* Build the compiler

Known issues:
* ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual
  step due to a lack of a portable 'cat' command. It should be the
  concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
* There's no mechanism to discard coverage information when the ruleset
  changes

Depends on D39742

Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka

Reviewed By: rovka

Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39747

llvm-svn: 318356
2017-11-16 00:46:35 +00:00
Aditya Nandakumar 954eea074b [GISel][NFC]: Move getOpcodeDef from the LegalizationArtifactCombiner into GlobalISel/Utils for use elsewhere
llvm-svn: 318350
2017-11-15 23:45:04 +00:00
Fangrui Song e73534464d NFC Remove default argument of DataLayout::getPointerABIAlignment
Differential Revision: https://reviews.llvm.org/D40005

llvm-svn: 318272
2017-11-15 06:17:32 +00:00
Aditya Nandakumar e6201c8724 [GISel]: Rework legalization algorithm for better elimination of
artifacts along with DCE

Legalization Artifacts are all those insts that are there to make the
type system happy. Currently, the target needs to say all combinations
of extends and truncs are legal and there's no way of verifying that
post legalization, we only have *truly* legal instructions. This patch
changes roughly the legalization algorithm to process all illegal insts
at one go, and then process all truncs/extends that were added to
satisfy the type constraints separately trying to combine trivial cases
until they converge. This has the added benefit that, the target
legalizerinfo can only say which truncs and extends are okay and the
artifact combiner would combine away other exts and truncs.

Updated legalization algorithm to roughly the following pseudo code.

WorkList Insts, Artifacts;
collect_all_insts_and_artifacts(Insts, Artifacts);

do {
  for (Inst in Insts)
         legalizeInstrStep(Inst, Insts, Artifacts);
  for (Artifact in Artifacts)
         tryCombineArtifact(Artifact, Insts, Artifacts);
} while(!Insts.empty());

Also, wrote a simple wrapper equivalent to SetVector, except for
erasing, it avoids moving all elements over by one and instead just
nulls them out.

llvm-svn: 318210
2017-11-14 22:42:19 +00:00
Daniel Sanders 7e52367398 [globalisel][tablegen] Import signextload and zeroextload.
Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to
correct situations where SelectionDAG and GlobalISel disagree on
representation. For example, it would rewrite:
  (sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16>
to:
  (sext:i32 (load:i16 $ptr)<<unindexedload>>)

I'd have preferred to replace the fragments and have the expansion happen
naturally as part of PatFrag expansion but the type inferencing system can't
cope with loads of types narrower than those mentioned in register classes.
This is because the SDTCisInt's on the sext constrain both the result and
operand to the 'legal' integer types (where legal is defined as 'a register
class can contain the type') which immediately rules the narrower types out.
Several targets (those with only one legal integer type) would then go on to
crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types
for the result of the extend.

Also, improve isObviouslySafeToFold() slightly to automatically return true for
neighbouring instructions. There can't be any re-ordering problems if
re-ordering isn't happenning. We'll need to improve it further to handle
sign/zero-extending loads when the extend and load aren't immediate neighbours
though.

llvm-svn: 317971
2017-11-11 03:23:44 +00:00
David Blaikie 3f833edc7c Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647
2017-11-08 01:01:31 +00:00
Kristof Beyls 78aa4b28a3 Mark intentional fall-through with LLVM_FALLTHROUGH.
... to silence gcc 7's default -Wimplicit-fallthrough.

llvm-svn: 317573
2017-11-07 13:31:52 +00:00
Kristof Beyls 178818ba20 Silence C4715 warning from MSVC (NFC).
The warning started triggering after r317560.
This commit silences it in the same way as previously done in a similar
situation, see
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140915/236088.html

llvm-svn: 317568
2017-11-07 11:54:00 +00:00
Kristof Beyls af9814a1fc [GlobalISel] Enable legalizing non-power-of-2 sized types.
This changes the interface of how targets describe how to legalize, see
the below description.

1. Interface for targets to describe how to legalize.

In GlobalISel, the API in the LegalizerInfo class is the main interface
for targets to specify which types are legal for which operations, and
what to do to turn illegal type/operation combinations into legal ones.

For each operation the type sizes that can be legalized without having
to change the size of the type are specified with a call to setAction.
This isn't different to how GlobalISel worked before. For example, for a
target that supports 32 and 64 bit adds natively:

  for (auto Ty : {s32, s64})
    setAction({G_ADD, 0, s32}, Legal);

or for a target that needs a library call for a 32 bit division:

  setAction({G_SDIV, s32}, Libcall);

The main conceptual change to the LegalizerInfo API, is in specifying
how to legalize the type sizes for which a change of size is needed. For
example, in the above example, how to specify how all types from i1 to
i8388607 (apart from s32 and s64 which are legal) need to be legalized
and expressed in terms of operations on the available legal sizes
(again, i32 and i64 in this case). Before, the implementation only
allowed specifying power-of-2-sized types (e.g. setAction({G_ADD, 0,
s128}, NarrowScalar).  A worse limitation was that if you'd wanted to
specify how to legalize all the sized types as allowed by the LLVM-IR
LangRef, i1 to i8388607, you'd have to call setAction 8388607-3 times
and probably would need a lot of memory to store all of these
specifications.

Instead, the legalization actions that need to change the size of the
type are specified now using a "SizeChangeStrategy".  For example:

   setLegalizeScalarToDifferentSizeStrategy(
       G_ADD, 0, widenToLargerAndNarrowToLargest);

This example indicates that for type sizes for which there is a larger
size that can be legalized towards, do it by Widening the size.
For example, G_ADD on s17 will be legalized by first doing WidenScalar
to make it s32, after which it's legal.
The "NarrowToLargest" indicates what to do if there is no larger size
that can be legalized towards. E.g. G_ADD on s92 will be legalized by
doing NarrowScalar to s64.

Another example, taken from the ARM backend is:
   for (unsigned Op : {G_SDIV, G_UDIV}) {
     setLegalizeScalarToDifferentSizeStrategy(Op, 0,
         widenToLargerTypesUnsupportedOtherwise);
     if (ST.hasDivideInARMMode())
       setAction({Op, s32}, Legal);
     else
       setAction({Op, s32}, Libcall);
   }

For this example, G_SDIV on s8, on a target without a divide
instruction, would be legalized by first doing action (WidenScalar,
s32), followed by (Libcall, s32).

The same principle is also followed for when the number of vector lanes
on vector data types need to be changed, e.g.:

   setAction({G_ADD, LLT::vector(8, 8)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(16, 8)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(4, 16)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(8, 16)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(2, 32)}, LegalizerInfo::Legal);
   setAction({G_ADD, LLT::vector(4, 32)}, LegalizerInfo::Legal);
   setLegalizeVectorElementToDifferentSizeStrategy(
       G_ADD, 0, widenToLargerTypesUnsupportedOtherwise);

As currently implemented here, vector types are legalized by first
making the vector element size legal, followed by then making the number
of lanes legal. The strategy to follow in the first step is set by a
call to setLegalizeVectorElementToDifferentSizeStrategy, see example
above.  The strategy followed in the second step
"moreToWiderTypesAndLessToWidest" (see code for its definition),
indicating that vectors are widened to more elements so they map to
natively supported vector widths, or when there isn't a legal wider
vector, split the vector to map it to the widest vector supported.

Therefore, for the above specification, some example legalizations are:
  * getAction({G_ADD, LLT::vector(3, 3)})
    returns {WidenScalar, LLT::vector(3, 8)}
  * getAction({G_ADD, LLT::vector(3, 8)})
    then returns {MoreElements, LLT::vector(8, 8)}
  * getAction({G_ADD, LLT::vector(20, 8)})
    returns {FewerElements, LLT::vector(16, 8)}


2. Key implementation aspects.

How to legalize a specific (operation, type index, size) tuple is
represented by mapping intervals of integers representing a range of
size types to an action to take, e.g.:

       setScalarAction({G_ADD, LLT:scalar(1)},
                       {{1, WidenScalar},  // bit sizes [ 1, 31[
                        {32, Legal},       // bit sizes [32, 33[
                        {33, WidenScalar}, // bit sizes [33, 64[
                        {64, Legal},       // bit sizes [64, 65[
                        {65, NarrowScalar} // bit sizes [65, +inf[
                       });

Please note that most of the code to do the actual lowering of
non-power-of-2 sized types is currently missing, this is just trying to
make it possible for targets to specify what is legal, and how non-legal
types should be legalized.  Probably quite a bit of further work is
needed in the actual legalizing and the other passes in GlobalISel to
support non-power-of-2 sized types.

I hope the documentation in LegalizerInfo.h and the examples provided in the
various {Target}LegalizerInfo.cpp and LegalizerInfoTest.cpp explains well
enough how this is meant to be used.

This drops the need for LLT::{half,double}...Size().


Differential Revision: https://reviews.llvm.org/D30529

llvm-svn: 317560
2017-11-07 10:34:34 +00:00
David Blaikie 1be62f0327 Move TargetFrameLowering.h to CodeGen where it's implemented
This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.

This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.

llvm-svn: 317379
2017-11-03 22:32:11 +00:00
Javed Absar 5cde1ccb29 [GlobalISel|ARM] : Allow legalizing G_FSUB
Adding support for VSUB.
Reviewed by: @rovka
Differential Revision: https://reviews.llvm.org/D39261

llvm-svn: 316902
2017-10-30 13:51:56 +00:00
Aditya Nandakumar d2a954d0ae Make the combiner check if shifts are legal before creating them
Summary: Make sure shifts are legal/specified by the legalizerinfo before creating it

Reviewers: qcolombet, dsanders, rovka, t.p.northover

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39264

llvm-svn: 316602
2017-10-25 18:49:18 +00:00
Daniel Sanders ea8711b88e Re-commit r315885: [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.

At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.

The previous commit failed on MSVC due to a failure to convert an
initializer_list to a std::vector. Hopefully, MSVC will accept this version.

Depends on D37457

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37458

llvm-svn: 315887
2017-10-16 03:36:29 +00:00
Daniel Sanders ce72d611af Revert r315885: [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
MSVC doesn't like one of the constructors.

llvm-svn: 315886
2017-10-16 02:15:39 +00:00
Daniel Sanders 6735ea86cd [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.

At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.

Depends on D37457

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37458

llvm-svn: 315885
2017-10-16 01:16:35 +00:00
Daniel Sanders df39cbae2f Re-commit r315863: [globalisel][tablegen] Import ComplexPattern when used as an operator
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.

This patch adds support for this in order to enable the import of load/store
patterns.

Depends on D37445

Hopefully fixed the ambiguous constructor that a large number of bots reported.

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D37456

llvm-svn: 315869
2017-10-15 18:22:54 +00:00
Daniel Sanders bb082a36d3 Revert r315863: [globalisel][tablegen] Import ComplexPattern when used as an operator
A large number of bots are failing on an ambiguous constructor call.

llvm-svn: 315866
2017-10-15 17:51:07 +00:00
Daniel Sanders b95b867dd8 [globalisel][tablegen] Import ComplexPattern when used as an operator
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.

This patch adds support for this in order to enable the import of load/store
patterns.

Depends on D37445

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D37456

llvm-svn: 315863
2017-10-15 17:03:36 +00:00
Aaron Ballman 615eb47035 Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.
Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1

llvm-svn: 315854
2017-10-15 14:32:27 +00:00
Quentin Colombet 4a6b75012d [RegisterBankInfo] Cache the getMinimalPhysRegClass information
TargetRegisterInfo::getMinimalPhysRegClass is actually pretty expensive
because it has to iterate over all the register classes.
Cache this information as we need and get it so that we limit its usage.
Right now, we heavily rely on it, because this is how we get the mapping
for vregs defined by copies from physreg (i.e., the one that are ABI
related).

Improve compile time by up to 10% for that pass.

NFC

llvm-svn: 315759
2017-10-13 21:16:15 +00:00
Quentin Colombet d58265ad55 [Legalizer] Use SmallSetVector instead of SetVector.
NFC

llvm-svn: 315758
2017-10-13 21:16:14 +00:00
Quentin Colombet b1a3bd1529 [LegalizerInfo] Don't evaluate end boundary every time through the loop
Match the LLVM coding standard for loop conditions.

NFC.

llvm-svn: 315757
2017-10-13 21:16:13 +00:00
Quentin Colombet 86220488b4 [Legalizer] Only allocate the SetVectors once per function.
Prior to this patch we used to create SetVectors in temporaries that
were created and destroyed for each instruction. Now, instead we create
and destroyed them only once, but clear the content for each
instruction.
This speeds up the pass by ~25%.

NFC.

llvm-svn: 315756
2017-10-13 21:16:05 +00:00
Don Hinton 3e0199f7eb [dump] Remove NDEBUG from test to enable dump methods [NFC]
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.

Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.

Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.

Differential Revision: https://reviews.llvm.org/D38406

llvm-svn: 315590
2017-10-12 16:16:06 +00:00
Justin Bogner fdf9bf4f16 CodeGen: Minor cleanups to use MachineInstr::getMF. NFC
Since r315388 we have a shorter way to say this, so we'll replace
MI->getParent()->getParent() with MI->getMF() in a few places.

llvm-svn: 315390
2017-10-10 23:50:49 +00:00
Adam Nemet 0965da2055 Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*
Sync it up with the name of the class actually defined here.  This has been
bothering me for a while...

llvm-svn: 315249
2017-10-09 23:19:02 +00:00
Aditya Nandakumar c3bfc81a1f [GISel]: Fix generation of illegal COPYs during CallLowering
We end up creating COPY's that are either truncating/extending and this
should be illegal.

https://reviews.llvm.org/D37640

Patch for X86 and ARM by igorb, rovka

llvm-svn: 315240
2017-10-09 20:07:43 +00:00
Quentin Colombet c2f3cea608 [Legalizer] Add support for G_OR NarrowScalar.
Legalize bitwise OR:
 A = BinOp<Ty> B, C
into:
 B1, ..., BN = G_UNMERGE_VALUES B
 C1, ..., CN = G_UNMERGE_VALUES C
 A1 = BinOp<Ty/N> B1, C2
 ...
 AN = BinOp<Ty/N> BN, CN
 A = G_MERGE_VALUES A1, ..., AN

llvm-svn: 314760
2017-10-03 04:53:56 +00:00
Eugene Zelenko 4f81cdd818 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314559
2017-09-29 21:55:49 +00:00
Ahmed Bougacha d630a92b0e [GlobalISel] Only build expensive remarks if they're enabled. NFC.
r313390 taught 'allowExtraAnalysis' to check whether remarks are
enabled at all.  Use that to only do the expensive instruction printing
if they are.

llvm-svn: 313552
2017-09-18 18:50:09 +00:00
Aditya Nandakumar c6615f56f5 [GISel]: Add a clean up combiner during legalization.
Added a combiner which can clean up truncs/extends that are created in
order to make the types work during legalization.

Also moved the combineMerges to the LegalizeCombiner.

https://reviews.llvm.org/D36880

llvm-svn: 312158
2017-08-30 19:32:59 +00:00
David Blaikie 196f53b295 Fix unused-lambda-capture warning by using default capture-by-ref
Since the lambda isn't escaped (via a std::function or similar) it's
fine/better to use default capture-by-ref to provide semantics similar
to language-level nested scopes (if/for/while/etc).

llvm-svn: 311782
2017-08-25 16:46:07 +00:00
Matt Morehouse 355f6f7444 Fix buildbot breakage from r311763. Remove unused lambda capture.
llvm-svn: 311781
2017-08-25 16:19:26 +00:00
Aditya Nandakumar 892979effc [GISel]: Implement widenScalar for Legalizing G_PHI
https://reviews.llvm.org/D37018

llvm-svn: 311763
2017-08-25 04:57:27 +00:00
Aditya Nandakumar efd8a84cd5 [GISEl]: Translate phi into G_PHI
G_PHI has the same semantics as PHI but also has types.
This lets us verify that the types in the G_PHI are consistent.
This also allows specifying legalization actions for G_PHIs.

https://reviews.llvm.org/D36990

llvm-svn: 311596
2017-08-23 20:45:48 +00:00
Quentin Colombet c7652e22d7 [GlobalISel] Remove a stall comment in CMake.
Thanks to Diana Picus <diana.picus@linaro.org> for noticing.

NFC

llvm-svn: 310114
2017-08-04 20:15:41 +00:00
Quentin Colombet 250e050a50 [GlobalISel] Make GlobalISel a non-optional library.
With this change, the GlobalISel library gets always built. In
particular, this is not possible to opt GlobalISel out of the build
using the LLVM_BUILD_GLOBAL_ISEL variable any more.

llvm-svn: 309990
2017-08-03 21:52:25 +00:00
Adrian Prantl aac78ce47e Use helper function instead of manually constructing DBG_VALUEs (NFC)
rdar://problem/33580047

llvm-svn: 309757
2017-08-01 22:37:35 +00:00
Aditya Nandakumar 02c602e18c [GISel]: Support Widening G_ICMP's destination operand.
Updated AArch64 to widen destination to s32.
https://reviews.llvm.org/D35737

Reviewed by Tim

llvm-svn: 309579
2017-07-31 17:00:16 +00:00
Adrian Prantl d92ac5a259 Remove the unused DBG_VALUE offset parameter from GlobalISel (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309449
2017-07-28 22:46:20 +00:00
Adrian Prantl abe04759a6 Remove the obsolete offset parameter from @llvm.dbg.value
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

llvm-svn: 309426
2017-07-28 20:21:02 +00:00
Tim Northover 071d77a51f GlobalISel: stop localizer putting constants before EH_LABELs
If the localizer pass puts one of its constants before the label that tells the
unwinder "jump here to handle your exception" then control-flow will skip it,
leaving uninitialized registers at runtime. That's bad.

llvm-svn: 308687
2017-07-20 22:58:26 +00:00
Diana Picus df4100b3d2 GlobalISel: Support G_(S|U)REM widening in LegalizerHelper
Treat widening G_SREM and G_UREM the same as G_SDIV and G_UDIV. This is
going to be used in the ARM backend (and that's when the test will come
too).

llvm-svn: 308278
2017-07-18 09:08:47 +00:00
Konstantin Zhuravlyov bb80d3e1d3 Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
Diana Picus d0104eaae8 [ARM] GlobalISel: Legalize G_FCMP for s32
This covers both hard and soft float.

Hard float is easy, since it's just Legal.

Soft float is more involved, because there are several different ways to
handle it based on the predicate: one and ueq need not only one, but two
libcalls to get a result. Furthermore, we have large differences between
the values returned by the AEABI and GNU functions.

AEABI functions return a nice 1 or 0 representing true and respectively
false. GNU functions generally return a value that needs to be compared
against 0 (e.g. for ogt, the value returned by the libcall is > 0 for
true).  We could introduce redundant comparisons for AEABI as well, but
they don't seem easy to remove afterwards, so we do different processing
based on whether or not the result really needs to be compared against
something (and just truncate if it doesn't).

llvm-svn: 307243
2017-07-06 09:09:33 +00:00
Daniel Sanders a6cfce6863 [globalisel][tablegen] Finish fixing compile-time regressions by merging the matcher and emitter state machines.
Summary:
Also, made a few minor tweaks to shave off a little more cumulative memory consumption:
* All rules share a single NewMIs instead of constructing their own. Only one
  will end up using it.
* Use MIs.resize(1) instead of MIs.clear();MIs.push_back(I) and prevent
  GIM_RecordInsn from changing MIs[0].

Depends on D33764

Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar

Reviewed By: ab

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D33766

llvm-svn: 307159
2017-07-05 14:50:18 +00:00
Diana Picus fc1675eb16 [GlobalISel] Refactor Legalizer helpers for libcalls
We used to have a helper that replaced an instruction with a libcall.
That turns out to be too aggressive, since sometimes we need to replace
the instruction with at least two libcalls. Therefore, change our
existing helper to only create the libcall and leave the instruction
removal as a separate step. Also rename the helper accordingly.

llvm-svn: 307149
2017-07-05 12:57:24 +00:00
Diana Picus 97a5d9b5a7 [MachineIRBuilder] Fix formatting. NFC.
llvm-svn: 307144
2017-07-05 11:47:23 +00:00
Diana Picus 3e40b46bf0 [MachineIRBuilder] Add buildOr helper. NFC.
This isn't used anywhere yet, but I need it for a future commit.

llvm-svn: 307141
2017-07-05 11:32:12 +00:00
Diana Picus 05e704f453 [MachineIRBuilder] Add buildBinaryOp helper. NFC
Add a helper for building simple binary ops like add, mul, sub, and.
This can be used in the future for quickly adding support for or, xor.

llvm-svn: 307139
2017-07-05 11:02:31 +00:00
Daniel Sanders 1745076c59 [globalisel][tablegen] Fix an unused variable warning in release builds after r307133
llvm-svn: 307138
2017-07-05 10:16:48 +00:00
Daniel Sanders d93a35ae40 [globalisel][tablegen] Added instruction emission to the state-machine-based matcher.
Summary:
This further improves the compile-time regressions that will be caused by a
re-commit of r303259.

Also added included preliminary work in preparation for the multi-insn emitter
since I needed to change the relevant part of the API for this patch anyway.

Depends on D33758

Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar

Reviewed By: ab

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D33764

llvm-svn: 307133
2017-07-05 09:39:33 +00:00
Daniel Sanders 6ab0daade8 [globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s)
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.

The following patches will expand on this further to fully fix the regressions.

Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar

Reviewed By: ab

Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33758

llvm-svn: 307079
2017-07-04 14:35:06 +00:00
Tim Northover ff5e7e1295 GlobalISel: add G_IMPLICIT_DEF instruction.
It looks like there are two target-independent but not GISel instructions that
need legalization, IMPLICIT_DEF and PHI. These are already anomalies since
their operands have important LLTs attached, so to make things more uniform it
seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF.

llvm-svn: 306875
2017-06-30 20:27:36 +00:00
Kristof Beyls b539ea5393 [GlobalISel] Make multi-step legalization work.
In r301116, a custom lowering needed to be introduced to be able to
legalize 8 and 16-bit divisions on ARM targets without a division
instruction, since 2-step legalization (WidenScalar from 8 bit to 32
bit, then Libcall the 32-bit division) doesn't work.

This fixes this and makes this kind of multi-step legalization, where
first the size of the type needs to be changed and then some action is
needed that doesn't require changing the size of the type,
straighforward to specify.

Differential Revision: https://reviews.llvm.org/D32529

llvm-svn: 306806
2017-06-30 08:26:20 +00:00
Aditya Nandakumar 20f6207013 [GISel]: New Opcode G_FLOG/G_FLOG2
https://reviews.llvm.org/D34837

llvm-svn: 306766
2017-06-29 23:43:44 +00:00
Tim Northover c990236ff9 GlobalISel: add some more sanity-checking to MachineInstrBuilder. NFC.
llvm-svn: 306481
2017-06-27 22:45:35 +00:00
Aditya Nandakumar cca75d2406 [GISel]: Add G_FEXP, G_FEXP2 opcodes
Also add IRTranslator support.
https://reviews.llvm.org/D34710

llvm-svn: 306475
2017-06-27 22:19:32 +00:00
Tim Northover 849fcca090 GlobalISel: verify that a COPY is trivial when created.
Without this check, COPY instructions can actually be one of the generic casts
in disguise. That's confusing and bad.

At some point during ISel this restriction has to be relaxed since the fully
selected instructions will usually use COPY for those purposes. Right now I
think it's possible that relaxation occurs during RegBankSelect (hence the
change there). I'm not convinced that's where it belongs long-term though.

llvm-svn: 306470
2017-06-27 21:41:40 +00:00
Eugene Zelenko 76bf48d932 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 306341
2017-06-26 22:44:03 +00:00
Tim Northover c2d5e6d637 AArch64: legalize G_EXTRACT operations.
This is the dual problem to legalizing G_INSERTs so most of the code and
testing was cribbed from there.

llvm-svn: 306328
2017-06-26 20:34:13 +00:00
Tim Northover 4b4eec7009 GlobalISel: remove G_SEQUENCE instruction.
It was trying to do too many things. The basic lumping together of values for
legalization purposes is now handled by G_MERGE_VALUES. More complex things
involving gaps and odd sizes are handled by G_INSERT sequences.

llvm-svn: 306120
2017-06-23 16:15:55 +00:00
Tim Northover b57bf2ac79 GlobalISel: convert buildSequence to use non-deprecated instructions.
G_SEQUENCE is going away soon so as a first step the MachineIRBuilder needs to
be taught how to emulate it with alternatives. We use G_MERGE_VALUES where
possible, and a sequence of G_INSERTs if not.

llvm-svn: 306119
2017-06-23 16:15:37 +00:00
Aditya Nandakumar c6a419123a [GISel]: Add G_FMA opcode for fused multiply adds
https://reviews.llvm.org/D34372

Reviewed by dsanders

llvm-svn: 305824
2017-06-20 19:25:23 +00:00
Daniel Sanders a6e2cebf98 [globalisel][tablegen] Add support for COPY_TO_REGCLASS.
Summary:
As part of this
* Emitted instructions now have named MachineInstr variables associated
  with them. This isn't particularly important yet but it's a small step
  towards multiple-insn emission.
* constrainSelectedInstRegOperands() is no longer hardcoded. It's now added
  as the ConstrainOperandsToDefinitionAction() action. COPY_TO_REGCLASS uses
  an alternate constraint mechanism ConstrainOperandToRegClassAction() which
  supports arbitrary constraints such as that defined by COPY_TO_REGCLASS.

Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls, aditya_nandakumar

Reviewed By: ab

Subscribers: javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D33590

llvm-svn: 305791
2017-06-20 12:36:34 +00:00
Igor Breger 14535f0fc2 [GlobalISel] combine not symmetric merge/unmerge nodes.
Summary:
In some cases legalization ends up with not symmetric merge/unmerge nodes.
Transform it to merge/unmerge nodes.

Reviewers: t.p.northover, qcolombet, zvi

Reviewed By: t.p.northover

Subscribers: rovka, kristof.beyls, guyblank, llvm-commits

Differential Revision: https://reviews.llvm.org/D33626

llvm-svn: 305783
2017-06-20 08:54:17 +00:00
Diana Picus 02e11010b2 [ARM] GlobalISel: Add support for i32 modulo
Add support for modulo for targets that have hardware division and for
those that don't. When hardware division is not available, we have to
choose the correct libcall to use. This is generally straightforward,
except for AEABI.

The AEABI variant is trickier than the other libcalls because it
returns { quotient, remainder }, instead of just one value like the
other libcalls that we've seen so far. Therefore, we need to use custom
lowering for it. However, we don't want to have too much special code,
so we refactor the target-independent code in the legalizer by adding a
helper for replacing an instruction with a libcall. This helper is used
by the legalizer itself when dealing with simple calls, and also by the
custom ARM legalization for the more complicated AEABI divmod calls.

llvm-svn: 305459
2017-06-15 10:53:31 +00:00
Daniel Sanders 4e52366c2a [globalisel][legalizer] G_LOAD/G_STORE NarrowScalar should not emit G_GEP x, 0.
Summary:
When legalizing G_LOAD/G_STORE using NarrowScalar, we should avoid emitting
	%0 = G_CONSTANT ty 0
	%1 = G_GEP %x, %0
since it's cheaper to not emit the redundant instructions than it is to fold them
away later.

Reviewers: qcolombet, t.p.northover, ab, rovka, aditya_nandakumar, kristof.beyls

Reviewed By: qcolombet

Subscribers: javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D32746

llvm-svn: 305340
2017-06-13 23:42:32 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Volkan Keles ebe6bb9006 [GlobalISel] IRTranslator: Add MachineMemOperand to target memory intrinsics
Reviewers: qcolombet, ab, t.p.northover, aditya_nandakumar, dsanders

Reviewed By: qcolombet

Subscribers: rovka, kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D33724

llvm-svn: 304743
2017-06-05 22:17:17 +00:00
Quentin Colombet 73141d5b4b [Localizer] Don't trick to be smart for the insertion point
There is no guarantee that the first use of a constant that is traversed
is actually the first in the related basic block. Thus, if we use that
as the insertion point we may end up with definitions that don't
dominate there use.

llvm-svn: 304244
2017-05-30 20:53:06 +00:00
Davide Italiano af66659d6b [GlobalIsel] Fix a warning with GCC 7 -Wpedantic. NFCI.
llvm-svn: 304174
2017-05-29 20:13:22 +00:00
NAKAMURA Takumi a288ec412f Prune trailing whitespace. (To regenerate makefiles)
llvm-svn: 304112
2017-05-28 22:54:25 +00:00