Commit Graph

52024 Commits

Author SHA1 Message Date
Chen Zheng c4c407a0eb [PowerPC] use more meaningful name - NFC
llvm-svn: 361218
2019-05-21 03:54:42 +00:00
Matt Arsenault 6dd08e335f AMDGPU: Force skip branches over calls
Unfortunately the way SIInsertSkips works is backwards, and is
required for correctness. r338235 added handling of some special cases
where skipping is mandatory to avoid side effects if no lanes are
active. It conservatively handled asm correctly, but the same logic
needs to apply to calls.

Usually the call sequence code is larger than the skip threshold,
although the way the count is computed is really broken, so I'm not
sure if anything was likely to really hit this.

llvm-svn: 361202
2019-05-20 22:04:42 +00:00
Martin Storsjo 4ed18e5ef5 [AArch64] Handle lowering lround on windows, where long is 32 bit
Differential Revision: https://reviews.llvm.org/D62108

llvm-svn: 361192
2019-05-20 19:53:28 +00:00
Bjorn Pettersson eee0f2330d [AMDGPU] Fix std::array initializers to avoid warnings with older tool chains. NFC
A std::array is implemented as a template with an array
inside a struct. Older versions of clang, like 3.6,
require an extra set of curly braces around std::array
initializations to avoid warnings.

The C++ language was changed regarding this by CWG 1270.
So more modern tool chains does not complaing even if
leaving out one level of braces.

llvm-svn: 361171
2019-05-20 16:41:08 +00:00
Matt Arsenault 5239298b0d R600: Fix unconditional return in loop
llvm-svn: 361167
2019-05-20 16:22:11 +00:00
Cullen Rhodes 523789fa6b [AArch64][SVE2] Asm: add SADALP and UADALP instructions
Summary:
This patch adds support for the integer pairwise add and accumulate long
instructions SADALP/UADALP. These instructions are predicated.

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62001

llvm-svn: 361154
2019-05-20 13:50:15 +00:00
Petar Jovanovic e85bbf564d [DebugInfoMetadata] Refactor DIExpression::prepend constants (NFC)
Refactor DIExpression::With* into a flag enum in order to be less
error-prone to use (as discussed on D60866).

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D61943

llvm-svn: 361137
2019-05-20 10:35:57 +00:00
Cullen Rhodes 96c5929926 [AArch64][SVE2] Asm: add int halving add/sub (predicated) instructions
Summary:
This patch adds support for the predicated integer halving add/sub
instructions:

    * SHADD, UHADD, SRHADD, URHADD
    * SHSUB, UHSUB, SHSUBR, UHSUBR

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D62000

llvm-svn: 361136
2019-05-20 10:35:23 +00:00
Cullen Rhodes 0fc6347b35 [AArch64][SVE2] Asm: add saturating multiply-add interleaved long instructions
Summary:
Patch adds support for SQDMLALBT and SQDMLSLBT instructions.

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D61998

llvm-svn: 361135
2019-05-20 10:29:48 +00:00
Fangrui Song 68774edcd6 Use llvm::sort. NFC
llvm-svn: 361134
2019-05-20 10:18:35 +00:00
Carl Ritson 34e95ce259 [AMDGPU] gfx1010 Avoid SMEM WAR hazard for some s_waitcnt values
Summary:
Avoid introducing hazard mitigation when lgkmcnt is reduced to 0.
Clarify code comments to explain assumptions made for this hazard
mitigation.  Expand and correct test cases to cover variants of
s_waitcnt.

Reviewers: nhaehnle, rampitec

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62058

llvm-svn: 361124
2019-05-20 07:20:12 +00:00
Craig Topper 3164b50af7 [X86] Remove combineShift function. Just dispatch directly to the handler for each flavor from the main switch. NFC
llvm-svn: 361108
2019-05-19 01:01:46 +00:00
Matt Arsenault 2f29220d6d AMDGPU/GlobalISel: Implement s64->s64 [SU]ITOFP
llvm-svn: 361082
2019-05-17 23:05:18 +00:00
Matt Arsenault 02b5ca8cd1 GlobalISel: Implement lower for S64->S32 [SU]ITOFP
This is ported from the custom AMDGPU DAG implementation. I think this
is a better default expansion than what the DAG currently uses, at
least if the target has CTLZ.

This implements the signed version in terms of the unsigned
conversion, which is implemented with bit operations. SelectionDAG has
several other implementations that should eventually be ported
depending on what instructions are legal.

llvm-svn: 361081
2019-05-17 23:05:13 +00:00
Sam Clegg 13717bd54b [WebAssembly] Remove expected failure of builtin-location.C test
This seems to have been fixed by https://reviews.llvm.org/D61956

Yay

Differential Revision: https://reviews.llvm.org/D62075

llvm-svn: 361071
2019-05-17 19:55:17 +00:00
Simon Pilgrim 065431c82b [X86][SSE] Fold movmsk(not(x)) -> not(movmsk)
Helps to improve folding of comparisons with movmsk results.

llvm-svn: 361056
2019-05-17 17:56:25 +00:00
Simon Pilgrim 2c2f8e74b9 [X86][SSE] Match all-of bool scalar reductions into a bitcast/movmsk + cmp.
Same as what we do for vector reductions in combineHorizontalPredicateResult, use movmsk+cmp for scalar (and(extract(x,0),extract(x,1)) reduction patterns. 

llvm-svn: 361052
2019-05-17 17:25:55 +00:00
Dmitry Preobrazhensky 198611b0ff [AMDGPU][MC] Corrected parsing of NAME:VALUE modifiers
See bug 41298: https://bugs.llvm.org/show_bug.cgi?id=41298

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D61009

llvm-svn: 361045
2019-05-17 16:04:17 +00:00
Dmitry Preobrazhensky 5ae3113969 [AMDGPU][MC] Enabled labels with s_call_b64 and s_cbranch_i_fork
See https://bugs.llvm.org/show_bug.cgi?id=41888

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D62016

llvm-svn: 361040
2019-05-17 14:57:04 +00:00
Simon Pilgrim 279314e81b [X86][AVX] Remove LowerCTTZ's AVX1 custom vector handling.
We can now rely on generic expansion to handle this.

llvm-svn: 361038
2019-05-17 14:37:19 +00:00
Simon Pilgrim 62c7032c18 [X86][AVX] isNOT - add extract_subvector(xor X, -1) -> extract_subvector(X) fold.
Prep work for the removal of the remaining x86 CTTZ vector lowering.

llvm-svn: 361035
2019-05-17 14:04:56 +00:00
Dmitry Preobrazhensky 43fcc79837 [AMDGPU][MC] Enabled expressions for most operands which accept integer values
See bug 40873: https://bugs.llvm.org/show_bug.cgi?id=40873

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D60768

llvm-svn: 361031
2019-05-17 13:17:48 +00:00
Matt Arsenault 1a02d30c87 AMDGPU: Fix unused variable warnings in release builds
llvm-svn: 361030
2019-05-17 12:59:27 +00:00
Matt Arsenault a510b570c2 AMDGPU/GlobalISel: Legalize G_FCEIL
llvm-svn: 361028
2019-05-17 12:20:05 +00:00
Matt Arsenault 6aebcd5499 AMDGPU/GlobalISel: Legalize G_INTRINSIC_TRUNC
llvm-svn: 361027
2019-05-17 12:20:01 +00:00
Matt Arsenault 6aafc5e19d AMDGPU/GlobalISel: Legalize G_FRINT
llvm-svn: 361026
2019-05-17 12:19:57 +00:00
Matt Arsenault 1448f5689e AMDGPU/GlobalISel: Legalize G_FCOPYSIGN
llvm-svn: 361025
2019-05-17 12:19:52 +00:00
Matt Arsenault 568f193847 AMDGPU/GlobalISel: RegBankSelect for llvm.amdgcn.s.buffer.load
llvm-svn: 361023
2019-05-17 12:02:34 +00:00
Matt Arsenault a3b5a386fa AMDGPU/GlobalISel: Use subreg index instead of extra unmerge
This saves instructions and extra steps, but I'm not sure about
introducing subregister indexes at this point.

llvm-svn: 361022
2019-05-17 12:02:31 +00:00
Matt Arsenault b3dc73634c AMDGPU/GlobalISel: Use waterfall loop for buffer_load
This adds support for more complex waterfall loops that need to handle
operands > 32-bits, and multiple operands.

llvm-svn: 361021
2019-05-17 12:02:27 +00:00
Simon Pilgrim a6d3bd486b [X86] Pull out IsNOT helper. NFCI.
Return the input value for the NOT pattern: (xor X, -1) -> X

llvm-svn: 361012
2019-05-17 10:37:08 +00:00
Rhys Perry c4bc61bad7 [AMDGPU] detect WaW hazards when moving/merging load/store instructions
Summary:
    In order to combine memory operations efficiently, the load/store
    optimizer might move some instructions around. It's usually safe
    to move instructions down past the merged instruction because the
    pass checks if memory operations can be re-ordered.

    Though, the current logic doesn't handle Write-after-Write hazards.

    This fixes a reflection issue with Monster Hunter World and DXVK.

    v2: - rebased on top of master
        - clean up the test case
        - handle WaW hazards correctly

    Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=40130

Original patch by Samuel Pitoiset.

Reviewers: tpr, arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: ronlieb, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D61313

llvm-svn: 361008
2019-05-17 09:32:23 +00:00
Cullen Rhodes 7f605c3550 [AArch64][SVE2] Asm: add saturating multiply-add long instructions
Summary:
Patch adds support for indexed and unpredicated vectors forms of the
following instructions:

    * SQDMLALB, SQDMLALT, SQDMLSLB, SQDMLSLT

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D61997

llvm-svn: 361005
2019-05-17 09:29:43 +00:00
Cullen Rhodes 334130a199 [AArch64][SVE2] Asm: add integer multiply-add long instructions
Summary:
Patch adds support for indexed and unpredicated vectors forms of the
following instructions:

    * SMLALB, SMLALT, UMLALB, UMLALT, SMLSLB, SMLSLT, UMLSLB, UMLSLT

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D61951

llvm-svn: 361003
2019-05-17 09:19:41 +00:00
Cullen Rhodes 0d47f00821 [AArch64][SVE2] Asm: add integer multiply long instructions
Summary:
Patch adds support for indexed and unpredicated vectors forms of the
following instructions:

    * SMULLB, SMULLT, UMULLB, UMULLT, SQDMULLB, SQDMULLT

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D61936

llvm-svn: 361002
2019-05-17 09:04:44 +00:00
Craig Topper ae1597d360 [X86] Add FeatureFastScalarShiftMasks and FeatureFastVectorShiftMasks to the ignore list for inlining compatibility.
These are tuning flags and won't cause any codegen issue if we inline a function
with a different value.

llvm-svn: 360992
2019-05-17 06:40:21 +00:00
Fangrui Song ad7199f3e6 [PowerPC] Support .reloc *, R_PPC{,64}_NONE, *
This can be used to create references among sections. When --gc-sections
is used, the referenced section will be retained if the origin section
is retained.

llvm-svn: 360990
2019-05-17 06:04:11 +00:00
Fangrui Song e18a6ad0b8 [MC][PowerPC] Clean up PPCAsmBackend
Replace the member variable Target with Triple
Use Triple instead of TheTarget.getName() to dispatch on 32-bit/64-bit.
Delete redundant parameters

llvm-svn: 360986
2019-05-17 05:44:26 +00:00
Fangrui Song 2463239777 [X86] Support .reloc *, R_{386,X86_64}_NONE, *
This can be used to create references among sections. When --gc-sections
is used, the referenced section will be retained if the origin section
is retained.

See R_MIPS_NONE (D13659), R_ARM_NONE (D61992), R_AARCH64_NONE (D61973) for similar changes.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D62014

llvm-svn: 360983
2019-05-17 03:25:39 +00:00
Fangrui Song aa6102ad8e [AArch64] Support .reloc *, R_AARCH64_NONE, *
Summary:
This can be used to create references among sections. When --gc-sections
is used, the referenced section will be retained if the origin section
is retained.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D61973

llvm-svn: 360981
2019-05-17 03:05:07 +00:00
Fangrui Song 43ca0e9eb8 [ARM] Support .reloc *, R_ARM_NONE, *
R_ARM_NONE can be used to create references among sections. When
--gc-sections is used, the referenced section will be retained if the
origin section is retained.

Add a generic MCFixupKind FK_NONE as this kind of no-op relocation is
ubiquitous on ELF and COFF, and probably available on many other binary
formats. See D62014.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D61992

llvm-svn: 360980
2019-05-17 02:51:54 +00:00
Jonas Paulsson 9427961c89 [SystemZ] Bugfix in SystemZTargetLowering::combineIntDIVREM()
Make sure to not unroll a vector division/remainder (with a constant splat
divisor) after type legalization, since the scalar type may then be illegal.

Review: Ulrich Weigand
https://reviews.llvm.org/D62036

llvm-svn: 360965
2019-05-17 00:50:35 +00:00
David L. Jones 4a5e01faa4 [X86][AsmParser] Add mnemonics missed in r360954.
These are valid Jcc, but aren't based on the EFLAGS condition codes (Intel 64
and IA-32 Architetcures Software Developer's Manual Vol. 1, Appendix B). These
are covered in clang/test, but not llvm/test.

llvm-svn: 360960
2019-05-17 00:19:20 +00:00
David L. Jones add7ed2281 [X86][AsmParser] Ignore "short" even harder in Intel syntax ASM.
In Intel syntax, it's not uncommon to see a "short" modifier on Jcc conditional
jumps, which indicates the offset should be a "short jump" (8-bit immediate
offset from EIP, -128 to +127). This patch expands to all recognized Jcc
condition codes, and removes the inline restriction.

Clang already ignores "jmp short" in inline assembly. However, only "jmp" and a
couple of Jcc are actually checked, and only inline (i.e., not when using the
integrated assembler for asm sources). A quick search through asm-containing
libraries at hand shows a pretty broad range of Jcc conditions spelled with
"short."

GAS ignores the "short" modifier, and instead uses an encoding based on the
given immediate. MS inline seems to do the same, and I suspect MASM does, too.
NASM will yield an error if presented with an out-of-range immediate value.

Example of GCC 9.1 and MSVC v19.20, "jmp short" with offsets that do and do not
fit within 8 bits: https://gcc.godbolt.org/z/aFZmjY

Differential Revision: https://reviews.llvm.org/D61990

llvm-svn: 360954
2019-05-16 23:27:07 +00:00
David L. Jones 11305984d0 [X86][AsmParser] Rename "ConditionCode" variable to "ConditionPredicate".
This better matches the verbiage in Intel documentation, and should help avoid
confusion between these two different kinds of values, both of which are parsed
from mnemonics.

llvm-svn: 360953
2019-05-16 23:27:05 +00:00
Reid Kleckner 08c15df29f [X86] Deduplicate symbol lowering logic, NFC
Summary:
This refactors four pieces of code that create SDNodes for references to
symbols:
- normal global address lowering (LEA, MOV, etc)
- callee global address lowering (CALL)
- external symbol address lowering (LEA, MOV, etc)
- external symbol address lowering (CALL)

Each of these pieces of code need to:
- classify the reference
- lower the symbol
- emit a RIP wrapper if needed
- emit a load if needed
- add offsets if needed

I think handling them all in one place will make the code easier to
maintain in the future.

Reviewers: craig.topper, RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61690

llvm-svn: 360952
2019-05-16 23:15:26 +00:00
Craig Topper f09b9d419f [X86] Use 0x9 instead of 0x1 as the immediate in some masked floor pattern. Similarly change 0x2 to 0xA for ceil.
This suppresses exceptions which is what we should be doing for ceil and floor. We already use the correct immediate
in patterns without masking.

llvm-svn: 360915
2019-05-16 16:53:50 +00:00
Matt Arsenault 99e6f4d11a AMDGPU: Introduce TokenFactor for ABI register copies in call sequence
The call was missing chain dependencies on the pre-call copies. I
don't think this was causing any real issues however.

llvm-svn: 360906
2019-05-16 15:10:27 +00:00
Matt Arsenault df24c92c0f AMDGPU: Assume xnack is enabled by default
This is the conservatively correct default. It is always safe to
assume xnack is enabled, but not the converse.

Introduce a feature to blacklist targets where xnack can never be
meaningfully enabled. I'm not sure the targets this is applied to is
100% correct.

llvm-svn: 360903
2019-05-16 14:48:34 +00:00
Adhemerval Zanella 2d28db6b9f [AArch64] Handle ISD::LROUND and ISD::LLROUND
This patch optimizes ISD::LROUND and ISD::LLROUND to fcvtas
instruction. It currently only handles the scalar version.

llvm-svn: 360894
2019-05-16 13:30:18 +00:00