Commit Graph

41154 Commits

Author SHA1 Message Date
Craig Topper 978fdb75a4 [X86] Add support for folding (insert_subvector vec1, (extract_subvector vec2, idx1), idx1) -> (blendi vec2, vec1).
llvm-svn: 294112
2017-02-04 23:26:46 +00:00
Craig Topper 3d95228dbe [X86] Simplify the code that turns INSERT_SUBVECTOR into BLENDI. NFCI
llvm-svn: 294111
2017-02-04 23:26:42 +00:00
Eric Christopher b128abcf7a Remove a bunch of unnecessary casts to a target specific version of TII and TRI as we're working from a target specific STI.
llvm-svn: 294081
2017-02-04 01:52:17 +00:00
Eugene Zelenko 3f37f07c7f [Sparc] Fix some Include What You Use warnings; other minor fixes (NFC).
This is preparation to reduce MCExpr.h dependencies.

llvm-svn: 294072
2017-02-04 00:36:49 +00:00
Eugene Zelenko cd8ea02b4a [Mips] Fix some Include What You Use warnings; other minor fixes (NFC).
This is preparation to reduce MCExpr.h dependencies.

llvm-svn: 294069
2017-02-03 23:39:33 +00:00
Eugene Zelenko 06869c04f3 [SystemZ] Fix some Include What You Use warnings; other minor fixes (NFC).
This is preparation to reduce MCExpr.h dependencies.

llvm-svn: 294068
2017-02-03 23:39:06 +00:00
Eugene Zelenko e894b4dc59 [AMDGPU] Fix some Include What You Use warnings; other minor fixes (NFC).
This is preparation to reduce MCExpr.h dependencies.

llvm-svn: 294067
2017-02-03 23:38:40 +00:00
Eugene Zelenko 939f6b0167 [AArch64] Fix some Include What You Use warnings; other minor fixes (NFC).
This is preparation to reduce MCExpr.h dependencies.

llvm-svn: 294053
2017-02-03 21:49:13 +00:00
Eugene Zelenko 07dc38f67a [ARM] Fix some Include What You Use warnings; other minor fixes (NFC).
This is preparation to reduce MCExpr.h dependencies.

llvm-svn: 294052
2017-02-03 21:48:12 +00:00
Eugene Zelenko c3164b9c2f [XCore] Fix some Include What You Use warnings; other minor fixes (NFC).
This is preparation to reduce MCExpr.h dependencies.

llvm-svn: 294051
2017-02-03 21:46:55 +00:00
Matt Arsenault f15da6c419 AMDGPU: AsmParser cleanups
Use typedef, remove unnecessary enum, line wraps.

llvm-svn: 294039
2017-02-03 20:49:51 +00:00
Stanislav Mekhanoshin 81db53109d [AMDGPU] Bump -amdgpu-unroll-threshold-private to 2000
This has quite positive performance impact according to measurements.
Before previous fixes to limit the optimization that was too high
and blowed compile time and scratch usage, but now this is gone and
we can bump the threshold.

Differential Revision: https://reviews.llvm.org/D29505

llvm-svn: 294032
2017-02-03 20:08:29 +00:00
Matt Arsenault 1fa5eacf9d AMDGPU: Set MCAsmInfo::PointerSize
llvm-svn: 294031
2017-02-03 20:02:23 +00:00
Matt Arsenault d9cd736585 AMDGPU: Don't unroll for private with dynamic allocas
This won't be elimnated, so this will just bloat code
if/when these are ever used/supported.

llvm-svn: 294030
2017-02-03 19:36:00 +00:00
Simon Pilgrim 034c1bd32c [X86][SSE] Add support for combining scalar_to_vector(extract_vector_elt) into a target shuffle.
Correctly flagging upper elements as undef.

llvm-svn: 294020
2017-02-03 17:59:58 +00:00
Simon Dardis 68e9d94055 [mips] Remove absolute size assertion for end directive
The .end <symbol> directive for MIPS marks the end of a symbol and sets the
symbol's size. Previously, the corresponding emitDirective handler asserted
that a function's size could be evaluated to an absolute value at that point
in time.

This cannot be done with when directives like .align have been encountered,
instead set the function's size to the corresponding symbolic expression and
let ELFObjectWriter resolve the expression to an absolute value. This avoids
a redundant call to evaluateAsAbsolute.

llvm-svn: 294012
2017-02-03 15:48:53 +00:00
Justin Lebar e90c468444 [NVPTX] Enable combineRepeatedFPDivisors for NVPTX.
Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D29477

llvm-svn: 294011
2017-02-03 15:13:50 +00:00
Artem Tamazov 43b61561b0 [AMDGPU][mc] Fix AddressSanitizer leftover issue in gfx7_asm_all test
Issue occurs when assembling "ds_ordered_count v0, v0 gds".

llvm-svn: 294004
2017-02-03 12:47:30 +00:00
Sanne Wouda a994185757 [ARM] Change TCReturn to tBL if tailcall optimization fails.
Summary:
The tail call optimisation is performed before register allocation, so
at that point we don't know if LR is being spilt or not. If LR was spilt
to the stack, then we cannot do a tail call optimisation. That would
involve popping back into LR which is not possible in Thumb1 code.

Reviewers: rengolin, jmolloy, rovka, olista01

Reviewed By: olista01

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D29020

llvm-svn: 294000
2017-02-03 11:15:53 +00:00
Stanislav Mekhanoshin f29602df65 [AMDGPU] Unroll preferences improvements
Exit loop analysis early if suitable private access found.
Do not account for GEPs which are invariant to loop induction variable.
Do not account for Allocas which are too big to fit into register file anyway.
Add option for tuning: -amdgpu-unroll-threshold-private.

Differential Revision: https://reviews.llvm.org/D29473

llvm-svn: 293991
2017-02-03 02:20:05 +00:00
Matt Arsenault e1b595306d AMDGPU: Fold fneg into fmin/fmax_legacy
llvm-svn: 293972
2017-02-03 00:51:50 +00:00
Craig Topper bbb2b95ce5 [X86] Mark 256-bit and 512-bit INSERT_SUBVECTOR operations as legal and remove the custom lowering.
llvm-svn: 293969
2017-02-03 00:24:49 +00:00
Matt Arsenault 2511c031de AMDGPU: Fold fneg into fminnum/fmaxnum
llvm-svn: 293968
2017-02-03 00:23:15 +00:00
Matt Arsenault a8fcfadf46 AMDGPU: Check if users of fneg can fold mods
In multi-use cases this can save a few instructions.

llvm-svn: 293962
2017-02-02 23:21:23 +00:00
Eugene Zelenko fbd13c5c12 [X86] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 293949
2017-02-02 22:55:55 +00:00
Reid Kleckner 3c467e225e [X86] Avoid sorted order check in release builds
Effectively reverts r290248 and fixes the unused function warning with
ifndef NDEBUG.

llvm-svn: 293945
2017-02-02 22:06:30 +00:00
Craig Topper c45657375b [X86] Move turning 256-bit INSERT_SUBVECTORS into BLENDI from legalize to DAG combine.
On one test this seems to have given more chance for DAG combine to do other INSERT_SUBVECTOR/EXTRACT_SUBVECTOR combines before the BLENDI was created. Looks like we can still improve more by teaching DAG combine to optimize INSERT_SUBVECTOR/EXTRACT_SUBVECTOR with BLENDI.

llvm-svn: 293944
2017-02-02 22:02:57 +00:00
Reid Kleckner c35139ec0d [CodeGen] Remove dead call-or-prologue enum from CCState
This enum has been dead since Olivier Stannard re-implemented ARM byval
handling in r202985 (2014).

llvm-svn: 293943
2017-02-02 21:58:22 +00:00
Rafael Espindola 13a79bbfe5 Change how we handle section symbols on ELF.
On ELF every section can have a corresponding section symbol. When in
an assembly file we have

.quad .text

the '.text' refers to that symbol.

The way we used to handle them is to leave .text an undefined symbol
until the very end when the object writer would map them to the
actual section symbol.

The problem with that is that anything before the end would see an
undefined symbol. This could result in bad diagnostics
(test/MC/AArch64/label-arithmetic-diags-elf.s), or incorrect results
when using the asm streamer (est/MC/Mips/expansion-jal-sym-pic.s).

Fixing this will also allow using the section symbol earlier for
setting sh_link of SHF_METADATA sections.

This patch includes a few hacks to avoid changing our behaviour when
handling conflicts between section symbols and other symbols. I
reported pr31850 to track that.

llvm-svn: 293936
2017-02-02 21:26:06 +00:00
Javed Absar bb8dcc6aec [ARM] Classification Improvements to ARM Sched-Model. NFCI.
This is the second in the series of patches to enable adding
of machine sched-models for ARM processors easier and compact.
This patch focuses on integer instructions and adds missing
sched definitions.

Reviewers: rovka, rengolin
Differential Revision: https://reviews.llvm.org/D29127

llvm-svn: 293935
2017-02-02 21:08:12 +00:00
Krzysztof Parzyszek d0d42f0ec8 [Hexagon] Adding opExtentBits and opExtentAlign to GPrel instructions
Patch by Colin LeMahieu.

llvm-svn: 293933
2017-02-02 20:35:12 +00:00
Michael Kuperstein e6d59fdca5 [X86] Add costs for non-AVX512 single-source permutation integer shuffles
Differential Revision: https://reviews.llvm.org/D29416

llvm-svn: 293932
2017-02-02 20:27:13 +00:00
Krzysztof Parzyszek e17b0bfb24 [Hexagon] Fix relocation kind for extended predicated calls
Patch by Sid Manning.

llvm-svn: 293931
2017-02-02 20:21:56 +00:00
Krzysztof Parzyszek 357b048666 [Hexagon] Remove A4_ext_* pseudo instructions
Patch by Colin LeMahieu.

llvm-svn: 293929
2017-02-02 19:58:22 +00:00
Krzysztof Parzyszek d67ab623f6 [Hexagon] Fix insertBranch for loops with multiple ENDLOOP instructions
llvm-svn: 293925
2017-02-02 19:36:37 +00:00
Dan Gohman b89f2d3d92 [WebAssembly] Add instruction definitions for drop and get/set_global.
llvm-svn: 293922
2017-02-02 19:29:44 +00:00
Nirav Dave 93f9d5ce04 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r293893 which is miscompiling lua on ARM and
bootstrapping for x86-windows.

llvm-svn: 293915
2017-02-02 18:24:55 +00:00
Simon Dardis 08ce5fb66b [mips] Expansion of BEQL and BNEL with immediate operands
Adds support for BEQL and BNEL macros with immediate operands.

Patch by: Srdjan Obucina

Reviewers: dsanders, zoran.jovanovic, vkalintiris, sdardis, obucina, seanbruno

Differential Revision: https://reviews.llvm.org/D17040

llvm-svn: 293905
2017-02-02 16:13:49 +00:00
Jonas Paulsson b7a2ef8375 [SystemZ] Add comment for ISD::FP_TO_UINT expansion.
(Copied from the fp-conv-10.ll test to SystemZISelLowering.cpp)

Review: Ulrich Weigand
llvm-svn: 293900
2017-02-02 15:42:14 +00:00
Krzysztof Parzyszek bc4dc9b4b9 [Hexagon] Emitting individual instructions without copying them
Patch by Colin LeMahieu.

llvm-svn: 293899
2017-02-02 15:32:26 +00:00
Krzysztof Parzyszek f65b8f14f4 [Hexagon] Rename TypeCOMPOUND to TypeCJ
llvm-svn: 293894
2017-02-02 15:03:30 +00:00
Nirav Dave 4442667fc5 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Recommiting after fixing X86 inc/dec chain bug.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 293893
2017-02-02 14:39:42 +00:00
Nirav Dave e14300e270 [X86,ISEL] Fix X86 increment chain dependence calculation
Merging Load-add-store pattern into a increment op previously dropped
the load's chain from the instructions dependence if the store is
chained to a TokenFactor.

llvm-svn: 293892
2017-02-02 14:39:26 +00:00
Diana Picus 32cd9b434c [ARM] GlobalISel: Lower pointer args and returns
It is important to change the ArgInfo's type from pointer to integer, otherwise
the CC assign function won't know what to do. Instead of hacking it up, we use
ComputeValueVTs and introduce some of the helpers that we will need later on for
lowering more complex types.

llvm-svn: 293889
2017-02-02 14:01:00 +00:00
Diana Picus 0c11c7b5c7 [ARM] GlobalISel: Error out instead of asserting
Allow unknown types in TLI.getValueType, otherwise we get asserts for certain
types that we do not support yet (instead of returning that we don't support
them and falling through the normal error path).

llvm-svn: 293888
2017-02-02 14:00:54 +00:00
Diana Picus fc19a8ff07 [ARM] GlobalISel: Legalize loading pointers
Make it legal to load pointer values. Also check that pointers are assigned
to the GPR reg bank by default.

llvm-svn: 293886
2017-02-02 13:20:49 +00:00
Simon Pilgrim 20ab6b875a [X86][SSE] Use MOVMSK for all_of/any_of reduction patterns
This is a first attempt at using the MOVMSK instructions to replace all_of/any_of reduction patterns (i.e. an and/or + shuffle chain).

So far this only matches patterns where we are reducing an all/none bits source vector (i.e. a comparison result) but we should be able to expand on this in conjunction with improvements to 'bool vector' handling both in the x86 backend as well as the vectorizers etc.

Differential Revision: https://reviews.llvm.org/D28810

llvm-svn: 293880
2017-02-02 11:52:33 +00:00
Craig Topper 047a8be18a [X86] Remove some unused DAGCombinerInfo parameters. NFC
llvm-svn: 293873
2017-02-02 08:03:23 +00:00
Craig Topper 94ed54b49a [X86] Move some INSERT_SUBVECTOR optimizations from legalize to DAG combine.
This moves creation of SUBV_BROADCAST and merging of adjacent loads that are being inserted together.

This is a step towards removing legalizing of INSERT_SUBVECTOR except for vXi1 cases.

llvm-svn: 293872
2017-02-02 08:03:20 +00:00
Craig Topper b81e6c48f8 [AVX-512] Fix the implicit defs for VZEROALL/VZEROUPPER to include YMM16-YMM31.
llvm-svn: 293862
2017-02-02 04:17:18 +00:00