Commit Graph

14109 Commits

Author SHA1 Message Date
Colin LeMahieu 7cd0892729 [Hexagon] Enabling ASM parsing on Hexagon backend and adding instruction parsing tests. General updating of the code emission.
llvm-svn: 252443
2015-11-09 04:07:48 +00:00
Hal Finkel f046f72efa [PowerPC] Fix LoopPreIncPrep not to depend on SCEV constant simplifications
Under most circumstances, if SCEV can simplify X-Y to a constant, then it can
also simplify Y-X to a constant. However, there is no guarantee that this is
always true, and concensus is not to consider that a correctness bug in SCEV
(although it is undesirable).

PPCLoopPreIncPrep gathers pointers used to access memory (via loads, stores and
prefetches) into buckets, where in each bucket the relative pointer offsets are
constant. We used to keep each bucket as a multimap, where SCEV's subtraction
operation was used to define the ordering predicate. Instead, use a fixed SCEV
base expression for each bucket, record the constant offsets from that base
expression, and adjust it later, if desirable, once all pointers have been
collected.

Doing it this way should be more compile-time efficient than the previous
scheme (in addition to making the implementation less sensitive to SCEV
simplification quirks).

Fixes PR25170.

llvm-svn: 252417
2015-11-08 08:04:40 +00:00
David Majnemer e35244cf63 [WinEH] Update PHIs of CATCHRET successors
The TailDuplication machine pass ran across a malformed CFG: a PHI node
referred it's predecessor's predecessor instead of it's predecessor.
This occurred because we split the edge in X86ISelLowering when we
processed the CATCHRET but forgot to do something about the PHI nodes.

This fixes PR25444.

llvm-svn: 252413
2015-11-08 02:36:00 +00:00
Joseph Tremoulet f748c8937e [WinEH] Update exception pointer registers
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.

Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.

Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.


Reviewers: pgavlin, majnemer, rnk

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D14344

llvm-svn: 252383
2015-11-07 01:11:31 +00:00
Ahmed Bougacha cf49b523a0 [AArch64][FastISel] Don't even try to select vector icmps.
We used to try to constant-fold them to i32 immediates.
Given that fast-isel doesn't otherwise support vNi1, when selecting
the result users, we'd fallback to SDAG anyway.
However, if the users were in another block, we'd insert broken
cross-class copies (GPR32 to FPR64).

Give up, let SDAG agree with itself on a vNi1 legalization strategy.

llvm-svn: 252364
2015-11-06 23:16:53 +00:00
Ahmed Bougacha b49eb3ab4b [X86] Fold (trunc (i32 (zextload i16))) into vbroadcast.
When matching non-LSB-extracting truncating broadcasts, we now insert
the necessary SRL. If the scalar resulted from a load, the SRL will be
folded into it, creating a narrower, offset, load.

However, i16 loads aren't Desirable, so we get i16->i32 zextloads.
We already catch i16 aextloads; catch these as well.

llvm-svn: 252363
2015-11-06 23:16:48 +00:00
Ahmed Bougacha 05a0514b12 [X86] SRL non-LSB extracts when folding to truncating broadcasts.
Now that we recognize this, we can support it instead of bailing out.
That is, we can fold:
  (v8i16 (shufflevector
    (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))),
    <1,1,...,1>))
into:
  (v8i16 (vbroadcast (i16 (trunc (srl Y, 16)))))

llvm-svn: 252362
2015-11-06 23:16:43 +00:00
Ahmed Bougacha 68614a36d1 [X86] Don't fold non-LSB extracts into truncating broadcasts.
We used to incorrectly assume that the offset we're extracting from
was a multiple of the element size. So, we'd fold:
  (v8i16 (shufflevector
    (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))),
    <1,1,...,1>))
into:
  (v8i16 (vbroadcast (i16 (trunc Y))))
whereas we should have extracted the higher bits from X.

Instead, bail out if the assumption doesn't hold.

llvm-svn: 252361
2015-11-06 23:16:38 +00:00
Tom Stellard 05691a678e DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> extload
Reviewers: resistor, arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13805

llvm-svn: 252349
2015-11-06 21:58:37 +00:00
Dan Gohman 3cb66c85b9 [WebAssembly] Use more explicit types in testcases.
llvm-svn: 252345
2015-11-06 21:32:42 +00:00
Dan Gohman 8d456e4200 [WebAssembly] Add more explicit pushes to the tests.
llvm-svn: 252344
2015-11-06 21:26:50 +00:00
Quentin Colombet 9a8efc08d3 [ShrinkWrapping] Teach shrink-wrapping how to analyze RegMask.
Previously we were conservatively assuming that RegMask operands clobber
callee saved registers.

llvm-svn: 252341
2015-11-06 21:00:13 +00:00
Andrew Kaylor 4731bea3e5 Improved the operands commute transformation for X86-FMA3 instructions.
All 3 operands of FMA3 instructions are commutable now.

Patch by Slava Klochkov

Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab).

Differential Revision: http://reviews.llvm.org/D13269

llvm-svn: 252335
2015-11-06 19:47:25 +00:00
Dan Gohman 4b96d8d1ff [WebAssembly] Make expression-stack pushing explicit
Modelling of the expression stack is evolving. This patch takes another
step by making pushes explicit.

Differential Revision: http://reviews.llvm.org/D14338

llvm-svn: 252334
2015-11-06 19:45:01 +00:00
Matt Arsenault 0c90e9501e AMDGPU: Create emergency stack slots during frame lowering
Test has a bogus verifier error which will be fixed by later commits.

llvm-svn: 252327
2015-11-06 18:17:45 +00:00
Matt Arsenault 3931948bb6 AMDGPU: Add pass to detect used kernel features
Mark kernels that use certain features that require user
SGPRs to support with kernel attributes. We need to know
before instruction selection begins because it impacts
the kernel calling convention lowering.

For now this only detects the workitem intrinsics.

llvm-svn: 252323
2015-11-06 18:01:57 +00:00
Matt Arsenault 623e6fd466 AMDGPU: Hack for VS_32 register pressure
For some reason VS_32 ends up factoring into the pressure heuristics
even though we should never see a virtual register with this class.

When SGPRs are reserved for register spilling, this for some reason
triggers reg-crit scheduling.

Setting isAllocatable = 0 may help with this since that seems to remove
it from the default implementation's generated table.

llvm-svn: 252321
2015-11-06 17:54:43 +00:00
Reid Kleckner b8fd162fc5 [WinEH] Mark funclet entries and exits as clobbering all registers
Summary:
In this implementation, LiveIntervalAnalysis invents a few register
masks on basic block boundaries that preserve no registers. The nice
thing about this is that it prevents the prologue inserter from thinking
it needs to spill all XMM CSRs, because it doesn't see any explicit
physreg defs in the MI.

Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer

Subscribers: MatzeB, llvm-commits

Differential Revision: http://reviews.llvm.org/D14407

llvm-svn: 252318
2015-11-06 17:06:38 +00:00
Jun Bum Lim 22fe15ee86 [AArch64]Enable the narrow ld promotion only on profitable microarchitectures
The benefit from converting narrow loads into a wider load (r251438) could be
micro-architecturally dependent, as it assumes that a single load with two bitfield
extracts is cheaper than two narrow loads. Currently, this conversion is
enabled only in cortex-a57 on which performance benefits were verified.

llvm-svn: 252316
2015-11-06 16:27:47 +00:00
Rafael Espindola 889d7bb4cb Bring r252305 back with a test fix.
We now create the .eh_frame section early, just like every other special
section.

This means that the special flags are visible in code that explicitly
asks for ".eh_frame".

llvm-svn: 252313
2015-11-06 15:30:45 +00:00
Vasileios Kalintiris b04672cade [mips] Define patterns for the atomic_{load,store}_{8,16,32,64} nodes.
Summary:
Without these patterns we would generate a complete LL/SC sequence.
This would be problematic for memory regions marked as WRITE-only or
READ-only, as the instructions LL/SC would read/write to the protected
memory regions correspondingly.

Reviewers: dsanders

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14397

llvm-svn: 252293
2015-11-06 12:07:20 +00:00
Tom Stellard 1e1b05db24 AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNEL
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13804

llvm-svn: 252291
2015-11-06 11:45:14 +00:00
NAKAMURA Takumi 9947cacebf Revert r252249 (and r252255, r252258), "[WinEH] Clone funclets with multiple parents"
It behaved flaky due to iterating pointer key values on std::set and std::map.

llvm-svn: 252279
2015-11-06 10:07:33 +00:00
Andrew Kaylor f05a87dff3 Temporarily disable flaky checks in wineh-multi-parent-cloning.
llvm-svn: 252258
2015-11-06 01:15:04 +00:00
Andrew Kaylor 29cd576554 [WinEH] Clone funclets with multiple parents
Windows EH funclets need to always return to a single parent funclet.  However, it is possible for earlier optimizations to combine funclets (probably based on one funclet having an unreachable terminator) in such a way that this condition is violated.

These changes add code to the WinEHPrepare pass to detect situations where a funclet has multiple parents and clone such funclets, fixing up the unwind and catch return edges so that each copy of the funclet returns to the correct parent funclet.

Differential Revision: http://reviews.llvm.org/D13274?id=39098

llvm-svn: 252249
2015-11-06 00:20:50 +00:00
Peter Collingbourne d4bff30370 DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

llvm-svn: 252219
2015-11-05 22:03:56 +00:00
Reid Kleckner 6ddae31045 [WinEH] Fix funclet prologues with stack realignment
We already had a test for this for 32-bit SEH catchpads, but those don't
actually create funclets. We had a bug that only appeared in funclet
prologues, where we would establish EBP and ESI as our FP and BP, and
then downstream prologue code would overwrite them.

While I was at it, I fixed Win64+funclets+stackrealign. This issue
doesn't come up as often there due to the ABI requring 16 byte stack
alignment, but now we can rest easy that AVX and WinEH will work well
together =P.

llvm-svn: 252210
2015-11-05 21:09:49 +00:00
Dan Gohman d7ffb919c1 [WebAssembly] Update wasm builtin functions to match spec changes.
The page_size operator has been removed from the spec, and the resize_memory
operator has been changed to grow_memory.

llvm-svn: 252202
2015-11-05 20:16:59 +00:00
Petar Jovanovic 99fba3c141 Add cfi instr for CFA calculation when movpc is expanded to call and pop
This fixes the issue of wrong CFA calculation in the following case:

0x08048400 <+0>:	push   %ebx
0x08048401 <+1>:	sub    $0x8,%esp
0x08048404 <+4>:	**call   0x8048409 <test+9>**
0x08048409 <+9>:	**pop    %eax**
0x0804840a <+10>:	add    $0x1bf7,%eax
0x08048410 <+16>:	mov    %eax,%ebx
0x08048412 <+18>:	call   0x80483f0 <bar>
0x08048417 <+23>:	add    $0x8,%esp
0x0804841a <+26>:	pop    %ebx
0x0804841b <+27>:	ret

The highlighted instructions are a product of movpc instruction. The call
instruction changes the stack pointer, and pop instruction restores its
value. However, the rule for computing CFA is not updated and is wrong on
the pop instruction. So, e.g. backtrace in gdb does not work when on the pop
instruction. This adds cfi instructions for both call and pop instructions.

cfi_adjust_cfa_offset** instruction is used with the appropriate offset for
setting the rules to calculate CFA correctly.

Patch by Violeta Vukobrat.

Differential Revision: http://reviews.llvm.org/D14021

llvm-svn: 252176
2015-11-05 17:19:59 +00:00
Derek Schuff 8a76b04a63 [WebAssembly] Rename ior operator to or to match the spec
Summary: The spec uses "or" for inclusive-or and "xor" for exclusive-or

Reviewers: sunfish

Subscribers: jfb, llvm-commits, dschuff

Differential Revision: http://reviews.llvm.org/D14362

llvm-svn: 252174
2015-11-05 17:08:11 +00:00
Asaf Badouh f99c054ebc revert rev. 252153 due to build failure on ubuntu
[X86][AVX512] add comi with Sae

llvm-svn: 252154
2015-11-05 08:55:54 +00:00
Asaf Badouh 7fdabf0a35 [X86][AVX512] add comi with Sae
add builtin_ia32_vcomisd and builtin_ia32_vcomisd

Differential Revision: http://reviews.llvm.org/D14331

llvm-svn: 252153
2015-11-05 08:45:06 +00:00
Matt Arsenault a40450cba2 AMDGPU: Fix assert when legalizing atomic operands
The operand layout is slightly different for the atomic
opcodes from the usual MUBUF loads and stores.

This should only fix it on SI/CI. VI is still broken
because it still emits the addr64 replacement.

llvm-svn: 252140
2015-11-05 02:46:56 +00:00
Joseph Tremoulet 6afccf6120 [WinEH] Fix establisher param reg in CLR funclets
Summary:
The CLR's personality routine passes the pointer to the establisher frame
in RCX, not RDX.

Reviewers: pgavlin, majnemer, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14343

llvm-svn: 252135
2015-11-05 02:20:07 +00:00
Matt Arsenault b0f87ed692 AMDGPU: Add missing v2f64 fadd tests
llvm-svn: 252117
2015-11-05 01:03:11 +00:00
Quentin Colombet 421723cdd8 [x86] Teach the shrink-wrapping hooks to do the proper thing with Win64.
Win64 has some strict requirements for the epilogue. As a result, we disable
shrink-wrapping for Win64 unless the block that gets the epilogue is already an
exit block.

Fixes PR24193.

llvm-svn: 252088
2015-11-04 22:37:28 +00:00
Simon Pilgrim 7e6606f4f1 [X86][SSE] Add general memory folding for (V)INSERTPS instruction
This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction.

The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening.

This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done.

It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted.

Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll

Differential Revision: http://reviews.llvm.org/D13988

llvm-svn: 252074
2015-11-04 20:48:09 +00:00
Andrew Kaylor e41a8c4182 Created new X86 FMA3 opcodes (FMA*_Int) that are used now for lowering of scalar FMA intrinsics.
Patch by Slava Klochkov 

The key difference between FMA* and FMA*_Int opcodes is that FMA*_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result.

Reviewers: Quentin Colombet, Elena Demikhovsky

Differential Revision: http://reviews.llvm.org/D13710

llvm-svn: 252060
2015-11-04 18:10:41 +00:00
James Molloy e7d679cf4c [ARM] Combine CMOV into BFI where possible
If we have a CMOV, OR and AND combination such as:
  if (x & CN)
    y |= CM;

And:
  * CN is a single bit;
  * All bits covered by CM are known zero in y;

Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).

llvm-svn: 252057
2015-11-04 16:55:07 +00:00
Michael Kuperstein b34de72269 [X86] DAGCombine should not introduce FILD in soft-float mode
The x86 "sitofp i64 to double" dag combine, in 32-bit mode, lowers sitofp 
directly to X86ISD::FILD (or FILD_FLAG). This should not be done in soft-float mode.

llvm-svn: 252042
2015-11-04 11:17:53 +00:00
Igor Laevsky 35fe692025 [StatepointLowering] Remove distinction between call and invoke safepoints
There is no point in having invoke safepoints handled differently than the
call safepoints. All relevant decisions could be made by looking at whether
or not gc.result and gc.relocate lay in a same basic block. This change will
 allow to lower call safepoints with relocates and results in a different 
basic blocks. See test case for example.

Differential Revision: http://reviews.llvm.org/D14158

llvm-svn: 252028
2015-11-04 01:16:10 +00:00
Derek Schuff cd9488d521 Address nit
llvm-svn: 252004
2015-11-03 22:40:45 +00:00
Derek Schuff 6b5c6da760 [WebAssembly] Support wasm select operator
Summary:
Add support for wasm's select operator, and lower LLVM's select DAG node
to it.

Reviewers: sunfish

Subscribers: dschuff, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D14295

llvm-svn: 252002
2015-11-03 22:40:40 +00:00
Simon Pilgrim b0d860a394 [X86][AVX] Tweaked shuffle stack folding tests
To avoid alternative lowerings.

llvm-svn: 251986
2015-11-03 21:58:35 +00:00
Simon Pilgrim df993479c9 [X86][AVX512] Fixed shuffle test name to match shuffle
llvm-svn: 251984
2015-11-03 21:39:30 +00:00
Simon Pilgrim e88dc04c48 [X86][XOP] Add support for the matching of the VPCMOV bit select instruction
XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) )

This patch adds tablegen pattern matching for this instruction.

Differential Revision: http://reviews.llvm.org/D8841

llvm-svn: 251975
2015-11-03 20:27:01 +00:00
Rafael Espindola 41302723de Remove unnecessary dependency on section and string positions.
llvm-svn: 251964
2015-11-03 19:24:17 +00:00
Michael Kuperstein 73dc85293f [X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments
When push instructions are being used to pass function arguments on
the stack, and either EH or debugging are enabled, we need to generate
.cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is
enough for the CFA offset to be correct at every call site, while
for debugging we want to be correct after every push.

Darwin does not support this well, so don't use pushes whenever it
would be required.

Differential Revision: http://reviews.llvm.org/D13767

llvm-svn: 251904
2015-11-03 08:17:25 +00:00
Matt Arsenault f1aebbf33a AMDGPU: Stop assuming vreg for build_vector
This was causing a variety of test failures when v2i64
is added as a legal type.

SIFixSGPRCopies should correctly handle the case of vector inputs
to a scalar reg_sequence, so this isn't necessary anymore. This
was hiding some deficiencies in how reg_sequence is handled later,
but this shouldn't be a problem anymore since the register class
copy of a reg_sequence is now done before the reg_sequence.

llvm-svn: 251860
2015-11-02 23:30:48 +00:00
Matt Arsenault d48da14269 AMDGPU: Error on graphics shaders with HSA
I've found myself pointlessly debugging problems from running
graphics tests with an HSA triple a few times, so stop this from
happening again.

llvm-svn: 251858
2015-11-02 23:23:02 +00:00