Commit Graph

12855 Commits

Author SHA1 Message Date
Andrea Di Biagio eb33134ce7 Simplify code; NFC.
Also, moved test cases from CodeGen/X86/fold-buildvector-bug.ll into
CodeGen/X86/buildvec-insertvec.ll and regenerated CHECK lines using
update_llc_test_checks.py.

llvm-svn: 239142
2015-06-05 10:29:55 +00:00
Simon Pilgrim b4c562de87 [X86][SSE] Added tests for i8/i16 vector shifts
Currently still scalarized, but D9474 should remedy that.

llvm-svn: 239136
2015-06-05 08:24:23 +00:00
Swaroop Sridhar 70d18df18f Statepoint: Fix handling of Far Immediate calls
gc.statepoint intrinsics with a far immediate call target 
were lowered incorrectly as pc-rel32 calls.

This change fixes the problem, and generates an indirect call 
via a scratch register.

For example: 

Intrinsic:
  %safepoint_token = call i32 (i64, i32, void ()*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* inttoptr (i64 140727162896504 to void ()*), i32 0, i32 0, i32 0, i32 0)

Old Incorrect Lowering:
  callq 140727162896504

New Correct Lowering:
  movabsq $140727162896504, %rax 
  callq *%rax

In lowerCallFromStatepoint(), the callee-target was modified and 
represented as a "TargetConstant" node, rather than a "Constant" node.
Undoing this modification enabled LowerCall() to generate the 
correct CALL instruction.

llvm-svn: 239114
2015-06-04 23:03:21 +00:00
Charles Davis da280728b6 [Target/X86] Don't use callee-saved registers in a Win64 tail call on non-Windows.
Summary:
A small bit that I missed when I updated the X86 backend to account for
the Win64 calling convention on non-Windows. Now we don't use dead
non-volatile registers when emitting a Win64 indirect tail call on
non-Windows.

Should fix PR23710.

Test Plan: Added test for the correct behavior based on the case I posted to PR23710.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10258

llvm-svn: 239111
2015-06-04 22:50:05 +00:00
Benjamin Kramer ff0fb6936b [SDAG switch lowering] Fix switch case -> or merging for 0 and INT_MIN
The big/small ordering here is based on signed values so SmallValue will
be INT_MIN and BigValue 0. This shouldn't be a problem but the code
assumed that BigValue always had more bits set than SmallValue.

We used to just miss the transformation, but a recent refactoring of
mine turned this into an assertion failure.

llvm-svn: 239105
2015-06-04 22:05:51 +00:00
Colin LeMahieu 348efdbd36 Shouldn't be XFAIL'ed.
llvm-svn: 239103
2015-06-04 21:49:43 +00:00
Colin LeMahieu c40be85adc Revert r239095 incorrect test tree.
llvm-svn: 239102
2015-06-04 21:32:42 +00:00
Jingyue Wu a2f6027a31 [NVPTX] roll forward r239082
NVPTXISelDAGToDAG translates "addrspacecast to param" to
NVPTX::nvvm_ptr_gen_to_param

Added an llc test in bug21465.

llvm-svn: 239100
2015-06-04 21:28:26 +00:00
Colin LeMahieu fc52c11d80 [Hexagon] Adding functionality for duplexing. Duplexing is a way to compress commonly used pairs of instructions in order to reduce code size. The test case duplex.ll normally would be 8 bytes, assign register to 0 and jump to link register. After duplexing this is only 4 bytes. This also tests the HexagonMCShuffler code path which is used to make sure duplexed instructions still follow slot requirements.
llvm-svn: 239095
2015-06-04 21:16:16 +00:00
Jingyue Wu b8f38668d5 Revert r239082
llc crashed for NVPTX backend

llvm-svn: 239094
2015-06-04 21:07:08 +00:00
Ahmed Bougacha 8207641251 [GlobalMerge] Take into account minsize on Global users' parents.
Now that we can look at users, we can trivially do this: when we would
have otherwise disabled GlobalMerge (currently -O<3), we can just run
it for minsize functions, as it's usually a codesize win.

Differential Revision: http://reviews.llvm.org/D10054

llvm-svn: 239087
2015-06-04 20:39:23 +00:00
Jingyue Wu f3a8079b75 [NVPTX] kernel pointer arguments point to the global address space
Summary:
With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a
pointer in the global address space. This change, along with
NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit ld.global.*
and st.global.* for accessing kernel pointer arguments.

Minor changes:
1. refactor: extract function convertToPointerInAddrSpace
2. fix a bug in the test case in bug21465.ll

Test Plan: lower-kernel-ptr-arg.ll

Reviewers: eliben, meheff, jholewinski

Reviewed By: jholewinski

Subscribers: wengxt, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10154

llvm-svn: 239082
2015-06-04 20:19:38 +00:00
Alexei Starovoitov 310deada10 [bpf] add big- and host- endian support
Summary:
-march=bpf    -> host endian
-march=bpf_le -> little endian
-match=bpf_be -> big endian

Test Plan:
v1 was tested by IBM s390 guys and appears to be working there.
It bit rots too fast here.

Reviewers: chandlerc, tstellarAMD

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10177

llvm-svn: 239071
2015-06-04 19:15:05 +00:00
Andrea Di Biagio 9ac8a6b13d [DAGCombiner] Fix wrong folding of a build_vector into a blend with zero.
Method 'visitBUILD_VECTOR' in the DAGCombiner knows how to combine a
build_vector of a bunch of extract_vector_elt nodes and constant zero nodes
into a shuffle blend with a zero vector.

However, method 'visitBUILD_VECTOR' forgot that a floating point
build_vector may contain negative zero as well as positive zero.

Example:

define <2 x double> @example(<2 x double> %A) {
entry:
  %0 = extractelement <2 x double> %A, i32 0
  %1 = insertelement <2 x double> undef, double %0, i32 0
  %2 = insertelement <2 x double> %1, double -0.0, i32 1
  ret <2 x double> %2
}

Before this patch, llc (with -mattr=+sse4.1) wrongly generated
  movq   %xmm0, %xmm0  # xmm0 = xmm0[0],zero

So, the sign bit of the negative zero was effectively lost.

This patch fixes the problem by adding explicit checks for positive zero.

With this patch, llc produces the following code for the example above:
  movhpd .LCPI0_0(%rip), %xmm0

where .LCPI0_0 referes to a 'double -0'.

llvm-svn: 239070
2015-06-04 19:15:01 +00:00
Matt Arsenault 73e06fa262 R600/SI: Reimplement isLegalAddressingMode
Now that we sometimes know the address space, this can
theoretically do a better job.

This needs better test coverage, but this mostly depends on
first updating the loop optimizatiosn to provide the address
space.

llvm-svn: 239053
2015-06-04 16:17:42 +00:00
Matt Arsenault 81c7ae2bf5 R600/SI: Fix some cases for load / store of half
Mostly argument loads were producing broken zextloads
from an FP type.

llvm-svn: 239049
2015-06-04 16:00:27 +00:00
Hans Wennborg d922915685 Switch lowering: fix assert in buildBitTests (PR23738)
When checking (High - Low + 1).sle(BitWidth), BitWidth would be truncated
to the size of the left-hand side. In the case of this PR, the left-hand
side was i4, so BitWidth=64 got truncated to 0 and the assert failed.

llvm-svn: 239048
2015-06-04 15:55:00 +00:00
James Molloy 37593732a4 Don't create a MIN/MAX node if the underlying compare has more than one use.
If the compare in a select pattern has another use then it can't be removed, so we'd just
be creating repeated code if we created a min/max node.

Spotted by Matt Arsenault!

llvm-svn: 239037
2015-06-04 13:48:23 +00:00
Elena Demikhovsky 2f1a0dabd0 AVX-512: I brought back vector-shuffle-512-v8.ll test.
I re-generated it after all AVX-512 shuffle optimizations.

llvm-svn: 239026
2015-06-04 07:49:56 +00:00
Sanjay Patel 667a7e2a0f make reciprocal estimate code generation more flexible by adding command-line options (3rd try)
The first try (r238051) to land this was reverted due to ExecutionEngine build failure;
that was hopefully addressed by r238788.

The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure;
that was hopefully addressed by r238953.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982

llvm-svn: 239001
2015-06-04 01:32:35 +00:00
Tom Stellard 1ba52feb96 R600: Re-enable sub-reg liveness
The bug in the R600 backend that this uncovered has been fixed.

llvm-svn: 238999
2015-06-04 01:20:04 +00:00
Matt Arsenault 5e740dbca6 R600/SI: Fix tests with triples in them
Only set the triple from the command line options.
Some of these were still testing SI features and using the
old r600-- triple.

llvm-svn: 238958
2015-06-03 20:04:05 +00:00
Colin LeMahieu 1ce7a11c9c [Hexagon] Test doesn't work on all platforms. At any rate the uninitialized variable issue was fixed. Removing re-registering ASM backend.
llvm-svn: 238949
2015-06-03 18:00:45 +00:00
Colin LeMahieu a675077310 [Hexagon] Reapply 238772 OSABI was not correctly set, added empty_elf test to make sure it is.
llvm-svn: 238947
2015-06-03 17:34:16 +00:00
Matthias Braun 125c9f5f7b ARM: Thumb2 LDRD/STRD supports independent input/output regs
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.

Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.

Differential Revision: http://reviews.llvm.org/D9694

Recommiting after the revert in r238821, the buildbot still failed with
the patch removed so there seems to be another reason for the breakage.

llvm-svn: 238935
2015-06-03 16:30:24 +00:00
Asaf Badouh 402ebb34af re-apply 238809
AVX-512: Implemented GETEXP instruction for KNL and SKX
Added rounding mode modifier for SQRTPS/PD
Added tests for encoding and intrinsics.
CR:
http://reviews.llvm.org/D9991

llvm-svn: 238923
2015-06-03 13:41:48 +00:00
Elena Demikhovsky 21de893377 AVX-512: VSHUFPD instruction selection - code improvements
llvm-svn: 238918
2015-06-03 11:21:01 +00:00
Rafael Espindola cf8beece97 Revert "make reciprocal estimate code generation more flexible by adding command-line options (2nd try)"
This reverts commit r238842.

It broke -DBUILD_SHARED_LIBS=ON build.

llvm-svn: 238900
2015-06-03 05:32:44 +00:00
Sanjoy Das 513aadecac [SelectionDAG] Fix PR23603.
Summary:
LLVM's MI level notion of invariant_load is different from LLVM's IR
level notion of invariant_load with respect to dereferenceability.  The
IR notion of invariant_load only guarantees that all *non-faulting*
invariant loads result in the same value.  The MI notion of invariant
load guarantees that the load can be legally moved to any location
within its containing function.  The MI notion of invariant_load is
stronger than the IR notion of invariant_load -- an MI invariant_load is
an IR invariant_load + a guarantee that the location being loaded from
is dereferenceable throughout the function's lifetime.

Reviewers: hfinkel, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10075

llvm-svn: 238881
2015-06-02 22:33:30 +00:00
Daniel Sanders c95f3f8c95 [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only.
Summary:
Following on from r209907 which made personality encodings indirect, do the
same for TType encodings. This fixes the case where a try/catch block needs
to generate references to, for example, std::exception in the
.gcc_except_table.

Previous attempts at committing this broke the buildbots due to bugs in IAS.
These bugs have now been fixed so trying again.

Reviewers: petarj

Reviewed By: petarj

Subscribers: srhines, joerg, tberghammer, llvm-commits

Differential Revision: http://reviews.llvm.org/D9669

llvm-svn: 238863
2015-06-02 20:32:50 +00:00
Sanjay Patel 6f031d848e make reciprocal estimate code generation more flexible by adding command-line options (2nd try)
The first try (r238051) to land this was reverted due to bot failures
that were hopefully addressed by r238788.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982

llvm-svn: 238842
2015-06-02 15:28:15 +00:00
Elena Demikhovsky 44a129c533 AVX-512: Shorten implementation of lowerV16X32VectorShuffle()
using lowerVectorShuffleWithSHUFPS() and other shuffle-helpers routines.
Added matching of VALIGN instruction.

llvm-svn: 238830
2015-06-02 13:43:18 +00:00
Vasileios Kalintiris bb698c7d5f [mips] Add support for dynamic stack realignment.
Summary:
With this change we are able to realign the stack dynamically, whenever it
contains objects with alignment requirements that are larger than the
alignment specified from the given ABI.

We have to use the $fp register as the frame pointer when we perform
dynamic stack realignment. In complex stack frames, with variably-sized
objects, we reserve additionally the callee-saved register $s7 as the
base pointer in order to reference locals.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8633

llvm-svn: 238829
2015-06-02 13:14:46 +00:00
Renato Golin 3a7bec86bd Revert "ARM: Thumb2 LDRD/STRD supports independent input/output regs"
This reverts commit r238795, as it broke the Thumb2 self-hosting buildbot.

Since self-hosting issues with Clang are hard to investigate, I'm taking the
liberty to revert now, so we can investigate it offline.

llvm-svn: 238821
2015-06-02 11:47:30 +00:00
Asaf Badouh 8d897dd05f revert 238809
llvm-svn: 238810
2015-06-02 07:45:19 +00:00
Asaf Badouh 17de10f37e AVX-512: Implemented GETEXP instruction for KNL and SKX
Added rounding mode modifier for SQRTPS/PD
Added tests for encoding and intrinsics.

llvm-svn: 238809
2015-06-02 07:18:14 +00:00
Matthias Braun e20dc1cd3a ARM: Thumb2 LDRD/STRD supports independent input/output regs
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.

Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.

Differential Revision: http://reviews.llvm.org/D9694

llvm-svn: 238795
2015-06-01 23:27:08 +00:00
Matthias Braun 72b8f74813 AArch64: Use CMP;CCMP sequences for and/or/setcc trees.
Previously CCMP/FCCMP instructions were only used by the
AArch64ConditionalCompares pass for control flow. This patch uses them
for SELECT like instructions as well by matching patterns in ISelLowering.

PR20927, rdar://18326194

Differential Revision: http://reviews.llvm.org/D8232

llvm-svn: 238793
2015-06-01 22:31:17 +00:00
Matthias Braun c1e029e93d LiveRangeEdit: Fix liveranges not shrinking on subrange kill.
If a dead instruction we may not only have a last-use in the main live
range but also in a subregister range if subregisters are tracked. We
need to partially rebuild live ranges in both cases.

The testcase only broke when subregister liveness was enabled. I
commited it in the current form because there is currently no flag to
enable/disable subregister liveness.

This fixes PR23720.

llvm-svn: 238785
2015-06-01 21:26:26 +00:00
Rafael Espindola b5815b4738 Revert "[Hexagon] Adding basic ELF relocation generation and testing advanced relaxation codepath."
This reverts commit r238748.

It broke the msan bot:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/4372/steps/check-llvm%20msan/logs/stdio

llvm-svn: 238772
2015-06-01 19:20:47 +00:00
Vasileios Kalintiris cbbf8e0a39 [mips][FastISel] Implement bswap.
Summary: Implement bswap intrinsic for MIPS FastISel. It's very different for misp32 r1/r2 .

Based on a patch by Reed Kotler.

Test Plan:
bswap1.ll
test-suite

Reviewers: dsanders, rkotler

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D7219

llvm-svn: 238760
2015-06-01 16:40:45 +00:00
Vasileios Kalintiris bdb91b31f0 [mips][FastISel] Implement intrinsics memset, memcopy & memmove.
Summary:
Implement the intrinsics memset, memcopy and memmove in MIPS FastISel.
Make some needed infrastructure fixes so that this can work.

Based on a patch by Reed Kotler.

Test Plan:
memtest1.ll
The patch passes test-suite for mips32 r1/r2 and at O0/O2

Reviewers: rkotler, dsanders

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D7158

llvm-svn: 238759
2015-06-01 16:36:01 +00:00
Vasileios Kalintiris 8fcb3986d0 [mips][FastISel] Implement srem/urem and sdiv/udiv instructions.
Summary: Implement the LLVM assembly urem/srem and sdiv/udiv instructions in MIPS FastISel.

Based on a patch by Reed Kotler.

Test Plan:
srem1.ll
div1.ll
test-suite at O0/O2 for mips32 r1/r2

Reviewers: dsanders, rkotler

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D7028

llvm-svn: 238757
2015-06-01 16:17:37 +00:00
Vasileios Kalintiris 127f894b55 [mips][FastISel] Implement the select statement for MIPS FastISel.
Summary: Implement the LLVM IR select statement for MIPS FastISelsel.

Based on a patch by Reed Kotler.

Test Plan:
"Make check" test included now.
Passes test-suite at O2/O0 mips32 r1/r2.

Reviewers: dsanders, rkotler

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D6774

llvm-svn: 238756
2015-06-01 15:56:40 +00:00
Vasileios Kalintiris 7f680e156e [mips][FastISel] Clobber HI0/LO0 registers in MUL instructions.
Summary:
The contents of the HI/LO registers are unpredictable after the execution of
the MUL instruction. In addition to implicitly defining these registers in the
MUL instruction definition, we have to mark those registers as dead too.

Without this the fast register allocator is running out of registers when the
MUL instruction is followed by another one that tries to allocate the AC0
register.

Based on a patch by Reed Kotler.

Reviewers: dsanders, rkotler

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D9825

llvm-svn: 238755
2015-06-01 15:48:09 +00:00
Colin LeMahieu a739a4b3c7 [Hexagon] Adding basic ELF relocation generation and testing advanced relaxation codepath.
llvm-svn: 238748
2015-06-01 14:51:26 +00:00
Elena Demikhovsky 67afb630e1 AVX-512: Optimized vector shuffle for v16f32 and v16i32 types.
llvm-svn: 238743
2015-06-01 13:26:18 +00:00
Luke Cheeseman 4c476858cc Removing commited assembly file.
llvm-svn: 238742
2015-06-01 13:18:53 +00:00
Luke Cheeseman 85fd06d389 Re-commit of r238201 with fix for building with shared libraries.
llvm-svn: 238739
2015-06-01 12:02:47 +00:00
Elena Demikhovsky 0c41088ebf AVX-512: Implemented vector shuffle lowering for v8i64 and v8f64 types.
I removed the vector-shuffle-512-v8.ll, it is auto-generated test, not valid any more.

llvm-svn: 238735
2015-06-01 09:49:53 +00:00