Commit Graph

3148 Commits

Author SHA1 Message Date
Matt Beaumont-Gay f3a9a38db4 Unix line endings
llvm-svn: 149115
2012-01-27 02:31:29 +00:00
Jakob Stoklund Olesen fc9dce25f7 Handle call-clobbered ymm registers on Win64.
The Win64 calling convention has xmm6-15 as callee-saved while still
clobbering all ymm registers.

Add a YMM_HI_6_15 pseudo-register that aliases the clobbered part of the
ymm registers, and mark that as call-clobbered.  This allows live xmm
registers across calls.

This hack wouldn't be necessary with RegisterMask operands representing
the call clobbers, but they are not quite operational yet.

llvm-svn: 149088
2012-01-26 22:59:28 +00:00
Victor Umansky 5f29b0e57b Fix for the following bug in AVX codegen for double-to-int conversions:
.	"fptosi" and "fptoui" IR instructions are defined with round-to-zero rounding mode.
.	Currently for AVX mode for <4xdouble> and <8xdouble>  the "VCVTPD2DQ.128" and "VCVTPD2DQ.256" instructions are selected (for .fp_to_sint. DAG node operation ) by AVX codegen. However they use round-to-nearest-even rounding mode.
.	Consequently, the conversion produces incorrect numbers.
 
The fix is to replace selection of VCVTPD2DQ instructions with VCVTTPD2DQ instructions. The latter use truncate (i.e. round-to-zero) rounding mode. 
As .fp_to_sint. DAG node operation is used only for lowering of  "fptosi" and "fptoui" IR instructions, the fix in X86InstrSSE.td definition file doesn.t have an impact on other LLVM flows.
 
The patch includes changes in the .td file, LIT test for the changes and a fix in a legacy LIT test (which produced asm code conflicting with LLVN IR spec). 

llvm-svn: 149056
2012-01-26 08:51:39 +00:00
Anton Korobeynikov 7722a2d4e3 Properly emit ctors / dtors with priorities into desired sections
and let linker handle the rest.

This finally fixes PR5329

llvm-svn: 148990
2012-01-25 22:24:19 +00:00
Elena Demikhovsky 0b0c5d8c4c ZERO_EXTEND operation is optimized for AVX.
v8i16 -> v8i32, v4i32 -> v4i64 - used vpunpck* instructions.

llvm-svn: 148803
2012-01-24 13:54:13 +00:00
Craig Topper 3469212c82 Add support for selecting 256-bit PALIGNR.
llvm-svn: 148532
2012-01-20 05:53:00 +00:00
Eli Friedman b359e67d2d Remove a low-quality test which was failing on Windows; test/CodeGen/X86/sret.ll is a better test for the relevant behavior.
llvm-svn: 148526
2012-01-20 02:06:40 +00:00
Eli Friedman 32c7c25dcb Support MSVC x86-32 sret convention. PR11688. Patch by Joe Groff.
llvm-svn: 148513
2012-01-20 00:05:46 +00:00
Nick Lewycky 9e2c7f659e Space after punctuation.
llvm-svn: 148451
2012-01-19 01:13:47 +00:00
Nick Lewycky ecc0084f72 Add a TargetOption for disabling tail calls.
llvm-svn: 148442
2012-01-19 00:34:10 +00:00
Nadav Rotem 3b8f0cc9fa Fix a bug in the type-legalization of vector integers. When we bitcast one vector type to another, we must not bitcast the result if one type is widened while the other is promoted.
llvm-svn: 148383
2012-01-18 08:33:18 +00:00
Nadav Rotem fb6ddee0e9 Transform: (EXTRACT_VECTOR_ELT( VECTOR_SHUFFLE )) -> EXTRACT_VECTOR_ELT.
llvm-svn: 148337
2012-01-17 21:44:01 +00:00
Nadav Rotem 86e5390dbf Fix 11769.
In CanXFormVExtractWithShuffleIntoLoad we assumed that EXTRACT_VECTOR_ELT can be later handled by the DAGCombiner.
However, in some cases on AVX, the EXTRACT_VECTOR_ELT is legalized to EXTRACT_SUBVECTOR + EXTRACT_VECTOR_ELT, which
currently is not handled by the DAGCombiner. In this patch I added a check that we only extract from the XMM part.

llvm-svn: 148298
2012-01-17 09:13:19 +00:00
Eli Friedman 206ca569aa Make sure the non-SSE lowering for fences correctly clobbers EFLAGS. PR11768.
llvm-svn: 148240
2012-01-16 16:42:21 +00:00
Nadav Rotem 57935243bd [AVX] Optimize x86 VSELECT instructions using SimplifyDemandedBits.
We know that the blend instructions only use the MSB, so if the mask is
sign-extended then we can convert it into a SHL instruction. This is a
common pattern because the type-legalizer sign-extends the i1 type which
is used by the LLVM-IR for the condition.

Added a new optimization in SimplifyDemandedBits for SIGN_EXTEND_INREG -> SHL.

llvm-svn: 148225
2012-01-15 19:27:55 +00:00
Chandler Carruth fd63436553 Relax the FileCheck assertion a bit -- all we really care about is that
we're loading from the global array, not how it is spelled in the asm.
This should fix the MSVC bots.

llvm-svn: 148214
2012-01-15 09:38:59 +00:00
Chandler Carruth 4bb2eb15d4 FileCheck-ize a test, make it more specific to directly test the shift
removal desired.

llvm-svn: 148213
2012-01-15 09:32:57 +00:00
Chad Rosier d027d528b5 Cleanup test case by adding checks for test names.
llvm-svn: 148166
2012-01-14 01:46:51 +00:00
Rafael Espindola 78a6477b28 Add a test showing how the Leh_func_endN symbol is used.
llvm-svn: 148161
2012-01-14 00:12:59 +00:00
Craig Topper 9f14d9f939 Add patterns for v16i16 and v32i8 immAllZerosV to select VPXOR to match v4i64 and v8i32.
llvm-svn: 148106
2012-01-13 06:59:47 +00:00
Elena Demikhovsky 060f6ccdb8 Fixed a bug in LowerVECTOR_SHUFFLE caused assertion failure
lc: X86ISelLowering.cpp:6480: llvm::SDValue llvm::X86TargetLowering::LowerVECTOR_SHUFFLE(llvm::SDValue, llvm::SelectionDAG&) const: Assertion `V1.getOpcode() != ISD::UNDEF&&  "Op 1 of shuffle should not be undef"' failed.
Added a test.

llvm-svn: 148044
2012-01-12 20:33:10 +00:00
Rafael Espindola 9a167eeaff Add error-reporting tests for platforms that don't support segmented stacks.
Patch by Brian Anderson.

llvm-svn: 148042
2012-01-12 20:26:13 +00:00
Rafael Espindola 00e861ed57 Support segmented stacks on 64-bit FreeBSD.
This patch uses tcb_spare field in the tcb structure to store info.
Patch by Jyun-Yan You.

llvm-svn: 148041
2012-01-12 20:24:30 +00:00
Rafael Espindola 10745d3381 Support segmented stacks on win32.
Uses the pvArbitrary slot of the TIB, which is reserved for applications. We
only support frames with a static size.

llvm-svn: 148040
2012-01-12 20:22:08 +00:00
Nadav Rotem 0a0a829bea Fix a bug in the AVX 256-bit shuffle code in cases where the splat element is on the boundary of two 128-bit vectors.
The attached testcase was stuck in an endless loop.

llvm-svn: 148027
2012-01-12 15:31:55 +00:00
Benjamin Kramer 5b3aa60b44 X86: Generalize the x << (y & const) optimization to also catch masks with more set bits set than 31 or 63.
llvm-svn: 148024
2012-01-12 12:41:34 +00:00
Nadav Rotem b5ce6ee835 On AVX, we can load v8i32 at a time. The bug happens when two uneven loads are used.
When we load the v12i32 type, the GenWidenVectorLoads method generates two loads: v8i32 and v4i32 
and attempts to use CONCAT_VECTORS to join them. In this fix I concat undef values to widen 
the smaller value. The test "widen_load-2.ll" also exposes this bug on AVX.

llvm-svn: 147964
2012-01-11 20:19:17 +00:00
Bill Wendling bb3c4ad135 Check to make sure that the CFString's back store ends up in the correct section.
llvm-svn: 147961
2012-01-11 19:33:37 +00:00
Rafael Espindola d90466bcbf Support segmented stacks on mac.
This uses TLS slot 90, which actually belongs to JavaScriptCore. We only support
frames with static size
Patch by Brian Anderson.

llvm-svn: 147960
2012-01-11 19:00:37 +00:00
Rafael Espindola 1e1b797989 Split segmented stacks tests into tests for static- and dynamic-size frames.
Patch by Brian Anderson.

llvm-svn: 147959
2012-01-11 18:51:03 +00:00
Rafael Espindola 4eecacb9c8 Generate the segmented stack prologue for fastcc too.
Patch by Brian Anderson.

llvm-svn: 147958
2012-01-11 18:41:19 +00:00
Chandler Carruth 3212a34269 Revert r147945 which disabled an addressing mode transformation. I had
hoped this would revive one of the llvm-gcc selfhost build bots, but it
didn't so it doesn't appear that my transform is the culprit.

If anyone else is seeing failures, please let me know!

llvm-svn: 147957
2012-01-11 18:36:12 +00:00
Rafael Espindola 2b89448d60 Use unsigned comparison in segmented stack prologue.
This is a comparison of two addresses, and GCC does the comparison unsigned.

Patch by Brian Anderson.

llvm-svn: 147954
2012-01-11 18:23:35 +00:00
Rafael Espindola 6635ae1c17 Explicitly set the scale to 1 on some segstack prologue instrs.
Patch by Brian Anderson.

llvm-svn: 147952
2012-01-11 18:14:03 +00:00
Jan Sjödin 21f83d9f36 Add XOP Intrinsics and tests
llvm-svn: 147949
2012-01-11 15:20:20 +00:00
Nadav Rotem baae7e4577 Fix a bug in the lowering of BUILD_VECTOR for AVX. SCALAR_TO_VECTOR does not zero untouched elements. Use INSERT_VECTOR_ELT instead.
llvm-svn: 147948
2012-01-11 14:07:51 +00:00
Chandler Carruth 9bc48e5215 Disable the transformation I added in r147936 to see if it fixes some
strange build bot failures that look like a miscompile into an infloop.
I'll investigate this tomorrow, but I'd both like to know whether my
patch is the culprit, and get the bots back to green.

llvm-svn: 147945
2012-01-11 12:17:47 +00:00
Jakob Stoklund Olesen 6039983755 Fix undefined code and reenable test case.
I don't think the compact encoding code is right, but at least is has
defined behavior now.

llvm-svn: 147938
2012-01-11 09:08:04 +00:00
Chandler Carruth 55b2cdee26 Teach the X86 instruction selection to do some heroic transforms to
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

llvm-svn: 147936
2012-01-11 08:41:08 +00:00
NAKAMURA Takumi 0e60839e11 llvm/test/CodeGen/X86/zext-fold.ll: Relax an expression in stack offset.
llvm-svn: 147928
2012-01-11 07:34:22 +00:00
NAKAMURA Takumi 9e823f6ae1 llvm/test/CodeGen/X86/sub-with-overflow.ll: Add explicit -mtriple=i686-linux.
llvm-svn: 147927
2012-01-11 07:34:14 +00:00
Jakob Stoklund Olesen 05ff7f06fb Disable test that seems to expose an unrelated Linux issue.
llvm-svn: 147921
2012-01-11 03:42:27 +00:00
Jakob Stoklund Olesen 8b1d023a4a Detect when a value is undefined on an edge to a landing pad.
Consider this code:

int h() {
  int x;
  try {
    x = f();
    g();
  } catch (...) {
    return x+1;
  }
  return x;
}

The variable x is undefined on the first edge to the landing pad, but it
has the f() return value on the second edge to the landing pad.

SplitAnalysis::getLastSplitPoint() would assume that the return value
from f() was live into the landing pad when f() throws, which is of
course impossible.

Detect these cases, and treat them as if the landing pad wasn't there.
This allows spill code to be inserted after the function call to f().

<rdar://problem/10664933>

llvm-svn: 147912
2012-01-11 02:07:05 +00:00
Chad Rosier a415140a2c Add test case for r147881.
llvm-svn: 147891
2012-01-10 23:09:53 +00:00
Joerg Sonnenberger 96cd35cf6d Default stack alignment for 32bit x86 should be 4 Bytes, not 8 Bytes.
Add a test that checks the stack alignment of a simple function for
Darwin, Linux and NetBSD for 32bit and 64bit mode.

llvm-svn: 147888
2012-01-10 22:43:53 +00:00
Nadav Rotem 61bdf79035 Fix a bug in the legalization of shuffle vectors. When we emulate shuffles using BUILD_VECTORS we may be using a BV of different type. Make sure to cast it back.
llvm-svn: 147851
2012-01-10 14:28:46 +00:00
Craig Topper 430f3f1bd6 Fix a crash in AVX2 when trying to broadcast a double into a 128-bit vector. There is no vbroadcastsd xmm, but we do need to support 64-bit integers broadcasted into xmm. Also factor the AVX check into the isVectorBroadcast function. This makes more sense since the AVX2 check was already inside.
llvm-svn: 147844
2012-01-10 08:23:59 +00:00
Evan Cheng 0be4144a68 Allow machine-cse to look across MBB boundary when cse'ing instructions that
define physical registers. It's currently very restrictive, only catching
cases where the CE is in an immediate (and only) predecessor. But it catches
a surprising large number of cases.

rdar://10660865

llvm-svn: 147827
2012-01-10 02:02:58 +00:00
Chandler Carruth ea27c16adc Cleanup and FileCheck-ize a test.
llvm-svn: 147772
2012-01-09 09:44:26 +00:00
Craig Topper ef7f5bf8c9 Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing.
llvm-svn: 147767
2012-01-09 06:52:46 +00:00