Joerg Sonnenberger
5463e66768
Fix generation of the address size override prefix. Add assertions for
...
the invalid cases. At least 16bit operand in 64bit mode is currently not
rejected in the parser.
llvm-svn: 153166
2012-03-21 05:48:07 +00:00
Craig Topper
9cfc69c779
Spacing fixes and using 'unsigned' instead of 'int' to index to select shuffle elements for consistency with other shuffle code in X86 backend.
...
llvm-svn: 153154
2012-03-21 02:14:01 +00:00
Chad Rosier
4106917355
[avx] Add patterns for combining vextractf128 + vmovaps/vmovups/vmobdqu to
...
vextractf128 with 128-bit mem dest.
Combines
vextractf128 $0, %ymm0, %xmm0
vmovaps %xmm0, (%rdi)
to
vextractf128 $0, %ymm0, (%rdi)
rdar://11082570
llvm-svn: 153139
2012-03-20 21:43:40 +00:00
Chad Rosier
0158ae2e5b
[avx] Add the AddedComplexity to the VINSERTI128 avx2 patterns to give
...
precedence over the VINSERTF128 avx1 patterns.
llvm-svn: 153114
2012-03-20 19:45:07 +00:00
Chad Rosier
93d5427c69
Whitespace.
...
llvm-svn: 153105
2012-03-20 18:38:33 +00:00
Chad Rosier
5a6011267a
[avx] Move the vextractf128 patterns closer to the vextractf128 def. Remove
...
whitespace from test case. No functional change intended.
llvm-svn: 153103
2012-03-20 18:24:55 +00:00
Chad Rosier
07a4cb9382
[avx] Adjust the VINSERTF128rm pattern to allow for unaligned loads.
...
This results in things such as
vmovups 16(%rdi), %xmm0
vinsertf128 $1, %xmm0, %ymm0, %ymm0
to be combined to
vinsertf128 $1, 16(%rdi), %ymm0, %ymm0
rdar://11076953
llvm-svn: 153092
2012-03-20 17:08:51 +00:00
Craig Topper
b34d96c614
Remove code that prevented lowering shuffles if they are used by load and themselves used by a extract_vector_elt. This was done to allow the DAG combiner to collapse to a single element load. Unfortunately, sometimes the extract_vector_elt would disappear before DAG combine could do the transformation leaving a vector_shuffle that isel couldn't handle. New code lets the shuffle be converted to a target specific node, but then adds a combine routine that can convert target specific nodes back to vector_shuffles if the folding criteria are met.
...
llvm-svn: 153080
2012-03-20 07:17:59 +00:00
Craig Topper
cbc96a6e90
Factor out target shuffle mask decoding from getShuffleScalarElt and use a SmallVector of int instead of unsigned for shuffle mask in decode functions. Preparation for another change.
...
llvm-svn: 153079
2012-03-20 06:42:26 +00:00
Preston Gurd
48ccc4df0b
This patch adds X86 instruction itineraries for non-pseudo opcodes in
...
X86InstrCompiler.td.
It also adds –mcpu-generic to the legalize-shift-64.ll test so the test
will pass if run on an Intel Atom CPU, which would otherwise
produce an instruction schedule which differs from that which the test expects.
llvm-svn: 153033
2012-03-19 14:10:12 +00:00
Benjamin Kramer
57003a6768
Add a note for -ffast-math optimization of vector norm.
...
llvm-svn: 153031
2012-03-19 00:43:34 +00:00
Craig Topper
129f9ef669
isCommutedMOVLMask should only look at 128-bit vectors to match isMOVLMask.
...
llvm-svn: 153027
2012-03-18 22:50:10 +00:00
Craig Topper
b25fda95f6
Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations.
...
llvm-svn: 152997
2012-03-17 18:46:09 +00:00
Chad Rosier
b9b73170e3
[avx] Add patterns for VINSERTF128rm.
...
This results in things such as
vmovaps -96(%rbx), %xmm1
vinsertf128 $1, %xmm1, %ymm0, %ymm0
to be combined to
vinsertf128 $1, -96(%rbx), %ymm0, %ymm0
rdar://10643481
llvm-svn: 152762
2012-03-15 00:45:30 +00:00
Kevin Enderby
1ef22f33d0
Change the X86 assembler to not require a segment register on string
...
instruction's destination operand like it does for the source operand.
Also fix a typo in the comment for X86AsmParser::isSrcOp().
llvm-svn: 152654
2012-03-13 19:47:55 +00:00
Kevin Enderby
fb3110b5d2
Added a missing error check for X86 assembly with mismatched base and index
...
registers not both being 64-bit or both being 32-bit registers.
llvm-svn: 152580
2012-03-12 21:32:09 +00:00
Craig Topper
bef78fc2ee
Convert more static tables of registers used by calling convention to uint16_t to reduce space.
...
llvm-svn: 152538
2012-03-11 07:57:25 +00:00
Kay Tiong Khoo
57c8e7f364
*fix typo in comment; test of commit access
...
llvm-svn: 152507
2012-03-10 21:29:49 +00:00
Benjamin Kramer
adfc73d68f
C files in llvm still have to be C89 compliant, remove C++-style comments.
...
llvm-svn: 152495
2012-03-10 15:10:06 +00:00
Bill Wendling
ebb10df441
Fix disasm of iret, sysexit, and sysret when displayed with Intel syntax.
...
Patch by Kay Tiong Khoo!
llvm-svn: 152487
2012-03-10 07:37:27 +00:00
Kevin Enderby
deed5aaa41
Add the missing call to Error when a bad X86 scale expression is parsed.
...
llvm-svn: 152443
2012-03-09 22:24:10 +00:00
Kevin Enderby
014e1cde5f
Fix the x86 disassembler to at least print the lock prefix if it is the first
...
prefix. Added a FIXME to remind us this still does not work when it is not the
first prefix.
llvm-svn: 152414
2012-03-09 17:52:49 +00:00
Craig Topper
2dac962864
Use uint16_t to store opcodes in static tables in X86 backend.
...
llvm-svn: 152391
2012-03-09 07:45:21 +00:00
Chad Rosier
a281afc676
Fix a regression from r147481.
...
Original commit message from r147481:
DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.
Fix:
Unaligned loads need to generate a vmovups.
rdar://10974078
llvm-svn: 152366
2012-03-09 02:00:48 +00:00
Eli Friedman
de850676e0
Fix the operand ordering on aliases for shld and shrd. PR12173, part 2.
...
llvm-svn: 152136
2012-03-06 19:58:46 +00:00
Jim Grosbach
fd93a59557
Make MCRegisterInfo available to the the MCInstPrinter.
...
Used to allow context sensitive printing of super-register or sub-register
references.
llvm-svn: 152043
2012-03-05 19:33:20 +00:00
Chad Rosier
9424aa1c51
Address Evan's comments for r151877.
...
Specifically, remove the magic number when checking to see if the copy has a
glue operand and simplify the checking logic.
rdar://10930395
llvm-svn: 152041
2012-03-05 19:27:12 +00:00
Eli Friedman
a5a6d6aa8f
Make aliases for shld and shrd match gas. PR12173.
...
llvm-svn: 152014
2012-03-05 04:31:54 +00:00
Craig Topper
1d32658877
Use uint16_t to store register overlaps to reduce static data.
...
llvm-svn: 152001
2012-03-04 10:43:23 +00:00
Craig Topper
420525ce3b
Use uint16_t to store registers in callee saved register tables to reduce size of static data.
...
llvm-svn: 151996
2012-03-04 03:33:22 +00:00
Craig Topper
6dedbae429
Use uint8_t instead of enums to store values in X86 disassembler table. Shaves 150k off the size of X86DisassemblerDecoder.o
...
llvm-svn: 151995
2012-03-04 02:16:41 +00:00
Chad Rosier
f5e086f18e
Prevent obscure and incorrect tail-call optimization.
...
In this instance we are generating the tail-call during legalizeDAG. The 2nd
floor call can't be a tail call because it clobbers %xmm1, which is defined by
the first floor call. The first floor call can't be a tail-call because it's
not in the tail position. The only reasonable way I could think to fix this
in a target-independent manner was to check for glue logic on the copy reg.
rdar://10930395
llvm-svn: 151877
2012-03-02 02:50:46 +00:00
Michael J. Spencer
35145f830a
Minimal changes for LLVM to compile under VS11.
...
llvm-svn: 151849
2012-03-01 22:42:52 +00:00
Kevin Enderby
b119c08af3
Added annotations for x86 pc relative loads to llvm's 'C' disassembler.
...
So with darwin's otool(1) an x86_64 hello world .o file will print:
leaq L_.str(%rip), %rax ## literal pool for: Hello world
llvm-svn: 151769
2012-02-29 22:58:34 +00:00
Andrew Trick
6eb6528b98
Intel Atom instruction itineraries for mov sign extension and mov zero extension.
...
Patch by Tyler Nowicki!
llvm-svn: 151743
2012-02-29 19:44:41 +00:00
Derek Schuff
56b662ce0f
Make MemoryObject accessor members const again
...
llvm-svn: 151687
2012-02-29 01:09:06 +00:00
Evan Cheng
65f9d19c4f
Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call.
...
llvm-svn: 151645
2012-02-28 18:51:51 +00:00
Daniel Dunbar
ee7b899343
Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part.
...
llvm-svn: 151630
2012-02-28 15:36:07 +00:00
Evan Cheng
87c7b09d8d
Some ARM implementaions, e.g. A-series, does return stack prediction. That is,
...
the processor keeps a return addresses stack (RAS) which stores the address
and the instruction execution state of the instruction after a function-call
type branch instruction.
Calling a "noreturn" function with normal call instructions (e.g. bl) can
corrupt RAS and causes 100% return misprediction so LLVM should use a
unconditional branch instead. i.e.
mov lr, pc
b _foo
The "mov lr, pc" is issued in order to get proper backtrace.
rdar://8979299
llvm-svn: 151623
2012-02-28 06:42:03 +00:00
Preston Gurd
a49ef92a76
This patch adds instruction latencies for the SSE instructions
...
to the instruction scheduler for the Intel Atom.
llvm-svn: 151590
2012-02-27 23:35:03 +00:00
Chad Rosier
a72393a3f9
Add q suffix aliases for the fistp and fisttp mnemonics.
...
rdar://10921670
PR11935
llvm-svn: 151543
2012-02-27 19:43:12 +00:00
Craig Topper
6491c8020e
X86 disassembler support for jcxz, jecxz, and jrcxz. Fixes PR11643. Patch by Kay Tiong Khoo.
...
llvm-svn: 151510
2012-02-27 01:54:29 +00:00
NAKAMURA Takumi
bdf94879df
Target/X86: Fix assertion failures and warnings caused by r151382 _ftol2 lowering for i386-*-win32 targets. Patch by Joe Groff.
...
[Joe Groff] Hi everyone. My previous patch applied as r151382 had a few problems:
Clang raised a warning, and X86 LowerOperation would assert out for
fptoui f64 to i32 because it improperly lowered to an illegal
BUILD_PAIR. Here's a patch that addresses these issues. Let me know if
any other changes are necessary. Thanks.
llvm-svn: 151432
2012-02-25 03:37:25 +00:00
Michael J. Spencer
248d65e78b
Add WIN_FTOL_* psudo-instructions to model the unique calling convention
...
used by the Win32 _ftol2 runtime function. Patch by Joe Groff!
llvm-svn: 151382
2012-02-24 19:01:22 +00:00
Pete Cooper
682c76b7d4
Turn avx insert intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove duplicate patterns for selecting the intrinsics
...
llvm-svn: 151342
2012-02-24 03:51:49 +00:00
Kevin Enderby
6fbcd8d439
Updated the llvm-mc disassembler C API to support for the X86 target.
...
rdar://10873652
As part of this I updated the llvm-mc disassembler C API to always call the
SymbolLookUp call back even if there is no getOpInfo call back. If there is a
getOpInfo call back that is tried first and then if that gets no information
then the SymbolLookUp is called. I also made the code more robust by
memset(3)'ing to zero the LLVMOpInfo1 struct before then setting
SymbolicOp.Value before for the call to getOpInfo. And also don't use any
values from the LLVMOpInfo1 struct if getOpInfo returns 0. And also don't
use any of the ReferenceType or ReferenceName values from SymbolLookUp if it
returns NULL. rdar://10873563 and rdar://10873683
For the X86 target also fixed bugs so the annotations get printed.
Also fixed a few places in the ARM target that was not producing symbolic
operands for some instructions. rdar://10878166
llvm-svn: 151267
2012-02-23 18:18:17 +00:00
Michael J. Spencer
8b98bf2d6b
Properly emit _fltused with FastISel. Refactor to share code with SDAG.
...
Patch by Joe Groff!
llvm-svn: 151183
2012-02-22 19:06:13 +00:00
Chad Rosier
5dfe6dab25
Remove extra semi-colons.
...
llvm-svn: 151169
2012-02-22 17:25:00 +00:00
Craig Topper
cc830f8cda
Declare register classes as const. Fix a couple pointers to register classes that weren't already const.
...
llvm-svn: 151138
2012-02-22 07:28:11 +00:00
Craig Topper
760b134ffa
Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified.
...
llvm-svn: 151134
2012-02-22 05:59:10 +00:00
Aaron Ballman
e67173e718
Adding support for Microsoft's thiscall calling convention. LLVM side of the patch.
...
llvm-svn: 151123
2012-02-22 03:04:40 +00:00
Ahmed Charles
636a3d618c
Remove dead code. Improve llvm_unreachable text. Simplify some control flow.
...
llvm-svn: 150918
2012-02-19 11:37:01 +00:00
Craig Topper
de121a1000
Remove some unneeded includes and fix ordering in X86ISelLowering.cpp. Remove unneeded 'using namespace'.
...
llvm-svn: 150916
2012-02-19 07:15:48 +00:00
Craig Topper
65a4ceea1e
Unify all shuffle mask checking functions take a mask and VT instead of VectorShuffleSDNode.
...
llvm-svn: 150913
2012-02-19 05:41:45 +00:00
Craig Topper
3e5c04e432
Make a bunch of X86ISelLowering shuffle functions static now that they are no longer needed by isel.
...
llvm-svn: 150908
2012-02-19 02:53:47 +00:00
Jia Liu
e1d619691b
some comment fix for X86 and ARM
...
llvm-svn: 150902
2012-02-19 02:03:36 +00:00
Craig Topper
66a3597a4a
Add vmfunc instruction to X86 assembler and disassembler.
...
llvm-svn: 150899
2012-02-19 01:39:49 +00:00
Jia Liu
b22310fda6
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
...
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Craig Topper
57d3aaed78
Add X86InstrSVM.td that I forgot to add in r150873.
...
llvm-svn: 150874
2012-02-18 08:34:12 +00:00
Craig Topper
ed7aa46366
Add X86 assembler and disassembler support for AMD SVM instructions. Original patch by Kay Tiong Khoo. Few tweaks by me for code density and to reduce replication.
...
llvm-svn: 150873
2012-02-18 08:19:49 +00:00
Craig Topper
ba172d2d59
Remove the last of the old vector_shuffle patterns from X86 isel.
...
llvm-svn: 150795
2012-02-17 07:02:34 +00:00
Jakob Stoklund Olesen
bc6ba479b6
Remove the YMM_HI_6_15 hack.
...
Call clobbers are now represented with register mask operands. The
regmask can easily represent the fact that xmm6 is call-preserved while
ymm6 isn't. This is automatically computed by TableGen from the
CalleeSavedRegs containing xmm6.
llvm-svn: 150709
2012-02-16 17:56:06 +00:00
Jakob Stoklund Olesen
97e3115dc2
Use the same CALL instructions for Windows as for everything else.
...
The different calling conventions and call-preserved registers are
represented with regmask operands that are added dynamically.
llvm-svn: 150708
2012-02-16 17:56:02 +00:00
Jakob Stoklund Olesen
8a450cb2fa
Enable register mask operands for x86 calls.
...
Call instructions no longer have a list of 43 call-clobbered registers.
Instead, they get a single register mask operand with a bit vector of
call-preserved registers.
This saves a lot of memory, 42 x 32 bytes = 1344 bytes per call
instruction, and it speeds up building call instructions because those
43 imp-def operands no longer need to be added to use-def lists. (And
removed and shifted and re-added for every explicit call operand).
Passes like LiveVariables, LiveIntervals, RAGreedy, PEI, and
BranchFolding are significantly faster because they can deal with call
clobbers in bulk.
Overall, clang -O2 is between 0% and 8% faster, uniformly distributed
depending on call density in the compiled code. Debug builds using
clang -O0 are 0% - 3% faster.
I have verified that this patch doesn't change the assembly generated
for the LLVM nightly test suite when building with -disable-copyprop
and -disable-branch-fold.
Branch folding behaves slightly differently in a few cases because call
instructions have different hash values now.
Copy propagation flushes its data structures when it crosses a register
mask operand. This causes it to leave a few dead copies behind, on the
order of 20 instruction across the entire nightly test suite, including
SPEC. Fixing this properly would require the pass to use different data
structures.
llvm-svn: 150638
2012-02-16 00:02:50 +00:00
Chad Rosier
f0687634c3
Use a temporary variable, rather then a series of redundant calls.
...
llvm-svn: 150538
2012-02-15 00:36:26 +00:00
Pete Cooper
c21ebf5c41
Stop custom lowering forr x86 DEC64m from happening if the load in the lowered sequence has more than 1 user
...
llvm-svn: 150537
2012-02-15 00:33:37 +00:00
Craig Topper
cfad98f745
Move old movl vector_shuffle patterns. Not needed anymore since vector_shuffles shouldn't reach isel.
...
llvm-svn: 150462
2012-02-14 08:14:53 +00:00
Craig Topper
8b19d78808
Still more vector_shuffle pattern removal.
...
llvm-svn: 150365
2012-02-13 07:23:41 +00:00
Ahmed Charles
32e983e4fc
Fix various issues (or do cleanups) found by enabling certain MSVC warnings.
...
- Use unsigned literals when the desired result is unsigned. This mostly allows unsigned/signed mismatch warnings to be less noisy even if they aren't on by default.
- Remove misplaced llvm_unreachable.
- Add static to a declaration of a function on MSVC x86 only.
- Change some instances of calling a static function through a variable to simply calling that function while removing the unused variable.
llvm-svn: 150364
2012-02-13 06:30:56 +00:00
Craig Topper
74650add0e
Remove more vector_shuffle patterns for unpack. These should be target specific nodes when they get to isel.
...
llvm-svn: 150363
2012-02-13 05:48:49 +00:00
Craig Topper
6d471c9e49
Recommit r150328. Previous test failures should be fixed by r150360.
...
llvm-svn: 150362
2012-02-13 05:10:10 +00:00
Craig Topper
87119fa37f
Update CanXFormVExtractWithShuffleIntoLoad to ensure bitcasts of loads only have one use. Matches DAGCombiner and prevents vector_shuffles from reaching isel.
...
llvm-svn: 150360
2012-02-13 04:30:38 +00:00
NAKAMURA Takumi
0826c17d00
Revert r150328, "Remove more vector_shuffle patterns."
...
It caused 3 failures on pre-penryn and non-x86(generic) hosts.
llvm-svn: 150357
2012-02-13 00:10:15 +00:00
Pete Cooper
71be57bb32
Fixed bug when custom lowering DEC64m on x86.
...
If the DEC node had more than one user, it was doing this lowering but
leaving the original DEC node around and so decrementing twice.
Fixes PR11964.
llvm-svn: 150356
2012-02-13 00:10:03 +00:00
Craig Topper
e24c94af81
Remove more vector_shuffle patterns.
...
llvm-svn: 150328
2012-02-12 08:14:35 +00:00
Craig Topper
d40d9eb2b3
Remove more vector_shuffle patterns.
...
llvm-svn: 150321
2012-02-12 01:07:34 +00:00
Craig Topper
330ca97700
Remove more vector_shuffle patterns.
...
llvm-svn: 150314
2012-02-11 23:31:01 +00:00
Anton Korobeynikov
c6b4017ce2
Add support for implicit TLS model used with MS VC runtime.
...
Patch by Kai Nacke!
llvm-svn: 150307
2012-02-11 17:26:53 +00:00
Benjamin Kramer
915e3d9568
Don't mix declarations and code.
...
llvm-svn: 150305
2012-02-11 16:01:02 +00:00
Benjamin Kramer
428704eb52
Make the EDis tables const.
...
llvm-svn: 150304
2012-02-11 14:51:07 +00:00
Benjamin Kramer
478e8de8ef
Reuse the enum names from X86Desc in the X86Disassembler.
...
This requires some gymnastics to make it available for C code. Remove the names
from the disassembler tables, making them relocation free.
llvm-svn: 150303
2012-02-11 14:50:54 +00:00
Craig Topper
981c6cf7b3
Remove some patterns for matching vector_shuffle instructions since vector_shuffles should be custom lowered before isel.
...
llvm-svn: 150299
2012-02-11 07:43:35 +00:00
Craig Topper
11826a6e10
Fix shuffle lowering code to stop creating temporary DAG nodes to do shuffle mask checks on. This seemed to be confusing things such that vector_shuffle ops to got through to iselection. This is another step towards removing the vector_shuffle handling patterns from isel.
...
llvm-svn: 150296
2012-02-11 06:24:48 +00:00
Craig Topper
a0cd970b81
More tweaks to get the size of the X86 disassembler tables down.
...
llvm-svn: 150167
2012-02-09 08:58:07 +00:00
Craig Topper
487e744f66
Flatten some of the arrays in the X86 disassembler tables to reduce space needed to store pointers on 64-bit hosts and reduce relocations needed at startup. Part of PR11953.
...
llvm-svn: 150161
2012-02-09 07:45:30 +00:00
Jakob Stoklund Olesen
4519fd0b21
Handle register masks when searching for EFLAGS clobbers.
...
Calls clobber the flags, but when using register masks there is no
EFLAGS<imp-def> operand.
llvm-svn: 150117
2012-02-09 00:17:22 +00:00
Elena Demikhovsky
1adc1d53dd
Fixed a bug in printing "cmp" pseudo ops.
...
> This IR code
> %res = call <8 x float> @llvm.x86.avx.cmp.ps.256(<8 x float> %a0, <8 x float> %a1, i8 14)
> fails with assertion:
>
> llc: X86ATTInstPrinter.cpp:62: void llvm::X86ATTInstPrinter::printSSECC(const llvm::MCInst*, unsigned int, llvm::raw_ostream&): Assertion `0 && "Invalid ssecc argument!"' failed.
> 0 llc 0x0000000001355803
> 1 llc 0x0000000001355dc9
> 2 libpthread.so.0 0x00007f79a30575d0
> 3 libc.so.6 0x00007f79a23a1945 gsignal + 53
> 4 libc.so.6 0x00007f79a23a2f21 abort + 385
> 5 libc.so.6 0x00007f79a239a810 __assert_fail + 240
> 6 llc 0x00000000011858d5 llvm::X86ATTInstPrinter::printSSECC(llvm::MCInst const*, unsigned int, llvm::raw_ostream&) + 119
I added the full testing for all possible pseudo-ops of cmp.
I extended X86AsmPrinter.cpp and X86IntelInstPrinter.cpp.
You'l also see lines alignments (unrelated to this fix) in X86IselLowering.cpp from my previous check-in.
llvm-svn: 150068
2012-02-08 08:37:26 +00:00
Craig Topper
172b9243cd
Remove a couple unneeded intrinsic patterns
...
llvm-svn: 150067
2012-02-08 08:29:30 +00:00
Craig Topper
5405571fe0
Remove GCC builtins for vpermilp* intrinsics as clang no longer needs them. Custom lower the intrinsics to the vpermilp target specific node and remove intrinsic patterns.
...
llvm-svn: 150060
2012-02-08 06:36:57 +00:00
Evan Cheng
1b81fddd65
Use LEA to adjust stack ptr for Atom. Patch by Andy Zhang.
...
llvm-svn: 150008
2012-02-07 22:50:41 +00:00
Craig Topper
b27fd77c3f
Add instruction selection for 256-bit VPSHUFD and 128-bit VPERMILPS/VPERMILPD.
...
llvm-svn: 149968
2012-02-07 06:28:42 +00:00
Derek Schuff
8b2dcad4b5
Enable streaming of bitcode
...
This CL delays reading of function bodies from initial parse until
materialization, allowing overlap of compilation with bitcode download.
llvm-svn: 149918
2012-02-06 22:30:29 +00:00
Chris Lattner
8213c8af29
Remove some dead code and tidy things up now that vectors use ConstantDataVector
...
instead of always using ConstantVector.
llvm-svn: 149912
2012-02-06 21:56:39 +00:00
Benjamin Kramer
2496717052
X86: Don't call malloc for 4 bits. No functionality change.
...
llvm-svn: 149866
2012-02-06 12:06:18 +00:00
Craig Topper
1f71057747
Add shuffle decoding support for 256-bit pshufd. Merge vpermilp* and pshufd decoding.
...
llvm-svn: 149859
2012-02-06 07:17:51 +00:00
Duncan Sands
ae22c60f90
Persuade GCC that there is nothing worth warning about here (there isn't).
...
llvm-svn: 149834
2012-02-05 14:20:11 +00:00
Chandler Carruth
ebd90c58e6
Begin fleshing out more convenience predicates in llvm::Triple and
...
convert at least one client over to use them. Subsequent patches both to
LLVM and Clang will try to convert more people over to a common set of
predicates.
This round of predicates is focused on OS-categorization predicates.
llvm-svn: 149815
2012-02-05 08:26:40 +00:00
Craig Topper
c4965bce14
Convert assert(0) to llvm_unreachable
...
llvm-svn: 149814
2012-02-05 07:21:30 +00:00
Craig Topper
4ed7278ff4
Convert assert(0) to llvm_unreachable in X86 Target directory.
...
llvm-svn: 149809
2012-02-05 05:38:58 +00:00
Craig Topper
83f3bdaa45
Convert some assert(0) in default of switch statements to llvm_unreachable.
...
llvm-svn: 149808
2012-02-05 03:43:23 +00:00
Craig Topper
1d471e31ba
Add target specific node for PMULUDQ. Change patterns to use it and custom lower intrinsics to it. Use it instead of intrinsic to handle 64-bit vector multiplies.
...
llvm-svn: 149807
2012-02-05 03:14:49 +00:00
Craig Topper
4daa67483d
Remove most of the intrinsics for XOP VPCMOV instruction. They all aliased to the same instruction with different types. This would be better accomplished with casts in the not yet created xopintrin.h header file.
...
llvm-svn: 149795
2012-02-05 00:55:56 +00:00
Andrew Trick
f8ea108c05
TargetPassConfig: confine the MC configuration to TargetMachine.
...
Passes prior to instructon selection are now split into separate configurable stages.
Header dependencies are simplified.
The bulk of this diff is simply removal of the silly DisableVerify flags.
Sorry for the target header churn. Attempting to stabilize them.
llvm-svn: 149754
2012-02-04 02:56:59 +00:00
Craig Topper
47e6d26911
Remove getShuffleVPERMILPImmediate function, getShuffleSHUFImmediate performs the same calculation.
...
llvm-svn: 149683
2012-02-03 06:52:33 +00:00
Craig Topper
d5ffe0900d
Remove unnecessary qualification on 256-bit vector handling in LowerBUILD_VECTOR. Condition was already guaranteed by earlier code.
...
llvm-svn: 149680
2012-02-03 06:32:21 +00:00
Andrew Trick
ccb673659a
Added TargetPassConfig. The first little step toward configuring codegen passes.
...
Allows command line overrides to be centralized in LLVMTargetMachine.cpp.
LLVMTargetMachine can intercept common passes and give precedence to command line overrides.
Allows adding "internal" target configuration options without touching TargetOptions.
Encapsulates the PassManager.
Provides a good point to initialize all CodeGen passes so that Pass ID's can be used in APIs.
Allows modifying the target configuration hooks without rebuilding the world.
llvm-svn: 149672
2012-02-03 05:12:41 +00:00
Andrew Trick
808a7a6ce6
whitespace
...
llvm-svn: 149671
2012-02-03 05:12:30 +00:00
Lang Hames
bb682450f9
Incorporate suggestions Chad, Jakob and Evan's suggestions on r149957.
...
llvm-svn: 149655
2012-02-03 01:13:49 +00:00
Jakob Stoklund Olesen
5e1ac45b93
Require non-NULL register masks.
...
It doesn't seem worthwhile to give meaning to a NULL register mask
pointer. It complicates all the code using register mask operands.
llvm-svn: 149646
2012-02-02 23:52:57 +00:00
Elena Demikhovsky
6fbb4d2842
Minor change in signature of the getZeroVector()
...
llvm-svn: 149601
2012-02-02 09:20:18 +00:00
Elena Demikhovsky
fb44980b41
Optimization for SIGN_EXTEND operation on AVX.
...
Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32
extensions.
llvm-svn: 149600
2012-02-02 09:10:43 +00:00
Francois Pichet
26f302d568
Unbreak the MSVC build.
...
llvm-svn: 149599
2012-02-02 08:36:09 +00:00
Lang Hames
0269caafa6
Set EFLAGS correctly in EmitLoweredSelect on X86.
...
llvm-svn: 149597
2012-02-02 07:48:37 +00:00
Andrew Trick
8523b16ff5
Instruction scheduling itinerary for Intel Atom.
...
Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.
Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.
Adds a test to verify that the scheduler is working.
Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.
Patch by Preston Gurd!
llvm-svn: 149558
2012-02-01 23:20:51 +00:00
Mon P Wang
9f05206659
Avoid creating an extract element to an illegal type after LegalizeTypes has run.
...
llvm-svn: 149548
2012-02-01 22:15:20 +00:00
Chad Rosier
e273cb08c4
Tidy up.
...
llvm-svn: 149521
2012-02-01 18:45:51 +00:00
Elena Demikhovsky
824eed70a6
Passing AVX 256-bit structures in Win64 was wrong.
...
Fixed Win64 calling conventions.
llvm-svn: 149494
2012-02-01 10:46:14 +00:00
Elena Demikhovsky
34cca175ab
Shortened code in shuffle masks
...
llvm-svn: 149493
2012-02-01 10:33:05 +00:00
Elena Demikhovsky
0e48c70ba7
Optimization for "truncate" operation on AVX.
...
Truncating v4i64 -> v4i32 and v8i32 -> v8i16 may be done with set of shuffles.
llvm-svn: 149485
2012-02-01 07:56:44 +00:00
Craig Topper
9cdb8bdf04
Don't create VBROADCAST nodes if any nodes use the chain result from the load. Fixes PR11900.
...
llvm-svn: 149478
2012-02-01 06:51:58 +00:00
Devang Patel
a173ee56fd
Add assembler dialect attribute in asm parser which lets target specific asm parser change dialect on the fly.
...
llvm-svn: 149396
2012-01-31 18:14:05 +00:00
Craig Topper
b85e40f738
Remove pcmpgt/pcmpeq intrinsics as clang is not using them.
...
llvm-svn: 149367
2012-01-31 06:52:44 +00:00
Evan Cheng
4e7992eeba
PR11834: Use macros which are defined on Windows. Patch by Marina Yatsina.
...
llvm-svn: 149294
2012-01-30 23:10:32 +00:00
Devang Patel
7cdb2ff6b5
Intel syntax. Adjust special code, used to recognize cmp<comparison code>{ss,sd,ps,pd}, for intel syntax.
...
llvm-svn: 149291
2012-01-30 22:47:12 +00:00
Devang Patel
9a9bb5c5db
Intel syntax. Support .intel_syntax directive.
...
llvm-svn: 149270
2012-01-30 20:02:42 +00:00
Benjamin Kramer
396c590818
Fix refacto.
...
llvm-svn: 149269
2012-01-30 20:01:35 +00:00
Douglas Gregor
e577cfe172
Eliminate narrowing conversion in initializer list, to make C++11 happy
...
llvm-svn: 149254
2012-01-30 16:57:18 +00:00
Benjamin Kramer
20af25f47b
X86: Simplify shuffle mask generation code.
...
llvm-svn: 149248
2012-01-30 15:16:21 +00:00
Craig Topper
516cba3380
Fix pattern for memory form of PSHUFD for use with FP vectors to remove bitcast to an integer vector that normal code wouldn't have. Also remove bitcasts from code that turns splat vector loads into a shuffle as it was making the broken pattern necessary.
...
llvm-svn: 149232
2012-01-30 07:50:31 +00:00
Craig Topper
ca29bcfc10
Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic patterns with custom lowering to a target specific nodes.
...
llvm-svn: 149216
2012-01-30 01:10:15 +00:00
Devang Patel
63fe5697f4
Intel Syntax: Parse mem operand with seg reg. QWORD PTR FS:[320]
...
llvm-svn: 149142
2012-01-27 19:48:28 +00:00
Craig Topper
5639e9e8fb
Move some patterns back near their instructions and use AddedComplexity to fix priority. Merge some patterns into their instruction definition.
...
llvm-svn: 149122
2012-01-27 07:09:40 +00:00
Jim Grosbach
8f28dbdde5
Keep source location information for X86 MCFixup's.
...
llvm-svn: 149106
2012-01-27 00:51:27 +00:00
Jakob Stoklund Olesen
fc9dce25f7
Handle call-clobbered ymm registers on Win64.
...
The Win64 calling convention has xmm6-15 as callee-saved while still
clobbering all ymm registers.
Add a YMM_HI_6_15 pseudo-register that aliases the clobbered part of the
ymm registers, and mark that as call-clobbered. This allows live xmm
registers across calls.
This hack wouldn't be necessary with RegisterMask operands representing
the call clobbers, but they are not quite operational yet.
llvm-svn: 149088
2012-01-26 22:59:28 +00:00
Victor Umansky
5f29b0e57b
Fix for the following bug in AVX codegen for double-to-int conversions:
...
. "fptosi" and "fptoui" IR instructions are defined with round-to-zero rounding mode.
. Currently for AVX mode for <4xdouble> and <8xdouble> the "VCVTPD2DQ.128" and "VCVTPD2DQ.256" instructions are selected (for .fp_to_sint. DAG node operation ) by AVX codegen. However they use round-to-nearest-even rounding mode.
. Consequently, the conversion produces incorrect numbers.
The fix is to replace selection of VCVTPD2DQ instructions with VCVTTPD2DQ instructions. The latter use truncate (i.e. round-to-zero) rounding mode.
As .fp_to_sint. DAG node operation is used only for lowering of "fptosi" and "fptoui" IR instructions, the fix in X86InstrSSE.td definition file doesn.t have an impact on other LLVM flows.
The patch includes changes in the .td file, LIT test for the changes and a fix in a legacy LIT test (which produced asm code conflicting with LLVN IR spec).
llvm-svn: 149056
2012-01-26 08:51:39 +00:00
Craig Topper
86e44bc829
Add HasXOP predicate check covering a bunch of XOP intrinsic patterns.
...
llvm-svn: 149054
2012-01-26 07:51:55 +00:00
Craig Topper
1c0e22f57a
Fix AVX vs SSE patterns ordering issue for VPCMPESTRM and VPCMPISTRM.
...
llvm-svn: 149053
2012-01-26 07:31:30 +00:00
Craig Topper
b91760eff8
Remove some more patterns by custom lowering intrinsics to target specific nodes.
...
llvm-svn: 149052
2012-01-26 07:18:03 +00:00
Chris Lattner
33633a90a0
fix a bug I introduced in r148929, this is not a splat!
...
Thanks to Eli for noticing.
llvm-svn: 148947
2012-01-25 09:56:22 +00:00
Craig Topper
7834900950
Custom lower PSIGN and PSHUFB intrinsics to their corresponding target specific nodes so we can remove the isel patterns.
...
llvm-svn: 148933
2012-01-25 06:43:11 +00:00
Chris Lattner
47a86bdbe2
use ConstantVector::getSplat in a few places.
...
llvm-svn: 148929
2012-01-25 06:02:56 +00:00
Craig Topper
ce4f9c5668
Custom lower phadd and phsub intrinsics to target specific nodes. Remove the patterns that are no longer necessary.
...
llvm-svn: 148927
2012-01-25 05:37:32 +00:00
Craig Topper
5bcf070e68
Remove AVX 256-bit unaligned load intrinsics. 128-bit versions had been removed a while ago.
...
llvm-svn: 148922
2012-01-25 04:42:03 +00:00
Craig Topper
3ad5bc019a
Merge intrinsic pattern and no pattern versions of VCVTSD2SI intruction definitions. Matches non-AVX version of same instructions.
...
llvm-svn: 148914
2012-01-25 03:52:09 +00:00
Devang Patel
a410ed3ced
Intel Syntax: Extend special hand coded logic, to recognize special instructions, for intel syntax.
...
llvm-svn: 148864
2012-01-24 21:43:36 +00:00
Elena Demikhovsky
0b0c5d8c4c
ZERO_EXTEND operation is optimized for AVX.
...
v8i16 -> v8i32, v4i32 -> v4i64 - used vpunpck* instructions.
llvm-svn: 148803
2012-01-24 13:54:13 +00:00
Craig Topper
0d8e67aebd
Add comments near load pattern fragments indicating that all integer vector loads are promoted to v2i64 or v4i64 so that no one tries to reintroduce pattern fragments for other types.
...
llvm-svn: 148771
2012-01-24 03:03:17 +00:00
Devang Patel
eba7d3dba9
Fix typo.
...
llvm-svn: 148751
2012-01-23 23:56:33 +00:00
Devang Patel
cf893a437e
Intel syntax: Robustify parsing of memory operand's displacement experssion.
...
llvm-svn: 148737
2012-01-23 22:35:25 +00:00
Devang Patel
e660fdd953
Intel syntax: Parse memory operand with empty base reg, e.g. DWORD PTR [4*RDI]
...
llvm-svn: 148721
2012-01-23 20:20:06 +00:00
Devang Patel
880bc1644b
Intel syntax: Parse segment registers.
...
llvm-svn: 148712
2012-01-23 18:31:58 +00:00
Craig Topper
edd1d0acfc
Custom lower PCMPEQ/PCMPGT intrinsics to target specific nodes and remove the intrinsic patterns.
...
llvm-svn: 148687
2012-01-23 08:18:28 +00:00
Craig Topper
6b90c5d03e
Update more places to use target specific nodes for vector shifts instead of intrinsics.
...
llvm-svn: 148685
2012-01-23 06:46:22 +00:00
Craig Topper
5e80db4e4f
Custom lower vector shift intrinsics to target specific nodes and remove the patterns that are no longer needed.
...
llvm-svn: 148684
2012-01-23 06:16:53 +00:00
Craig Topper
20c98df340
Remove pattern fragments for v32i8, v16i16, v8i32, v16i8, v8i16, and v4i32 loads. All integer vector loads are promoted to v2i64 or v4i64 so these pattern fragments can never match. Fix or remove patterns that used these fragments.
...
llvm-svn: 148672
2012-01-23 00:06:44 +00:00
Craig Topper
0b7ad76bd0
Combine X86 CMPPD and CMPPS node types. Simplifies selection code and pattern matching.
...
llvm-svn: 148670
2012-01-22 23:36:02 +00:00
Craig Topper
bd4884371b
Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86 ISD node types into only two node types. Simplifying opcode selection and pattern matching.
...
llvm-svn: 148667
2012-01-22 22:42:16 +00:00
Craig Topper
094626414d
Add target specific ISD node types for SSE/AVX vector shuffle instructions and change all the code that used to create intrinsic nodes to create the new nodes instead.
...
llvm-svn: 148664
2012-01-22 19:15:14 +00:00
Craig Topper
a4ed5246d8
Make code a little less verbose.
...
llvm-svn: 148651
2012-01-22 03:07:48 +00:00
Craig Topper
cb3433cd58
Remove unused X86 ISD node type defines.
...
llvm-svn: 148644
2012-01-22 01:15:56 +00:00
Craig Topper
123adfa0f3
Move some vector shift patterns into their instruction definitions.
...
llvm-svn: 148643
2012-01-22 00:41:20 +00:00
Craig Topper
dcaa5fbd08
Add memory patterns for some of the fp<->integer conversion instructions. Fold some patterns into instruction definitions.
...
llvm-svn: 148641
2012-01-21 18:37:15 +00:00
Benjamin Kramer
5cff13a3fb
Remove unused variables.
...
llvm-svn: 148635
2012-01-21 10:42:44 +00:00
Craig Topper
39bc1e4d25
Fix PR11819 introduced by r148537. I'd commit the test case, but the generated code is terrible as it gets fully scalarized. Expect a future commit to fix that.
...
llvm-svn: 148632
2012-01-21 08:49:33 +00:00
Devang Patel
ce6a2ca8c8
Intel syntax: Robustify register parsing.
...
llvm-svn: 148591
2012-01-20 22:32:05 +00:00
David Blaikie
46a9f016c5
More dead code removal (using -Wunreachable-code)
...
llvm-svn: 148578
2012-01-20 21:51:11 +00:00
Devang Patel
d0930fff85
Intel syntax: Parse ... PTR [-8]
...
llvm-svn: 148570
2012-01-20 21:21:01 +00:00
Devang Patel
f36613cb45
Intel syntax: For now, disable ambiguous JMP64pcrel32 for intel syntax.
...
llvm-svn: 148569
2012-01-20 21:14:06 +00:00
Craig Topper
a409479023
Improve 256-bit shuffle splitting to allow 2 sources in each 128-bit lane. As long as only a single lane of the source is used in the lane in the destination. This makes the splitting match much closer to what happens with 256-bit shuffles when AVX is disabled and only 128-bit XMM is allowed.
...
llvm-svn: 148537
2012-01-20 09:29:03 +00:00
Craig Topper
3469212c82
Add support for selecting 256-bit PALIGNR.
...
llvm-svn: 148532
2012-01-20 05:53:00 +00:00
Eli Friedman
32c7c25dcb
Support MSVC x86-32 sret convention. PR11688. Patch by Joe Groff.
...
llvm-svn: 148513
2012-01-20 00:05:46 +00:00
Devang Patel
f83dcfd052
Post process 'and', 'sub' instructions and select better encoding, if available.
...
llvm-svn: 148489
2012-01-19 18:40:55 +00:00
Devang Patel
2529dd9e00
Intel syntax: There is no need to create unary expr for simple negative displacement.
...
llvm-svn: 148486
2012-01-19 18:15:51 +00:00
Devang Patel
4a62ff9bcb
Post process 'xor', 'or' and 'cmp' instructions and select better encoding, if available.
...
llvm-svn: 148485
2012-01-19 17:53:25 +00:00
Craig Topper
a875b7ccc7
Folding table additions and fixes for AVX.
...
llvm-svn: 148467
2012-01-19 08:50:38 +00:00
Craig Topper
80576e8d1f
Merge 128-bit and 256-bit SHUFPS/SHUFPD handling.
...
llvm-svn: 148466
2012-01-19 08:19:12 +00:00
Nick Lewycky
ecc0084f72
Add a TargetOption for disabling tail calls.
...
llvm-svn: 148442
2012-01-19 00:34:10 +00:00
Jakob Stoklund Olesen
ff482f733b
Add experimental -x86-use-regmask command line option.
...
It adds register mask operands to x86 call instructions. Once all the
backend passes support register mask operands, this will be permanently
enabled.
llvm-svn: 148438
2012-01-18 23:52:22 +00:00
Jakob Stoklund Olesen
f1fb1d2375
Ignore register mask operands when lowering instructions to MC.
...
This is similar to implicit register operands. MC doesn't understand
register liveness and call clobbers.
llvm-svn: 148437
2012-01-18 23:52:19 +00:00
Devang Patel
de47cced25
Process instructions after match to select alternative encoding which may be more desirable.
...
llvm-svn: 148431
2012-01-18 22:42:29 +00:00
Jim Grosbach
aba3de99c0
Tidy up. MCAsmBackend naming conventions.
...
llvm-svn: 148400
2012-01-18 18:52:16 +00:00
Jakob Stoklund Olesen
f43b599550
Add a CoveredBySubRegs property to Register descriptions.
...
When set, this bit indicates that a register is completely defined by
the value of its sub-registers.
Use the CoveredBySubRegs property to infer which super-registers are
call-preserved given a list of callee-saved registers. For example, the
ARM registers D8-D15 are callee-saved. This now automatically implies
that Q4-Q7 are call-preserved.
Conversely, Win64 callees save XMM6-XMM15, but the corresponding
YMM6-YMM15 registers are not call-preserved because they are not fully
defined by their sub-registers.
llvm-svn: 148363
2012-01-18 00:16:39 +00:00
Jakob Stoklund Olesen
d51a710bde
Move X86 callee saved register lists to the X86CallConv .td file.
...
Add a trivial implementation of the getCallPreservedMask() hook.
llvm-svn: 148347
2012-01-17 22:47:01 +00:00
Devang Patel
c9ed518792
Intel syntax: Fix parser match class to check memory operand size.
...
llvm-svn: 148338
2012-01-17 21:48:03 +00:00
Devang Patel
a7143b6a2b
Intel syntax: Parse "BYTE PTR [RDX + RCX]"
...
llvm-svn: 148334
2012-01-17 21:25:10 +00:00
Devang Patel
2ed6718616
Untabify.
...
llvm-svn: 148322
2012-01-17 19:09:22 +00:00
Devang Patel
8b39be79ad
Intel syntax: Do not unncessarily create plus expression for memory operand displacement.
...
llvm-svn: 148321
2012-01-17 19:08:07 +00:00
Devang Patel
41b9ddeb7a
Intel syntax: Robustify memory operand parsing.
...
llvm-svn: 148312
2012-01-17 18:00:18 +00:00
Nadav Rotem
86c3807b99
Fix warning.
...
llvm-svn: 148301
2012-01-17 09:31:09 +00:00
Nadav Rotem
86e5390dbf
Fix 11769.
...
In CanXFormVExtractWithShuffleIntoLoad we assumed that EXTRACT_VECTOR_ELT can be later handled by the DAGCombiner.
However, in some cases on AVX, the EXTRACT_VECTOR_ELT is legalized to EXTRACT_SUBVECTOR + EXTRACT_VECTOR_ELT, which
currently is not handled by the DAGCombiner. In this patch I added a check that we only extract from the XMM part.
llvm-svn: 148298
2012-01-17 09:13:19 +00:00
Craig Topper
9cafcd8baa
Remove unnecessary AVX check from an assert. hasSSE2 is enough.
...
llvm-svn: 148295
2012-01-17 08:23:44 +00:00
Craig Topper
37b10ef250
Fix a crasher when PerformShiftCombine receives a BUILD_VECTOR of all UNDEF. Probably could use better handling in DAG combine or getNode. Fixes PR11772.
...
llvm-svn: 148285
2012-01-17 04:44:50 +00:00
Eli Friedman
206ca569aa
Make sure the non-SSE lowering for fences correctly clobbers EFLAGS. PR11768.
...
llvm-svn: 148240
2012-01-16 16:42:21 +00:00
Eli Friedman
75e3db4c7a
Get rid of unused codegen-only instruction.
...
llvm-svn: 148239
2012-01-16 16:29:35 +00:00
Craig Topper
db8890aedd
Give priority to AVX over SSE for 128-bit floating point unpck instructions.
...
llvm-svn: 148233
2012-01-16 09:56:42 +00:00
Nadav Rotem
57935243bd
[AVX] Optimize x86 VSELECT instructions using SimplifyDemandedBits.
...
We know that the blend instructions only use the MSB, so if the mask is
sign-extended then we can convert it into a SHL instruction. This is a
common pattern because the type-legalizer sign-extends the i1 type which
is used by the LLVM-IR for the condition.
Added a new optimization in SimplifyDemandedBits for SIGN_EXTEND_INREG -> SHL.
llvm-svn: 148225
2012-01-15 19:27:55 +00:00
Benjamin Kramer
339ced4e34
Return an ArrayRef from ShuffleVectorSDNode::getMask and push it through CodeGen.
...
llvm-svn: 148218
2012-01-15 13:16:05 +00:00
Craig Topper
c10e1abaf3
Fix the memop type on a couple 256-bit AVX instructions that were using f128mem instead of f256mem.
...
llvm-svn: 148196
2012-01-14 18:29:57 +00:00
Craig Topper
d78429f850
Add a bunch of AVX instructions to the folding tables. Also fixed the alignment on 256-bit AVX2 instructions.
...
llvm-svn: 148194
2012-01-14 18:14:53 +00:00
Chad Rosier
71a185c5c6
Fix pasto from r146196.
...
llvm-svn: 148167
2012-01-14 01:50:21 +00:00
Devang Patel
7066d28043
Revert r148131, it was committed before it was ready.
...
llvm-svn: 148134
2012-01-13 19:28:58 +00:00
Devang Patel
7ecdc6d4f5
Refactor.
...
llvm-svn: 148131
2012-01-13 19:12:18 +00:00
Craig Topper
e52d86a740
Convert SHUFPD with the same register for both sources to PSHUFD if it would prevent a register copy. Similar to SHUFPS, but requires the mask to be converted.
...
llvm-svn: 148112
2012-01-13 09:21:41 +00:00
Craig Topper
b1c2ebf6ee
use v8i32 as optimal mem type over v8f32 if AVX2 is enabled. Similar to SSE2 vs SSE1.
...
llvm-svn: 148109
2012-01-13 08:32:21 +00:00
Craig Topper
cb7e13d7c0
Make X86 instruction selection use 256-bit VPXOR for build_vector of all ones if AVX2 is enabled. This gives the ExeDepsFix pass a chance to choose FP vs int as appropriate. Also use v8i32 as the type for getZeroVector if AVX2 is enabled. This is consistent with SSE2 using prefering v4i32.
...
llvm-svn: 148108
2012-01-13 08:12:35 +00:00
Craig Topper
9f14d9f939
Add patterns for v16i16 and v32i8 immAllZerosV to select VPXOR to match v4i64 and v8i32.
...
llvm-svn: 148106
2012-01-13 06:59:47 +00:00
Craig Topper
a4c5a47b97
Use 8i32 constant pool entry for converting AVX2_SETALLONES. Possibly fixes PR11750.
...
llvm-svn: 148101
2012-01-13 06:12:41 +00:00
Craig Topper
2aa07f832e
Fix typo in PerformAddCombine that caused any vector type to be checked for horizontal add/sub if AVX2 is enabled. This caused an assert to fail for non 128/256-bit vectors when done before type legalizing. Fixes PR11749.
...
llvm-svn: 148096
2012-01-13 05:04:25 +00:00
Bill Wendling
9c8456f7ef
Fix off-by-one error.
...
llvm-svn: 148077
2012-01-13 00:41:53 +00:00
Bill Wendling
ee5eaebc58
Fix the code that was WRONG.
...
The registers are placed into the saved registers list in the reverse order,
which is why the original loop was written to loop backwards.
llvm-svn: 148064
2012-01-12 23:05:03 +00:00
Elena Demikhovsky
060f6ccdb8
Fixed a bug in LowerVECTOR_SHUFFLE caused assertion failure
...
lc: X86ISelLowering.cpp:6480: llvm::SDValue llvm::X86TargetLowering::LowerVECTOR_SHUFFLE(llvm::SDValue, llvm::SelectionDAG&) const: Assertion `V1.getOpcode() != ISD::UNDEF&& "Op 1 of shuffle should not be undef"' failed.
Added a test.
llvm-svn: 148044
2012-01-12 20:33:10 +00:00
Rafael Espindola
00e861ed57
Support segmented stacks on 64-bit FreeBSD.
...
This patch uses tcb_spare field in the tcb structure to store info.
Patch by Jyun-Yan You.
llvm-svn: 148041
2012-01-12 20:24:30 +00:00
Rafael Espindola
10745d3381
Support segmented stacks on win32.
...
Uses the pvArbitrary slot of the TIB, which is reserved for applications. We
only support frames with a static size.
llvm-svn: 148040
2012-01-12 20:22:08 +00:00
Devang Patel
4a6e778aae
Rename X86ATTAsmParser -> X86AsmParser
...
We are using one parser to parse att as well as intel style syntax.
llvm-svn: 148032
2012-01-12 18:03:40 +00:00
Benjamin Kramer
9ece950ddb
After Jakob's r147938 exception handling on i386 was completely broken.
...
Restore the (obviously wrong) behavior from before r147938 without relying on
undefined behavior. Add a fat FIXME note.
This should fix nightly tester failures.
llvm-svn: 148030
2012-01-12 17:37:18 +00:00
Nadav Rotem
0a0a829bea
Fix a bug in the AVX 256-bit shuffle code in cases where the splat element is on the boundary of two 128-bit vectors.
...
The attached testcase was stuck in an endless loop.
llvm-svn: 148027
2012-01-12 15:31:55 +00:00
Benjamin Kramer
5b3aa60b44
X86: Generalize the x << (y & const) optimization to also catch masks with more set bits set than 31 or 63.
...
llvm-svn: 148024
2012-01-12 12:41:34 +00:00
Devang Patel
fc6be102ae
Add predicate method check match memory operand size, if available.
...
In att style asm syntax memory operand size is derived from suffix attached with mnemonic. In intel style asm syntax it is part of memory operand hence predicate method check is required to select appropriate instruction.
llvm-svn: 148006
2012-01-12 01:51:42 +00:00
Devang Patel
46831de240
Add intel style operand parser skeleton.
...
This is a work in progress.
llvm-svn: 148002
2012-01-12 01:36:43 +00:00
Chandler Carruth
eb21da060b
Switch all of the uses of my InsertDAGNode helper to follow the exact
...
same pattern. We already had this pattern is a few places, but others
tried to make a rough approximation of an actual DAG structure. As not
everywhere went to this trouble, nothing could rely on this being done.
In fact, I've checked all references to these node Ids, and the ones
that are using the topo-sort properties are actually satisfied with
a strict-weak-ordering. The requirement appears to be that Use >= Def.
I've added a big blurb of comments to this bit of the transform to
clarify why the order is so important for the next reader of the code.
I'm starting with this change as it is very small, and trivially
reverted if something breaks or the >= above really does need to be >.
If that proves the case, we can hide the problem by reverting this
patch, but the problem exists elsewhere as well, and so a more
comprehensive solution will be needed.
llvm-svn: 148001
2012-01-12 01:34:44 +00:00
Rafael Espindola
d90466bcbf
Support segmented stacks on mac.
...
This uses TLS slot 90, which actually belongs to JavaScriptCore. We only support
frames with static size
Patch by Brian Anderson.
llvm-svn: 147960
2012-01-11 19:00:37 +00:00
Rafael Espindola
4eecacb9c8
Generate the segmented stack prologue for fastcc too.
...
Patch by Brian Anderson.
llvm-svn: 147958
2012-01-11 18:41:19 +00:00
Chandler Carruth
3212a34269
Revert r147945 which disabled an addressing mode transformation. I had
...
hoped this would revive one of the llvm-gcc selfhost build bots, but it
didn't so it doesn't appear that my transform is the culprit.
If anyone else is seeing failures, please let me know!
llvm-svn: 147957
2012-01-11 18:36:12 +00:00
Rafael Espindola
2b89448d60
Use unsigned comparison in segmented stack prologue.
...
This is a comparison of two addresses, and GCC does the comparison unsigned.
Patch by Brian Anderson.
llvm-svn: 147954
2012-01-11 18:23:35 +00:00
Rafael Espindola
6635ae1c17
Explicitly set the scale to 1 on some segstack prologue instrs.
...
Patch by Brian Anderson.
llvm-svn: 147952
2012-01-11 18:14:03 +00:00
Jan Sjödin
21f83d9f36
Add XOP Intrinsics and tests
...
llvm-svn: 147949
2012-01-11 15:20:20 +00:00
Nadav Rotem
baae7e4577
Fix a bug in the lowering of BUILD_VECTOR for AVX. SCALAR_TO_VECTOR does not zero untouched elements. Use INSERT_VECTOR_ELT instead.
...
llvm-svn: 147948
2012-01-11 14:07:51 +00:00
Chandler Carruth
9bc48e5215
Disable the transformation I added in r147936 to see if it fixes some
...
strange build bot failures that look like a miscompile into an infloop.
I'll investigate this tomorrow, but I'd both like to know whether my
patch is the culprit, and get the bots back to green.
llvm-svn: 147945
2012-01-11 12:17:47 +00:00
Chandler Carruth
3eacfb83fa
Hoist a really redundant code pattern into a helper function, and delete
...
lots of lines of code. No functionality changed.
llvm-svn: 147942
2012-01-11 11:04:36 +00:00
Chandler Carruth
b0049f4a43
Simplify the AND-rooted mask+shift checking code to match that of the
...
SRL-rooted code.
llvm-svn: 147941
2012-01-11 09:35:04 +00:00
Chandler Carruth
3dbcda8478
Unify the interface of the three mask+shift transform helpers, and
...
factor the differences that were hiding in one of them into its other
caller, the SRL handling code. No change in behavior.
llvm-svn: 147940
2012-01-11 09:35:02 +00:00
Chandler Carruth
aa01e6661a
Clarify and make explicit some of the requirements for transforming
...
mask+shift pairs at the beginning of the ISD::AND case block, and then
hoist the final pattern into a helper function, simplifying and
reflowing it appropriately. This should have no observable behavior
change, but several simplifications fell out of this such as directly
computing the new mask constant, etc.
llvm-svn: 147939
2012-01-11 09:35:00 +00:00
Jakob Stoklund Olesen
6039983755
Fix undefined code and reenable test case.
...
I don't think the compact encoding code is right, but at least is has
defined behavior now.
llvm-svn: 147938
2012-01-11 09:08:04 +00:00
Chandler Carruth
51d3076bbf
Hoist the logic to transform shift+mask combinations into sub-register
...
extracts and scaled addressing modes into its own helper function. No
functionality changed here, just hoisting and layout fixes falling out
of that hoisting.
llvm-svn: 147937
2012-01-11 08:48:20 +00:00
Chandler Carruth
55b2cdee26
Teach the X86 instruction selection to do some heroic transforms to
...
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:
unsigned x = my_accelerator_table[input >> 11];
Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):
*(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));
The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.
In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.
llvm-svn: 147936
2012-01-11 08:41:08 +00:00
Lang Hames
995c63329a
Fixed order of operands in comment to match code.
...
llvm-svn: 147890
2012-01-10 22:53:20 +00:00
Joerg Sonnenberger
96cd35cf6d
Default stack alignment for 32bit x86 should be 4 Bytes, not 8 Bytes.
...
Add a test that checks the stack alignment of a simple function for
Darwin, Linux and NetBSD for 32bit and 64bit mode.
llvm-svn: 147888
2012-01-10 22:43:53 +00:00
Chad Rosier
1a8f0ccd8c
Add missing VEX predicates to VMOVSDto64rr/VMOVSDto64mr. This fixes a few
...
failing test cases on our internal AVX nightly tester.
rdar://10663637
llvm-svn: 147881
2012-01-10 22:14:06 +00:00
Bill Wendling
d5ab02600e
For i386, don't use the generic code.
...
As the comment around 7746 says, it's better to use the x87 extended precision
here than SSE. And the generic code doesn't know how to do that. It also regains
the speed lost for the uint64_to_float.c testcase.
<rdar://problem/10669858>
llvm-svn: 147869
2012-01-10 19:41:30 +00:00
Devang Patel
67bf992a8f
Add definition for intel asm variant.
...
Right now, this just adds additional entries in match table. The parser does not use them yet.
llvm-svn: 147859
2012-01-10 17:51:54 +00:00
David Blaikie
edbb58c577
Remove unnecessary default cases in switches that cover all enum values.
...
llvm-svn: 147855
2012-01-10 16:47:17 +00:00
Benjamin Kramer
077ae1d760
Add definitions for AMD's bobcat (aka btver1)
...
llvm-svn: 147846
2012-01-10 11:50:02 +00:00
Craig Topper
430f3f1bd6
Fix a crash in AVX2 when trying to broadcast a double into a 128-bit vector. There is no vbroadcastsd xmm, but we do need to support 64-bit integers broadcasted into xmm. Also factor the AVX check into the isVectorBroadcast function. This makes more sense since the AVX2 check was already inside.
...
llvm-svn: 147844
2012-01-10 08:23:59 +00:00
Craig Topper
b0c0f72ae6
Remove hasXMM/hasXMMInt functions. Move callers to hasSSE1/hasSSE2. This is the final piece to remove the AVX hack that disabled SSE.
...
llvm-svn: 147843
2012-01-10 06:54:16 +00:00
Craig Topper
d97bbd7b60
Remove hasSSE*orAVX functions and change all callers to use just hasSSE*. AVX is now an SSE level and no longer disables SSE checks.
...
llvm-svn: 147842
2012-01-10 06:37:29 +00:00
Craig Topper
eb8f9e9e5b
Instruction selection priority fixes to remove the XMM/XMMInt/orAVX predicates. Another commit will remove orAVX functions from X86SubTarget.
...
llvm-svn: 147841
2012-01-10 06:30:56 +00:00
Devang Patel
29ba4f97e6
Fix asm string wrt variants.
...
llvm-svn: 147805
2012-01-09 21:32:02 +00:00
Devang Patel
85d684a4d9
Split AsmParser into two components - AsmParser and AsmParserVariant
...
AsmParser holds info specific to target parser.
AsmParserVariant holds info specific to asm variants supported by the target.
llvm-svn: 147787
2012-01-09 19:13:28 +00:00
Chandler Carruth
c16622daff
Don't rely on the fact that shift values are never very large, and thus
...
this substraction will result in small negative numbers at worst which
become very large positive numbers on assignment and are thus caught by
the <=4 check on the next line. The >0 check clearly intended to catch
these as negative numbers.
Spotted by inspection, and impossible to trigger given the shift widths
that can be used.
llvm-svn: 147773
2012-01-09 09:47:25 +00:00
Craig Topper
f287a4509e
Remove AVX hack in X86Subtarget. AVX/AVX2 are now treated as an SSE level. Predicate functions have been altered to maintain previous names and behavior.
...
llvm-svn: 147770
2012-01-09 09:02:13 +00:00
Craig Topper
b89805c77d
Add HasAVX predicate to some of the AVX patterns.
...
llvm-svn: 147769
2012-01-09 08:34:00 +00:00
Craig Topper
a51f7f75c2
Reorder a bunch of patterns to put the AVX version first thus giving it priority over the SSE version. Another step towards trying to remove the AVX hack that disables SSE from X86Subtarget.
...
llvm-svn: 147768
2012-01-09 08:10:38 +00:00
Craig Topper
ef7f5bf8c9
Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing.
...
llvm-svn: 147767
2012-01-09 06:52:46 +00:00
Craig Topper
c1f5622ad3
Mark MOVNTI as being supported in SSE2 OR AVX mode. This instruction has no AVX equivalent so we should use the SSE version.
...
llvm-svn: 147766
2012-01-09 06:38:55 +00:00
Craig Topper
a081644f8a
Move SSE2 logical operations PAND/POR/PXOR/PANDN above SSE1 logical operations ANDPS/ORPS/XORPS/ANDNPS. This fixes a pattern ordering issue that meant that the SSE2 instructions could never be directly selected since the SSE1 patterns would always match first. This is largely moot with the ExeDepsFix pass, but I'm trying to audit for all such ordering issues.
...
llvm-svn: 147765
2012-01-09 05:07:01 +00:00
Craig Topper
210e4f81b3
Change some places that were checking for AVX OR SSE1/2 to use hasXMM/hasXMMInt instead. Also fix one place that checked SSE3, but accidentally excluded AVX to use hasSSE3orAVX. This is a step towards removing the AVX hack from the X86Subtarget.h
...
llvm-svn: 147764
2012-01-09 02:28:15 +00:00
Craig Topper
744f6311d3
Don't disable MMX support when AVX is enabled. Fix predicates for MMX instructions that were added along with SSE instructions to check for AVX in addition to SSE level.
...
llvm-svn: 147762
2012-01-09 00:11:29 +00:00
Craig Topper
c1ab7afec8
Enable FISTTP* instructions when AVX is enabled.
...
llvm-svn: 147758
2012-01-08 23:04:21 +00:00
Victor Umansky
540651cf59
Reverted commit #147601 upon Evan's request.
...
llvm-svn: 147748
2012-01-08 17:20:33 +00:00
Craig Topper
f210619d08
Fix typo in the X86 backend readme. Patch from Jaeden Amero.
...
llvm-svn: 147739
2012-01-07 20:35:21 +00:00
Benjamin Kramer
6898db6269
Remove VectorExtras. This unused helper was written for a type of API that is discouraged now.
...
llvm-svn: 147738
2012-01-07 19:42:13 +00:00
Craig Topper
ca66bba45e
Remove unnecessary check of hasAVX(). It's already included in hasXMM().
...
llvm-svn: 147734
2012-01-07 18:48:43 +00:00
Eric Christopher
c206d46709
Make the 'x' constraint work for AVX registers as well.
...
Fixes rdar://10614894
llvm-svn: 147704
2012-01-07 01:02:09 +00:00
Craig Topper
29b0737452
Mark scalar FMA4 instructions as ignoring the VEX.L bit.
...
llvm-svn: 147602
2012-01-05 08:56:10 +00:00
Victor Umansky
9255b6d9fe
Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX.
...
Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX)
Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov
llvm-svn: 147601
2012-01-05 08:46:19 +00:00
Bill Wendling
ac27f0c830
Replace the uint64_t -> double convertion algorithm with one that's more efficient.
...
This small bit of ASM code is sufficient to do what the old algorithm did:
movq %rax, %xmm0
punpckldq (c0), %xmm0 // c0: (uint4){ 0x43300000U, 0x45300000U, 0U, 0U }
subpd (c1), %xmm0 // c1: (double2){ 0x1.0p52, 0x1.0p52 * 0x1.0p32 }
#ifdef __SSE3__
haddpd %xmm0, %xmm0
#else
pshufd $0x4e, %xmm0, %xmm1
addpd %xmm1, %xmm0
#endif
It's arguably faster. One caveat, the 'haddpd' instruction isn't very fast on
all processors.
<rdar://problem/7719814>
llvm-svn: 147593
2012-01-05 02:13:20 +00:00
Benjamin Kramer
9c48f26341
Silence warnings of a mysterious compiler that still defaults to C89.
...
llvm-svn: 147553
2012-01-04 22:06:45 +00:00
Evan Cheng
104dbb0fd1
For x86, canonicalize max
...
(x > y) ? x : y
=>
(x >= y) ? x : y
So for something like
(x - y) > 0 : (x - y) ? 0
It will be
(x - y) >= 0 : (x - y) ? 0
This makes is possible to test sign-bit and eliminate a comparison against
zero. e.g.
subl %esi, %edi
testl %edi, %edi
movl $0, %eax
cmovgl %edi, %eax
=>
xorl %eax, %eax
subl %esi, $edi
cmovsl %eax, %edi
rdar://10633221
llvm-svn: 147512
2012-01-04 01:41:39 +00:00
Chad Rosier
6ca97df951
Fix 80-column violations.
...
llvm-svn: 147495
2012-01-03 23:19:12 +00:00
Nadav Rotem
6d31bac85e
Revert 147426 because it caused pr11696.
...
llvm-svn: 147485
2012-01-03 22:19:42 +00:00
Chad Rosier
493c1b3152
Enhance DAGCombine for transforming 128->256 casts into a vmovaps, rather
...
then a vxorps + vinsertf128 pair if the original vector came from a load.
rdar://10594409
llvm-svn: 147481
2012-01-03 21:05:52 +00:00
Devang Patel
c1215324a3
Intel style asm variant does not need '%' prefix.
...
llvm-svn: 147453
2012-01-03 18:22:10 +00:00
Craig Topper
5bacb7e9e5
Miscellaneous shuffle lowering cleanup. No functional changes. Primarily converting the indexing loops to unsigned to be consistent across functions.
...
llvm-svn: 147430
2012-01-02 09:17:37 +00:00
Craig Topper
53d559641f
Make CanXFormVExtractWithShuffleIntoLoad reject loads with multiple uses. Also make it return false if there's not even a load at all. This makes the code better match the code in DAGCombiner that it tries to match. These two changes prevent some cases where vector_shuffles were making it to instruction selection and causing the older shuffle selection code to be triggered. Also needed to fix a bad pattern that this change exposed. This is the first step towards getting rid of the old shuffle selection support. No test cases yet because there's no way to tell whether a shuffle was handled in the legalize stage or at instruction selection.
...
llvm-svn: 147428
2012-01-02 08:46:48 +00:00
Nadav Rotem
6c7a0e6c8b
Optimize the sequence blend(sign_extend(x)) to blend(shl(x)) since SSE blend instructions only look at the highest bit.
...
llvm-svn: 147426
2012-01-02 08:05:46 +00:00
Craig Topper
b910984458
Allow CRC32 instructions to be selected when AVX is enabled.
...
llvm-svn: 147411
2012-01-01 19:51:58 +00:00
Craig Topper
1c064e0a89
Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers.
...
llvm-svn: 147409
2012-01-01 19:40:22 +00:00
Benjamin Kramer
47aecca51a
X86Disassembler: Fix undefined behavior found by GCC 4.6
...
llvm-svn: 147404
2012-01-01 17:55:36 +00:00
Craig Topper
6e54ba7eee
Merge X86 SHUFPS and SHUFPD node types.
...
llvm-svn: 147394
2011-12-31 23:50:21 +00:00
Craig Topper
d51092d93a
Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load.
...
llvm-svn: 147393
2011-12-31 23:24:49 +00:00
Craig Topper
0e796fee11
Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected.
...
llvm-svn: 147392
2011-12-31 23:15:11 +00:00
Craig Topper
a5d1fc2cc7
Make FMA4 imply AVX so that YMM registers would be available. Necessitates removing from Bulldozer CPU types since it would enable AVX code generation implicitly. Also make SSE4A imply SSE3. Without some level of SSE implied, XMM registers wouldn't be legal.
...
llvm-svn: 147369
2011-12-30 07:16:00 +00:00
Craig Topper
2ba766ae84
Add disassembler support for VPERMIL2PD and VPERMIL2PS.
...
llvm-svn: 147368
2011-12-30 06:23:39 +00:00
Craig Topper
03a0beda88
Add FMA4 instructions to disassembler.
...
llvm-svn: 147367
2011-12-30 05:20:36 +00:00
Craig Topper
cd93de93fa
Separate the concept of having memory access in operand 4 from the concept of having the W bit set for XOP instructons. Removes ORing W-bits in the encoder and will similarly simplify the disassembler implementation.
...
llvm-svn: 147366
2011-12-30 04:48:54 +00:00
Craig Topper
c0f9bcb5d5
Combine FMA4 SS/SD patterns with the instruction definitions.
...
llvm-svn: 147365
2011-12-30 03:33:59 +00:00
Craig Topper
51fe43fcd9
Combine FMA4 PS/PD patterns with the instruction definitions.
...
llvm-svn: 147364
2011-12-30 03:17:15 +00:00
Craig Topper
6c08930c5e
Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms.
...
llvm-svn: 147361
2011-12-30 02:18:36 +00:00
Craig Topper
2ca79b9d4b
Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere.
...
llvm-svn: 147360
2011-12-30 01:49:53 +00:00
Craig Topper
d773607eee
Fix execution domains for PS/PD FMA3 instructions. Add SS/SD forms o FMA3 instructions.
...
llvm-svn: 147353
2011-12-29 20:43:40 +00:00
Craig Topper
8cab06a214
Expose FMA3 instructions to the disassembler.
...
llvm-svn: 147351
2011-12-29 20:03:14 +00:00
Craig Topper
e1bd05128e
Make FMA3 imply AVX needs to be enabled. Particularly because 256-bit types aren't valid unless AVX is enabled.
...
llvm-svn: 147349
2011-12-29 19:46:19 +00:00
Craig Topper
dd286a5201
Change XOP detection to use the correct CPUID bit instead of using the FMA4 bit.
...
llvm-svn: 147348
2011-12-29 19:25:56 +00:00
Craig Topper
a060afb5ba
Add FeaturePOPCNT to all CPU types that lost it was removed from SSE42/SSE4A in r147339.
...
llvm-svn: 147347
2011-12-29 18:47:31 +00:00
Craig Topper
97f05c5768
Mark non-VEX forms of PCLMUL instructions as requiring SSE2 to be enabled along with CLMUL. That's required for the XMM registers to be valid for integer data. Doesn't change any behavior since the CLMUL instructions don't have patterns yet.
...
llvm-svn: 147345
2011-12-29 18:08:36 +00:00
Craig Topper
1559123c77
Mark non-VEX forms of AES instructions as requiring SSE2 to be enabled along with AES. Since that's required for the XMM registers to be valid for integer data. Doesn't change any behavior though since you can't use an intrinsic with an illegal type anyway. Just makes it consistent with the VEX forms.
...
llvm-svn: 147344
2011-12-29 18:00:08 +00:00
Craig Topper
9e61291bf5
Remove the separate explicit AES instruction patterns. They are equivalent to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns.
...
llvm-svn: 147342
2011-12-29 17:41:56 +00:00
Craig Topper
7bd3305f3e
Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled on its own without disabling SSE4.2 or SSE4A.
...
llvm-svn: 147339
2011-12-29 15:51:45 +00:00
Craig Topper
0fdf720ded
Make LowerBUILD_VECTOR keep node vector types consistent when creating MOVL for v16i16 and v32i8.
...
llvm-svn: 147337
2011-12-29 03:34:54 +00:00
Craig Topper
862c9b65be
Remove some elses after returns.
...
llvm-svn: 147336
2011-12-29 03:20:51 +00:00
Craig Topper
274e20a499
Remove trailing spaces. Fix an assert to use && instead of || before string. Add same assert on similar code path.
...
llvm-svn: 147335
2011-12-29 03:09:33 +00:00
Eli Friedman
3a01ddb7e9
Fix type-checking for load transformation which is not legal on floating-point types. PR11674.
...
llvm-svn: 147323
2011-12-28 21:24:44 +00:00
Elena Demikhovsky
b3515a8d4b
Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR.
...
Matching MOVLP mask for AVX (265-bit vectors) was wrong.
The failure was detected by conformance tests.
llvm-svn: 147308
2011-12-28 08:14:01 +00:00
Craig Topper
df34d152bd
Add handling of x86_avx2_pmovmskb to computeMaskedBitsForTargetNode for consistency. Add comments and an assert for BMI instructions to PerformXorCombine since the enabling of the combine is conditional on it, but the function itself isn't.
...
llvm-svn: 147287
2011-12-27 06:27:23 +00:00
Rafael Espindola
a56ab0ede7
Section relative fixups are a coff concept, not a x86 one. Replace the
...
x86 specific reloc_coff_secrel32 with a generic FK_SecRel_4.
llvm-svn: 147252
2011-12-24 14:47:52 +00:00
Chandler Carruth
a3d54fe0ae
Use standard promotion for i8 CTTZ nodes and i8 CTLZ nodes when the
...
LZCNT instructions are available. Force promotion to i32 to get
a smaller encoding since the fix-ups necessary are just as complex for
either promoted type
We can't do standard promotion for CTLZ when lowering through BSR
because it results in poor code surrounding the 'xor' at the end of this
instruction. Essentially, if we promote the entire CTLZ node to i32, we
end up doing the xor on a 32-bit CTLZ implementation, and then
subtracting appropriately to get back to an i8 value. Instead, our
custom logic just uses the knowledge of the incoming size to compute
a perfect xor. I'd love to know of a way to fix this, but so far I'm
drawing a blank. I suspect the legalizer could be more clever and/or it
could collude with the DAG combiner, but how... ;]
llvm-svn: 147251
2011-12-24 12:12:34 +00:00
Chandler Carruth
38ce24455d
Add systematic testing for cttz as well, and fix the bug I spotted by
...
inspection earlier.
llvm-svn: 147250
2011-12-24 11:46:10 +00:00
Benjamin Kramer
767bbe48c1
Chandler fixed this.
...
llvm-svn: 147247
2011-12-24 11:23:32 +00:00
Chandler Carruth
c9fcde2347
Expand more when we have a nice 'tzcnt' instruction, to avoid generating
...
'bsf' instructions here.
This one is actually debatable to my eyes. It's not clear that any chip
implementing 'tzcnt' would have a slow 'bsf' for any reason, and unless
EFLAGS or a zero input matters, 'tzcnt' is just a longer encoding.
Still, this restores the old behavior with 'tzcnt' enabled for now.
llvm-svn: 147246
2011-12-24 11:11:38 +00:00
Chandler Carruth
7e9453e916
Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the
...
X86ISelLowering C++ code. Because this is lowered via an xor wrapped
around a bsr, we want the dagcombine which runs after isel lowering to
have a chance to clean things up. In particular, it is very common to
see code which looks like:
(sizeof(x)*8 - 1) ^ __builtin_clz(x)
Which is trying to compute the most significant bit of 'x'. That's
actually the value computed directly by the 'bsr' instruction, but if we
match it too late, we'll get completely redundant xor instructions.
The more naive code for the above (subtracting rather than using an xor)
still isn't handled correctly due to the dagcombine getting confused.
Also, while here fix an issue spotted by inspection: we should have been
expanding the zero-undef variants to the normal variants when there is
an 'lzcnt' instruction. Do so, and test for this. We don't want to
generate unnecessary 'bsr' instructions.
These two changes fix some regressions in encoding and decoding
benchmarks. However, there is still a *lot* to be improve on in this
type of code.
llvm-svn: 147244
2011-12-24 10:55:54 +00:00
Rafael Espindola
908d2ed14e
Move x86 specific bits of the COFF writer to lib/Target/X86.
...
llvm-svn: 147231
2011-12-24 02:14:02 +00:00
Chad Rosier
00bbedff03
Fix 80-column violations.
...
llvm-svn: 147192
2011-12-22 22:35:21 +00:00
Chad Rosier
3172488cc0
Fix 80-column violations.
...
llvm-svn: 147095
2011-12-21 20:59:09 +00:00
Chad Rosier
3ede414127
No case stmt for BUILD_VECTOR in PerformDAGCombine(), so I assume this isn't
...
necessary. Please chime in if I'm mistaken.
llvm-svn: 147065
2011-12-21 19:14:52 +00:00
Rafael Espindola
b264d33854
Move the X86 specific bits of the ELF writer to the Target/X86 directory.
...
Other targets will follow shortly.
llvm-svn: 147060
2011-12-21 17:30:17 +00:00
Rafael Espindola
1ad4095d6b
Reduce the exposure of Triple::OSType in the ELF object writer. This will
...
avoid including ADT/Triple.h in many places when the target specific bits are
moved.
llvm-svn: 147059
2011-12-21 17:00:36 +00:00
Craig Topper
b8b1b4c1de
Remove mode specific disassembler classes and just call X86GenericDisassembler constructor with appropriate argument in the creation functions. This removes a few tables that needed to be anchored.
...
llvm-svn: 147046
2011-12-21 08:06:52 +00:00
Craig Topper
f30188418b
Fix typo in a couple comments
...
llvm-svn: 147045
2011-12-21 06:30:53 +00:00
Elena Demikhovsky
ec7e6e0946
This is the second fix related to VZEXT_MOVL node.
...
The failure that I see in the current version is:
LLVM ERROR: Cannot select: 0x18b8f70: v4i64 = X86ISD::VZEXT_MOVL 0x18beee0 [ID=14]
0x18beee0: v4i64 = insert_subvector 0x18b8c70, 0x18b9170, 0x18b9570 [ID=13]
0x18b8c70: v4i64 = insert_subvector 0x18b9870, 0x18bf4e0, 0x18b9970 [ID=12]
0x18b9870: v4i64 = undef [ID=4]
0x18bf4e0: v2i64 = bitcast 0x18bf3e0 [ID=10]
0x18bf3e0: v4i32 = BUILD_VECTOR 0x18b9770, 0x18b9770, 0x18b9770, 0x18b9770 [ID=8]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9770: i32 = TargetConstant<0> [ID=6]
0x18b9970: i32 = Constant<0> [ID=3]
0x18b9170: v2i64 = undef [ORD=1] [ID=1]
0x18b9570: i32 = Constant<2> [ID=5]
llvm-svn: 146975
2011-12-20 13:34:28 +00:00
Chandler Carruth
24680c24d8
Begin teaching the X86 target how to efficiently codegen patterns that
...
use the zero-undefined variants of CTTZ and CTLZ. These are just simple
patterns for now, there is more to be done to make real world code using
these constructs be optimized and codegen'ed properly on X86.
The existing tests are spiffed up to check that we no longer generate
unnecessary cmov instructions, and that we generate the very important
'xor' to transform bsr which counts the index of the most significant
one bit to the number of leading (most significant) zero bits. Also they
now check that when the variant with defined zero result is used, the
cmov is still produced.
llvm-svn: 146974
2011-12-20 11:19:37 +00:00
Chandler Carruth
e805b16e3d
Fix up the CMake build for the new files added in r146960, they're
...
likely to stay either way that discussion ends up resolving itself.
llvm-svn: 146966
2011-12-20 08:42:11 +00:00
David Blaikie
a379b18173
Unweaken vtables as per http://llvm.org/docs/CodingStandards.html#ll_virtual_anch
...
llvm-svn: 146960
2011-12-20 02:50:00 +00:00
Jakob Stoklund Olesen
c7b437ae34
Emit a getMatchingSuperRegClass() implementation for every target.
...
Use information computed while inferring new register classes to emit
accurate, table-driven implementations of getMatchingSuperRegClass().
Delete the old manual, error-prone implementations in the targets.
llvm-svn: 146873
2011-12-19 16:53:34 +00:00
Benjamin Kramer
1b54835a10
Another variadics tweak.
...
llvm-svn: 146852
2011-12-18 20:51:31 +00:00
Benjamin Kramer
530b820500
Use the fancy new VariadicFunction template instead of a plain variadic function.
...
Some compilers were complaining about passing StringRef to it.
llvm-svn: 146850
2011-12-18 19:59:20 +00:00
Craig Topper
a913dde0ef
Remove an unused X86ISD node type.
...
llvm-svn: 146833
2011-12-17 19:16:44 +00:00
Benjamin Kramer
792edd3c75
X86: Factor the bswap asm matching to be slightly less horrible to read.
...
llvm-svn: 146831
2011-12-17 14:36:05 +00:00
Rafael Espindola
d3df3d3527
Add back the MC bits of 126425. Original patch by Nathan Jeffords. I added the
...
asm parsing and testcase.
llvm-svn: 146801
2011-12-17 01:14:52 +00:00
Lang Hames
da07b3ad42
Make sure that the lower bits on the VSELECT condition are properly set.
...
llvm-svn: 146800
2011-12-17 01:08:46 +00:00
Craig Topper
a4d411cb1b
Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is supported. Fix 'unpackh v, v' for 256-bit types to understand 128-bit lanes.
...
llvm-svn: 146726
2011-12-16 08:06:31 +00:00
Eli Friedman
64944090ff
Make sure we correctly note the existence of an i8 immediate for vblendvps and friends, so we compute fixups correctly. PR11586.
...
llvm-svn: 146709
2011-12-15 23:46:18 +00:00
Chad Rosier
41dbf59e12
Add missing zmovl AVX patterns which were causing crashes.
...
Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>!
llvm-svn: 146689
2011-12-15 22:11:31 +00:00
Chad Rosier
75ed9dcbc6
Fix assert in LowerBUILD_VECTOR for v16i16 type on AVX.
...
Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>!
llvm-svn: 146684
2011-12-15 21:34:44 +00:00
Lang Hames
c44b5e469b
Fix VSELECT operand order. Was previously backwards, causing bogus vector shift results - <rdar://problem/10559581>.
...
llvm-svn: 146671
2011-12-15 18:57:27 +00:00
Chad Rosier
b7a0b89ff0
Use SmallVector/assign(), rather than std::vector/push_back().
...
llvm-svn: 146627
2011-12-15 01:16:09 +00:00
Chad Rosier
1940baa76b
Add support for lowering fneg when AVX is enabled.
...
rdar://10566486
llvm-svn: 146625
2011-12-15 01:02:25 +00:00
Bill Wendling
ae94fb4009
The saved registers weren't being processed in the correct order. This lead to
...
the compact unwind claiming that one register was saved before another, which
isn't all that great in general. Process them in the natural order. Reverse the
list only when necessary for the algorithm.
llvm-svn: 146612
2011-12-14 23:53:24 +00:00
Evan Cheng
7fae11b231
- Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function
...
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
prevent IT blocks from being broken apart.
llvm-svn: 146542
2011-12-14 02:11:42 +00:00
Chandler Carruth
637cc6a8aa
Initial CodeGen support for CTTZ/CTLZ where a zero input produces an
...
undefined result. This adds new ISD nodes for the new semantics,
selecting them when the LLVM intrinsic indicates that the undef behavior
is desired. The new nodes expand trivially to the old nodes, so targets
don't actually need to do anything to support these new nodes besides
indicating that they should be expanded. I've done this for all the
operand types that I could figure out for all the targets. Owners of
various targets, please review and let me know if any of these are
incorrect.
Note that the expand behavior is *conservatively correct*, and exactly
matches LLVM's current behavior with these operations. Ideally this
patch will not change behavior in any way. For example the regtest suite
finds the exact same instruction sequences coming out of the code
generator. That's why there are no new tests here -- all of this is
being exercised by the existing test suite.
Thanks to Duncan Sands for reviewing the various bits of this patch and
helping me get the wrinkles ironed out with expanding for each target.
Also thanks to Chris for clarifying through all the discussions that
this is indeed the approach he was looking for. That said, there are
likely still rough spots. Further review much appreciated.
llvm-svn: 146466
2011-12-13 01:56:10 +00:00
Daniel Dunbar
8889bb08b8
LLVMBuild: Introduce a common section which currently has a list of the
...
subdirectories to traverse into.
- Originally I wanted to avoid this and just autoscan, but this has one key
flaw in that new subdirectories can not automatically trigger a rerun of the
llvm-build tool. This is particularly a pain when switching back and forth
between trees where one has added a subdirectory, as the dependencies will
tend to be wrong. This will also eliminates FIXME implicitly.
llvm-svn: 146436
2011-12-12 22:45:54 +00:00
Daniel Dunbar
27a7489a03
LLVMBuild: Remove trailing newline, which irked me.
...
llvm-svn: 146409
2011-12-12 19:48:00 +00:00
Jan Sjödin
7c0face455
XOP instructions and encoding tests.
...
llvm-svn: 146407
2011-12-12 19:37:49 +00:00
Jan Sjödin
6dd2488383
XOP encoding bits and logic.
...
llvm-svn: 146397
2011-12-12 19:12:26 +00:00
Craig Topper
1fdfec63a4
Remove some remants of the old palign pattern fragment that were still hanging around. Also remove a cast from inside getShuffleVPERM2X128Immediate and getShuffleVPERMILPImmediate since the only caller already had done the cast.
...
llvm-svn: 146344
2011-12-11 19:12:35 +00:00
Rafael Espindola
c7f355b8e1
Handle expressions of the form _GLOBAL_OFFSET_TABLE_-symbol the same way gas
...
does. The _GLOBAL_OFFSET_TABLE_ is still magical in that we get a R_386_GOTPC,
but it doesn't change the immediate in the same way as when the expression
has no right hand side symbol.
llvm-svn: 146311
2011-12-10 02:28:43 +00:00
Benjamin Kramer
863683c590
This is now implemented.
...
llvm-svn: 146258
2011-12-09 15:45:57 +00:00
Benjamin Kramer
16bbfbec66
X86: Add patterns for the various rounding ops for SSE4.1 and AVX.
...
llvm-svn: 146257
2011-12-09 15:44:03 +00:00
Benjamin Kramer
2dc5dec41d
X86: Split (v)rounds[sd] into a normal and an intrinsic version.
...
llvm-svn: 146256
2011-12-09 15:43:55 +00:00
Evan Cheng
557cda7f1d
Remove hasSSE1orAVX(). It's the same as hasXMM().
...
llvm-svn: 146246
2011-12-09 06:32:46 +00:00
Evan Cheng
b96bca81e7
Add 256-bit variant vmovss and vmovsd patterns. rdar://10538417
...
llvm-svn: 146196
2011-12-08 22:30:45 +00:00
Evan Cheng
2a217be25f
Add various missing AVX patterns which was causing crashes. Sadly, the generated
...
code looks pretty bad compared to SSE.
rdar://10538793
llvm-svn: 146191
2011-12-08 22:05:28 +00:00
Owen Anderson
57a7f41d5d
Don't explicitly marked libm rounding ops as legal on SSE4.1/AVX. There don't seem to be patterns for these, so I don't know why they were marked legal in the first place.
...
Fixes failures caused by r146171.
llvm-svn: 146180
2011-12-08 20:51:38 +00:00
Owen Anderson
0b9b9da6c8
Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise.
...
llvm-svn: 146171
2011-12-08 19:32:14 +00:00
Evan Cheng
4d1a2d449f
Many of the SSE patterns should not be selected when AVX is available. This led to the following code in X86Subtarget.cpp
...
if (HasAVX)
X86SSELevel = NoMMXSSE;
This is so patterns that are predicated on hasSSE3, etc. would not be selected when avx is available. Instead, the AVX variant is selected.
However, this breaks instructions which do not have AVX variants.
The right way to fix this is for the SSE but not-AVX patterns to predicate on something like hasSSE3() && !hasAVX().
Then we can take out the hack in X86Subtarget.cpp. Patterns which do not have AVX variants do not need to change.
However, we need to audit all the patterns before we make the change. This patch is workaround that fixes one specific case,
the prefetch instructions. rdar://10538297
llvm-svn: 146163
2011-12-08 19:00:42 +00:00
Jan Sjödin
d19760a40c
Src2 and src3 were accidentally swapped for the FMA4 rr patterns. Undo this and fix the encoding.
...
llvm-svn: 146151
2011-12-08 14:43:19 +00:00
Craig Topper
1d578e8835
Fix a bunch of SSE/AVX patterns to use proper memop types. In particular, not using integer loads other than v2i64/v4i64 since the others are all promoted.
...
llvm-svn: 146031
2011-12-07 08:30:53 +00:00
Bill Wendling
302cf8d5d0
Adjust the stack by one pointer size for all frameless stacks.
...
llvm-svn: 146030
2011-12-07 07:58:55 +00:00
Bill Wendling
3c86459997
Fix off-by-one error when encoding the stack size for a frameless stack.
...
llvm-svn: 146029
2011-12-07 07:49:49 +00:00
Evan Cheng
7f8e563a69
Add bundle aware API for querying instruction properties and switch the code
...
generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.
For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.
llvm-svn: 146026
2011-12-07 07:15:52 +00:00
Bill Wendling
67a70c995a
Explicitly check for the different SUB instructions.
...
llvm-svn: 145976
2011-12-06 22:14:27 +00:00
Bill Wendling
5a173cd367
Encode the total stack if there isn't a frame.
...
llvm-svn: 145969
2011-12-06 21:34:01 +00:00
Bill Wendling
a73c0c99ea
* Add a macro to remove a magic number.
...
* Rename variables to reflect what they're actually used for.
llvm-svn: 145968
2011-12-06 21:23:42 +00:00
Bill Wendling
87571b6392
Check the correct value for small stack sizes. Also modify some comments.
...
llvm-svn: 145954
2011-12-06 19:16:17 +00:00
Bill Wendling
a4e87944a8
For a small sized stack, we encode that value directly with no "stack adjust" value.
...
llvm-svn: 145952
2011-12-06 19:09:06 +00:00
Craig Topper
83320e03e6
Add X86ISD::HADD/HSUB to getTargetNodeName
...
llvm-svn: 145929
2011-12-06 09:31:36 +00:00
Craig Topper
6572e0f203
Fix a bunch of SSE/AVX patterns to use v2i64/v4i64 loads since all other integer vector loads are promoted to those.
...
llvm-svn: 145927
2011-12-06 09:04:59 +00:00
Craig Topper
8d4ba198d6
Merge floating point and integer UNPCK X86ISD node types.
...
llvm-svn: 145926
2011-12-06 08:21:25 +00:00
Craig Topper
3cb802c775
Clean up some of the shuffle decoding code for UNPCK instructions. Add instruction commenting for AVX/AVX2 forms for integer UNPCKs.
...
llvm-svn: 145924
2011-12-06 05:31:16 +00:00
Craig Topper
bf41eb3a98
Merge isSHUFPMask and isCommutedSHUFPMask into single function that can do both. Do the same for the 256-bit version. Use loops to reduce size of isVSHUFPYMask. Fix test cases that were incorrectly passing due to isCommutedSHUFPMask not checking for the vector being 128-bit. This caused some 256-bit shuffles to be incorrectly commuted.
...
llvm-svn: 145921
2011-12-06 04:59:07 +00:00
Bill Wendling
4e87e850a2
Add a comment.
...
llvm-svn: 145896
2011-12-06 01:57:48 +00:00
Jakob Stoklund Olesen
10e1252269
Use logarithmic units for basic block alignment.
...
This was actually a bit of a mess. TLI.setPrefLoopAlignment was clearly
documented as taking log2(bytes) units, but the x86 target would still
set a preferred loop alignment of '16'.
CodePlacementOpt passed this number on to the basic block, and
AsmPrinter interpreted it as bytes.
Now both MachineFunction and MachineBasicBlock use logarithmic
alignments.
Obviously, MachineConstantPool still measures alignments in bytes, so we
can emulate the thrill of using as.
llvm-svn: 145889
2011-12-06 01:26:19 +00:00
Bill Wendling
f7cef7ecad
The compact encoding of the registers are 3-bits each. Make sure we shift the
...
value over that much.
llvm-svn: 145888
2011-12-06 01:26:14 +00:00
Jim Grosbach
25b63fa117
Move target-specific logic out of generic MCAssembler.
...
Whether a fixup needs relaxation for the associated instruction is a
target-specific function, as the FIXME indicated. Create a hook for that
and use it.
llvm-svn: 145881
2011-12-06 00:47:03 +00:00
Craig Topper
51bec1a37a
Remove some leftover remnants that once tried to create 64-bit MMX PALIGNR instructions.
...
llvm-svn: 145804
2011-12-05 07:27:14 +00:00
Craig Topper
6a55b1dd9f
Clean up and optimizations to the X86 shuffle lowering code. No functional change.
...
llvm-svn: 145803
2011-12-05 06:56:46 +00:00
Sanjoy Das
006e43bcc0
Check for stack space more intelligently.
...
libgcc sets the stack limit field in TCB to 256 bytes above the actual
allocated stack limit. This means if the function's stack frame needs
less than 256 bytes, we can just compare the stack pointer with the
stack limit. This should result in lesser calls to __morestack.
llvm-svn: 145766
2011-12-03 09:32:07 +00:00
Sanjoy Das
165ca1d4ba
Fix a bug in the x86-32 code generated for segmented stacks.
...
Currently LLVM pads the call to __morestack with a add and sub of 8
bytes to esp. This isn't correct since __morestack expects the call
to be followed directly by a ret.
This commit also adjusts the relevant test-case.
llvm-svn: 145765
2011-12-03 09:21:07 +00:00
Nick Lewycky
8fd1254a0a
Creating multiple JITs on X86 in multiple threads causes multiple writes (of
...
the same value) to this variable. This code could be refactored, but it doesn't
matter since the old JIT is going away. Add tsan annotations to ignore the
race.
llvm-svn: 145745
2011-12-03 02:45:50 +00:00
Nick Lewycky
50f02cb21b
Move global variables in TargetMachine into new TargetOptions class. As an API
...
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.
One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.
llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Jan Sjödin
1280eb1d06
Add XOP feature flag.
...
llvm-svn: 145682
2011-12-02 15:14:37 +00:00
Craig Topper
b67440367f
Reduce duplicate code in isHorizontalBinOp and add some asserts to protect assumptions
...
llvm-svn: 145681
2011-12-02 08:18:41 +00:00
Craig Topper
abeb79eee3
Add instruction selection support for horizontal add/sub of 256-bit floating point vectors. Also add the test case for 256-bit integer vectors.
...
llvm-svn: 145680
2011-12-02 07:16:01 +00:00
Sanjoy Das
f60485c4cf
Dummy commit to check commit access.
...
llvm-svn: 145619
2011-12-01 19:15:08 +00:00
Eric Christopher
9da7f305a4
For 64-bit the rest of the general regs are ok for the q constraint. Make
...
sure we can emit both the high and low versions of those registers.
Fixes rdar://10392864
llvm-svn: 145579
2011-12-01 08:12:41 +00:00
Eli Friedman
d61887dd0a
Pass AVX vectors which are arguments to varargs functions on the stack. <rdar://problem/10463281>.
...
llvm-svn: 145573
2011-12-01 04:49:21 +00:00
Jan Sjödin
9430e284a9
Support for encoding all FMA4 instructions and tablegen patterns for all
...
remaining FMA4 instructions and intrinsics with tests.
llvm-svn: 145525
2011-11-30 22:09:42 +00:00
Benjamin Kramer
5feb3dab79
X86: Turns out bulldozer also supports sse42 and lzcnt.
...
While at it remove the barcelona/instanbul/shanghai subtargets, they're
unsupported by GCC and look pretty broken.
llvm-svn: 145494
2011-11-30 15:48:16 +00:00
Benjamin Kramer
981f32327d
X86: Add subtargets for AMD's bulldozer.
...
llvm-svn: 145493
2011-11-30 15:27:46 +00:00
Nadav Rotem
96923cc2bb
X86: PerformOrCombine introduced a vselect node with a wrong order of operands. This bug was introduced when a dedicated blend sdnode was replaced with the vselect node (in 139479).
...
llvm-svn: 145488
2011-11-30 10:13:37 +00:00
Craig Topper
c4977ba413
Add instruction selection support for AVX2 horizontal add/sub instructions.
...
llvm-svn: 145487
2011-11-30 09:10:50 +00:00
Craig Topper
0a672eaf9e
Merge VPERM2F128/VPERM2I128 ISD node types.
...
llvm-svn: 145485
2011-11-30 07:47:51 +00:00
Craig Topper
bafd224c8b
Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128.
...
llvm-svn: 145483
2011-11-30 06:25:25 +00:00
Evan Cheng
648e48d02e
Add another missing pattern. llvm-gcc likes f64 but clang likes i64 so it was generating poor code for some SSE builtins.
...
llvm-svn: 145448
2011-11-29 22:48:34 +00:00
Jakob Stoklund Olesen
bde32d36bb
Make X86::FsFLD0SS / FsFLD0SD real pseudo-instructions.
...
Like V_SET0, these instructions are expanded by ExpandPostRA to xorps /
vxorps so they can participate in execution domain swizzling.
This also makes the AVX variants redundant.
llvm-svn: 145440
2011-11-29 22:27:25 +00:00
Daniel Dunbar
539d0a8a09
build/CMake: Finish removal of add_llvm_library_dependencies.
...
llvm-svn: 145420
2011-11-29 19:25:30 +00:00
Michael J. Spencer
de3a2118db
MC/X86/COFF: Allow quotes in names when targeting MS/Windows,
...
as MC is the only assembler we support.
This splits MS/Windows and GNU/Windows ASM infos into two seperate classes.
While there is currently only one difference, full MS C++ ABI support will
require many more.
llvm-svn: 145409
2011-11-29 18:00:06 +00:00
Elena Demikhovsky
7a81dea516
Fixed vsqrt.ss intrinsic usage - order of input operands was wrong.
...
Added a test.
Thanks Bruno for reviewing the patch.
llvm-svn: 145403
2011-11-29 15:00:45 +00:00
Craig Topper
1d63ae3731
Fix shuffle decoding for memory forms for (V)SHUFPS/D.
...
llvm-svn: 145392
2011-11-29 07:58:09 +00:00
Craig Topper
c16db840be
Fix issues in shuffle decoding around VPERM* instructions. Fix shuffle decoding for VSHUFPS/D for 256-bit types. Add pattern matching for memory forms of VPERMILPS/VPERMILPD.
...
llvm-svn: 145390
2011-11-29 07:49:05 +00:00
Craig Topper
12b72def4e
Fix VINSERTF128/VEXTRACTF128 to be marked as FP instructions. Allow execution dependency fix pass to convert them to their integer equivalents when AVX2 is enabled.
...
llvm-svn: 145376
2011-11-29 05:37:58 +00:00
Craig Topper
897a7d4b9c
Correctly mark VPERM2F128 as being an FP instruction and add execution domain fixing support to convert it to VPERM2I128 for AVX2.
...
llvm-svn: 145370
2011-11-29 03:57:34 +00:00
Evan Cheng
aa93ceb164
Add missing avx pattern.
...
llvm-svn: 145272
2011-11-28 20:27:23 +00:00
Craig Topper
818a983e93
Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.
...
llvm-svn: 145238
2011-11-28 10:14:51 +00:00
Craig Topper
b0456936da
Make isCommutedVSHUFP more like the way isCommutedSHUFP is handled.
...
llvm-svn: 145218
2011-11-28 01:14:24 +00:00
Craig Topper
79ee88a511
Merge detecting and handling for VSHUFPSY and VSHUFPDY since a lot of the code was similar for both.
...
llvm-svn: 145199
2011-11-27 21:41:12 +00:00
Craig Topper
51280d565b
Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.
...
llvm-svn: 145153
2011-11-26 22:55:48 +00:00
Craig Topper
7704bd7ac3
Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.
...
llvm-svn: 145148
2011-11-26 20:47:44 +00:00
Bruno Cardoso Lopes
0f9a1f5e6c
This patch contains support for encoding FMA4 instructions and
...
tablegen patterns for scalar FMA4 operations and intrinsic. Also
add tests for vfmaddsd.
Patch by Jan Sjodin
llvm-svn: 145133
2011-11-25 19:33:42 +00:00
Craig Topper
d65a444478
Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.
...
llvm-svn: 145126
2011-11-24 22:57:10 +00:00
Craig Topper
d26466748b
Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish.
...
llvm-svn: 145125
2011-11-24 22:20:08 +00:00
Benjamin Kramer
651db37352
X86: alias cqo to cqto.
...
llvm-svn: 145121
2011-11-24 12:02:46 +00:00
Benjamin Kramer
ebcb451874
X86: Use btq for bit tests if the immediate can't be encoded in 32 bits.
...
Before:
movabsq $4294967296, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x01,0x00,0x00,0x00]
testq %rax, %rdi ## encoding: [0x48,0x85,0xf8]
jne LBB0_2 ## encoding: [0x75,A]
After:
btq $32, %rdi ## encoding: [0x48,0x0f,0xba,0xe7,0x20]
jb LBB0_2 ## encoding: [0x72,A]
btq is usually slower than testq because it doesn't fuse with the jump, but here we're better off
saving one register and a giant movabsq.
llvm-svn: 145103
2011-11-23 13:54:17 +00:00
Elena Demikhovsky
779ba6d7b7
I added several lines in X86 code generator that allow to choose
...
VSHUFPS/VSHUFPD instructions while lowering VECTOR_SHUFFLE node. I check a commuted VSHUFP mask.
The patch was reviewed by Bruno.
llvm-svn: 145099
2011-11-23 10:23:16 +00:00
Jakob Stoklund Olesen
02845410f9
Fix PR11422.
...
This was a bug in keeping track of the available domains when merging
domain values.
The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr
to the integer domain which is only available in AVX2.
Also add an assertion to catch future attempts at emitting AVX2
instructions.
llvm-svn: 145096
2011-11-23 04:03:08 +00:00
Craig Topper
83c4592619
More fixes to the X86InstComments for shuffle instructions. In particular add AVX flavors of many instructions and fix the destination operand for some of the existing AVX entries.
...
llvm-svn: 145063
2011-11-22 14:27:57 +00:00
Craig Topper
ccb7097509
Fix shuffle decoding logic to handle UNPCKLPS/UNPCKLPD on 256-bit vectors correctly. Add support for decoding UNPCKHPS/UNPCKHPD for AVX 128-bit and 256-bit forms.
...
llvm-svn: 145055
2011-11-22 01:57:35 +00:00
Craig Topper
f563977795
Add methods for querying minimum SSE version along with AVX. Simplifies all the places that had to check a version of SSE and AVX.
...
llvm-svn: 145053
2011-11-22 00:44:41 +00:00
Craig Topper
6270d072c5
Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled.
...
llvm-svn: 145028
2011-11-21 08:26:50 +00:00
Craig Topper
669199ca94
Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled.
...
llvm-svn: 145026
2011-11-21 06:57:39 +00:00
Craig Topper
a065238c6e
Make LowerSIGN_EXTEND_INREG split 256-bit vectors when AVX1 is enabled and use AVX2 shifts when AVX2 is enabled.
...
llvm-svn: 145022
2011-11-21 01:12:36 +00:00
Craig Topper
e79761df73
Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine.
...
llvm-svn: 145005
2011-11-20 00:12:05 +00:00
Craig Topper
a3a6583694
Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled.
...
llvm-svn: 145004
2011-11-19 22:34:59 +00:00
Craig Topper
bac86038ac
Remove some of the special classes that worked around an old tablegen limitation of not being able to remove redundant bitconverts from patterns.
...
llvm-svn: 145003
2011-11-19 21:01:54 +00:00
Craig Topper
3af6ae089f
Custom lower AVX2 variable shift intrinsics to shl/srl/sra nodes and remove the intrinsic patterns.
...
llvm-svn: 144999
2011-11-19 17:46:46 +00:00
Craig Topper
f984efbfce
Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors.
...
llvm-svn: 144989
2011-11-19 09:02:40 +00:00
Craig Topper
81390be00f
Collapse X86 PSIGNB/PSIGNW/PSIGND node types.
...
llvm-svn: 144988
2011-11-19 07:33:10 +00:00
Craig Topper
de6b73bb4d
Extend VPBLENDVB and VPSIGN lowering to work for AVX2.
...
llvm-svn: 144987
2011-11-19 07:07:26 +00:00
Craig Topper
66e2b5a61e
Remove unused parameters from the AVX maskmov classes.
...
llvm-svn: 144985
2011-11-19 04:49:22 +00:00
Nadav Rotem
1ec141d0f9
Add AVX2 vpbroadcast support
...
llvm-svn: 144967
2011-11-18 02:49:55 +00:00
Craig Topper
f41e1d0246
Fix SSE/AVX integer comparison patterns to understand that all integer vector loads are promoted to i64 vector loads so patterns need a bitconvert. Also slightly simplify the AVX2 variable shift patterns by using the predefined bitconvert pattern fragments.
...
llvm-svn: 144896
2011-11-17 07:49:38 +00:00
Craig Topper
f17b600577
Remove seemingly unnecessary duplicate VROUND definitions.
...
llvm-svn: 144885
2011-11-17 07:04:00 +00:00
Eli Friedman
20439a42b0
Turn on vzeroupper insertion on call boundaries for AVX; it works as far as I know, and I'd like to see wider testing.
...
llvm-svn: 144867
2011-11-17 00:21:52 +00:00
Evan Cheng
011538dc79
Another missing X86ISD::MOVLPD pattern. rdar://10450317
...
llvm-svn: 144839
2011-11-16 22:24:44 +00:00
Pete Cooper
48784ed5b7
Added missing comment about new custom lowering of DEC64
...
llvm-svn: 144811
2011-11-16 19:03:23 +00:00
Evan Cheng
ecb2908bf9
Sink codegen optimization level into MCCodeGenInfo along side relocation model
...
and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.
llvm-svn: 144788
2011-11-16 08:38:26 +00:00
Craig Topper
3ed7d9ee5a
Fix the execution domain on a bunch of SSE/AVX instructions.
...
llvm-svn: 144784
2011-11-16 07:30:46 +00:00
Craig Topper
07d8b5e2c9
Remove code to enable execution dependency fix pass on VR256. VR128 is sufficient after r144636.
...
llvm-svn: 144777
2011-11-16 05:02:04 +00:00
Nadav Rotem
37010002f2
AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code.
...
llvm-svn: 144720
2011-11-15 22:50:37 +00:00
Pete Cooper
7c7ba1baa1
Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
...
by later instructions.
Only done for DEC64m right now.
Fixes <rdar://problem/6172640>
llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Jay Foad
0745e645e0
Remove some unnecessary includes of PseudoSourceValue.h.
...
llvm-svn: 144631
2011-11-15 07:24:32 +00:00
Craig Topper
649d1c5eec
Fix PR11370 for real. Prevents converting 256-bit FP instruction to AVX2 256-bit integer instructions when AVX2 isn't enabled.
...
llvm-svn: 144629
2011-11-15 06:39:01 +00:00
Craig Topper
05baa85f58
Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370.
...
llvm-svn: 144622
2011-11-15 05:55:35 +00:00
Jakob Stoklund Olesen
f8ad336bc4
Break false dependencies before partial register updates.
...
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.
The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.
The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.
llvm-svn: 144602
2011-11-15 01:15:30 +00:00
Evan Cheng
fb13d32b3f
Add a missing pattern for X86ISD::MOVLPD. rdar://10436044
...
llvm-svn: 144566
2011-11-14 20:35:52 +00:00
Pete Cooper
890e02e854
Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom lowered
...
Constant idx case is still done in tablegen but other cases are then expanded
Fixes <rdar://problem/10435460>
llvm-svn: 144557
2011-11-14 19:38:42 +00:00
Craig Topper
182b00a2e0
Add AVX2 version of instructions to load folding tables. Also add a bunch of missing SSE/AVX instructions.
...
llvm-svn: 144525
2011-11-14 08:07:55 +00:00
Craig Topper
a331515c82
Add neverHasSideEffects, mayLoad, and mayStore to many patternless SSE/AVX instructions. Remove MMX check from LowerVECTOR_SHUFFLE since MMX vector types won't go through it anyway.
...
llvm-svn: 144522
2011-11-14 06:46:21 +00:00
Craig Topper
b8bcb473e2
Add BLSI, BLSMSK, and BLSR to getTargetNodeName.
...
llvm-svn: 144502
2011-11-13 17:31:07 +00:00
Craig Topper
3dc75f9e3b
Add more AVX2 shift lowering support. Move AVX2 variable shift to use patterns instead of custom lowering code.
...
llvm-svn: 144457
2011-11-12 09:58:49 +00:00
Daniel Dunbar
52823cc91c
build: Attempt to rectify inconsistencies between CMake and LLVMBuild versions of explicit dependencies.
...
- The hope is that we have a tool/test to verify these are accurate (and tight) soon.
llvm-svn: 144444
2011-11-12 02:10:57 +00:00