llvm-project

Commit Graph

Author	SHA1	Message	Date
Bill Wendling	ae94fb4009	The saved registers weren't being processed in the correct order. This lead to the compact unwind claiming that one register was saved before another, which isn't all that great in general. Process them in the natural order. Reverse the list only when necessary for the algorithm. llvm-svn: 146612	2011-12-14 23:53:24 +00:00
Evan Cheng	7fae11b231	- Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function to finalize MI bundles (i.e. add BUNDLE instruction and computing register def and use lists of the BUNDLE instruction) and a pass to unpack bundles. - Teach more of MachineBasic and MachineInstr methods to be bundle aware. - Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to prevent IT blocks from being broken apart. llvm-svn: 146542	2011-12-14 02:11:42 +00:00
Chandler Carruth	637cc6a8aa	Initial CodeGen support for CTTZ/CTLZ where a zero input produces an undefined result. This adds new ISD nodes for the new semantics, selecting them when the LLVM intrinsic indicates that the undef behavior is desired. The new nodes expand trivially to the old nodes, so targets don't actually need to do anything to support these new nodes besides indicating that they should be expanded. I've done this for all the operand types that I could figure out for all the targets. Owners of various targets, please review and let me know if any of these are incorrect. Note that the expand behavior is conservatively correct, and exactly matches LLVM's current behavior with these operations. Ideally this patch will not change behavior in any way. For example the regtest suite finds the exact same instruction sequences coming out of the code generator. That's why there are no new tests here -- all of this is being exercised by the existing test suite. Thanks to Duncan Sands for reviewing the various bits of this patch and helping me get the wrinkles ironed out with expanding for each target. Also thanks to Chris for clarifying through all the discussions that this is indeed the approach he was looking for. That said, there are likely still rough spots. Further review much appreciated. llvm-svn: 146466	2011-12-13 01:56:10 +00:00
Daniel Dunbar	8889bb08b8	LLVMBuild: Introduce a common section which currently has a list of the subdirectories to traverse into. - Originally I wanted to avoid this and just autoscan, but this has one key flaw in that new subdirectories can not automatically trigger a rerun of the llvm-build tool. This is particularly a pain when switching back and forth between trees where one has added a subdirectory, as the dependencies will tend to be wrong. This will also eliminates FIXME implicitly. llvm-svn: 146436	2011-12-12 22:45:54 +00:00
Daniel Dunbar	27a7489a03	LLVMBuild: Remove trailing newline, which irked me. llvm-svn: 146409	2011-12-12 19:48:00 +00:00
Jan Sjödin	7c0face455	XOP instructions and encoding tests. llvm-svn: 146407	2011-12-12 19:37:49 +00:00
Jan Sjödin	6dd2488383	XOP encoding bits and logic. llvm-svn: 146397	2011-12-12 19:12:26 +00:00
Craig Topper	1fdfec63a4	Remove some remants of the old palign pattern fragment that were still hanging around. Also remove a cast from inside getShuffleVPERM2X128Immediate and getShuffleVPERMILPImmediate since the only caller already had done the cast. llvm-svn: 146344	2011-12-11 19:12:35 +00:00
Rafael Espindola	c7f355b8e1	Handle expressions of the form _GLOBAL_OFFSET_TABLE_-symbol the same way gas does. The _GLOBAL_OFFSET_TABLE_ is still magical in that we get a R_386_GOTPC, but it doesn't change the immediate in the same way as when the expression has no right hand side symbol. llvm-svn: 146311	2011-12-10 02:28:43 +00:00
Benjamin Kramer	863683c590	This is now implemented. llvm-svn: 146258	2011-12-09 15:45:57 +00:00
Benjamin Kramer	16bbfbec66	X86: Add patterns for the various rounding ops for SSE4.1 and AVX. llvm-svn: 146257	2011-12-09 15:44:03 +00:00
Benjamin Kramer	2dc5dec41d	X86: Split (v)rounds[sd] into a normal and an intrinsic version. llvm-svn: 146256	2011-12-09 15:43:55 +00:00
Evan Cheng	557cda7f1d	Remove hasSSE1orAVX(). It's the same as hasXMM(). llvm-svn: 146246	2011-12-09 06:32:46 +00:00
Evan Cheng	b96bca81e7	Add 256-bit variant vmovss and vmovsd patterns. rdar://10538417 llvm-svn: 146196	2011-12-08 22:30:45 +00:00
Evan Cheng	2a217be25f	Add various missing AVX patterns which was causing crashes. Sadly, the generated code looks pretty bad compared to SSE. rdar://10538793 llvm-svn: 146191	2011-12-08 22:05:28 +00:00
Owen Anderson	57a7f41d5d	Don't explicitly marked libm rounding ops as legal on SSE4.1/AVX. There don't seem to be patterns for these, so I don't know why they were marked legal in the first place. Fixes failures caused by r146171. llvm-svn: 146180	2011-12-08 20:51:38 +00:00
Owen Anderson	0b9b9da6c8	Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise. llvm-svn: 146171	2011-12-08 19:32:14 +00:00
Evan Cheng	4d1a2d449f	Many of the SSE patterns should not be selected when AVX is available. This led to the following code in X86Subtarget.cpp if (HasAVX) X86SSELevel = NoMMXSSE; This is so patterns that are predicated on hasSSE3, etc. would not be selected when avx is available. Instead, the AVX variant is selected. However, this breaks instructions which do not have AVX variants. The right way to fix this is for the SSE but not-AVX patterns to predicate on something like hasSSE3() && !hasAVX(). Then we can take out the hack in X86Subtarget.cpp. Patterns which do not have AVX variants do not need to change. However, we need to audit all the patterns before we make the change. This patch is workaround that fixes one specific case, the prefetch instructions. rdar://10538297 llvm-svn: 146163	2011-12-08 19:00:42 +00:00
Jan Sjödin	d19760a40c	Src2 and src3 were accidentally swapped for the FMA4 rr patterns. Undo this and fix the encoding. llvm-svn: 146151	2011-12-08 14:43:19 +00:00
Craig Topper	1d578e8835	Fix a bunch of SSE/AVX patterns to use proper memop types. In particular, not using integer loads other than v2i64/v4i64 since the others are all promoted. llvm-svn: 146031	2011-12-07 08:30:53 +00:00
Bill Wendling	302cf8d5d0	Adjust the stack by one pointer size for all frameless stacks. llvm-svn: 146030	2011-12-07 07:58:55 +00:00
Bill Wendling	3c86459997	Fix off-by-one error when encoding the stack size for a frameless stack. llvm-svn: 146029	2011-12-07 07:49:49 +00:00
Evan Cheng	7f8e563a69	Add bundle aware API for querying instruction properties and switch the code generator to it. For non-bundle instructions, these behave exactly the same as the MC layer API. For properties like mayLoad / mayStore, look into the bundle and if any of the bundled instructions has the property it would return true. For properties like isPredicable, only return true if all of the bundled instructions have the property. For properties like canFoldAsLoad, isCompare, conservatively return false for bundles. llvm-svn: 146026	2011-12-07 07:15:52 +00:00
Bill Wendling	67a70c995a	Explicitly check for the different SUB instructions. llvm-svn: 145976	2011-12-06 22:14:27 +00:00
Bill Wendling	5a173cd367	Encode the total stack if there isn't a frame. llvm-svn: 145969	2011-12-06 21:34:01 +00:00
Bill Wendling	a73c0c99ea	* Add a macro to remove a magic number. * Rename variables to reflect what they're actually used for. llvm-svn: 145968	2011-12-06 21:23:42 +00:00
Bill Wendling	87571b6392	Check the correct value for small stack sizes. Also modify some comments. llvm-svn: 145954	2011-12-06 19:16:17 +00:00
Bill Wendling	a4e87944a8	For a small sized stack, we encode that value directly with no "stack adjust" value. llvm-svn: 145952	2011-12-06 19:09:06 +00:00
Craig Topper	83320e03e6	Add X86ISD::HADD/HSUB to getTargetNodeName llvm-svn: 145929	2011-12-06 09:31:36 +00:00
Craig Topper	6572e0f203	Fix a bunch of SSE/AVX patterns to use v2i64/v4i64 loads since all other integer vector loads are promoted to those. llvm-svn: 145927	2011-12-06 09:04:59 +00:00
Craig Topper	8d4ba198d6	Merge floating point and integer UNPCK X86ISD node types. llvm-svn: 145926	2011-12-06 08:21:25 +00:00
Craig Topper	3cb802c775	Clean up some of the shuffle decoding code for UNPCK instructions. Add instruction commenting for AVX/AVX2 forms for integer UNPCKs. llvm-svn: 145924	2011-12-06 05:31:16 +00:00
Craig Topper	bf41eb3a98	Merge isSHUFPMask and isCommutedSHUFPMask into single function that can do both. Do the same for the 256-bit version. Use loops to reduce size of isVSHUFPYMask. Fix test cases that were incorrectly passing due to isCommutedSHUFPMask not checking for the vector being 128-bit. This caused some 256-bit shuffles to be incorrectly commuted. llvm-svn: 145921	2011-12-06 04:59:07 +00:00
Bill Wendling	4e87e850a2	Add a comment. llvm-svn: 145896	2011-12-06 01:57:48 +00:00
Jakob Stoklund Olesen	10e1252269	Use logarithmic units for basic block alignment. This was actually a bit of a mess. TLI.setPrefLoopAlignment was clearly documented as taking log2(bytes) units, but the x86 target would still set a preferred loop alignment of '16'. CodePlacementOpt passed this number on to the basic block, and AsmPrinter interpreted it as bytes. Now both MachineFunction and MachineBasicBlock use logarithmic alignments. Obviously, MachineConstantPool still measures alignments in bytes, so we can emulate the thrill of using as. llvm-svn: 145889	2011-12-06 01:26:19 +00:00
Bill Wendling	f7cef7ecad	The compact encoding of the registers are 3-bits each. Make sure we shift the value over that much. llvm-svn: 145888	2011-12-06 01:26:14 +00:00
Jim Grosbach	25b63fa117	Move target-specific logic out of generic MCAssembler. Whether a fixup needs relaxation for the associated instruction is a target-specific function, as the FIXME indicated. Create a hook for that and use it. llvm-svn: 145881	2011-12-06 00:47:03 +00:00
Craig Topper	51bec1a37a	Remove some leftover remnants that once tried to create 64-bit MMX PALIGNR instructions. llvm-svn: 145804	2011-12-05 07:27:14 +00:00
Craig Topper	6a55b1dd9f	Clean up and optimizations to the X86 shuffle lowering code. No functional change. llvm-svn: 145803	2011-12-05 06:56:46 +00:00
Sanjoy Das	006e43bcc0	Check for stack space more intelligently. libgcc sets the stack limit field in TCB to 256 bytes above the actual allocated stack limit. This means if the function's stack frame needs less than 256 bytes, we can just compare the stack pointer with the stack limit. This should result in lesser calls to __morestack. llvm-svn: 145766	2011-12-03 09:32:07 +00:00
Sanjoy Das	165ca1d4ba	Fix a bug in the x86-32 code generated for segmented stacks. Currently LLVM pads the call to __morestack with a add and sub of 8 bytes to esp. This isn't correct since __morestack expects the call to be followed directly by a ret. This commit also adjusts the relevant test-case. llvm-svn: 145765	2011-12-03 09:21:07 +00:00
Nick Lewycky	8fd1254a0a	Creating multiple JITs on X86 in multiple threads causes multiple writes (of the same value) to this variable. This code could be refactored, but it doesn't matter since the old JIT is going away. Add tsan annotations to ignore the race. llvm-svn: 145745	2011-12-03 02:45:50 +00:00
Nick Lewycky	50f02cb21b	Move global variables in TargetMachine into new TargetOptions class. As an API change, now you need a TargetOptions object to create a TargetMachine. Clang patch to follow. One small functionality change in PTX. PTX had commented out the machine verifier parts in their copy of printAndVerify. That now calls the version in LLVMTargetMachine. Users of PTX who need verification disabled should rely on not passing the command-line flag to enable it. llvm-svn: 145714	2011-12-02 22:16:29 +00:00
Jan Sjödin	1280eb1d06	Add XOP feature flag. llvm-svn: 145682	2011-12-02 15:14:37 +00:00
Craig Topper	b67440367f	Reduce duplicate code in isHorizontalBinOp and add some asserts to protect assumptions llvm-svn: 145681	2011-12-02 08:18:41 +00:00
Craig Topper	abeb79eee3	Add instruction selection support for horizontal add/sub of 256-bit floating point vectors. Also add the test case for 256-bit integer vectors. llvm-svn: 145680	2011-12-02 07:16:01 +00:00
Sanjoy Das	f60485c4cf	Dummy commit to check commit access. llvm-svn: 145619	2011-12-01 19:15:08 +00:00
Eric Christopher	9da7f305a4	For 64-bit the rest of the general regs are ok for the q constraint. Make sure we can emit both the high and low versions of those registers. Fixes rdar://10392864 llvm-svn: 145579	2011-12-01 08:12:41 +00:00
Eli Friedman	d61887dd0a	Pass AVX vectors which are arguments to varargs functions on the stack. <rdar://problem/10463281>. llvm-svn: 145573	2011-12-01 04:49:21 +00:00
Jan Sjödin	9430e284a9	Support for encoding all FMA4 instructions and tablegen patterns for all remaining FMA4 instructions and intrinsics with tests. llvm-svn: 145525	2011-11-30 22:09:42 +00:00
Benjamin Kramer	5feb3dab79	X86: Turns out bulldozer also supports sse42 and lzcnt. While at it remove the barcelona/instanbul/shanghai subtargets, they're unsupported by GCC and look pretty broken. llvm-svn: 145494	2011-11-30 15:48:16 +00:00
Benjamin Kramer	981f32327d	X86: Add subtargets for AMD's bulldozer. llvm-svn: 145493	2011-11-30 15:27:46 +00:00
Nadav Rotem	96923cc2bb	X86: PerformOrCombine introduced a vselect node with a wrong order of operands. This bug was introduced when a dedicated blend sdnode was replaced with the vselect node (in 139479). llvm-svn: 145488	2011-11-30 10:13:37 +00:00
Craig Topper	c4977ba413	Add instruction selection support for AVX2 horizontal add/sub instructions. llvm-svn: 145487	2011-11-30 09:10:50 +00:00
Craig Topper	0a672eaf9e	Merge VPERM2F128/VPERM2I128 ISD node types. llvm-svn: 145485	2011-11-30 07:47:51 +00:00
Craig Topper	bafd224c8b	Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128. llvm-svn: 145483	2011-11-30 06:25:25 +00:00
Evan Cheng	648e48d02e	Add another missing pattern. llvm-gcc likes f64 but clang likes i64 so it was generating poor code for some SSE builtins. llvm-svn: 145448	2011-11-29 22:48:34 +00:00
Jakob Stoklund Olesen	bde32d36bb	Make X86::FsFLD0SS / FsFLD0SD real pseudo-instructions. Like V_SET0, these instructions are expanded by ExpandPostRA to xorps / vxorps so they can participate in execution domain swizzling. This also makes the AVX variants redundant. llvm-svn: 145440	2011-11-29 22:27:25 +00:00
Daniel Dunbar	539d0a8a09	build/CMake: Finish removal of add_llvm_library_dependencies. llvm-svn: 145420	2011-11-29 19:25:30 +00:00
Michael J. Spencer	de3a2118db	MC/X86/COFF: Allow quotes in names when targeting MS/Windows, as MC is the only assembler we support. This splits MS/Windows and GNU/Windows ASM infos into two seperate classes. While there is currently only one difference, full MS C++ ABI support will require many more. llvm-svn: 145409	2011-11-29 18:00:06 +00:00
Elena Demikhovsky	7a81dea516	Fixed vsqrt.ss intrinsic usage - order of input operands was wrong. Added a test. Thanks Bruno for reviewing the patch. llvm-svn: 145403	2011-11-29 15:00:45 +00:00
Craig Topper	1d63ae3731	Fix shuffle decoding for memory forms for (V)SHUFPS/D. llvm-svn: 145392	2011-11-29 07:58:09 +00:00
Craig Topper	c16db840be	Fix issues in shuffle decoding around VPERM* instructions. Fix shuffle decoding for VSHUFPS/D for 256-bit types. Add pattern matching for memory forms of VPERMILPS/VPERMILPD. llvm-svn: 145390	2011-11-29 07:49:05 +00:00
Craig Topper	12b72def4e	Fix VINSERTF128/VEXTRACTF128 to be marked as FP instructions. Allow execution dependency fix pass to convert them to their integer equivalents when AVX2 is enabled. llvm-svn: 145376	2011-11-29 05:37:58 +00:00
Craig Topper	897a7d4b9c	Correctly mark VPERM2F128 as being an FP instruction and add execution domain fixing support to convert it to VPERM2I128 for AVX2. llvm-svn: 145370	2011-11-29 03:57:34 +00:00
Evan Cheng	aa93ceb164	Add missing avx pattern. llvm-svn: 145272	2011-11-28 20:27:23 +00:00
Craig Topper	818a983e93	Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar. llvm-svn: 145238	2011-11-28 10:14:51 +00:00
Craig Topper	b0456936da	Make isCommutedVSHUFP more like the way isCommutedSHUFP is handled. llvm-svn: 145218	2011-11-28 01:14:24 +00:00
Craig Topper	79ee88a511	Merge detecting and handling for VSHUFPSY and VSHUFPDY since a lot of the code was similar for both. llvm-svn: 145199	2011-11-27 21:41:12 +00:00
Craig Topper	51280d565b	Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created. llvm-svn: 145153	2011-11-26 22:55:48 +00:00
Craig Topper	7704bd7ac3	Collapse X86ISD node types for PUNPCKH, PUNPCKL, UNPCKLP, and UNPCKHP to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type. llvm-svn: 145148	2011-11-26 20:47:44 +00:00
Bruno Cardoso Lopes	0f9a1f5e6c	This patch contains support for encoding FMA4 instructions and tablegen patterns for scalar FMA4 operations and intrinsic. Also add tests for vfmaddsd. Patch by Jan Sjodin llvm-svn: 145133	2011-11-25 19:33:42 +00:00
Craig Topper	d65a444478	Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64. llvm-svn: 145126	2011-11-24 22:57:10 +00:00
Craig Topper	d26466748b	Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish. llvm-svn: 145125	2011-11-24 22:20:08 +00:00
Benjamin Kramer	651db37352	X86: alias cqo to cqto. llvm-svn: 145121	2011-11-24 12:02:46 +00:00
Benjamin Kramer	ebcb451874	X86: Use btq for bit tests if the immediate can't be encoded in 32 bits. Before: movabsq $4294967296, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x01,0x00,0x00,0x00] testq %rax, %rdi ## encoding: [0x48,0x85,0xf8] jne LBB0_2 ## encoding: [0x75,A] After: btq $32, %rdi ## encoding: [0x48,0x0f,0xba,0xe7,0x20] jb LBB0_2 ## encoding: [0x72,A] btq is usually slower than testq because it doesn't fuse with the jump, but here we're better off saving one register and a giant movabsq. llvm-svn: 145103	2011-11-23 13:54:17 +00:00
Elena Demikhovsky	779ba6d7b7	I added several lines in X86 code generator that allow to choose VSHUFPS/VSHUFPD instructions while lowering VECTOR_SHUFFLE node. I check a commuted VSHUFP mask. The patch was reviewed by Bruno. llvm-svn: 145099	2011-11-23 10:23:16 +00:00
Jakob Stoklund Olesen	02845410f9	Fix PR11422. This was a bug in keeping track of the available domains when merging domain values. The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr to the integer domain which is only available in AVX2. Also add an assertion to catch future attempts at emitting AVX2 instructions. llvm-svn: 145096	2011-11-23 04:03:08 +00:00
Craig Topper	83c4592619	More fixes to the X86InstComments for shuffle instructions. In particular add AVX flavors of many instructions and fix the destination operand for some of the existing AVX entries. llvm-svn: 145063	2011-11-22 14:27:57 +00:00
Craig Topper	ccb7097509	Fix shuffle decoding logic to handle UNPCKLPS/UNPCKLPD on 256-bit vectors correctly. Add support for decoding UNPCKHPS/UNPCKHPD for AVX 128-bit and 256-bit forms. llvm-svn: 145055	2011-11-22 01:57:35 +00:00
Craig Topper	f563977795	Add methods for querying minimum SSE version along with AVX. Simplifies all the places that had to check a version of SSE and AVX. llvm-svn: 145053	2011-11-22 00:44:41 +00:00
Craig Topper	6270d072c5	Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled. llvm-svn: 145028	2011-11-21 08:26:50 +00:00
Craig Topper	669199ca94	Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled. llvm-svn: 145026	2011-11-21 06:57:39 +00:00
Craig Topper	a065238c6e	Make LowerSIGN_EXTEND_INREG split 256-bit vectors when AVX1 is enabled and use AVX2 shifts when AVX2 is enabled. llvm-svn: 145022	2011-11-21 01:12:36 +00:00
Craig Topper	e79761df73	Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine. llvm-svn: 145005	2011-11-20 00:12:05 +00:00
Craig Topper	a3a6583694	Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled. llvm-svn: 145004	2011-11-19 22:34:59 +00:00
Craig Topper	bac86038ac	Remove some of the special classes that worked around an old tablegen limitation of not being able to remove redundant bitconverts from patterns. llvm-svn: 145003	2011-11-19 21:01:54 +00:00
Craig Topper	3af6ae089f	Custom lower AVX2 variable shift intrinsics to shl/srl/sra nodes and remove the intrinsic patterns. llvm-svn: 144999	2011-11-19 17:46:46 +00:00
Craig Topper	f984efbfce	Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors. llvm-svn: 144989	2011-11-19 09:02:40 +00:00
Craig Topper	81390be00f	Collapse X86 PSIGNB/PSIGNW/PSIGND node types. llvm-svn: 144988	2011-11-19 07:33:10 +00:00
Craig Topper	de6b73bb4d	Extend VPBLENDVB and VPSIGN lowering to work for AVX2. llvm-svn: 144987	2011-11-19 07:07:26 +00:00
Craig Topper	66e2b5a61e	Remove unused parameters from the AVX maskmov classes. llvm-svn: 144985	2011-11-19 04:49:22 +00:00
Nadav Rotem	1ec141d0f9	Add AVX2 vpbroadcast support llvm-svn: 144967	2011-11-18 02:49:55 +00:00
Craig Topper	f41e1d0246	Fix SSE/AVX integer comparison patterns to understand that all integer vector loads are promoted to i64 vector loads so patterns need a bitconvert. Also slightly simplify the AVX2 variable shift patterns by using the predefined bitconvert pattern fragments. llvm-svn: 144896	2011-11-17 07:49:38 +00:00
Craig Topper	f17b600577	Remove seemingly unnecessary duplicate VROUND definitions. llvm-svn: 144885	2011-11-17 07:04:00 +00:00
Eli Friedman	20439a42b0	Turn on vzeroupper insertion on call boundaries for AVX; it works as far as I know, and I'd like to see wider testing. llvm-svn: 144867	2011-11-17 00:21:52 +00:00
Evan Cheng	011538dc79	Another missing X86ISD::MOVLPD pattern. rdar://10450317 llvm-svn: 144839	2011-11-16 22:24:44 +00:00
Pete Cooper	48784ed5b7	Added missing comment about new custom lowering of DEC64 llvm-svn: 144811	2011-11-16 19:03:23 +00:00
Evan Cheng	ecb2908bf9	Sink codegen optimization level into MCCodeGenInfo along side relocation model and code model. This eliminates the need to pass OptLevel flag all over the place and makes it possible for any codegen pass to use this information. llvm-svn: 144788	2011-11-16 08:38:26 +00:00
Craig Topper	3ed7d9ee5a	Fix the execution domain on a bunch of SSE/AVX instructions. llvm-svn: 144784	2011-11-16 07:30:46 +00:00
Craig Topper	07d8b5e2c9	Remove code to enable execution dependency fix pass on VR256. VR128 is sufficient after r144636. llvm-svn: 144777	2011-11-16 05:02:04 +00:00
Nadav Rotem	37010002f2	AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code. llvm-svn: 144720	2011-11-15 22:50:37 +00:00
Pete Cooper	7c7ba1baa1	Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used by later instructions. Only done for DEC64m right now. Fixes <rdar://problem/6172640> llvm-svn: 144705	2011-11-15 21:57:53 +00:00
Jay Foad	0745e645e0	Remove some unnecessary includes of PseudoSourceValue.h. llvm-svn: 144631	2011-11-15 07:24:32 +00:00
Craig Topper	649d1c5eec	Fix PR11370 for real. Prevents converting 256-bit FP instruction to AVX2 256-bit integer instructions when AVX2 isn't enabled. llvm-svn: 144629	2011-11-15 06:39:01 +00:00
Craig Topper	05baa85f58	Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370. llvm-svn: 144622	2011-11-15 05:55:35 +00:00
Jakob Stoklund Olesen	f8ad336bc4	Break false dependencies before partial register updates. Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix about instructions with partial register updates causing false unwanted dependencies. The ExecutionDepsFix pass will break the false dependencies if the updated register was written in the previoius N instructions. The small loop added to sse-domains.ll runs twice as fast with dependency-breaking instructions inserted. llvm-svn: 144602	2011-11-15 01:15:30 +00:00
Evan Cheng	fb13d32b3f	Add a missing pattern for X86ISD::MOVLPD. rdar://10436044 llvm-svn: 144566	2011-11-14 20:35:52 +00:00
Pete Cooper	890e02e854	Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom lowered Constant idx case is still done in tablegen but other cases are then expanded Fixes <rdar://problem/10435460> llvm-svn: 144557	2011-11-14 19:38:42 +00:00
Craig Topper	182b00a2e0	Add AVX2 version of instructions to load folding tables. Also add a bunch of missing SSE/AVX instructions. llvm-svn: 144525	2011-11-14 08:07:55 +00:00
Craig Topper	a331515c82	Add neverHasSideEffects, mayLoad, and mayStore to many patternless SSE/AVX instructions. Remove MMX check from LowerVECTOR_SHUFFLE since MMX vector types won't go through it anyway. llvm-svn: 144522	2011-11-14 06:46:21 +00:00
Craig Topper	b8bcb473e2	Add BLSI, BLSMSK, and BLSR to getTargetNodeName. llvm-svn: 144502	2011-11-13 17:31:07 +00:00
Craig Topper	3dc75f9e3b	Add more AVX2 shift lowering support. Move AVX2 variable shift to use patterns instead of custom lowering code. llvm-svn: 144457	2011-11-12 09:58:49 +00:00
Daniel Dunbar	52823cc91c	build: Attempt to rectify inconsistencies between CMake and LLVMBuild versions of explicit dependencies. - The hope is that we have a tool/test to verify these are accurate (and tight) soon. llvm-svn: 144444	2011-11-12 02:10:57 +00:00
Craig Topper	ea28a34c43	Add lowering for AVX2 shift instructions. llvm-svn: 144380	2011-11-11 07:39:23 +00:00
Bill Wendling	8df8204554	If we have to reset the calculation of the compact encoding, then also reset the "saved register" index. <rdar://problem/10430076> llvm-svn: 144350	2011-11-11 00:59:14 +00:00
Daniel Dunbar	6d617b48c7	LLVMBuild: Add explicit information on whether targets define an assembly printer, assembly parser, or disassembler. llvm-svn: 144344	2011-11-11 00:23:56 +00:00
Nadav Rotem	0a2f797dec	AVX2: Add variable shift from memory. Note: These patterns only works in some cases because many times the load sd node is bitcasted from a load node of a different type. llvm-svn: 144266	2011-11-10 06:54:20 +00:00
Daniel Dunbar	233c9304a8	llvm-build: Add --native-target and --enable-targets options, and add logic to handle defining the "magic" target related components (like native, nativecodegen, and engine). - We still require these components to be in the project (currently in lib/Target) so that we have a place to document them and hopefully make it more obvious that they are "magic". llvm-svn: 144253	2011-11-10 00:50:07 +00:00
Daniel Dunbar	82219ad4dc	llvm-build: Add an explicit component type to represent targets. - Gives us a place to hang target specific metadata (like whether the target has a JIT). llvm-svn: 144250	2011-11-10 00:49:51 +00:00
Nadav Rotem	1938482bfa	AVX2: Add patterns for variable shift operations llvm-svn: 144212	2011-11-09 21:22:13 +00:00
Devang Patel	2f70bcdb94	Remove unnecessary include. llvm-svn: 144211	2011-11-09 21:11:02 +00:00
Nadav Rotem	79135d844d	Add AVX2 support for vselect of v32i8 llvm-svn: 144187	2011-11-09 13:21:28 +00:00
Craig Topper	f87a2bef51	Enable execution dependency fix pass for YMM registers when AVX2 is enabled. Add AVX2 logical operations to list of replaceable instructions. llvm-svn: 144179	2011-11-09 09:37:21 +00:00
Craig Topper	c9eb09d3b8	Add instruction selection for AVX2 integer comparisons. llvm-svn: 144176	2011-11-09 08:06:13 +00:00
Craig Topper	8c8a431057	Add AVX2 instruction lowering for add, sub, and mul. llvm-svn: 144174	2011-11-09 07:28:55 +00:00
Pete Cooper	82cd9e81fc	Added invariant field to the DAG.getLoad method and changed all calls. When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses llvm-svn: 144100	2011-11-08 18:42:53 +00:00
Evan Cheng	91b56e0390	Add x86 isel logic and patterns to match movlps from clang generated IR for _mm_loadl_pi(). rdar://10134392, rdar://10050222 llvm-svn: 144052	2011-11-08 00:31:58 +00:00
Jakob Stoklund Olesen	0241308954	Expand V_SET0 to xorps by default. The xorps instruction is smaller than pxor, so prefer that encoding. The ExecutionDepsFix pass will switch the encoding to pxor and xorpd when appropriate. llvm-svn: 143996	2011-11-07 19:15:58 +00:00
Craig Topper	a6d409d543	Add AVX2 variable shift instructions and intrinsics. llvm-svn: 143915	2011-11-07 08:26:24 +00:00
Craig Topper	ff39be0afc	Add AVX2 VPMOVMASK instructions and intrinsics. llvm-svn: 143904	2011-11-07 03:20:35 +00:00
Craig Topper	e122dcbf4a	Add AVX2 VEXTRACTI128 and VINSERTI128 instructions. Fix VPERM2I128 to be qualified with HasAVX2 instead of HasAVX. Mark VINSERTF128 and VEXTRACTF128 as never having side effects. llvm-svn: 143902	2011-11-07 02:00:04 +00:00
Craig Topper	f01f1b5cb9	More AVX2 instructions and their intrinsics. llvm-svn: 143895	2011-11-06 23:04:08 +00:00
Benjamin Kramer	20baffb257	Replace (Lower\|Upper)caseString in favor of StringRef's newest methods. llvm-svn: 143891	2011-11-06 20:37:06 +00:00
Craig Topper	05d1cb98e7	Add more AVX2 instructions and intrinsics. llvm-svn: 143861	2011-11-06 06:12:20 +00:00
Benjamin Kramer	f3da529028	Add more PRI.64 macros for MSVC and use them throughout the codebase. llvm-svn: 143799	2011-11-05 08:57:40 +00:00
Eli Friedman	8f249600e7	Enhanced vzeroupper insertion pass that avoids inserting vzeroupper where it is unnecessary through local analysis. Patch from Bruno Cardoso Lopes, with some additional changes. I'm going to wait for any review comments and perform some additional testing before turning this on by default. llvm-svn: 143750	2011-11-04 23:46:11 +00:00
Daniel Dunbar	4a9c6426ff	build/cmake: Use tblgen macro directly instead of llvm_tablegen, which just added a layer of indirection with no value (not even conciseness). llvm-svn: 143727	2011-11-04 19:04:23 +00:00
Craig Topper	caba032f48	Add intrinsics for X86 vcvtps2ph and vcvtph2ps instructions llvm-svn: 143683	2011-11-04 06:59:49 +00:00
Dan Gohman	198b7ffc11	Reapply r143206, with fixes. Disallow physical register lifetimes across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660	2011-11-03 21:49:52 +00:00
Daniel Dunbar	bf9bba47a1	build: Add initial cut at LLVMBuild.txt files. llvm-svn: 143634	2011-11-03 18:53:17 +00:00
Craig Topper	0e7cbbabea	Add new X86 AVX2 VBROADCAST instructions. llvm-svn: 143612	2011-11-03 07:35:53 +00:00
Craig Topper	a47b05c7f3	More AVX2 instructions and intrinsics. llvm-svn: 143536	2011-11-02 06:54:17 +00:00
Craig Topper	682b850602	Add a bunch more X86 AVX2 instructions and their corresponding intrinsics. llvm-svn: 143529	2011-11-02 04:42:13 +00:00
Eli Friedman	3f5eccbe7a	Teach the x86 backend a couple tricks for dealing with v16i8 sra by a constant splat value. Fixes PR11289. llvm-svn: 143498	2011-11-01 21:18:39 +00:00
Craig Topper	cfcfdf2aab	Begin adding AVX2 instructions. No selection support yet other than intrinsics. llvm-svn: 143331	2011-10-31 02:15:10 +00:00
Craig Topper	228d9131aa	Add intrinsics and feature flag for read/write FS/GS base instructions. Also add AVX2 feature flag. llvm-svn: 143319	2011-10-30 19:57:21 +00:00
Benjamin Kramer	7402ee6ec2	X86: Emit logical shift by constant splat of <16 x i8> as a <8 x i16> shift and zero out the bits where zeros should've been shifted in. llvm-svn: 143315	2011-10-30 17:31:21 +00:00
Nadav Rotem	c602b2c4de	Fix pr11266. On x86: (shl V, 1) -> add V,V Hardware support for vector-shift is sparse and in many cases we scalarize the result. Additionally, on sandybridge padd is faster than shl. llvm-svn: 143311	2011-10-30 13:24:22 +00:00
Dan Gohman	9b9c970148	Revert r143206, as there are still some failing tests. llvm-svn: 143262	2011-10-29 00:41:52 +00:00
Dan Gohman	73057ad24f	Reapply r143177 and r143179 (reverting r143188), with scheduler fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206	2011-10-28 17:55:38 +00:00
Duncan Sands	225a7037d6	Speculatively disable Dan's commits 143177 and 143179 to see if it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188	2011-10-28 09:55:57 +00:00
Dan Gohman	4db3f7dd83	Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177	2011-10-28 01:29:32 +00:00
Kevin Enderby	49e6a0da7e	Change the sysexit mnemonic (and sysexitl) to never have the REX.W prefix and not depend on In32BitMode. Use the sysexitq mnemonic for the version with the REX.W prefix and only allow it only In64BitMode. rdar://9738584 llvm-svn: 143112	2011-10-27 17:40:41 +00:00
Lang Hames	58dba012b6	Rename NonScalarIntSafe to something more appropriate. llvm-svn: 143080	2011-10-26 23:50:43 +00:00
Rafael Espindola	b3285224cd	Fixes an issue reported by -verify-machineinstrs. Patch by Sanjoy Das. llvm-svn: 143064	2011-10-26 21:16:41 +00:00
Rafael Espindola	66393c127d	This commit introduces two fake instructions MORESTACK_RET and MORESTACK_RET_RESTORE_R10; which are lowered to a RET and a RET followed by a MOV respectively. Having a fake instruction prevents the verifier from seeing a MachineBasicBlock end with a non-terminator (MOV). It also prevents the rather eccentric case of a MachineBasicBlock ending with RET but having successors nevertheless. Patch by Sanjoy Das. llvm-svn: 143062	2011-10-26 21:12:27 +00:00
Eli Friedman	b72d55353a	Add support to the old JIT for acquire/release loads and stores on x86. PR11207. llvm-svn: 142841	2011-10-24 20:24:21 +00:00
Craig Topper	b05d9e9bea	Add X86 SARX, SHRX, and SHLX instructions. llvm-svn: 142779	2011-10-23 22:18:24 +00:00
Craig Topper	980d59832a	Add X86 RORX instruction llvm-svn: 142741	2011-10-23 07:34:00 +00:00
Craig Topper	e94d277db8	Add X86 MULX instruction for disassembler. llvm-svn: 142738	2011-10-23 00:33:32 +00:00
Craig Topper	7412aa9886	Remove some duplicate specifying of neverHasSideEffects and mayLoad from X86 multiply instructions. llvm-svn: 142737	2011-10-22 23:13:53 +00:00
Nadav Rotem	e649d66552	Fix pr11193. SHL inserts zeros from the right, thus even when the original sign_extend_inreg value was of 1-bit, we need to sra. llvm-svn: 142724	2011-10-22 12:39:25 +00:00
Craig Topper	039a79067a	Remove intrinsics for X86 BLSI, BLSMSK, and BLSR intrinsics and replace with custom isel lowering code. llvm-svn: 142642	2011-10-21 06:55:01 +00:00
Evan Cheng	54d678fff4	Fix TLS lowering bug. The CopyFromReg must be glued to the TLSCALL. rdar://10291355 llvm-svn: 142550	2011-10-19 22:22:54 +00:00
Craig Topper	ef309c3384	Rename PEXTR to PEXT. Add intrinsics for BMI instructions. llvm-svn: 142480	2011-10-19 07:48:35 +00:00
Eric Christopher	16ec8c103a	Revert "Turn on the vzeroupper pass by default." This reverts commit 494f7ac3e8d2ab3d94e52317abf9c42a949fe1f3. llvm-svn: 142455	2011-10-18 23:10:11 +00:00
Eric Christopher	9bede2dd92	Turn on the vzeroupper pass by default. I'll remove/rename the option in a few days. llvm-svn: 142439	2011-10-18 22:50:17 +00:00
Lang Hames	7d2f7b5a33	Teach fast isel about vector stores, and make DoSelectCall return false when it fails to emit a store. This fixes <rdar://problem/10215997>. llvm-svn: 142432	2011-10-18 22:11:33 +00:00
Duncan Sands	d278d35b13	Fix a bunch of unused variable warnings when doing a release build with gcc-4.6. llvm-svn: 142350	2011-10-18 12:44:00 +00:00
David Meyer	49045ddb4c	Remove NaClMode llvm-svn: 142338	2011-10-18 05:29:23 +00:00
Craig Topper	e20793a4f1	Don't use inline assembly in 64-bit Visual Studio. Unfortunately, this means that cpuid leaf 7 can't be queried on versions of Visual Studio earlier than VS 2008 SP1. Fixes PR11147. llvm-svn: 142177	2011-10-17 05:33:10 +00:00
Craig Topper	96fa597828	Add X86 PEXTR and PDEP instructions. llvm-svn: 142141	2011-10-16 16:50:08 +00:00
Benjamin Kramer	1930b003fe	Add AsmToken::getEndLoc and use it to add ranges to x86 asm register parsing. <stdin>:1:12: error: register %rax is only available in 64-bit mode incl %rax ^~~~ llvm-svn: 142137	2011-10-16 12:10:27 +00:00
Benjamin Kramer	d416bae5f2	X86AsmParser: Synthesize EndLoc for tokens out of StartLoc + Length and print ranges for invalid operands. <stdin>:1:4: error: invalid instruction mnemonic 'abc' abc incl %edi ^~~ llvm-svn: 142135	2011-10-16 11:28:29 +00:00
Craig Topper	aea148c366	Add X86 BZHI instruction as well as BMI2 feature detection. llvm-svn: 142122	2011-10-16 07:55:05 +00:00
Craig Topper	0ae8d4d738	Add X86 INVPCID instruction. Add 32/64-bit predicates to INVEPT, INVVPID, VMREAD, and VMWRITE to remove hack from X86RecognizableInstr. llvm-svn: 142117	2011-10-16 07:05:40 +00:00
Chris Lattner	a3a0681083	Enhance llvm::SourceMgr to support diagnostic ranges, the same way clang does. Enhance the X86 asmparser to produce ranges in the one case that was annoying me, for example: test.s:10:15: error: invalid operand for instruction movl 0(%rax), 0(%edx) ^~~~~~~ It should be straight-forward to enhance filecheck, tblgen, and/or the .ll parser to use ranges where appropriate if someone is interested. llvm-svn: 142106	2011-10-16 04:47:35 +00:00
Craig Topper	25ea4e5ad3	Add X86 BEXTR instruction. This instruction uses VEX.vvvv to encode Operand 3 instead of Operand 2 so needs special casing in the disassembler and code emitter. Ultimately, should pass this information from tablegen llvm-svn: 142105	2011-10-16 03:51:13 +00:00
Craig Topper	6c8879e3ab	Add X86 feature detection support for BMI instructions. Added new cpuid function for accessing leafs with sub leafs specified in ECX. Also added code to keep track of the max cpuid level supported in both basic and extended leaves and qualified the existing cpuid calls and the new call to leaf 7. llvm-svn: 142089	2011-10-16 00:21:51 +00:00
Craig Topper	27ad12539d	Add support for X86 blsr, blsmsk, and blsi instructions. Required extra work because these are the first VEX encoded instructions to use the reg field as an opcode extension. llvm-svn: 142082	2011-10-15 20:46:47 +00:00
Benjamin Kramer	5fb5e3b384	SmallVector -> array llvm-svn: 142073	2011-10-15 13:28:31 +00:00
Evan Cheng	06fdaeb5d9	A few 80-col violations. llvm-svn: 141988	2011-10-14 20:36:23 +00:00
Craig Topper	965de2c197	Add X86 ANDN instruction. Including instruction selection. llvm-svn: 141947	2011-10-14 07:06:56 +00:00
Craig Topper	3657fe4b17	Add X86 TZCNT instruction and patterns to select it. Also added core-avx2 processor which is gcc's name for Haswell. llvm-svn: 141939	2011-10-14 03:21:46 +00:00
Jakob Stoklund Olesen	d9444d455e	Ban rematerializable instructions with side effects. TableGen infers unmodeled side effects on instructions without a pattern. Fix some instruction definitions where that was overlooked. Also raise an error if a rematerializable instruction has unmodeled side effects. That doen't make any sense. llvm-svn: 141929	2011-10-14 01:00:49 +00:00
Jakob Stoklund Olesen	eafa9d50c2	V_SET0 has no side effects. TableGen will mark any pattern-less instruction as having unmodeled side effects. This is extra bad for V_SET0 which gets rematerialized a lot. This was part of the cause for PR11125, but the real bug was fixed in r141923. llvm-svn: 141924	2011-10-14 00:39:50 +00:00
Eli Friedman	a5abd03a8d	Simplify assertion, and avoid undefined shift. Based on patch by Ahmed Charles. llvm-svn: 141912	2011-10-13 23:27:48 +00:00
Bill Wendling	25f6d3e321	More closely follow libgcc, which has code after the `ret' instruction to release the stack segment and reset the stack pointer. Place the code in its own MBB to make the verifier happy. llvm-svn: 141859	2011-10-13 08:24:19 +00:00
Bill Wendling	063f55ffdd	Revert r141854 because it was causing failures: http://lab.llvm.org:8011/builders/llvm-x86_64-linux/builds/101 --- Reverse-merging r141854 into '.': U test/MC/Disassembler/X86/x86-32.txt U test/MC/Disassembler/X86/simple-tests.txt D test/CodeGen/X86/bmi.ll U lib/Target/X86/X86InstrInfo.td U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86.td U lib/Target/X86/X86Subtarget.h llvm-svn: 141857	2011-10-13 07:48:07 +00:00
Bill Wendling	22a690e3db	Should not add instructions to a BB after a return instruction. The machine instruction verifier doesn't like this, nor do I. llvm-svn: 141856	2011-10-13 07:42:32 +00:00
Craig Topper	8cc9388073	Add X86 TZCNT instruction and patterns to select it. Also added core-avx2 processor which is gcc's name for Haswell. llvm-svn: 141854	2011-10-13 07:09:14 +00:00
Craig Topper	2fdcb1f045	Add 'implicit EFLAGS' to patterns for popcnt and lzcnt llvm-svn: 141853	2011-10-13 06:18:52 +00:00
Nick Lewycky	064c1c0e77	Fix indent in comment. llvm-svn: 141749	2011-10-12 00:14:12 +00:00
Craig Topper	63bc541196	Add HasPOPCNT predicate to the POPCNT instructions. Also mark POPCNT as modifying EFLAGS. llvm-svn: 141656	2011-10-11 07:13:09 +00:00
Craig Topper	0fbca75c17	Make Ivy Bridge 16-bit floating point conversion instructions require AVX. llvm-svn: 141654	2011-10-11 07:01:37 +00:00
Craig Topper	271064e873	Add X86 LZCNT instruction. Including instruction selection support. llvm-svn: 141651	2011-10-11 06:44:02 +00:00
Craig Topper	a697852386	Fix disassembling of popcntw. Also remove some code that says it accounts for 64BIT_REXW_XD not existing, but it does exist. llvm-svn: 141642	2011-10-11 04:34:23 +00:00
Lang Hames	f22f46bf25	Fixed natural stack alignment for Linux x86-32. Thanks Eli. llvm-svn: 141616	2011-10-11 00:51:36 +00:00
Lang Hames	de7ab801cc	Add a natural stack alignment field to TargetData, and prevent InstCombine from promoting allocas to preferred alignments that exceed the natural alignment. This avoids some potentially expensive dynamic stack realignments. The natural stack alignment is set in target data strings via the "S<size>" option. Size is in bits and must be a multiple of 8. The natural stack alignment defaults to "unspecified" (represented by a zero value), and the "unspecified" value does not prevent any alignment promotions. Target maintainers that care about avoiding promotions should explicitly add the "S<size>" option to their target data strings. llvm-svn: 141599	2011-10-10 23:42:08 +00:00
Eli Friedman	8ec0897db6	Make sure the X86 backend doesn't explode on 128-bit shuffles in AVX mode. Fixes PR11102. llvm-svn: 141585	2011-10-10 22:28:47 +00:00
Benjamin Kramer	874c519337	X86: Add a subtarget definition for core-avx-i, which is GCC's name for ivy bridge. llvm-svn: 141571	2011-10-10 19:35:07 +00:00
Nadav Rotem	814598563f	Fix 10892 - When lowering SIGN_EXTEND_INREG do not lower v2i64 because the instruction set has no 64-bit SRA support. llvm-svn: 141570	2011-10-10 19:31:45 +00:00
Benjamin Kramer	42c0330a79	X86: Add patterns for the movbe instruction (mov + bswap, only available on atom) llvm-svn: 141563	2011-10-10 18:34:56 +00:00
Craig Topper	a14c5723eb	Put a bunch of calls to ToggleFeature behind proper if statements. llvm-svn: 141527	2011-10-10 05:34:02 +00:00
Craig Topper	fe9179fa4f	Add Ivy Bridge 16-bit floating point conversion instructions for the X86 disassembler. llvm-svn: 141505	2011-10-09 07:31:39 +00:00
Jakob Stoklund Olesen	513d1213cc	Prevent potential NOREX bug. A GR8_NOREX virtual register is created when extrating a sub_8bit_hi sub-register: %vreg2<def> = COPY %vreg1:sub_8bit_hi; GR8_NOREX:%vreg2 %GR64_ABCD:%vreg1 TEST8ri_NOREX %vreg2, 1, %EFLAGS<imp-def>; GR8_NOREX:%vreg2 If such a live range is ever split, its register class must not be inflated to GR8. The sub-register copy can only target GR8_NOREX. I dont have a test case for this theoretical bug. llvm-svn: 141500	2011-10-08 20:20:03 +00:00
Jakob Stoklund Olesen	729abd360e	Add TEST8ri_NOREX pseudo to constrain sub_8bit_hi copies. In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot target all GR8 registers, only those in GR8_NOREX. TO enforce this, we ensure that all instructions using the EXTRACT_SUBREG are GR8_NOREX constrained. This fixes PR11088. llvm-svn: 141499	2011-10-08 18:28:28 +00:00
Jakob Stoklund Olesen	464fcc0035	Constrain both operands on MOVZX32_NOREXrr8. This instruction is explicitly encoded without an REX prefix, so both operands but be *_NOREX. Also add an assertion to copyPhysReg() that fires when the MOV8rr_NOREX constraints are not satisfied. This fixes a miscompilation in 20040709-2 in the gcc test suite. llvm-svn: 141410	2011-10-07 20:15:54 +00:00
Evan Cheng	74db300f37	High bits of movmskp{s\|d} and pmovmskb are known zero. rdar://10247336 llvm-svn: 141371	2011-10-07 17:21:44 +00:00
Craig Topper	d9cfddc5cd	Add X86 disassembler support for RDFSBASE, RDGSBASE, WRFSBASE, and WRGSBASE. llvm-svn: 141358	2011-10-07 07:02:24 +00:00
Craig Topper	bf136764ae	Add X86 disassembler support for XSAVE, XRSTOR, and XSAVEOPT. llvm-svn: 141354	2011-10-07 05:53:50 +00:00
Craig Topper	5aebebe18d	Revert part of r141274. Only need to change encoding for xchg %eax, %eax in 64-bit mode. This is because in 64-bit mode xchg %eax, %eax implies zeroing the upper 32-bits of RAX which makes it not a NOP. In 32-bit mode using NOP encoding is fine. llvm-svn: 141353	2011-10-07 05:35:38 +00:00
Craig Topper	23eb468b1f	Fix assembling of xchg %eax, %eax to not use the NOP encoding of 0x90. This was done by creating a new register group that excludes AX registers. Fixes PR10345. Also added aliases for flipping the order of the operands of xchg <reg>, %eax. llvm-svn: 141274	2011-10-06 06:44:41 +00:00
Peter Collingbourne	fb3d935649	Build system infrastructure for multiple tblgens. llvm-svn: 141266	2011-10-06 01:51:51 +00:00
Jakob Stoklund Olesen	ee9b576a2a	Override TRI::getSubClassWithSubReg for X86. There are fewer registers with sub_8bit sub-registers in 32-bit mode than in 64-bit mode. In 32-bit mode, sub_8bit behaves the same as sub_8bit_hi. llvm-svn: 141206	2011-10-05 20:26:33 +00:00
Craig Topper	b58a9665bd	Change C++ style comments to C style comments in X86 disassembler. Patch from Joe Abbey. llvm-svn: 141162	2011-10-05 03:29:32 +00:00
Owen Anderson	0ca562ec4c	Teach the MC to output code/data region marker labels in MachO and ELF modes. These are used by disassemblers to provide better disassembly, particularly on targets like ARM Thumb that like to intermingle data in the TEXT segment. llvm-svn: 141135	2011-10-04 23:26:17 +00:00
Craig Topper	f18c896337	Add support in the disassembler for ignoring the L-bit on certain VEX instructions. Mark instructions that have this behavior. Fixes PR10676. llvm-svn: 141065	2011-10-04 06:30:42 +00:00
Craig Topper	786bdb9e14	Add support for MOVBE and RDRAND instructions for the assembler and disassembler. Includes feature flag checking, but no instrinsic support. Fixes PR10832, PR11026 and PR11027. llvm-svn: 141007	2011-10-03 17:28:23 +00:00
Craig Topper	0d0be47d03	Treat VEX.vvvv as a 3-bit field outside of 64-bit mode. Prevents access to registers xmm8-xmm15 outside 64-bit mode. llvm-svn: 140997	2011-10-03 08:14:29 +00:00
Craig Topper	31854ba017	Fix VEX disassembling to ignore REX.RXBW bits in 32-bit mode. llvm-svn: 140993	2011-10-03 07:51:09 +00:00
Craig Topper	7aea69d949	Fix some Intel syntax disassembly issues with instructions that implicitly use AL/AX/EAX/RAX such as ADD/SUB/ADC/SUBB/XOR/OR/AND/CMP/MOV/TEST. llvm-svn: 140974	2011-10-02 21:08:12 +00:00
Craig Topper	21c33657d6	Special case disassembler handling of REX.B prefix on NOP instruction to decode as XCHG R8D, EAX instead. Fixes PR10344. llvm-svn: 140971	2011-10-02 16:56:09 +00:00
Craig Topper	d07a59f288	Fix disassembling of INVEPT and INVVPID to take operands llvm-svn: 140955	2011-10-01 21:20:14 +00:00
Craig Topper	88cb33e0d4	Fix disassembler handling of CRC32 which is an odd instruction that uses 0xf2 as an opcode extension and allows the opsize prefix. This necessitated adding IC_XD_OPSIZE and IC_64BIT_XD_OPSIZE contexts. Unfortunately, this increases the size of the disassembler tables. Fixes PR10702. llvm-svn: 140954	2011-10-01 19:54:56 +00:00
Jakob Stoklund Olesen	237dceff90	Store sub-class lists as a bit vector. This uses less memory and it reduces the complexity of sub-class operations: - hasSubClassEq() and friends become O(1) instead of O(N). - getCommonSubClass() becomes O(N) instead of O(N^2). In the future, TableGen will infer register classes. This makes it cheap to add them. llvm-svn: 140898	2011-09-30 22:19:07 +00:00
Jakob Stoklund Olesen	dd1904e7a6	Expand the x86 V_SET0* pseudos right after register allocation. This also makes it possible to reduce the number of pseudo instructions and get rid of the encoding information. llvm-svn: 140776	2011-09-29 05:10:54 +00:00
Eli Friedman	2fb357a5b0	PR11033: Make sure we don't generate PCMPGTQ and PCMPEQQ if the target CPU does not support them. llvm-svn: 140723	2011-09-28 21:00:25 +00:00
Jakob Stoklund Olesen	934b7d7645	Rename SSEDomainFix -> lib/CodeGen/ExecutionDepsFix. I'll clean up the source in the next commit. llvm-svn: 140663	2011-09-28 00:01:54 +00:00
Jakob Stoklund Olesen	30c811246f	Remove X86-dependent stuff from SSEDomainFix. This also enables domain swizzling for AVX code which required a few trivial test changes. The pass will be moved to lib/CodeGen shortly. llvm-svn: 140659	2011-09-27 23:50:46 +00:00
Jakob Stoklund Olesen	b48c994cc0	Promote the X86 Get/SetSSEDomain functions to TargetInstrInfo. I am going to unify the SSEDomainFix and NEONMoveFix passes into a single target independent pass. They are essentially doing the same thing. llvm-svn: 140652	2011-09-27 22:57:18 +00:00
Craig Topper	45faba98b4	Fix VEX decoding in i386 mode. Fixes PR11008. llvm-svn: 140515	2011-09-26 05:12:43 +00:00
Jakob Stoklund Olesen	55cf2ed148	Only run MF.verify() with EXPENSIVE_CHECKS=1. llvm-svn: 140441	2011-09-24 01:11:19 +00:00
Duncan Sands	a54fd541c2	Implement Chris's suggestion of legalizing the various SSE and AVX hadd/hsub intrinsics into the new fhadd/fhsub X86 node. llvm-svn: 140383	2011-09-23 16:10:22 +00:00
Eli Friedman	87c844cdf8	PR10991: make fast-isel correctly check whether accessing a global through an alias involves thread-local storage. (I'm not entirely sure how this is supposed to work, but this patch makes fast-isel consistent with the normal isel path.) llvm-svn: 140355	2011-09-22 23:41:28 +00:00
Jakob Stoklund Olesen	f05864ad7d	Add support for GR32 <-> FR32 cross class copies. We already support GR64 <-> VR128 copies. All of these copies break partial register dependencies by zeroing the high part of the target register. llvm-svn: 140348	2011-09-22 22:45:24 +00:00
Duncan Sands	0e4fcb8e3b	Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from floating point add/sub of appropriate shuffle vectors. Does not synthesize the 256 bit AVX versions because they work differently. llvm-svn: 140332	2011-09-22 20:15:48 +00:00
Craig Topper	6d1872b77a	Fix register printing in disassembling of push/pop of segment registers and in/out in Intel syntax mode. Fixes PR10960 llvm-svn: 140299	2011-09-22 07:01:50 +00:00
Benjamin Kramer	cfd26cd744	The SSE version differences for fmin/fmax are more involved than I thought. - x87: no min or max. - SSE1: min/max for single precision scalars and vectors. - SSE2: min/max for single and double precision scalars and vectors. - AVX: as SSE2, but also supports the wider ymm vectors. (this is covered by the isTypeLegal check) llvm-svn: 140296	2011-09-22 03:27:22 +00:00
Benjamin Kramer	dc397a6402	X86: Don't form min/max nodes if the target is missing SSE. llvm-svn: 140294	2011-09-22 03:01:42 +00:00
Benjamin Kramer	e5e189f669	X86Disassembler: if verbose logging is going to nulls(), disable logging completely. Otherwise we'll spend a ridiculous amount of time pretty printing debug output and then discarding it. llvm-svn: 140276	2011-09-21 21:47:35 +00:00
Nadav Rotem	50f123d8e5	fix comment llvm-svn: 140258	2011-09-21 17:14:40 +00:00
Nadav Rotem	c1cd8506ce	Insert a sanity check on the combining of x86 truncing-store nodes. This comes to replace the problematic check that was removed in r139995. llvm-svn: 140246	2011-09-21 08:45:10 +00:00
Richard Trieu	a318b8dce6	Change: assert(!"error message"); To: assert(0 && "error message"); which is more consistant across the code base. llvm-svn: 140234	2011-09-21 03:09:09 +00:00
Owen Anderson	69fa8ffeef	In the disassembler C API, be careful not to confuse the comment streamer that the disassembler outputs annotations on with the streamer that the InstPrinter will print them on. llvm-svn: 140217	2011-09-21 00:25:23 +00:00
Bruno Cardoso Lopes	8058234b32	Revert r140097, working on a better approach llvm-svn: 140203	2011-09-20 23:19:29 +00:00
Bruno Cardoso Lopes	f7638e1e51	Simplify max/minp[s\|d] dagcombine matching llvm-svn: 140199	2011-09-20 22:34:45 +00:00
Bruno Cardoso Lopes	60aa85b672	Tidy up a bit more, fix tab and remove trailing whitespaces llvm-svn: 140186	2011-09-20 21:45:26 +00:00
Bruno Cardoso Lopes	33e91a6cf7	The wrong relocation was being emitted for several SSSE3 instructions. This fixes PR10963. Thanks to Benjamin for finding the wrong tablegen declaration. llvm-svn: 140184	2011-09-20 21:39:21 +00:00
Bruno Cardoso Lopes	05f3f4939a	Tidy up code! llvm-svn: 140183	2011-09-20 21:39:06 +00:00
Craig Topper	68c92d86da	Extend changes from r139986 to produce 256-bit AVX minps/minpd/maxps/maxpd. llvm-svn: 140140	2011-09-20 07:38:59 +00:00
Bruno Cardoso Lopes	c4398d2c7b	Fix PR10949. Fix the encoding of VMOVPQIto64rr. llvm-svn: 140098	2011-09-19 23:36:59 +00:00
Bruno Cardoso Lopes	51792dcc4d	Based on the small opt Zvi's patch was trying to achieve, eliminate 128-bit undef subvector insertion into a 256-bit vector llvm-svn: 140097	2011-09-19 23:36:50 +00:00
Bruno Cardoso Lopes	d4a3d452d4	Match X86ISD::FSETCCsd and X86ISD::FSETCCss while in AVX mode. This fix PR10955 and PR10948. llvm-svn: 140069	2011-09-19 21:29:24 +00:00
Nadav Rotem	763c11cc12	Fix typos in my prev commit, found by Tobi. llvm-svn: 140003	2011-09-18 19:00:23 +00:00
Nadav Rotem	261a10a007	setOperationAction should be done on the return value of the type, not the operands. llvm-svn: 140001	2011-09-18 14:57:03 +00:00
Nadav Rotem	7ae11279e9	When promoting integer vectors we often create ext-loads. This patch adds a dag-combine optimization to implement the ext-load efficiently (using shuffles). For example the type <4 x i8> is stored in memory as i32, but it needs to find its way into a <4 x i32> register. Previously we scalarized the memory access, now we use shuffles. llvm-svn: 139995	2011-09-18 10:39:32 +00:00
Craig Topper	d9d01917ee	Fix typo by changing Lower256IntVETCC to Lower256IntVSETCC. llvm-svn: 139993	2011-09-18 08:03:58 +00:00
Duncan Sands	f2b8c854dd	Synthesize x86 max/min instructions also for vectors (i.e. produce maxps and maxpd). This broke the sse41-blend.ll testcase by causing maxpd to be produced rather than a cmp+blend pair, which is the reason I tweaked it. Gives a small speedup on doduc with dragonegg when the GCC vectorizer is used. llvm-svn: 139986	2011-09-17 16:49:39 +00:00
Bruno Cardoso Lopes	4641efe304	Describe more AVX 128-bit convert instructions without patterns to have mayLoad = 1 llvm-svn: 139973	2011-09-16 23:41:29 +00:00
Bruno Cardoso Lopes	5389ed5dfb	Add mayLoad attribute to AVX convert instructions, since non of them are declared with load patterns. This fix the crash in PR10941. No testcases, since a fold is triggered and then converted back to the register form afterwards. llvm-svn: 139953	2011-09-16 22:02:14 +00:00
Bruno Cardoso Lopes	2d406f02bf	Fix PR10884. This PR basically reports a problem where a crash in generated code happened due to %rbp being clobbered: pushq %rbp movq %rsp, %rbp .... vmovmskps %ymm12, %ebp .... movq %rbp, %rsp popq %rbp ret Since Eric's r123367 commit, the default stack alignment for x86 32-bit has changed to be 16-bytes. Since then, the MaxStackAlignmentHeuristicPass hasn't been really used, but with AVX it becomes useful again, since per ABI compliance we don't always align the stack to 256-bit, but only when there are 256-bit incoming arguments. ReserveFP was only used by this pass, but there's no RA target hook that uses getReserveFP() to check for the presence of FP (since nothing was triggering the pass to run, the uses of getReserveFP() were removed through time without being noticed). Change this pass to use setForceFramePointer, which is properly called by MachineFunction hasFP method. The testcase is very big and dependent on RA, not sure if it's worth adding to test/CodeGen/X86. llvm-svn: 139939	2011-09-16 20:58:28 +00:00
Owen Anderson	a0c3b97221	Don't attach annotations to MCInst's. Instead, have the disassembler return, and the printer accept, an annotation string which can be passed through if the client cares about annotations. llvm-svn: 139876	2011-09-15 23:38:46 +00:00
Bruno Cardoso Lopes	7b43568a93	Add a fixme note! llvm-svn: 139872	2011-09-15 23:04:24 +00:00
Bruno Cardoso Lopes	c69d68a150	Add the remaining AVX versions of instructions to X86InstrInfo, this time for describing high latency ones and for recognizting loads from the same base pointer llvm-svn: 139864	2011-09-15 22:15:52 +00:00
Bruno Cardoso Lopes	6b302955b1	Factor out partial register update checks for some SSE instructions. Also add the AVX versions and add comments! llvm-svn: 139854	2011-09-15 21:42:23 +00:00
Owen Anderson	d1814791ad	Add support for stored annotations to MCInst, and provide facilities for MC-based InstPrinters to print them out. Enhance the ARM and X86 InstPrinter's to do so in verbose mode. llvm-svn: 139820	2011-09-15 18:36:29 +00:00
Bruno Cardoso Lopes	fa1ca3070b	Change all checks regarding the presence of any SSE level to always take into consideration the presence of AVX. This change, together with the SSEDomainFix enabled for AVX, makes AVX codegen to always (hopefully) emit the same code as SSE for 128-bit vector ops. I don't have a testcase for this, but AVX now beats SSE in performance for 128-bit ops in the majority of programas in the llvm testsuite llvm-svn: 139817	2011-09-15 18:27:36 +00:00
Bruno Cardoso Lopes	62d79875d3	Enable SSEDomainFix pass for AVX mode. llvm-svn: 139816	2011-09-15 18:27:32 +00:00
Eli Friedman	da5f010177	Fix the code creating VZEXT_LOAD so that it creates the right memoperand. Issue spotted in -debug output. I can't think of any practical effects at the moment, but it might matter if we start doing more aggressive alias analysis in CodeGen. llvm-svn: 139758	2011-09-14 23:42:45 +00:00
Craig Topper	ee8157cb41	Fix mem type for VEX.128 form of VROUNDP*. Remove filter preventing VROUND from being recognized by disassembler. llvm-svn: 139691	2011-09-14 06:41:26 +00:00
Craig Topper	96e00e5a24	Make disassembling of VBLEND* print immediate as a XMM/YMM register name. Fixes PR10917. llvm-svn: 139690	2011-09-14 05:55:28 +00:00
Bruno Cardoso Lopes	d560b8c8e9	Teach the foldable tables about 128-bit AVX instructions and make the alignment check for 256-bit classes more strict. There're no testcases but we catch more folding cases for AVX while running single and multi sources in the llvm testsuite. Since some 128-bit AVX instructions have different number of operands than their SSE counterparts, they are placed in different tables. 256-bit AVX instructions should also be added in the table soon. And there a few more 128-bit versions to handled, which should come in the following commits. llvm-svn: 139687	2011-09-14 02:36:58 +00:00
Bruno Cardoso Lopes	333a59eced	Vector shuffle mask <i32 4, i32 5, i32 2, i32 3> should yield "movsd", not "movss". llvm-svn: 139686	2011-09-14 02:36:14 +00:00
Nadav Rotem	9cfbeaff15	swap vselect operand order - pr10907 llvm-svn: 139630	2011-09-13 19:56:38 +00:00
Bruno Cardoso Lopes	03d6002d68	Add versions 256-bit versions of alignedstore and alignedload, to be more strict about the alignment checking. This was found by inspection and I don't have any testcases so far, although the llvm testsuite runs without any problem. llvm-svn: 139625	2011-09-13 19:33:03 +00:00
Bruno Cardoso Lopes	56d9b51caf	Revert the remaining part of r139528. According to PR10907 the bug seems to be in the VSELECT operands order, so I'll leave the fix for Nadav. llvm-svn: 139624	2011-09-13 19:33:00 +00:00
Nadav Rotem	52202fbf2d	Add vselect target support for targets that do not support blend but do support xor/and/or (For example SSE2). llvm-svn: 139623	2011-09-13 19:17:42 +00:00
Craig Topper	8dd7bbcc80	Only disassembler instructions with vvvv != 1111 if the instruction actually uses the vvvv field to encode an operand. Fixes PR10851. llvm-svn: 139591	2011-09-13 07:37:44 +00:00
Craig Topper	e98d8a5c84	Remove filter that was preventing MOVDQU/MOVDQA and their VEX forms from being disassembled. Also added encodings for the other register/register form of these instructions. Fixes PR10848. llvm-svn: 139588	2011-09-13 06:54:58 +00:00
Craig Topper	b7ae29e404	Fix encoding of VMOVDQU to not simultaneously be 'TB OpSize' and 'XS'. 'XS' is correct and seems to have been taking priority. llvm-svn: 139587	2011-09-13 06:39:34 +00:00
Eli Friedman	d68a727bd0	Fix the assembler strings for a couple of atomic instructions. Doesn't really matter much in practice, but it's a bit cleaner. llvm-svn: 139563	2011-09-13 00:27:04 +00:00
Bruno Cardoso Lopes	ff8d8a830e	Fix PR10845. SUBREG_TO_REG shouldn't be used when the input and destination types are equal! llvm-svn: 139553	2011-09-12 22:59:23 +00:00
Bruno Cardoso Lopes	973d2921e8	Revert the wrong part of r139528, and fix testcases. llvm-svn: 139541	2011-09-12 21:24:07 +00:00
Bruno Cardoso Lopes	be7a086f58	Not sure how CMPPS and CMPPD had already ever worked, I guess it didn't. However with this fix it does now. Basically the operand order for the x86 target specific node is not the same as the instruction, but since the intrinsic need that specific order at the instruction definition, just change the order during legalization. Also, there were some wrong invertions of condition codes, such as GE => LE, GT => LT, fix that too. Fix PR10907. llvm-svn: 139528	2011-09-12 19:30:40 +00:00
Bruno Cardoso Lopes	f6382979f2	Organize a bit the operand names for CMPPS and CMPPD llvm-svn: 139527	2011-09-12 19:30:36 +00:00
Bruno Cardoso Lopes	2e4bee16bb	Realign BLEND patterns to match the general style for patterns in .td file. llvm-svn: 139526	2011-09-12 19:30:33 +00:00
Bruno Cardoso Lopes	9c9f64918c	Fix 80-columns llvm-svn: 139525	2011-09-12 19:30:29 +00:00
Nadav Rotem	c0c71e162a	Format patterns, remove unused X86blend patterns llvm-svn: 139491	2011-09-12 08:41:50 +00:00
Craig Topper	48f2b36911	Fix disassembling of one of the register/register forms of MOVUPS/MOVUPD/MOVAPS/MOVAPD/MOVSS/MOVSD and their VEX equivalents. Fixes PR10877. llvm-svn: 139486	2011-09-11 23:19:54 +00:00
Craig Topper	a88e356017	Fix disassembling of reverse register/register forms of ADD/SUB/XOR/OR/AND/SBB/ADC/CMP/MOV. llvm-svn: 139485	2011-09-11 21:41:45 +00:00
Nadav Rotem	b873b18721	CR fixes per Bruno's request. Undo the changes from r139285 which added custom lowering to vselect. Add tablegen lowering for vselect. llvm-svn: 139479	2011-09-11 15:02:23 +00:00
Eli Friedman	7f50e00203	r139454 activates an assert in a case where we were doing the right thing anyway. Make that explicit, and un-XFAIL the testcase. llvm-svn: 139458	2011-09-10 02:01:42 +00:00
Richard Trieu	74996f2a79	Fix the asserts in lib/Target/X86/X86ELFWriterInfo.cpp and lib/ExecutionEngine/MCJIT/MCJIT.cpp from: assert("error"); to: assert(0 && "error"); llvm-svn: 139456	2011-09-10 01:42:07 +00:00
Richard Trieu	d9917bef6c	Fixed an assert from: assert("not implemented for target shuffle node"); to: assert(0 && "not implemented for target shuffle node"); This causes a test failure in CodeGen/X86/palignr.ll which has been marked as XFAIL for the time being. Test failure filed at PR10901. llvm-svn: 139454	2011-09-10 01:26:21 +00:00
Nadav Rotem	de838daefd	Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type llvm-svn: 139400	2011-09-09 20:29:17 +00:00
Craig Topper	5d5134014f	Fix handling of Intel syntax disassembling of movs and stos to stop being blank. Also fixed scas, and cmps to always print size suffix in Intel syntax since its abiguous without arguments. Fixes PR10875. llvm-svn: 139353	2011-09-09 05:40:53 +00:00
Nadav Rotem	b5df62036b	Dix the 80-columns and remove unsupported v8i16 type from the list of legal vselect types. llvm-svn: 139324	2011-09-08 22:17:35 +00:00
Bruno Cardoso Lopes	46b9cde019	Add a AVX version of a simple i64 -> f64 bitcast. This could be triggered using llc with -O0, which wouldn't let it be folded and expose the lack of this pattern. llvm-svn: 139320	2011-09-08 21:52:33 +00:00
Bruno Cardoso Lopes	23eb5265b4	* Combines Alignment, AuxInfo, and TB_NOT_REVERSABLE flag into a single field (Flags), which is a bitwise OR of items from the TB_* enum. This makes it easier to add new information in the future. * Gives every static array an equivalent layout: { RegOp, MemOp, Flags } * Adds a helper function, AddTableEntry, to avoid duplication of the insertion code. * Renames TB_NOT_REVERSABLE to TB_NO_REVERSE. * Adds TB_NO_FORWARD, which is analogous to TB_NO_REVERSE, except that it prevents addition of the Reg->Mem entry. (This is going to be used by Native Client, in the next CL). Patch by David Meyer llvm-svn: 139311	2011-09-08 18:35:57 +00:00
Bruno Cardoso Lopes	fb113a0051	Add AVX versions of blend vector operations and fix some issues noticed in Nadav's r139285 and r139287 commits. 1) Rename vsel.ll to a more descriptive name 2) Change the order of BLEND operands to "Op1, Op2, Cond", this is necessary because PBLENDVB is already used in different places with this order, and it was being emitted in the wrong way for vselect 3) Add AVX patterns and tests for the same SSE41 instructions llvm-svn: 139305	2011-09-08 18:05:08 +00:00
Bruno Cardoso Lopes	ea8d803bb0	Fix PR10844: Add patterns to cover non foldable versions of X86vzmovl. Triggered using llc -O0. Also fix some SET0PS patterns to their AVX forms and test it on the testcase. llvm-svn: 139304	2011-09-08 18:05:02 +00:00
Nadav Rotem	2550ba2a27	Add X86-SSE4 codegen support for vector-select. llvm-svn: 139285	2011-09-08 08:11:19 +00:00
Eli Friedman	02f2f89a98	Fix atomic load and store on x86 to pass -verify-machineinstrs (and possibly fix some subtle bugs involving passes which check mayStore()). This isn't exactly ideal, but it is good enough for the moment. llvm-svn: 139245	2011-09-07 18:48:32 +00:00
James Molloy	4c493e8050	Refactor instprinter and mcdisassembler to take a SubtargetInfo. Add -mattr= handling to llvm-mc. Reviewed by Owen Anderson. llvm-svn: 139237	2011-09-07 17:24:38 +00:00
Rafael Espindola	6559656e73	Detect attempt to use segmented stacks on non ELF systems and error (not assert) early. llvm-svn: 139233	2011-09-07 16:10:57 +00:00
Bill Wendling	226c4ed92a	Reenable compact unwind by default. However, also emit the old version of unwind information for older linkers. llvm-svn: 139206	2011-09-06 23:47:14 +00:00
Rafael Espindola	9d96c94278	Fix comment. Noticed by Duncan. llvm-svn: 139161	2011-09-06 19:29:31 +00:00
Duncan Sands	f2641e1bc1	Add codegen support for vector select (in the IR this means a select with a vector condition); such selects become VSELECT codegen nodes. This patch also removes VSETCC codegen nodes, unifying them with SETCC nodes (codegen was actually often using SETCC for vector SETCC already). This ensures that various DAG combiner optimizations kick in for vector comparisons. Passes dragonegg bootstrap with no testsuite regressions (nightly testsuite as well as "make check-all"). Patch mostly by Nadav Rotem. llvm-svn: 139159	2011-09-06 19:07:46 +00:00
Rafael Espindola	db5823dc77	Fix style issues and typos found by Duncan. llvm-svn: 139154	2011-09-06 18:43:08 +00:00
Duncan Sands	a098436b32	Split the init.trampoline intrinsic, which currently combines GCC's init.trampoline and adjust.trampoline intrinsics, into two intrinsics like in GCC. While having one combined intrinsic is tempting, it is not natural because typically the trampoline initialization needs to be done in one function, and the result of adjust trampoline is needed in a different (nested) function. To get around this llvm-gcc hacks the nested function lowering code to insert an additional parent variable holding the adjust.trampoline result that can be accessed from the child function. Dragonegg doesn't have the luxury of tweaking GCC code, so it stored the result of adjust.trampoline in the memory GCC set aside for the trampoline itself (this is always available in the child function), and set up some new memory (using an alloca) to hold the trampoline. Unfortunately this breaks Go which allocates trampoline memory on the heap and wants to use it even after the parent has exited (!). Rather than doing even more hacks to get Go working, it seemed best to just use two intrinsics like in GCC. Patch mostly by Sanjoy Das. llvm-svn: 139140	2011-09-06 13:37:06 +00:00
Nick Lewycky	73df7e3830	Add a new MC bit for NaCl (Native Client) mode. NaCl requires that certain instructions are more aligned than the CPU requires, and adds some additional directives, to follow in future patches. Patch by David Meyer! llvm-svn: 139125	2011-09-05 21:51:43 +00:00
Benjamin Kramer	7859d2e148	Use internal storage for command line option. llvm-svn: 139079	2011-09-03 03:45:06 +00:00
Bruno Cardoso Lopes	07d9914620	Add AVX versions to match AESENC/AESDEC intrinsics. This hopefully ends the cycle of missing AVX counterparts of already present SSE* patterns llvm-svn: 139073	2011-09-03 00:47:08 +00:00
Bruno Cardoso Lopes	1d5c2d9227	Add AVX version of a SSE4.1 VPBLENDVB pattern llvm-svn: 139072	2011-09-03 00:47:05 +00:00
Bruno Cardoso Lopes	212a8c4357	Add AVX versions of SSE4.1 EXTRACTPS patterns llvm-svn: 139071	2011-09-03 00:47:03 +00:00
Bruno Cardoso Lopes	3d581a36b6	Add AVX versions for SSE4.1 MOVZX* patterns llvm-svn: 139070	2011-09-03 00:47:01 +00:00
Bruno Cardoso Lopes	6d701fcef0	Add one more AVX pattern for MOVZPQILo2PQI llvm-svn: 139069	2011-09-03 00:46:58 +00:00
Bruno Cardoso Lopes	9923c51564	Move PUNPCKLQDQ splat pattern close to the instruction definition and duplicate it for AVX mode. llvm-svn: 139068	2011-09-03 00:46:56 +00:00
Bruno Cardoso Lopes	96b11f39e2	Add AVX pattern versions for PSHUFB,PSIGN{B,W,D} llvm-svn: 139067	2011-09-03 00:46:54 +00:00
Bruno Cardoso Lopes	9a0da1e57a	Add AVX versions of MOVZDI2PDI patterns. Use SUBREG_TO_REG to indicate that the AVX versions (even the 128-bit ones) all clear the upper part of the destination register. llvm-svn: 139066	2011-09-03 00:46:51 +00:00
Bruno Cardoso Lopes	903952223a	Enforce subtarget checks in a few places to be explicit when the pattern should be matched llvm-svn: 139065	2011-09-03 00:46:49 +00:00
Bruno Cardoso Lopes	521b0cfdc6	Tidy up code moving patterns to their appropriate place! llvm-svn: 139064	2011-09-03 00:46:47 +00:00
Bruno Cardoso Lopes	aad5e50ded	Add AVX versions of FsMOVAPS and FsMOVAPS. Teach X86InstrInfo how to use it! llvm-svn: 139063	2011-09-03 00:46:45 +00:00
Bruno Cardoso Lopes	d893fc92af	Teach X86FastISel to use AVX versions of instructions when possible llvm-svn: 139062	2011-09-03 00:46:42 +00:00
Bruno Cardoso Lopes	006c9371a1	Fix 80-column and style llvm-svn: 139061	2011-09-03 00:46:40 +00:00
Bruno Cardoso Lopes	dbb40015ff	Tidy up some SSE/AVX convert intrinsics. Also add an AVX version of OptForSize pattern llvm-svn: 139060	2011-09-03 00:46:38 +00:00
Jakob Stoklund Olesen	1f72dd40c7	Pseudo CMOV instructions don't clobber EFLAGS. The explanation about a 0 argument being materialized as xor is no longer valid. Rematerialization will check if EFLAGS is live before clobbering it. The code produced by X86TargetLowering::EmitLoweredSelect does not clobber EFLAGS. This causes one less testb instruction to be generated in the cmov.ll test case. llvm-svn: 139057	2011-09-02 23:52:55 +00:00
Jakob Stoklund Olesen	f08354d183	Check for EFLAGS live-out before clobbering it. It is only allowed to clobber EFLAGS at the end of a block if it isn't live-in to any successor. llvm-svn: 139056	2011-09-02 23:52:52 +00:00
Jakob Stoklund Olesen	d0c8a31c8b	Use existing function. llvm-svn: 139055	2011-09-02 23:52:49 +00:00
Jakob Stoklund Olesen	38019e3188	Remove unused variables. llvm-svn: 139047	2011-09-02 22:41:25 +00:00
Eli Friedman	f3dd6da7a8	Don't fast-isel for atomic load/store; some cases require extra handling missing from fast-isel. llvm-svn: 139044	2011-09-02 22:33:24 +00:00
Kevin Enderby	5b03f72292	Change X86 disassembly to print immediates values as signed by default. Special case those instructions that the immediate is not sign-extend. radr://8795217 llvm-svn: 139028	2011-09-02 20:01:23 +00:00
Bill Wendling	4e1d018935	Revert r138826 until PR10834 can be fixed. llvm-svn: 139018	2011-09-02 18:15:04 +00:00
Bruno Cardoso Lopes	f61d1c072e	Fix vbroadcast matching logic to early unmatch if the node doesn't have only one use. Fix PR10825. llvm-svn: 138951	2011-09-01 18:15:06 +00:00
Bruno Cardoso Lopes	a0d85139e5	Move more code around and duplicate AVX patterns: MOVHPS and MOVLPS llvm-svn: 138897	2011-08-31 21:15:32 +00:00
Bruno Cardoso Lopes	21a180367b	Move MOVAPS,MOVUPS patterns close to the instructions definition llvm-svn: 138896	2011-08-31 21:15:29 +00:00
Bruno Cardoso Lopes	941001312a	Remove "_Int" forms of MOVUPSmr and MOVAPSmr llvm-svn: 138895	2011-08-31 21:15:22 +00:00
Rafael Espindola	6e31dfea35	Spelling and grammar fixes to problems found by Duncan. llvm-svn: 138858	2011-08-31 16:43:33 +00:00
Eli Friedman	635d9692b6	Make sure we don't crash when -miphoneos-version-min is specified on x86. Hopefully this will fix gcc testsuite failures. llvm-svn: 138856	2011-08-31 16:19:51 +00:00
Eric Christopher	72d1d5e193	Rework this conditional a bit. Patch by Sanjoy Das llvm-svn: 138853	2011-08-31 04:17:21 +00:00
Bruno Cardoso Lopes	9fc6b8be03	- Move all MOVSS and MOVSD patterns close to their definitions - Duplicate some store patterns to their AVX forms! - Catched a bug while restricting the patterns subtarget, fix it and update a testcase to check it properly llvm-svn: 138851	2011-08-31 03:04:20 +00:00
Bruno Cardoso Lopes	aa1daa63da	Remove unnecessary AVX checks llvm-svn: 138850	2011-08-31 03:04:14 +00:00
Bruno Cardoso Lopes	db520db514	Teach more places to use VMOVAPS,VMOVUPS instead of MOVAPS,MOVUPS, whenever AVX is enabled. llvm-svn: 138849	2011-08-31 03:04:09 +00:00
Evan Cheng	cb1e5bae4c	Fix (movhps load) lowering / pattern to match more cases. rdar://10050549 llvm-svn: 138848	2011-08-31 02:05:24 +00:00
Bill Wendling	6470e07e20	Fix off-by-one error Benjamin noticed. llvm-svn: 138832	2011-08-30 21:23:24 +00:00
Bill Wendling	7a9c3033a4	Enable compact unwind info by default. This only applies to Darwin when CFI is disabled. llvm-svn: 138826	2011-08-30 20:54:11 +00:00
Jeffrey Yasskin	065c35726f	Fix C++0x narrowing errors when char is unsigned. In the case of EDInstInfo, this would actually cause a bug when -1 became 255 and was then compared >=0 in llvm-mc/Disassembler.cpp. llvm-svn: 138825	2011-08-30 20:53:29 +00:00
Rafael Espindola	94d3253626	Adds support for variable sized allocas. For a variable sized alloca, code is inserted to first check if the current stacklet has enough space. If so, space is allocated by simply decrementing the stack pointer. Otherwise a runtime routine (__morestack_allocate_stack_space in libgcc) is called which allocates the required memory from the heap. Patch by Sanjoy Das. llvm-svn: 138818	2011-08-30 19:47:04 +00:00
Rafael Espindola	3353017668	Adds a SelectionDAG node X86SegAlloca which will be custom lowered from DYNAMIC_STACKALLOC. Two new pseudo instructions (SEG_ALLOCA_32 and SEG_ALLOCA_64) which will match X86SegAlloca (based on word size) are also added. They will be custom emitted to inject the actual stack handling code. Patch by Sanjoy Das. llvm-svn: 138814	2011-08-30 19:43:21 +00:00
Rafael Espindola	c21742112b	Emit segmented-stack specific code into function prologues for X86. Modify the pass added in the previous patch to call this new code. This new prologues generated will call a libgcc routine (__morestack) to allocate more stack space from the heap when required Patch by Sanjoy Das. llvm-svn: 138812	2011-08-30 19:39:58 +00:00
Eli Friedman	850b9a9a84	Explicitly zero out parts of a vector which are required to be zero by the algorithm in LowerUINT_TO_FP_i32. This only has a substantial effect on the generated code when the input is extracted from a vector register; other ways of loading an i32 do the appropriate zeroing implicitly. Fixes PR10802. llvm-svn: 138768	2011-08-29 21:15:46 +00:00
Bruno Cardoso Lopes	50e0170fa5	Move non-intruction patterns to a more appropriate place! llvm-svn: 138744	2011-08-29 17:51:24 +00:00
Nicolas Geoffray	7ea09c9462	Remove premature previous commit. llvm-svn: 138725	2011-08-28 14:52:51 +00:00
Nicolas Geoffray	f786bae6ac	Encoding of instructions referencing segments has changed. Do what X86MCCodeEmitter does. llvm-svn: 138723	2011-08-28 13:07:57 +00:00
Benjamin Kramer	61a1ff543c	Silence GCC warnings and make an array const. llvm-svn: 138706	2011-08-27 17:36:14 +00:00
Eli Friedman	5e5704277f	Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction. llvm-svn: 138660	2011-08-26 21:21:21 +00:00
Craig Topper	c66d50d1a2	Fix disassembling of VCVTSD2SI llvm-svn: 138623	2011-08-26 04:49:29 +00:00
Bruno Cardoso Lopes	ed834810be	Do the same as r138461. Mark VZEROALL as clobbering all YMM registers llvm-svn: 138592	2011-08-25 22:23:58 +00:00
Bruno Cardoso Lopes	8347b86293	Add support for AVX 256-bit version of MOVDDUP! llvm-svn: 138588	2011-08-25 21:40:37 +00:00
Bruno Cardoso Lopes	388eacee2c	Make isMOVDDUP mask check more strict and update comments! llvm-svn: 138587	2011-08-25 21:40:34 +00:00
Craig Topper	14380ff9a0	Add more missing TB encodings to VEX instructions to allow them to be disassembled. Fixes remainder of PR10678. llvm-svn: 138553	2011-08-25 08:11:01 +00:00
Craig Topper	e1541838f9	Add TB encoding to VEROALL, VZEROUPPER, and VCVTPS2PD to allow them to be disassembled. Fixes PR10723. llvm-svn: 138551	2011-08-25 06:57:46 +00:00
Bruno Cardoso Lopes	296256fb32	Add support for 256-bit versions of VSHUFPD and VSHUFPS. llvm-svn: 138546	2011-08-25 02:58:26 +00:00
Bruno Cardoso Lopes	54366cc332	Add memory version of SHUFPD to mask decoding! llvm-svn: 138545	2011-08-25 02:58:21 +00:00
Bruno Cardoso Lopes	50d74211df	Create a section for non-instructions patterns in the beginning of the file, and move more code around! llvm-svn: 138521	2011-08-24 23:18:11 +00:00
Bruno Cardoso Lopes	2fb51d38e6	Move code around! llvm-svn: 138520	2011-08-24 23:18:09 +00:00
Bruno Cardoso Lopes	fb702fe8d6	Organize UNPCK* patterns, also add remaining for AVX. llvm-svn: 138519	2011-08-24 23:18:06 +00:00
Bruno Cardoso Lopes	9ade17b7f2	Move remaining MOVDDUP patterns close to MOVDDUP defintion and duplicate the missing ones for AVX. llvm-svn: 138518	2011-08-24 23:18:04 +00:00
Bruno Cardoso Lopes	c1e1e7ab97	Organize and tidy up MOVDDUP section. Also update comments! llvm-svn: 138517	2011-08-24 23:18:02 +00:00
Bruno Cardoso Lopes	813891a215	Move MOVHLPS patterns close to MOVHLPS definition, and duplicate the pattern for 128-bit AVX mode. llvm-svn: 138516	2011-08-24 23:17:59 +00:00
Bruno Cardoso Lopes	9566a66a7c	Move all PSHUF* patterns close to the PSHUF* definitions. Also be explicit about which subtarget they refer to, and add AVX versions of the ones we currently don't. Remove old and now wrong comments! llvm-svn: 138515	2011-08-24 23:17:57 +00:00
Bruno Cardoso Lopes	2953d7b320	Move all SHUFP* patterns close to the SHUFP* definitions. Also be explicit about which subtarget they refer to, and add AVX versions of the ones we currently don't. Make the mask check more strict, to be clear it won't be used to match to 256-bit versions! llvm-svn: 138514	2011-08-24 23:17:55 +00:00
Eli Friedman	9c73a57b20	Hook up 64-bit atomic load/store on x86-32. I plan to write more efficient implementations eventually. llvm-svn: 138505	2011-08-24 22:33:28 +00:00
Eli Friedman	38cd821dc4	Fix whitespace. llvm-svn: 138487	2011-08-24 21:17:30 +00:00
Eli Friedman	342e8df0e0	Basic x86 code generation for atomic load and store instructions. llvm-svn: 138478	2011-08-24 20:50:09 +00:00
Bruno Cardoso Lopes	ce02840633	Mark VZEROALL as clobbering all YMM registers llvm-svn: 138461	2011-08-24 18:48:33 +00:00
Evan Cheng	2bb4035707	Move TargetRegistry and TargetSelect from Target to Support where they belong. These are strictly utilities for registering targets and components. llvm-svn: 138450	2011-08-24 18:08:43 +00:00
Craig Topper	de92622aa5	Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid costly scalarization. Fixes PR10711. llvm-svn: 138427	2011-08-24 06:14:18 +00:00
Bruno Cardoso Lopes	9e9f2ce32d	Fix a nasty bug where a v4i64 was being wrong emitted with 32-bit permutations. Also tidy up some patterns and make them close to their instruction definition! llvm-svn: 138392	2011-08-23 22:06:37 +00:00
Evan Cheng	4d6c9d711d	Some refactoring so TargetRegistry.h no longer has to include any files from MC. llvm-svn: 138367	2011-08-23 20:15:21 +00:00
Nick Lewycky	4c8ff77f1b	PerformSubCombine to work on integers larger than i128. Fixes a crasher. llvm-svn: 138354	2011-08-23 19:01:24 +00:00
Craig Topper	6612e35b0d	Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit ones, avoiding sclarization. Add vex form of pcmpeqq and pcmpgtq. Fixes more cases for PR10712. llvm-svn: 138321	2011-08-23 04:36:33 +00:00
Bruno Cardoso Lopes	2a3ffb5d97	Introduce a pass to insert vzeroupper instructions to avoid AVX to SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper" llc command line option. This is only the first step (very naive and conservative one) to sketch out the idea, but proper DFA is coming next to allow smarter decisions. Comments and ideas now and in further commits will be very appreciated. llvm-svn: 138317	2011-08-23 01:14:17 +00:00
Benjamin Kramer	9dc808e74d	X86: Add some operand types required to identify calls. llvm-svn: 138285	2011-08-22 22:55:32 +00:00
Bruno Cardoso Lopes	74f090d44c	Add support for breaking 256-bit int VETCC into two 128-bit ones, avoding scalarization of the compare. Reduces code from 59 to 6 instructions. Fix PR10712. llvm-svn: 138271	2011-08-22 20:31:04 +00:00
Bruno Cardoso Lopes	6e62ca940a	Add 128-bit AVX codegen for PCMP* family of integer instructions llvm-svn: 138270	2011-08-22 20:31:00 +00:00
Bruno Cardoso Lopes	d126347f32	Re-write part of VEX encoding logic, to be more easy to read! Also fix a bug and add a testcase! llvm-svn: 138123	2011-08-19 22:27:29 +00:00
Craig Topper	ba6c2a52c7	Add TB encoding to VEX versions of SSE fp logical operations to fix disassembler llvm-svn: 138034	2011-08-19 05:28:50 +00:00
Bruno Cardoso Lopes	22241acc29	Fix PR10677. Initial patch and idea by Peter Cooper but I've changed the implementation! llvm-svn: 138029	2011-08-19 02:23:56 +00:00
Bruno Cardoso Lopes	5647d84aa4	Re-encoded 128-bit AVX versions of SQRT, RSQRT, RCP have 3 operands instead of 2. They were already defined this way in their regular version, but not for the intrinsics versions (_Int), and that would work for assembly emission but not for object code, since a MachineOperand would be missing. This commit fix PR10697. Also removed the {VSQRT,VRSQRT,VRCP}r_Int forms and match the intrinsic via INSERT_SUBREG+EXTRACT_SUBREG patterns. The same couldn't be done for memory versions because sse_load_f32/sse_load_f64 operand need special handling and don't work like regular "addr" operands. There are right now 114 "_Int" and 98 "Int_*" forms! I'm slowly removing them as I step through, but hope we can get rid of these someday, they are really annoying :) llvm-svn: 138012	2011-08-18 23:59:21 +00:00
Bruno Cardoso Lopes	3c7d6eb64c	Cleanup vector logical ops in AVX and add use int versions for simple v2i64 llvm-svn: 137919	2011-08-18 02:11:34 +00:00
Bruno Cardoso Lopes	1a87fcb9ba	Fix PR10688. Add support for spliting 256-bit vector shifts when the shift amount is variable llvm-svn: 137885	2011-08-17 22:12:20 +00:00
Owen Anderson	a4043c4b32	Allow the MCDisassembler to return a "soft fail" status code, indicating an instruction that is disassemblable, but invalid. Only used for ARM UNPREDICTABLE instructions at the moment. Patch by James Molloy. llvm-svn: 137830	2011-08-17 17:44:15 +00:00
Bruno Cardoso Lopes	be5e987379	Introduce matching patterns for vbroadcast AVX instruction. The idea is to match splats in the form (splat (scalar_to_vector (load ...))) whenever the load can be folded. All the logic and instruction emission is working but because of PR8156, there are no ways to match loads, cause they can never be folded for splats. Thus, the tests are XFAILed, but I've tested and exercised all the logic using a relaxed version for checking the foldable loads, as if the bug was already fixed. This should work out of the box once PR8156 gets fixed since MayFoldLoad will work as expected. llvm-svn: 137810	2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes	6d33c7f303	Update comments about vector splat handling in x86 llvm-svn: 137808	2011-08-17 02:29:13 +00:00
Bruno Cardoso Lopes	ed786a346e	Now that we have a canonical way to handle 256-bit splats: vinsertf128 $1 + vpermilps $0, remove the old code that used to first do the splat in a 128-bit vector and then insert it into a larger one. This is better because the handling code gets simpler and also makes a better room for the upcoming vbroadcast! llvm-svn: 137807	2011-08-17 02:29:10 +00:00
Bruno Cardoso Lopes	2e99f1b3aa	Instead of always leaving the work to the generic legalizer when there is no support for native 256-bit shuffles, be more smart in some cases, for example, when you can extract specific 128-bit parts and use regular 128-bit shuffles for them. Example: For this shuffle: shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 1, i32 0, i32 7, i32 6> This was expanded to: vextractf128 $1, %ymm1, %xmm2 vpextrq $0, %xmm2, %rax vmovd %rax, %xmm1 vpextrq $1, %xmm2, %rax vmovd %rax, %xmm2 vpunpcklqdq %xmm1, %xmm2, %xmm1 vpextrq $0, %xmm0, %rax vmovd %rax, %xmm2 vpextrq $1, %xmm0, %rax vmovd %rax, %xmm0 vpunpcklqdq %xmm2, %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 ret Now we get: vshufpd $1, %xmm0, %xmm0, %xmm0 vextractf128 $1, %ymm1, %xmm1 vshufpd $1, %xmm1, %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 llvm-svn: 137733	2011-08-16 18:21:54 +00:00
Bruno Cardoso Lopes	c1676e41c0	While I'm here, remove the "_alt" hacks to a series of INSERT_SUBREG and also add the AVX versions of the 128-bit patterns llvm-svn: 137685	2011-08-15 23:36:51 +00:00
Bruno Cardoso Lopes	67005029bc	Reorder declarations of vmovmskp* and also put the necessary AVX predicate and TB encoding fields. This fix the encoding for the attached testcase. This fixes PR10625. llvm-svn: 137684	2011-08-15 23:36:45 +00:00
Jim Grosbach	120a96a721	MCTargetAsmParser target match predicate support. Allow a target assembly parser to do context sensitive constraint checking on a potential instruction match. This will be used, for example, to handle Thumb2 IT block parsing. llvm-svn: 137675	2011-08-15 23:03:29 +00:00
Bruno Cardoso Lopes	cbe7feeab9	Fix PR10656. It's only profitable to use 128-bit inserts and extracts when AVX mode is one. Otherwise is just more work for the type legalizer. llvm-svn: 137661	2011-08-15 21:45:54 +00:00
Bruno Cardoso Lopes	c53dd2ac01	Fix comment! llvm-svn: 137521	2011-08-12 21:54:42 +00:00
Bruno Cardoso Lopes	f15dfe5818	The VPERM2F128 is a AVX instruction which permutes between two 256-bit vectors. It operates on 128-bit elements instead of regular scalar types. Recognize shuffles that are suitable for VPERM2F128 and teach the x86 legalizer how to handle them. llvm-svn: 137519	2011-08-12 21:48:26 +00:00
Bruno Cardoso Lopes	960c8f71aa	Move code around and add comments llvm-svn: 137518	2011-08-12 21:48:22 +00:00
Duncan Sands	a41634e307	Silence a bunch (but not all) "variable written but not read" warnings when building with assertions disabled. llvm-svn: 137460	2011-08-12 14:54:45 +00:00
Andrew Trick	210bf8351d	findDeadCallerSavedReg fix: Missing NULL terminator in register arrays. Fix by Ivan Baev. Sorry I don't have a unit test, but the fix is obvious so I don't want to delay it. llvm-svn: 137404	2011-08-12 00:49:19 +00:00
Bruno Cardoso Lopes	8fbf023c9b	Add a dag combine to xform 256-bit shuffles into simple vector inserts and extracts. This simple combine makes us generate only 1 instruction instead of 11 in the v8 case. llvm-svn: 137362	2011-08-11 21:50:44 +00:00
Bruno Cardoso Lopes	043c820800	Fix PR10492 by teaching MOVHLPS and MOVLPS mask matching to be more strict. llvm-svn: 137324	2011-08-11 18:59:13 +00:00
Nadav Rotem	efdd183f52	Add a comment, per Bruno's CR. llvm-svn: 137313	2011-08-11 17:05:47 +00:00
Nadav Rotem	1542d5a00a	[AVX] If the data which is going to be saved is already in two XMM registers (for example, after integer operation), do not pack the registers into a YMM before saving. Its better to save as two XMM registers. Before: vinsertf128 $1, %xmm3, %ymm0, %ymm3 vinsertf128 $0, %xmm1, %ymm3, %ymm1 vmovaps %ymm1, 416(%rsp) After: vmovaps %xmm3, 416+16(%rsp) vmovaps %xmm1, 416(%rsp) llvm-svn: 137308	2011-08-11 16:41:21 +00:00
Bruno Cardoso Lopes	dbd1352c80	Cleanup: Remove Int_ CVTSS2SI* forms llvm-svn: 137297	2011-08-11 02:52:36 +00:00
Bruno Cardoso Lopes	a2d8bb97b9	Splats for v8i32/v8f32 can be handled by VPERMILPSY. This was causing infinite recursive calls in legalize. Fix PR10562 llvm-svn: 137296	2011-08-11 02:49:44 +00:00
Bruno Cardoso Lopes	572c9aaf53	Use the splat index to generate the desired shuffle. Otherwise we could only get undefs and the vector shuffle becomes an undef, generating wrong code. llvm-svn: 137295	2011-08-11 02:49:41 +00:00
Eli Friedman	3ae39f8ad1	Fix X86TargetLowering::LowerExternalSymbol so that it actually works in non-trivial cases. This hasn't been an issue before because the function isn't normally called (but apparently is used to generate a tail-call to sin() on ELF x86-32 with PIC and SSE2). Fixes PR9693. llvm-svn: 137292	2011-08-11 01:48:05 +00:00
Nadav Rotem	410a11fe82	When performing a truncating store, it is sometimes possible to rearrange the data in-register prior to saving to memory. When we reorder the data in memory we prevent the need to save multiple scalars to memory, making a single regular store. llvm-svn: 137238	2011-08-10 19:30:14 +00:00
Bruno Cardoso Lopes	3ff111c12d	The following X86 pattern is incorrect: def : Pat<(X86Movss VR128:$src1, (bc_v4i32 (v2i64 (load addr:$src2)))), (MOVLPSrm VR128:$src1, addr:$src2)>; This matches a MOVSS dag with a MOVLPS instruction. However, MOVSS will replace only the low 32 bits of the register, while the MOVLPS instruction will replace the low 64 bits. A testcase is added and illustrates the bug and also modified the one that was already present. Patch by Tanya Lattner. llvm-svn: 137227	2011-08-10 17:45:17 +00:00
Bruno Cardoso Lopes	278ffd7d8e	Fix a bug in vpermilps mask checking. Fix PR10560 llvm-svn: 137194	2011-08-10 01:54:17 +00:00
Bruno Cardoso Lopes	72323966c8	Add 256-bit support for v8i32, v4i64 and v4f64 ISD::SELECT. Fix PR10556 llvm-svn: 137179	2011-08-09 23:27:13 +00:00
Bruno Cardoso Lopes	fc481959d2	Add v16i16 and v32i8 store patterns llvm-svn: 137166	2011-08-09 22:39:53 +00:00
Bruno Cardoso Lopes	6963062a99	Use fp unpack instructions to unpack int types. Until we have AVX2, this is the best we can do for these patterns. This fix PR10554. llvm-svn: 137161	2011-08-09 22:18:37 +00:00
Eli Friedman	4ef2426b87	Fix a couple ridiculous copy-paste errors. rdar://9914773 . llvm-svn: 137160	2011-08-09 22:17:39 +00:00
Bruno Cardoso Lopes	bed48dc8ff	Reapply a more appropriate solution than in r137114. AVX supports v4f64 = sitofp v4i32. This fix PR10559. Also add support for v4i32 = fptosi v4f64. llvm-svn: 137128	2011-08-09 17:39:13 +00:00
Bruno Cardoso Lopes	24dd1d4a27	Revert r137114 llvm-svn: 137127	2011-08-09 17:39:01 +00:00
Bruno Cardoso Lopes	ad3453cf2d	Handle sitofp between v4f64 <- v4i32. Fix PR10559 llvm-svn: 137114	2011-08-09 05:48:01 +00:00
Bruno Cardoso Lopes	1155b1eafa	Add support for avx vector fextend llvm-svn: 137105	2011-08-09 03:04:29 +00:00
Bruno Cardoso Lopes	0d0964d099	Add AVX versions of 128-bit sitofp and fptosi llvm-svn: 137104	2011-08-09 03:04:25 +00:00
Bruno Cardoso Lopes	2fc107365b	Add two patterns to match special vmovss and vmovsd cases. Also fix the patterns already there to be more strict regarding the predicate. This fixes PR10558 llvm-svn: 137100	2011-08-09 01:43:09 +00:00
Bruno Cardoso Lopes	af6a85484c	Make LowerVSETCC aware of AVX types and add patterns to match them. llvm-svn: 137090	2011-08-09 00:46:57 +00:00
Bruno Cardoso Lopes	c96953c12a	Add support for several vector shifts operations while in AVX mode. Fix PR10581 llvm-svn: 137067	2011-08-08 21:31:08 +00:00
Jakob Stoklund Olesen	daa2cad723	Hoist hasLoadFromStackSlot and hasStoreToStackSlot. These the methods are target-independent since they simply scan the memory operands. They can live in TargetInstrInfoImpl. llvm-svn: 137063	2011-08-08 20:53:24 +00:00
Jakob Stoklund Olesen	4f0ace5674	Don't clobber pending ST regs when FP regs are killed. X86FloatingPoint keeps track of pending ST registers for an upcoming inline asm instruction with fixed stack register constraints. It does this by remembering which FP register holds the value that should appear at a fixed stack position for the inline asm. When that FP register is killed before the inline asm, make sure to duplicate it to a scratch register, so the ST register still has a live FP reference. This could happen when the same FP register was copied to two ST registers, or when a spill instruction is inserted between the ST copy and the inline asm. This fixes PR10602. llvm-svn: 137050	2011-08-08 17:15:43 +00:00
Chandler Carruth	2536b51aae	Silence unused variable warnings in release builds. llvm-svn: 136956	2011-08-05 01:08:21 +00:00
Jason W Kim	239370cb3f	Fix http://llvm.org/bugs/show_bug.cgi?id=10583\n - test for 1 and 2 byte fixups to be added llvm-svn: 136954	2011-08-05 00:53:03 +00:00
Evan Cheng	19e3f80579	Fix an obvious type. Patch by Ivan Krasin. llvm-svn: 136899	2011-08-04 18:38:15 +00:00
Duncan Sands	00f39c1521	Add obviously missing "break". Noticed by Andrey Karpov with the PVS-studio tool. llvm-svn: 136878	2011-08-04 15:45:59 +00:00
Jason W Kim	e4df09f7ba	Fix http://llvm.org/bugs/show_bug.cgi?id=10568 Move the reloc size assert into AsmBackend - where it is more apropos. llvm-svn: 136855	2011-08-04 00:38:45 +00:00
Bill Wendling	e234f6ae0c	Only access both operands of an INSERT_SUBVECTOR if it is an INSERT_SUBVECTOR. Fixes PR10527. llvm-svn: 136853	2011-08-04 00:32:58 +00:00
Benjamin Kramer	103e2ec2df	Remove unused variables. llvm-svn: 136803	2011-08-03 19:53:48 +00:00
Jakob Stoklund Olesen	da618420ee	Handle IMPLICIT_DEF instructions in X86FloatingPoint. This fixes PR10575. llvm-svn: 136787	2011-08-03 16:33:19 +00:00
Eli Friedman	04c5025cd5	Don't create a ridiculous EXTRACT_ELEMENT. PR10563. The testcase looks extremely fragile, so I'm adding an assertion which should catch any cases like this. llvm-svn: 136711	2011-08-02 18:38:35 +00:00
Bruno Cardoso Lopes	5ada908140	Make this kind of lowering to be supported by 256-bit instructions: shuffle (scalar_to_vector (load (ptr + 4))), undef, <0, 0, 0, 0> To: shuffle (vload ptr)), undef, <1, 1, 1, 1> Fix PR10494 llvm-svn: 136691	2011-08-02 16:06:18 +00:00
Nick Lewycky	a530a4d925	Bail from FastISel when we encounter a volatile memset intrinsic. Patch by Ivan Krasin! llvm-svn: 136663	2011-08-02 00:40:16 +00:00
Bruno Cardoso Lopes	a8e3673816	Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise the legalizer. This commit together with the two previous ones fixes PR10495. llvm-svn: 136654	2011-08-01 21:54:09 +00:00
Bruno Cardoso Lopes	616fe60548	Teach PreprocessISelDAG to be aware of vector types and to not process them. llvm-svn: 136653	2011-08-01 21:54:05 +00:00
Bruno Cardoso Lopes	bd30a4b584	Lower CONCAT_VECTORS to use two VINSERTF128 instructions instead of using a stack store. llvm-svn: 136652	2011-08-01 21:54:02 +00:00
Bruno Cardoso Lopes	7513939ddd	Since vectors with all ones can't be created with a 256-bit instruction, avoid returning early for v8i32 types, which would only be valid for vector with all zeros. Also split the handling of zeros and ones into separate checking logic since they are handled differently. This fixes PR10547 llvm-svn: 136642	2011-08-01 19:51:53 +00:00
Douglas Gregor	d41f3a161f	Update CMake target names for tablegen-generated data in the X86 and ARM targets. This should fix the CMake build with MSVC. llvm-svn: 136621	2011-08-01 16:29:27 +00:00
Eli Friedman	adec587d5c	Misc optimizer+codegen work for 'cmpxchg' and 'atomicrmw'. They appear to be working on x86 (at least for trivial testcases); other architectures will need more work so that they actually emit the appropriate instructions for orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC, Mips, and Alpha backends need such changes.) llvm-svn: 136457	2011-07-29 03:05:32 +00:00
Bruno Cardoso Lopes	65ce5ea3ba	Fix two tests that I crashed in the previous commits. The mask elts on the second half must be reindexed. llvm-svn: 136454	2011-07-29 02:05:28 +00:00
Bruno Cardoso Lopes	81eb193f2e	Match VPERMIL masks more strictly and update the target specific mask generation to always catch the weird cases. llvm-svn: 136453	2011-07-29 01:31:15 +00:00
Bruno Cardoso Lopes	795f558532	Add DecodeShuffle shuffle support for VPERMIPD variantes llvm-svn: 136452	2011-07-29 01:31:11 +00:00
Bruno Cardoso Lopes	d23709b18c	Add v8i32 and v4i64 vpermil patterns llvm-svn: 136451	2011-07-29 01:31:07 +00:00
Bruno Cardoso Lopes	c00f6728bc	Fix a bug while generating target specific VPERMIL masks: skip undef mask elements. This fixes PR10529. llvm-svn: 136450	2011-07-29 01:31:04 +00:00
Bruno Cardoso Lopes	b9ba465de8	Enable usage of SSE4 extracts and inserts in their 128-bit AVX forms. Also tidy up code a bit. llvm-svn: 136449	2011-07-29 01:31:02 +00:00
Bruno Cardoso Lopes	6aee388423	Cleanup PALIGNR handling and remove the old palign pattern fragment. Also make PALIGNR masks to don't match 256-bits, which isn't supported It's also a step to solve PR10489 llvm-svn: 136448	2011-07-29 01:30:59 +00:00
Chandler Carruth	9d7feab3e0	Rewrite the CMake build to use explicit dependencies between libraries, specified in the same file that the library itself is created. This is more idiomatic for CMake builds, and also allows us to correctly specify dependencies that are missed due to bugs in the GenLibDeps perl script, or change from compiler to compiler. On Linux, this returns CMake to a place where it can relably rebuild several targets of LLVM. I have tried not to change the dependencies from the ones in the current auto-generated file. The only places I've really diverged are in places where I was seeing link failures, and added a dependency. The goal of this patch is not to start changing the dependencies, merely to move them into the correct location, and an explicit form that we can control and change when necessary. This also removes a serialization point in the build because we don't have to scan all the libraries before we begin building various tools. We no longer have a step of the build that regenerates a file inside the source tree. A few other associated cleanups fall out of this. This isn't really finished yet though. After talking to dgregor he urged switching to a single CMake macro to construct libraries with both sources and dependencies in the arguments. Migrating from the two macros to that style will be a follow-up patch. Also, llvm-config is still generated with GenLibDeps.pl, which means it still has slightly buggy dependencies. The internal CMake 'llvm-config-like' macro uses the correct explicitly specified dependencies however. A future patch will switch llvm-config generation (when using CMake) to be based on these deps as well. This may well break Windows. I'm getting a machine set up now to dig into any failures there. If anyone can chime in with problems they see or ideas of how to solve them for Windows, much appreciated. llvm-svn: 136433	2011-07-29 00:14:25 +00:00
Oscar Fuentes	a8666a3cdb	Explicitly declare a library dependency of LLVMDesc to LLVMAsmPrinter. GenLibDeps.pl fails to detect vtable references. As this is the only referenced symbol from LLVMDesc to LLVMAsmPrinter on optimized builds, the algorithm that creates the list of libraries to be linked into tools doesn't know about the dependency and sometimes places the libraries on the wrong order, yielding error messages like this: ../../lib/libLLVMARMDesc.a(ARMMCTargetDesc.cpp.o): In function `llvm::ARMInstPrinter::ARMInstPrinter(llvm::MCAsmInfo const&)': ARMMCTargetDesc.cpp:(.text._ZN4llvm14ARMInstPrinterC1ERKNS_9MCAsmInfoE [llvm::ARMInstPrinter::ARMInstPrinter(llvm::MCAsmInfo const&)]+0x2a): undefined reference to `vtable for llvm::ARMInstPrinter' llvm-svn: 136328	2011-07-28 02:33:52 +00:00
Bruno Cardoso Lopes	8c19a8b5d5	Invert the subvector insertion to be more likely to be taken as a COPY llvm-svn: 136324	2011-07-28 01:26:53 +00:00
Bruno Cardoso Lopes	76bc28bac6	Add patterns to generate copies for extract_subvector instead of using vextractf128. This will reduce the number of issued instruction for several avx codes. llvm-svn: 136323	2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes	3fb0b635bd	movd/movq write zeros in the high 128-bit part of the vector. Use them to match 256-bit scalar_to_vector+zext. llvm-svn: 136322	2011-07-28 01:26:46 +00:00
Bruno Cardoso Lopes	eca99c4b5a	Add a few patterns to match allzeros without having to use the fp unit. Take advantage that the 128-bit vpxor zeros the higher part and use it. This also fixes PR10491 llvm-svn: 136321	2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes	9e2a301216	Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move a convert pattern close to the instruction definition. llvm-svn: 136320	2011-07-28 01:26:39 +00:00
Evan Cheng	eda1d4f3ba	Emit an error is asm parser parsed X86_64 only registers, e.g. %rax, %sil. This can happen in cases where TableGen generated asm matcher cannot check whether a register operand is in the right register class. e.g. mem operands. rdar://8204588 llvm-svn: 136292	2011-07-27 23:22:03 +00:00
Kevin Enderby	5ef6c453a6	Fix llvm-mc handing of x86 instructions that take 8-bit unsigned immediates. llvm-mc gives an "invalid operand" error for instructions that take an unsigned immediate which have the high bit set such as: pblendw $0xc5, %xmm2, %xmm1 llvm-mc treats all x86 immediates as signed values and range checks them. A small number of x86 instructions use the imm8 field as a set of bits. This change only changes those instructions and where the high bit is not ignored. The others remain unchanged. llvm-svn: 136287	2011-07-27 23:01:50 +00:00
Eli Friedman	26a484852e	Code generation for 'fence' instruction. llvm-svn: 136283	2011-07-27 22:21:52 +00:00
Eli Friedman	e6d1853e74	X86ISD::MEMBARRIER does not require SSE2; it doesn't actually generate any code, and all x86 processors will honor the required semantics. llvm-svn: 136249	2011-07-27 19:43:50 +00:00
Jeffrey Yasskin	6381c0100b	Explicitly cast narrowing conversions inside {}s that will become errors in C++0x. llvm-svn: 136211	2011-07-27 06:22:51 +00:00
Bruno Cardoso Lopes	f9324f4f6b	Move some code around to open opportunity for more shuffle matching llvm-svn: 136201	2011-07-27 00:56:37 +00:00
Bruno Cardoso Lopes	27a30a7792	The vpermilps and vpermilpd have different behaviour regarding the usage of the shuffle bitmask. Both work in 128-bit lanes without crossing, but in the former the mask of the high part is the same used by the low part while in the later both lanes have independent masks. Handle this properly and and add support for vpermilpd. llvm-svn: 136200	2011-07-27 00:56:34 +00:00
Bruno Cardoso Lopes	db5fb91491	Remove more dead code! llvm-svn: 136199	2011-07-27 00:56:27 +00:00
Evan Cheng	481ebb0133	Support .code32 and .code64 in X86 assembler. llvm-svn: 136197	2011-07-27 00:38:12 +00:00
Benjamin Kramer	124ac2b997	Add a neat little two's complement hack for x86. On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add, which can be commuted and encoded efficiently. This code is generated for __builtin_clz and friends. llvm-svn: 136167	2011-07-26 22:42:13 +00:00
Bruno Cardoso Lopes	f8fe47bd2b	Recognize unpckh* masks and match 256-bit versions. The new versions are different from the previous 128-bit because they work in lanes. Update a few comments and add testcases llvm-svn: 136157	2011-07-26 22:03:40 +00:00
Eli Friedman	93dc04d5ca	Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130. llvm-svn: 136148	2011-07-26 21:02:58 +00:00
Bruno Cardoso Lopes	53bc328071	Remove now unused patterns. 0 insertions(+), 98 deletions(-) llvm-svn: 136109	2011-07-26 18:22:39 +00:00
Bruno Cardoso Lopes	2e8f3c6f25	Cleanup old matching for PUNPCK* variants llvm-svn: 136108	2011-07-26 18:22:27 +00:00
Bill Wendling	ee61946783	The compact unwinding offsets are divided by 8 on 64-bit machines. llvm-svn: 136065	2011-07-26 08:03:49 +00:00
Bruno Cardoso Lopes	d600a0f878	Add 256-bit isel for movsldup/movshdup llvm-svn: 136051	2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes	d77b383199	More movsldup/movshdup cleanup. Rewrite the mask matching function and add support for 256-bit versions (but no instruction selection yet, coming next). llvm-svn: 136050	2011-07-26 02:39:28 +00:00
Bruno Cardoso Lopes	5b268a4b82	More cleanup, subtarget info isn't used here. llvm-svn: 136049	2011-07-26 02:39:25 +00:00
Bruno Cardoso Lopes	de7aaf5c7f	Add 128-bit AVX versions of movshdup/mosldup llvm-svn: 136048	2011-07-26 02:39:23 +00:00
Bruno Cardoso Lopes	957a6a13e0	Cleanup movsldup/movshdup matching. 27 insertions(+), 62 deletions(-) llvm-svn: 136047	2011-07-26 02:39:13 +00:00
Evan Cheng	3a79225b4c	Rename createCodeEmitter to createMCCodeEmitter; createObjectStreamer to createMCObjectStreamer. llvm-svn: 136031	2011-07-26 00:42:34 +00:00
Evan Cheng	1142444565	Rename TargetAsmParser to MCTargetAsmParser and TargetAsmLexer to MCTargetAsmLexer; rename createAsmLexer to createMCAsmLexer and createAsmParser to createMCAsmParser. llvm-svn: 136027	2011-07-26 00:24:13 +00:00
Chandler Carruth	97c069c1d2	Clean up a pile of hacks in our CMake build relating to TableGen. The first problem to fix is to stop creating synthetic Table_gen targets next to all of the LLVM libraries. These had no real effect as CMake specifies that add_custom_command(OUTPUT ...) directives (what the 'tablegen(...)' stuff expands to) are implicitly added as dependencies to all the rules in that CMakeLists.txt. These synthetic rules started to cause problems as we started more and more heavily using tablegen files from subdirectories* of the one where they were generated. Within those directories, the set of tablegen outputs was still available and so these synthetic rules added them as dependencies of those subdirectories. However, they were no longer properly associated with the custom command to generate them. Most of the time this "just worked" because something would get to the parent directory first, and run tablegen there. Once run, the files existed and the build proceeded happily. However, as more and more subdirectories have started using this, the probability of this failing to happen has increased. Recently with the MC refactorings, it became quite common for me when touching a large enough number of targets. To add insult to injury, several of the backends tried to fix this by adding explicit dependencies back to the parent directory's tablegen rules, but those dependencies didn't work as expected -- they weren't forming a linear chain, they were adding another thread in the race. This patch removes these synthetic rules completely, and adds a much simpler function to declare explicitly that a collection of tablegen'ed files are referenced by other libraries. From that, we can add explicit dependencies from the smaller libraries (such as every architectures Desc library) on this and correctly form a linear sequence. All of the backends are updated to use it, sometimes replacing the existing attempt at adding a dependency, sometimes adding a previously missing dependency edge. Please let me know if this causes any problems, but it fixes a rather persistent and problematic source of build flakiness on our end. llvm-svn: 136023	2011-07-26 00:09:08 +00:00
Evan Cheng	5928e69d20	Rename TargetAsmBackend to MCAsmBackend; rename createAsmBackend to createMCAsmBackend. llvm-svn: 136010	2011-07-25 23:24:55 +00:00
Bruno Cardoso Lopes	9212bf275d	Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128 This also fixes PR10452 llvm-svn: 136004	2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes	ec21941de0	Add remaining 256-bit vector bitcasts. This also fixes PR10451 llvm-svn: 136003	2011-07-25 23:05:28 +00:00
Bruno Cardoso Lopes	123dff0f58	- Handle special scalar_to_vector case: splats. Using a native 128-bit shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002	2011-07-25 23:05:25 +00:00
Bruno Cardoso Lopes	276eb8debf	Reintroduce r135730, this is indeed the right approach, there is no native 256-bit vector instruction to do scalar_to_vector. llvm-svn: 136001	2011-07-25 23:05:16 +00:00
Benjamin Kramer	c956033947	Add a note about efficient codegen for binary log. llvm-svn: 135996	2011-07-25 22:30:00 +00:00
Eli Friedman	ea8c66fea5	Get rid of an incorrect optimization for shuffles with PALIGNR and simplify isPALIGNRMask. Addresses PR10466, although the crash from that PR only triggers in cases where DAGCombine misses optimizing a shuffle. llvm-svn: 135980	2011-07-25 21:36:45 +00:00
Evan Cheng	61faa55b74	Separate MCInstPrinter registration from AsmPrinter registration. llvm-svn: 135974	2011-07-25 21:20:24 +00:00
Evan Cheng	f60768a14e	Fix last bits of MC layer issues. llvm-mc doesn't need to initialize TargetMachine's anymore. llvm-svn: 135963	2011-07-25 20:53:02 +00:00
Evan Cheng	f5bf19530b	Code clean up. llvm-svn: 135954	2011-07-25 20:18:48 +00:00
Bill Wendling	43ab71a9a8	Update the comment. This feature is available only on Darwin at the moment. Though it's not Darwin-specific. llvm-svn: 135951	2011-07-25 20:15:15 +00:00
Oscar Fuentes	47d4aaf8ad	Unbreak the build. llvm-svn: 135949	2011-07-25 20:13:36 +00:00
Evan Cheng	b25310095f	More refactoring. llvm-svn: 135939	2011-07-25 19:33:48 +00:00
Evan Cheng	7e763d86ba	Refactor X86 target to separate MC code from Target code. llvm-svn: 135930	2011-07-25 18:43:53 +00:00
Bill Wendling	2dc0005b3c	Changed disabled code into a flag. llvm-svn: 135924	2011-07-25 18:04:49 +00:00
Bill Wendling	1d10909cb7	Remove dead variable. llvm-svn: 135923	2011-07-25 18:01:27 +00:00
Bill Wendling	b97270d58a	After we've modified the prolog to save volatile registers, generate the compact unwind encoding for that function. This simply crawls through the prolog looking for machine instrs marked as "frame setup". It can calculate from these what the compact unwind should look like. This is currently disabled because of needed linker support. But initial tests look good. llvm-svn: 135922	2011-07-25 18:00:28 +00:00
Evan Cheng	f2596bc62a	Move TargetAsmParser.h TargetAsmBackend.h and TargetAsmLexer.h to MC where they belong. llvm-svn: 135833	2011-07-23 00:45:41 +00:00
Evan Cheng	6376593ed1	createXXXMCCodeGenInfo should be static. llvm-svn: 135826	2011-07-23 00:01:04 +00:00
Evan Cheng	8c886a40d2	Combine all MC initialization routines into one. e.g. InitializeX86MCAsmInfo, InitializeX86MCInstrInfo, etc. are combined into InitializeX86TargetMC. llvm-svn: 135812	2011-07-22 21:58:54 +00:00
Bruno Cardoso Lopes	a89039998d	Fix PR10422 by adding the necessary AVX UCOMISD memory versions to load folding logic llvm-svn: 135801	2011-07-22 20:53:20 +00:00
Bruno Cardoso Lopes	d23a324132	Add v8f32->v8i32 bitcast. Fixes PR10440 llvm-svn: 135794	2011-07-22 19:51:02 +00:00
Rafael Espindola	77242dd537	Turn shuffles into unpacks for VT == MVT::v2i64 and MVT::v2f64 too. Patch by Jeff Muizelaar. llvm-svn: 135789	2011-07-22 18:56:05 +00:00
Dan Gohman	c535278cf1	Fix x86's XALUO lowering to return its replacement values instead of doing the RAUW calls for the overflow value itself. This makes it more consistent with how the rest of LegalizeDAG works. llvm-svn: 135788	2011-07-22 18:45:15 +00:00
Benjamin Kramer	959b7e9df7	GCC complains about the angle of this line. Remove the escaped newline. llvm-svn: 135739	2011-07-22 01:02:57 +00:00
Bruno Cardoso Lopes	1872173841	Remove the 128-bit special handling from SCALAR_TO_VECTOR. This isn't the way to go. Doing this here will prevent several node matches later, and would have to force looking all the way through several VINSERTF128/VEXTRACTF128 chains to optimize simple things. llvm-svn: 135730	2011-07-22 00:15:10 +00:00
Bruno Cardoso Lopes	612e56174b	-Inspected a AVX code block added by someone in early Feb. This was never used and was actually very wrong, fix it and make it simpler. Also remove the ConcatVectors function, which is unused now. - Fix a introduction of useless nodes in r126664 and r126264. The VUNPCKL* should never be introduced cause we don't want duplicate nodes for 128 AVX and non-AVX modes, the actual instruction difference only exists during isel, but not for target specific DAG nodes. We only introduce V* target nodes when there is no 128-bit version already there. - Fix a fragile test and make it more useful. llvm-svn: 135729	2011-07-22 00:15:07 +00:00
Bruno Cardoso Lopes	91eff5140f	Add a DAGCombine for transforming 128->256 casts into a simple vxorps + vinsertf128 pair of instructions llvm-svn: 135727	2011-07-22 00:15:00 +00:00
Bruno Cardoso Lopes	dbebd01269	Introduce a new function to lower 256-bit vectors which are not direclty supported and should be promoted and handled by smaller shuffles llvm-svn: 135726	2011-07-22 00:14:56 +00:00
Bruno Cardoso Lopes	95d037721b	Rename function to be more specific and be more strict about its usage llvm-svn: 135725	2011-07-22 00:14:53 +00:00
Bruno Cardoso Lopes	178fb40612	- Register v16i16 as valid VR256 register class - Add more bitcasts for v16i16 - Since 135661 and 135662 already added the splat logic, just add one more splat test for v16i16 llvm-svn: 135663	2011-07-21 02:24:08 +00:00
Bruno Cardoso Lopes	b878caa5e2	Add support for 256-bit versions of VPERMIL instruction. This is a new instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662	2011-07-21 01:55:47 +00:00
Bruno Cardoso Lopes	fb4920eb25	Improve splat promotion to handle AVX types: v32i8 and v16i16. Also refactor the code and add a bunch of comments. The final shuffle emitted by handling 256-bit types is suitable for the VPERM shuffle instruction which is going to be introduced in a next commit (with a testcase which cover this commit) llvm-svn: 135661	2011-07-21 01:55:42 +00:00
Bruno Cardoso Lopes	18a8d25b62	Add aditional patterns for vextractf128 instruction llvm-svn: 135660	2011-07-21 01:55:39 +00:00
Bruno Cardoso Lopes	2389881b69	Add aditional patterns for vinsertf128 instruction llvm-svn: 135659	2011-07-21 01:55:36 +00:00
Bruno Cardoso Lopes	0a57b22588	Add v16i16 type to VR256 class llvm-svn: 135658	2011-07-21 01:55:33 +00:00
Bruno Cardoso Lopes	e6f8832631	Move code around. No functionality changes llvm-svn: 135657	2011-07-21 01:55:30 +00:00
Bruno Cardoso Lopes	0bdeacf03b	Tidy up code llvm-svn: 135656	2011-07-21 01:55:27 +00:00
Bill Wendling	28b6e12d9d	Mark instructions which are part of the frame setup with the MachineInstr::FrameSetup flag. llvm-svn: 135645	2011-07-21 00:44:56 +00:00
Bill Wendling	ed93564c7a	Remove unused function. llvm-svn: 135635	2011-07-20 23:07:42 +00:00
Bill Wendling	01bd7d9dc0	Remove the now defunct getCompactUnwindEncoding method from the frame lowering code. llvm-svn: 135634	2011-07-20 23:04:09 +00:00
Evan Cheng	bbf3b0de8b	Goodbye TargetAsmInfo. This eliminate last bit of CodeGen and Target in llvm-mc. There is still a bit more refactoring left to do in Targets. But we are now very close to fixing all the layering issues in MC. llvm-svn: 135611	2011-07-20 19:50:42 +00:00
Eli Friedman	ae60b6b008	Extend the hack for _GLOBAL_OFFSET_TABLE_ slightly; PR10389. llvm-svn: 135607	2011-07-20 19:36:11 +00:00
Evan Cheng	efd9b4240f	- Move CodeModel from a TargetMachine global option to MCCodeGenInfo. - Introduce JITDefault code model. This tells targets to set different default code model for JIT. This eliminates the ugly hack in TargetMachine where code model is changed after construction. llvm-svn: 135580	2011-07-20 07:51:56 +00:00
NAKAMURA Takumi	b66d255595	X86Subtarget.h: Assume "x86_64-cygwin", though it has not been released yet, to appease test/CodeGen/X86 on cygwin. llvm-svn: 135564	2011-07-20 04:02:20 +00:00
Evan Cheng	2129f59637	Introduce MCCodeGenInfo, which keeps information that can affect codegen (including compilation, assembly). Move relocation model Reloc::Model from TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine. llvm-svn: 135468	2011-07-19 06:37:02 +00:00
Evan Cheng	67c033e6b8	Move getInitialFrameState from TargetFrameInfo to MCAsmInfo (suggestions for better location welcome). llvm-svn: 135438	2011-07-18 22:29:13 +00:00
Evan Cheng	d60fa58ba1	Sink getDwarfRegNum, getLLVMRegNum, getSEHRegNum from TargetRegisterInfo down to MCRegisterInfo. Also initialize the mapping at construction time. This patch eliminate TargetRegisterInfo from TargetAsmInfo. It's another step towards fixing the layering violation. llvm-svn: 135424	2011-07-18 20:57:22 +00:00
Bruno Cardoso Lopes	50c1d9816c	Be more smart with VCVTSS2SD. Also place the patterns close to the definitions. llvm-svn: 135407	2011-07-18 18:11:25 +00:00
Bruno Cardoso Lopes	4208cace5f	Add AVX 128-bit sqrt versions llvm-svn: 135404	2011-07-18 17:51:40 +00:00
Chris Lattner	229907cd11	land David Blaikie's patch to de-constify Type, with a few tweaks. llvm-svn: 135375	2011-07-18 04:54:35 +00:00
Bruno Cardoso Lopes	4480040191	Add AVX 128-bit patterns for sint_to_fp llvm-svn: 135332	2011-07-16 00:50:20 +00:00
Bruno Cardoso Lopes	8df9cfc279	Fix a couple of things: 1) Make non-legal 256-bit loads to be promoted to v4i64. This lets us canonize the loads and handle things the same way we use to handle for 128-bit registers. Despite of what one of the removed comments explained, the load promotion would not mess with VPERM, it's only a matter of doing the appropriate bitcasts when this instructions comes to be introduced. Also make LOAD v8i32 legal. 2) Doing 1) exposed two bugs: - v4i64 was being promoted to itself for several opcodes (introduced in r124447 by David Greene) causing endless recursion and the stack to explode. - there was no support for allOnes BUILD_VECTORs and ANDNP would fail to match because it was generating early target constant pools during lowering. 3) The testcases are already checked-in, doing 1) exposed the bugs in the current testcases. 4) Tidy up code to be more clear and explicit about AVX. llvm-svn: 135313	2011-07-15 22:24:33 +00:00
Bruno Cardoso Lopes	1fe1377e65	Add a few patterns for 256-bit bitcasts. No testcases now, they are comming together with other tests. llvm-svn: 135312	2011-07-15 22:24:17 +00:00
Eli Friedman	3846acc98e	PR10370: Make sure we know how to relax push correctly on x86-64. llvm-svn: 135303	2011-07-15 21:28:39 +00:00
Chandler Carruth	65667dbf2d	Remove an unnecessary header from this file. I don't think this header was really intended, and it may have been required prior to some of the recent refactors. Including it however causes LLVMX86Desc to need symbols from LLVMX86CodeGen, forming a dependency cycle. This was masked in almost all builds: Clang, and GCC w/ optimizations didn't actually emit the symbols! llvm-svn: 135242	2011-07-15 04:16:38 +00:00
Evan Cheng	a83b37a9db	Move some parts of TargetAsmInfo down to MCAsmInfo. This is not the greatest solution but it is a small step towards removing the horror that is TargetAsmInfo. llvm-svn: 135237	2011-07-15 02:09:41 +00:00
Chandler Carruth	9a0001aedb	Major update to CMake build to reflect changes in r135219 in the backend. Moved some MCAsmInfo files down into the MCTargetDesc sublibraries, removed some (i suspect long) dead files from other parts of the CMake build, etc. Also copied the include directory hack from the Makefile. Finally, updated the lib deps. I spot checked this, and think its correct, but review appreciated there. llvm-svn: 135234	2011-07-15 00:40:52 +00:00
Evan Cheng	1705ab00ab	Rename createAsmInfo to createMCAsmInfo and move registration code to MCTargetDesc to prepare for next round of changes. llvm-svn: 135219	2011-07-14 23:50:31 +00:00
Bill Wendling	2d825b5ecf	* Redo the permutation encoding for frameless stacks to be more like what the unwind library expects. * Comment the permutation encoding for frameless stacks. llvm-svn: 135202	2011-07-14 22:01:34 +00:00
Benjamin Kramer	9654eef493	Port operand types for ARM and X86 over from EDIS to the .td files. llvm-svn: 135198	2011-07-14 21:47:22 +00:00
Evan Cheng	bc153d49b7	Next round of MC refactoring. This patch factor MC table instantiations, MC registeration and creation code into XXXMCDesc libraries. llvm-svn: 135184	2011-07-14 20:59:42 +00:00
Eric Christopher	92464be28c	Check register class matching instead of width of type matching when determining validity of matching constraint. Allow i1 types access to the GR8 reg class for x86. Fixes PR10352 and rdar://9777108 llvm-svn: 135180	2011-07-14 20:13:52 +00:00
Bruno Cardoso Lopes	6778597deb	Add 256-bit load/store recognition and matching in several places. llvm-svn: 135171	2011-07-14 18:50:58 +00:00
Nadav Rotem	771f29677f	[VECTOR-SELECT] During type legalization we often use the SIGN_EXTEND_INREG SDNode. When this SDNode is legalized during the LegalizeVector phase, it is scalarized because non-simple types are automatically marked to be expanded. In this patch we add support for lowering SIGN_EXTEND_INREG manually. This fixes CodeGen/X86/vec_sext.ll when running with the '-promote-elements' flag. llvm-svn: 135144	2011-07-14 11:11:14 +00:00
Eli Friedman	bc2ae1c865	Fix up assertion in r135018 so it doesn't trigger on 32-bit; when we're in 32-bit, it doesn't matter whether the operation overflows because the computed address is not wider than the immediate. llvm-svn: 135120	2011-07-14 00:22:31 +00:00
Bill Wendling	d11ea81db0	Add code to handle a "frameless" unwind stack. The frameless unwind stack has a special encoding, the algorithm for which is in "permuteEncode". llvm-svn: 135103	2011-07-13 23:03:31 +00:00
Bruno Cardoso Lopes	9613b64916	Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more general version of X86ISD::ANDNP also opened the room for a little bit of refactoring. llvm-svn: 135088	2011-07-13 21:36:51 +00:00
Bruno Cardoso Lopes	7ba479d22f	The target specific node PANDN name is misleading. That happens because it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN instruction. Rename it. llvm-svn: 135087	2011-07-13 21:36:47 +00:00
Eli Friedman	344ec79715	Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile. <rdar://problem/9763308> llvm-svn: 135084	2011-07-13 21:29:53 +00:00
Eli Friedman	ef67e7d623	Refactor out checking for displacements on x86-64 addressing modes. No functionality change. Refactoring in preparation for an additional safety check in FoldOffsetIntoAddress. Part of <rdar://problem/9763308>. llvm-svn: 135079	2011-07-13 20:44:23 +00:00
Jim Grosbach	602aa90ab8	Update MCParsedAsmOperand debug methods. Update the debug output interface for MCParsedAsmOperand to have a print() method which takes an output stream argument, an << operator which invokes the print method using the given stream, and a dump() method which prints the operand to the dbgs() stream. This makes the interface more consistent with the rest of LLVM, and more convenient to use at the debugger command line. llvm-svn: 135043	2011-07-13 15:34:57 +00:00
Bruno Cardoso Lopes	1021b4a9dd	AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd llvm-svn: 135023	2011-07-13 01:15:33 +00:00
Bill Wendling	ee6e776be2	Don't emit the FDE end label if the last thing emitted was a compact unwind and not the FDE llvm-svn: 135020	2011-07-13 00:49:09 +00:00
Eli Friedman	16323380cd	Add an assert (which should never trigger) that triggers on a testcase I'm looking at. llvm-svn: 135018	2011-07-13 00:44:29 +00:00
Bill Wendling	0402e8fe4b	Assign variable before we test it. llvm-svn: 135015	2011-07-13 00:23:39 +00:00
Bill Wendling	ed3c44224b	Fix obvious think-o. llvm-svn: 135014	2011-07-13 00:20:09 +00:00
Bill Wendling	929b90ff32	Clean up the handling of an EBP/RBP unwind frame pointer. In particular, don't assert when the frame pointer is -1 (i.e., the function is "frameless"). Still to do: "frameless" unwind information. llvm-svn: 135013	2011-07-13 00:16:14 +00:00
Evan Cheng	c5e6d2f519	- Eliminate MCCodeEmitter's dependency on TargetMachine. It now uses MCInstrInfo and MCSubtargetInfo. - Added methods to update subtarget features (used when targets automatically detect subtarget features or switch modes). - Teach X86Subtarget to update MCSubtargetInfo features bits since the MCSubtargetInfo layer can be shared with other modules. - These fixes .code 16 / .code 32 support since mode switch is updated in MCSubtargetInfo so MC code emitter can do the right thing. llvm-svn: 134884	2011-07-11 03:57:24 +00:00
Evan Cheng	91111d2706	Change createAsmParser to take a MCSubtargetInfo instead of triple, CPU, and feature string. Parsing some asm directives can change subtarget state (e.g. .code 16) and it must be reflected in other modules (e.g. MCCodeEmitter). That is, the MCSubtargetInfo instance must be shared. llvm-svn: 134795	2011-07-09 05:47:46 +00:00
Eli Friedman	fe2088bb1f	Really force on 64bit for 64-bit targets. Should fix remaining failures on unknown x86/non-x86 targets. llvm-svn: 134773	2011-07-08 23:43:01 +00:00
Eli Friedman	5286833f4a	Revert earlier unnecessary hack. Make sure we correctly force on 64bit and cmov for 64-bit targets. llvm-svn: 134768	2011-07-08 23:07:42 +00:00
Evan Cheng	60fc0fca5c	Restore old behavior. Always auto-detect features unless cpu or features are specified. llvm-svn: 134757	2011-07-08 22:30:25 +00:00
Eli Friedman	e2f76c4ade	Default 64-bit target features and SSE2 on when a triple specifies x86-64. Clean up all the other hacks which are now unnecessary. llvm-svn: 134753	2011-07-08 22:16:47 +00:00
Julien Lerouge	112fcc164a	Add _allrem, _aullrem and _allmul to the runtime for MSVC. http://llvm.org/bugs/show_bug.cgi?id=10305 llvm-svn: 134744	2011-07-08 21:40:25 +00:00
Cameron Zwarich	f03fa189ca	Add an intrinsic and codegen support for fused multiply-accumulate. The intent is to use this for architectures that have a native FMA instruction. llvm-svn: 134742	2011-07-08 21:39:21 +00:00
Evan Cheng	964cb5feb0	For non-x86 host, used generic as CPU name. llvm-svn: 134741	2011-07-08 21:14:14 +00:00
Benjamin Kramer	debe69fb37	Plug a leak by giving the AsmParser ownership of the MCSubtargetInfo. Found by valgrind. llvm-svn: 134738	2011-07-08 21:06:23 +00:00
Evan Cheng	22e9d8f40e	TargetAsmParser doesn't need reference to Target. llvm-svn: 134721	2011-07-08 19:33:14 +00:00
Evan Cheng	4d1ca96bfc	Eliminate asm parser's dependency on TargetMachine: - Each target asm parser now creates its own MCSubtatgetInfo (if needed). - Changed AssemblerPredicate to take subtarget features which tablegen uses to generate asm matcher subtarget feature queries. e.g. "ModeThumb,FeatureThumb2" is translated to "(Bits & ModeThumb) != 0 && (Bits & FeatureThumb2) != 0". llvm-svn: 134678	2011-07-08 01:53:10 +00:00
Nick Lewycky	9badf60203	Let the inline asm 'q' constraint match float, and on 64-bit double too. Fixes PR9602! llvm-svn: 134665	2011-07-08 00:19:27 +00:00
Eric Christopher	7a2a0f80de	Go ahead and emit the barrier on x86-64 even without sse2. The processor supports it just fine. Fixes PR9675 and rdar://9740801 llvm-svn: 134664	2011-07-08 00:04:56 +00:00
Eric Christopher	719c29702f	Handle fpcr register. Part of PR10299 and rdar://9740322 llvm-svn: 134653	2011-07-07 22:54:12 +00:00
Eric Christopher	9721396dab	Add support for the X86 'l' constraint. Fixes PR10149 and rdar://9738585 llvm-svn: 134648	2011-07-07 22:29:07 +00:00
Evan Cheng	13bcc6c1c7	Add Mode64Bit feature and sink it down to MC layer. llvm-svn: 134641	2011-07-07 21:06:52 +00:00
Evan Cheng	1a72add615	Compute feature bits at time of MCSubtargetInfo initialization. llvm-svn: 134606	2011-07-07 07:07:08 +00:00
Bill Wendling	667be58220	Use ArrayRef instead of a std::vector&. llvm-svn: 134595	2011-07-07 04:42:01 +00:00
Bill Wendling	b6adf46f62	Add a target hook to encode the compact unwind information. llvm-svn: 134577	2011-07-07 00:54:13 +00:00
Evan Cheng	3ddfbd325d	Rename files for consistency. llvm-svn: 134546	2011-07-06 22:01:53 +00:00
Bill Wendling	5ace8edfd6	Constify getCompactUnwindRegNum. llvm-svn: 134527	2011-07-06 20:33:48 +00:00
Evan Cheng	ab37af9af3	createMCInstPrinter doesn't need TargetMachine anymore. llvm-svn: 134525	2011-07-06 19:45:42 +00:00
Kevin Enderby	6ee1d2bd78	Changed the X86 PUSH64i8 record to use the i64i8imm ParserMatchClass so that a push with a small constant produces a 2-byte push. llvm-svn: 134501	2011-07-06 17:23:46 +00:00
Evan Cheng	4d806e2830	Remove the AsmWriterEmitter (unused) feature that rely on TargetSubtargetInfo. llvm-svn: 134457	2011-07-06 02:02:33 +00:00
Eli Friedman	415412e82f	Add assembler/disassembler support for non-AVX pclmulqdq. While I'm here, use proper aliases for the pclmullqlqdq and friends. PR10269. llvm-svn: 134424	2011-07-05 18:21:20 +00:00
Jakob Stoklund Olesen	e925f22b40	Consistent diagnostic capitalization and redundant context elimination. llvm-svn: 134311	2011-07-02 07:23:40 +00:00
Jakob Stoklund Olesen	25a404eb81	Include a source location when complaining about bad inline assembly. Add a MI->emitError() method that the backend can use to report errors related to inline assembly. Call it from X86FloatingPoint.cpp when the constraints are wrong. This enables proper clang diagnostics from the backend: $ clang -c pr30848.c pr30848.c:5:12: error: Inline asm output regs must be last on the x87 stack __asm__ ("" : "=u" (d)); /* { dg-error "output regs" } */ ^ 1 error generated. llvm-svn: 134307	2011-07-02 03:53:34 +00:00
Eric Christopher	a8a56f7e5c	TargetConstant immediates won't be placed into registers so tighten up the valid constant check earlier. rdar://9692967 llvm-svn: 134286	2011-07-01 23:04:38 +00:00
Evan Cheng	c9c090d7a5	Rename XXXGenSubtarget.inc to XXXGenSubtargetInfo.inc for consistency. llvm-svn: 134281	2011-07-01 22:36:09 +00:00
Evan Cheng	0711c4d489	Add MCSubtargetInfo target registry stuff. llvm-svn: 134279	2011-07-01 22:25:04 +00:00
Eli Friedman	d24a7da658	Calling-convention specifications for illegal types are no-ops. Simplify based on this. llvm-svn: 134264	2011-07-01 21:33:28 +00:00
Evan Cheng	0d639a28aa	Rename TargetSubtarget to TargetSubtargetInfo for consistency. llvm-svn: 134259	2011-07-01 21:01:15 +00:00
Evan Cheng	54b68e3432	- Added MCSubtargetInfo to capture subtarget features and scheduling itineraries. - Refactor TargetSubtarget to be based on MCSubtargetInfo. - Change tablegen generated subtarget info to initialize MCSubtargetInfo and hide more details from targets. llvm-svn: 134257	2011-07-01 20:45:01 +00:00
Evan Cheng	703a0fbf39	Hide the call to InitMCInstrInfo into tblgen generated ctor. llvm-svn: 134244	2011-07-01 17:57:27 +00:00
Bill Wendling	3f049b8b7e	Use the correct registers on X86_64. llvm-svn: 134208	2011-06-30 23:47:14 +00:00
Jakob Stoklund Olesen	d0e2352b65	Fix a problem with fast-isel return values introduced in r134018. We would put the return value from long double functions in the wrong register. This fixes gcc.c-torture/execute/conversion.c llvm-svn: 134205	2011-06-30 23:42:18 +00:00
Bill Wendling	b403f0c4ed	Add target a target hook to get the register number used by the compact unwind encoding for the registers it knows about. Return -1 if it can't handle that register. llvm-svn: 134202	2011-06-30 23:20:32 +00:00
Jakob Stoklund Olesen	2034261972	Tweak error messages to match GCC. Should fix gcc.target/i386/pr30848.c llvm-svn: 134193	2011-06-30 21:30:30 +00:00
Evan Cheng	fe6e405e8c	Fix the ridiculous SubtargetFeatures API where it implicitly expects CPU name to be the first encoded as the first feature. It then uses the CPU name to look up features / scheduling itineray even though clients know full well the CPU name being used to query these properties. The fix is to just have the clients explictly pass the CPU name! llvm-svn: 134127	2011-06-30 01:53:36 +00:00
Joerg Sonnenberger	91e5662075	Recognize the xstorerng alias for VIA PadLock's xstore instruction. llvm-svn: 134126	2011-06-30 01:38:03 +00:00
Eric Christopher	c932173773	Fix a small thinko for constant i64 lock/orq optimization where we we didn't have an opcode for 64-bit constant or expressions. Fixes rdar://9692967 llvm-svn: 134121	2011-06-30 00:48:30 +00:00
Jakob Stoklund Olesen	9f4cc4645b	Always adjust the stack pointer immediately after the call. Some x86-32 calls pop values off the stack, and we need to readjust the stack pointer after the call. This happens when ADJCALLSTACKUP is eliminated. It could happen that spill code was inserted between the CALL and ADJCALLSTACKUP instructions, and we would compute wrong stack pointer offsets for those frame index references. Fix this by inserting the stack pointer adjustment immediately after the call instead of where the ADJCALLSTACKUP instruction was erased. I don't have a test case since we don't currently insert code in that position. We will soon, though. I am testing a regalloc patch that didn't work on Linux because of this. llvm-svn: 134113	2011-06-29 23:11:39 +00:00
Eric Christopher	7e5f2350d3	Use getRegForInlineAsmConstraint instead of custom defining regclasses via vectors. Part of rdar://9643582 llvm-svn: 134079	2011-06-29 17:23:50 +00:00
Evan Cheng	194c3dc01f	Move CallFrameSetupOpcode and CallFrameDestroyOpcode to TargetInstrInfo. llvm-svn: 134030	2011-06-28 21:14:33 +00:00
Evan Cheng	0beca53a29	Hide more details in tablegen generated MCRegisterInfo ctor function. llvm-svn: 134027	2011-06-28 20:44:22 +00:00
Evan Cheng	df8974ef2f	Add MCInstrInfo registeration machinery. llvm-svn: 134026	2011-06-28 20:29:03 +00:00
Evan Cheng	1e210d08d8	Merge XXXGenRegisterNames.inc into XXXGenRegisterInfo.inc llvm-svn: 134024	2011-06-28 20:07:07 +00:00
Evan Cheng	6cc775f905	- Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021	2011-06-28 19:10:37 +00:00
Jakob Stoklund Olesen	7297e7e223	Clean up the handling of the x87 fp stack to make it more robust. Drop the FpMov instructions, use plain COPY instead. Drop the FpSET/GET instruction for accessing fixed stack positions. Instead use normal COPY to/from ST registers around inline assembly, and provide a single new FpPOP_RETVAL instruction that can access the return value(s) from a call. This is still necessary since you cannot tell from the CALL instruction alone if it returns anything on the FP stack. Teach fast isel to use this. This provides a much more robust way of handling fixed stack registers - we can tolerate arbitrary FP stack instructions inserted around calls and inline assembly. Live range splitting could sometimes break x87 code by inserting spill code in unfortunate places. As a bonus we handle floating point inline assembly correctly now. llvm-svn: 134018	2011-06-28 18:32:28 +00:00
Evan Cheng	8d71a75777	More refactoring. Move getRegClass from TargetOperandInfo to TargetInstrInfo. llvm-svn: 133944	2011-06-27 21:26:13 +00:00
Evan Cheng	d9997acd14	Merge XXXGenRegisterDesc.inc XXXGenRegisterNames.inc XXXGenRegisterInfo.h.inc into XXXGenRegisterInfo.inc. llvm-svn: 133922	2011-06-27 18:32:37 +00:00
Jakob Stoklund Olesen	ff653a2eed	Grow the X86FloatingPoint register map to hold 16 registers. This allows for more live scratch registers which is needed to handle live ST registers before return and inline asm instructions. llvm-svn: 133903	2011-06-27 04:08:36 +00:00
Chad Rosier	15db390f8f	Replace dyn_cast<> with cast<> since the cast is already guarded by the necessary check. llvm-svn: 133874	2011-06-25 18:51:28 +00:00
Chad Rosier	bde13d3f76	Enable tail call optimization in the presence of a byval (x86-32 and x86-64). <rdar://problem/9483883> llvm-svn: 133858	2011-06-25 02:04:56 +00:00
Douglas Gregor	03bf47c0f0	Unbreak CMake build llvm-svn: 133853	2011-06-25 00:51:50 +00:00
Evan Cheng	b2681bef4f	Add include guard. llvm-svn: 133847	2011-06-24 23:59:54 +00:00
Evan Cheng	3b960aca17	Rename TargetDesc to MCTargetDesc llvm-svn: 133846	2011-06-24 23:53:19 +00:00
Jim Grosbach	28fcafb502	Refactor MachO relocation generaration into the Target directories. Move the target-specific RecordRelocation logic out of the generic MC MachObjectWriter and into the target-specific object writers. This allows nuking quite a bit of target knowledge from the supposedly target-independent bits in lib/MC. llvm-svn: 133844	2011-06-24 23:44:37 +00:00
Chad Rosier	e553e75b15	Hoist simple check above more complex checking to avoid unnecessary overheads. No functional change intended. llvm-svn: 133824	2011-06-24 21:15:36 +00:00
Evan Cheng	e862d59eee	- Add MCRegisterInfo registration machinery. Also added x86 registration routines. - Rename TargetRegisterDesc to MCRegisterDesc. llvm-svn: 133820	2011-06-24 20:42:09 +00:00
Evan Cheng	247533179a	Starting to refactor Target to separate out code that's needed to fully describe target machine from those that are only needed by codegen. The goal is to sink the essential target description into MC layer so we can start building MC based tools without needing to link in the entire codegen. First step is to refactor TargetRegisterInfo. This patch added a base class MCRegisterInfo which TargetRegisterInfo is derived from. Changed TableGen to separate register description from the rest of the stuff. llvm-svn: 133782	2011-06-24 01:44:41 +00:00
Eli Friedman	5c958bb528	Add support for movntil/movntiq mnemonics. Reported on llvmdev. llvm-svn: 133759	2011-06-23 21:07:47 +00:00
Evan Cheng	8b2a2a1158	Rename TargetOptions::StackAlignment to StackAlignmentOverride. llvm-svn: 133739	2011-06-23 18:15:47 +00:00
Evan Cheng	3a0c5e52ff	Remove TargetOptions.h dependency from X86Subtarget. llvm-svn: 133726	2011-06-23 17:54:54 +00:00
Evan Cheng	ee9b90a727	Get rid of one getStackAlignment(). RegisterInfo shouldn't need to know about stack alignment. llvm-svn: 133679	2011-06-23 01:53:43 +00:00
Nick Lewycky	ef9c497e4c	Add support for assembling "movq" when it's correct to do so, while continuing to emit "movd" across the board to continue supporting a Darwin assembler bug. This is the reincarnation of r133452. llvm-svn: 133565	2011-06-21 22:45:41 +00:00
Bob Wilson	646dd0f4d1	Revert r133452: "Emit movq for 64-bit register to XMM register moves..." This is breaking compiler-rt and llvm-gcc builds on MacOSX when not using the integrated assembler. llvm-svn: 133524	2011-06-21 17:35:13 +00:00
Nick Lewycky	c7df192279	Emit movq for 64-bit register to XMM register moves, but continue to accept movd when assembling. llvm-svn: 133452	2011-06-20 18:33:26 +00:00
Benjamin Kramer	25e17b0f89	Remove unused but set variables. llvm-svn: 133347	2011-06-18 11:09:41 +00:00
Jakob Stoklund Olesen	3337f7d50a	Switch x86 to using AltOrders instead of MethodBodies. llvm-svn: 133325	2011-06-18 01:14:43 +00:00
Jakob Stoklund Olesen	157e6a79a1	SI, DI, BP, and SP don't have 8-bit sub-registers in x86 mode. llvm-svn: 133308	2011-06-17 23:15:00 +00:00
Dan Gohman	8eb36ef497	Add a comment describing why transforming (shl x, 1) to (add x, x) is to be considered safe enough in this context. llvm-svn: 133159	2011-06-16 15:55:48 +00:00
Bruno Cardoso Lopes	bbf2ab990f	Add AVX suport for fpextend. Original patch by Syoyo Fujita with more comments by me. llvm-svn: 133153	2011-06-16 07:03:21 +00:00
Jakob Stoklund Olesen	99f35eab45	Use set operations instead of plain lists to enumerate register classes. This simplifies many of the target description files since it is common for register classes to be related or contain sequences of numbered registers. I have verified that this doesn't change the files generated by TableGen for ARM and X86. It alters the allocation order of MBlaze GPR and Mips FGR32 registers, but I believe the change is benign. llvm-svn: 133105	2011-06-15 23:28:14 +00:00
John McCall	4b7a8d68ae	Add a new function attribute, nonlazybind, which inhibits lazy-loading optimizations when emitting calls to the function; instead those calls may use faster relocations which require the function to be immediately resolved upon loading the dynamic object featuring the call. This is useful when it is known that the function will be called frequently and pervasively and therefore there is no merit in delaying binding of the function. Currently only implemented for x86-64, where it turns into a call through the global offset table. Patch by Dan Gohman, who assures me that he's going to add LangRef documentation for this once it's committed. llvm-svn: 133080	2011-06-15 20:36:13 +00:00
Bruno Cardoso Lopes	dc9ff3a4b1	Add one more argument to the prefetch intrinsic to indicate whether it's a data or instruction cache access. Update the targets to match it and also teach autoupgrade. llvm-svn: 132976	2011-06-14 04:58:37 +00:00
Nick Lewycky	34a425b075	Fit banner in 80-col and adjust whitespace. No functionality changes. llvm-svn: 132964	2011-06-14 03:23:52 +00:00
Rafael Espindola	defd4b0875	AnalyzeBranch doesn't change which successors a bb has, just the order we try to branch to them. Before we were creating successor lists with duplicated entries. Fixing that found a bug in isBlockOnlyReachableByFallthrough that would causes it to return the wrong answer for ----------- ... jne foo jmp bar foo: ---------- llvm-svn: 132882	2011-06-12 03:20:32 +00:00
Charles Davis	7ed40cbded	Put FrameSetup flag on x86 instructions that set up the call frame. No functionality change. Later on, we'll use the flag to emit SEH pseudo-ops that describe how the call frame was built. llvm-svn: 132880	2011-06-12 01:45:54 +00:00
Eli Friedman	1735b29196	Make sure to pass OpFlags into MachineInstrBuilder::addExternalSymbol; the memcpy/memset symbol doesn't get marked up correctly in PIC modes otherwise. Should fix llvm-x86_64-linux-checks buildbot. Followup to r132864. llvm-svn: 132869	2011-06-11 01:55:07 +00:00
Eli Friedman	cd2124a3f0	Add full x86 fast-isel support for memcpy and memset. rdar://9431466 llvm-svn: 132864	2011-06-10 23:39:36 +00:00
Eli Friedman	87ef38784e	PR10092 (second try): Don't crash on a load without a momoperand; fast-isel creates loads like this. llvm-svn: 132826	2011-06-10 01:13:01 +00:00
Eli Friedman	5abfd79900	Chris fixed this README a while back by changing how clang generates code for structs like the given struct. llvm-svn: 132815	2011-06-09 23:02:19 +00:00
Eli Friedman	9008377c2d	Revert 132789; it breaks tests. My mistake. llvm-svn: 132795	2011-06-09 19:33:30 +00:00
Eli Friedman	c095116710	Add a check to make sure we don't crash with strange configurations where we do fast-isel, then try to fold instructions. PR10092. llvm-svn: 132789	2011-06-09 18:55:00 +00:00
Jakob Stoklund Olesen	5750ca7089	Remove custom allocation order boilerplate that is no longer needed. The register allocators automatically filter out reserved registers and place the callee saved registers last in the allocation order, so custom methods are no longer necessary just for that. Some targets still use custom allocation orders: ARM/Thumb: The high registers are removed from GPR in thumb mode. The NEON allocation orders prefer to use non-VFP2 registers first. X86: The GR8 classes omit AH-DH in x86-64 mode to avoid REX trouble. SystemZ: Some of the allocation orders are omitting R12 aliases without explanation. I don't understand this target well enough to fix that. It looks like all the boilerplate could be removed by reserving the right registers. llvm-svn: 132781	2011-06-09 16:56:59 +00:00
Eric Christopher	0713a9d8fc	Add a parameter to CCState so that it can access the MachineFunction. No functional change. Part of PR6965 llvm-svn: 132763	2011-06-08 23:55:35 +00:00
Stuart Hastings	e0d3426e1a	Followup to 132458, omit unnecessary stack copy when x87 input is a load. rdar://problem/6373334 llvm-svn: 132696	2011-06-06 23:15:58 +00:00
Stuart Hastings	be605494ac	Reapply 132424 with fixes. This fixes PR10068. rdar://problem/5993888 llvm-svn: 132606	2011-06-03 23:53:54 +00:00
Eric Christopher	de9399bf76	Have LowerOperandForConstraint handle multiple character constraints. Part of rdar://9119939 llvm-svn: 132510	2011-06-02 23:16:42 +00:00
Jakob Stoklund Olesen	60cdf8e727	Flag unallocatable register classes instead of giving them empty allocation orders. llvm-svn: 132509	2011-06-02 23:07:24 +00:00
Rafael Espindola	aa318ae495	Revert 132424 to fix PR10068. llvm-svn: 132479	2011-06-02 19:57:47 +00:00
Stuart Hastings	8d530ad22a	Omit unnecessary stack copy when x87 input is a load. rdar://problem/6373334 llvm-svn: 132458	2011-06-02 15:57:11 +00:00
Jakob Stoklund Olesen	aff1060207	Use TRI::has{Sub,Super}ClassEq() where possible. No functional change. llvm-svn: 132455	2011-06-02 05:43:46 +00:00
Rafael Espindola	d6860522b2	Don't hardcode the %reg format in the streamer. llvm-svn: 132451	2011-06-02 02:34:55 +00:00
Stuart Hastings	7adc95f69e	Recommit 132404 with fixes. rdar://problem/5993888 llvm-svn: 132424	2011-06-01 21:33:14 +00:00
Stuart Hastings	aab130d995	Revert 132404 to appease a buildbot. rdar://problem/5993888 llvm-svn: 132419	2011-06-01 19:52:20 +00:00
Stuart Hastings	7b7c102f2c	Add support for x86 CMPEQSS and friends. These instructions do a floating-point comparison, generate a mask of 0s or 1s, and generally DTRT with NaNs. Only profitable when the user wants a materialized 0 or 1 at runtime. rdar://problem/5993888 llvm-svn: 132404	2011-06-01 17:17:45 +00:00
Jakob Stoklund Olesen	56ce3a0f01	Fix PR10059 and future variations by handling all register subclasses. Add TargetRegisterInfo::hasSubClassEq and use it to check for compatible register classes instead of trying to list all register classes in X86's getLoadStoreRegOpcode. llvm-svn: 132398	2011-06-01 15:32:10 +00:00
Stuart Hastings	9f20804216	FGETSIGN support for x86, using movmskps/pd. Will be enabled with a patch to TargetLowering.cpp. rdar://problem/5660695 llvm-svn: 132388	2011-06-01 04:39:42 +00:00
Rafael Espindola	08600bcf65	Use the dwarf->llvm mapping to print register names in the cfi directives. Fixes PR9826. llvm-svn: 132317	2011-05-30 20:20:15 +00:00
Rafael Espindola	ddffa0e160	Introduce the DwarfRegAlias class for declaring that two registers have the same dwarf number. This will be used for creating a dwarf number to register mapping. The only case that needs this so far is the XMM/YMM registers that unfortunately do have the same numbers. llvm-svn: 132314	2011-05-30 17:49:59 +00:00
Rafael Espindola	ea8ca34e3a	Mark the 32 bit registers as invalid in 64 bit mode. In 64 bit mode they are subregisters of the 64 bit ones. llvm-svn: 132313	2011-05-30 16:04:54 +00:00
Rafael Espindola	19fea7a840	Add 132187 back now that the real problem is fixed. llvm-svn: 132238	2011-05-28 00:24:37 +00:00
Rafael Espindola	a5149b5cea	It looks like 132187 might have broken the llvm-gcc bootstrap. Revert while I check. llvm-svn: 132230	2011-05-27 23:36:02 +00:00
Cameron Zwarich	75d99e4b70	Add a GR32_NOREX_NOSP register class and fix a bug where getMatchingSuperRegClass() was saying that the matching superregister class of GR32_NOREX in GR64_NOREX_NOSP is GR64_NOREX, which drops the NOSP constraint. This fixes PR10032. llvm-svn: 132225	2011-05-27 22:26:04 +00:00
Jakob Stoklund Olesen	6019944901	Delete MethodBodies that only filtered reserved registers. The register allocators know to filter reserved registers from the allocation orders, so we don't need all of this boilerplate. llvm-svn: 132199	2011-05-27 18:27:13 +00:00
Rafael Espindola	2daba3380d	Remove dwarf numbers from subregs. We should use DW_OP_bit_piece to refer to them. I tested this with both check-all and the gdb testsuite. llvm-svn: 132187	2011-05-27 15:08:24 +00:00
Chad Rosier	b362884ca9	Renamed llvm.x86.sse42.crc32 intrinsics; crc64 doesn't exist. crc32.[8\|16\|32] have been renamed to .crc32.32.[8\|16\|32] and crc64.[8\|16\|32] have been renamed to .crc32.64.[8\|64]. llvm-svn: 132163	2011-05-26 23:13:19 +00:00
Stuart Hastings	493a12bf5e	Reverting 132105: it broke some LLVM-GCC DejaGNU tests. llvm-svn: 132108	2011-05-26 04:09:49 +00:00
Stuart Hastings	276f231c2f	Correctly handle a one-word struct passed byval on x86_64. rdar://problem/6920088 llvm-svn: 132105	2011-05-26 02:44:56 +00:00
Eli Friedman	c70355195c	Rewrite fast-isel integer cast handling to handle more cases, and to be simpler and more consistent. The practical effects here are that x86-64 fast-isel can now handle trunc from i8 to i1, and ARM fast-isel can handle many more constructs involving integers narrower than 32 bits (including loads, stores, and many integer casts). rdar://9437928 . llvm-svn: 132099	2011-05-25 23:49:02 +00:00
Francois Pichet	85ec52125b	Remove unused OpcodeMask enumerator. llvm-svn: 132062	2011-05-25 17:02:53 +00:00
Francois Pichet	58b09c9366	Fix MSVC warning: "is out of range for enum constant" MSVC doesn't support 64 bit enum. OpcodeMask is not used anywhere in the code base. llvm-svn: 132057	2011-05-25 15:58:10 +00:00
Rafael Espindola	fc9bae6f8b	Replace the -unwind-tables option with a per function flag. This is more LTO friendly as we can now correctly merge files compiled with or without -fasynchronous-unwind-tables. llvm-svn: 132033	2011-05-25 03:44:17 +00:00
Charles Davis	97019c709d	Add a method to TargetRegisterInfo to get the register number that the Win64 EH scheme uses internally. Implement it for x86 (the only architecture that LLVM supports for which this matters right now). llvm-svn: 131969	2011-05-24 16:57:53 +00:00
Evan Cheng	88f9137fd7	- Teach SelectionDAG::isKnownNeverZero to return true (op x, c) when c is non-zero. - Teach X86 cmov optimization to eliminate the cmov from ctlz, cttz extension when the source of X86ISD::BSR / X86ISD::BSF is proven to be non-zero. rdar://9490949 llvm-svn: 131948	2011-05-24 01:48:22 +00:00
Chris Lattner	e240dc52ff	add a missing alias to make us more bug compatible with gcc, PR9378 llvm-svn: 131874	2011-05-22 22:31:57 +00:00
Benjamin Kramer	e30b70073a	X86: smulo -> add is now done target-independently in DAGCombiner, remove the patterns. llvm-svn: 131801	2011-05-21 18:32:01 +00:00
Cameron Zwarich	faeb520c97	Fix PR9978 by adding RIP to GR64_TC so it can be used as an address in PIC code. It is already in GR64 for the same reasons. Since it isn't allocatable it can't cause any problems. llvm-svn: 131787	2011-05-21 04:13:49 +00:00
Eli Friedman	60afcc2a6f	Add fast-isel support for byval calls on x86. llvm-svn: 131764	2011-05-20 22:21:04 +00:00
Stuart Hastings	91f1d24736	Re-commit 131641 with fixes; de-pseudoize MOVSX16rr8 and friends. rdar://problem/8614450 llvm-svn: 131746	2011-05-20 19:04:40 +00:00
Benjamin Kramer	0bf26746d9	Rename the "sandybridge" subtarget to "corei7-avx", for GCC compatibility. llvm-svn: 131730	2011-05-20 15:11:26 +00:00
Chad Rosier	552f8c4819	Don't attempt to tail call optimize for Win64. llvm-svn: 131709	2011-05-20 00:59:28 +00:00
Evan Cheng	e8d2e9eb35	Revert r131664 and fix it in instcombine instead. rdar://9467055 llvm-svn: 131708	2011-05-20 00:54:37 +00:00
Eli Friedman	22da799428	Add fast-isel support for zeroext and signext ret instructions on x86. llvm-svn: 131689	2011-05-19 22:16:13 +00:00
Eric Christopher	4014e5e208	Oddly people want to use the 'r' constraint for fp constants on x86. Fixes rdar://9218925 Fixes PR9601 llvm-svn: 131682	2011-05-19 21:33:47 +00:00
Rafael Espindola	0fc5e89c82	ADD64ri32 sign extends its argument, so we need to use a R_X86_64_32S. Fixes PR9934. We really need to start tblgening the relocation info :-( llvm-svn: 131669	2011-05-19 20:32:34 +00:00
Evan Cheng	2b9bd38678	crc32 with 64-bit output zeros upper 32-bits. rdar://9467055 llvm-svn: 131664	2011-05-19 18:57:12 +00:00
Stuart Hastings	c72240bbd9	Reverting 131641 to investigate 'bot complaint. llvm-svn: 131654	2011-05-19 17:54:42 +00:00
Stuart Hastings	b476b0cc9f	Revise MOVSX16rr8/MOVZX16rr8 (and rm variants) to no longer be pseudos. rdar://problem/8614450 llvm-svn: 131641	2011-05-19 16:59:50 +00:00
Eli Friedman	6fc94dd687	Revert unintentional commit. llvm-svn: 131597	2011-05-18 23:13:10 +00:00
Eli Friedman	1754a25977	More instcombine simplifications towards better debug locations. llvm-svn: 131596	2011-05-18 23:11:30 +00:00
Cameron Zwarich	9ddeceff19	Reserve the segment registers on x86 to fix verifier failures in any code that uses them. llvm-svn: 131591	2011-05-18 22:24:48 +00:00
Chad Rosier	f4e832b14e	Enables vararg functions that pass all arguments via registers to be optimized into tail-calls when possible. llvm-svn: 131560	2011-05-18 19:59:50 +00:00
Mon P Wang	6f6b44d19d	Enable autodetect of popcnt llvm-svn: 131476	2011-05-17 18:33:37 +00:00
Eli Friedman	7b27942fe7	Add x86 fast-isel for calls returning first-class aggregates. rdar://9435872. This is r131438 with a couple small fixes. llvm-svn: 131474	2011-05-17 18:29:03 +00:00
Eli Friedman	d000a2c26e	Clean up the mess created by r131467+r131469. llvm-svn: 131471	2011-05-17 18:02:22 +00:00
Stuart Hastings	c65d8eda7b	Revert 131467 due to buildbot complaint. llvm-svn: 131469	2011-05-17 16:59:46 +00:00
Stuart Hastings	3cf5308890	Fix an obscure issue in X86_64 parameter passing: if a tiny byval is passed as the fifth parameter, insure it's passed correctly (in R9). rdar://problem/6920088 llvm-svn: 131467	2011-05-17 16:45:55 +00:00
Nadav Rotem	d8edb1d5cc	Fix a bug in PerformEXTRACT_VECTOR_ELTCombine. The code created an ADD SDNode with two different types, in cases where the index and the ptr had different types. llvm-svn: 131461	2011-05-17 08:31:57 +00:00
Eric Christopher	56a42ebf15	Update comment. llvm-svn: 131459	2011-05-17 08:16:14 +00:00
Eric Christopher	a1d9e29552	Support XOR and AND optimization with no return value. Finishes off rdar://8470697 llvm-svn: 131458	2011-05-17 08:10:18 +00:00
Eric Christopher	abfe3131e3	Couple less magic numbers. llvm-svn: 131457	2011-05-17 07:50:41 +00:00
Eric Christopher	eb47a2a1e5	Make this code a little less magic number laden. llvm-svn: 131456	2011-05-17 07:47:55 +00:00
Chris Lattner	1e81f57bf0	add a note llvm-svn: 131455	2011-05-17 07:22:33 +00:00
Eli Friedman	7335e8a720	Back out r131444 and r131438; they're breaking nightly tests. I'll look into it more tomorrow. llvm-svn: 131451	2011-05-17 02:36:59 +00:00
Eli Friedman	83ba150f3a	Add x86 fast-isel for calls returning first-class aggregates. rdar://9435872. llvm-svn: 131438	2011-05-17 00:13:47 +00:00
Eli Friedman	d4a3609d30	Remove dead code. Fix associated test to use FileCheck. llvm-svn: 131424	2011-05-16 21:28:22 +00:00
Eli Friedman	a4d4a0162d	Make fast-isel work correctly s/uadd.with.overflow intrinsics. llvm-svn: 131420	2011-05-16 21:06:17 +00:00
Eli Friedman	8f1e11cde9	Fix a FIXME by moving the fast-isel implementation of the objectsize intrinsic from the x86 code to the generic code. llvm-svn: 131332	2011-05-14 00:47:51 +00:00
Rafael Espindola	df9db7ed92	Don't produce a vmovntdq if we don't have AVX support. llvm-svn: 131330	2011-05-14 00:30:01 +00:00
Eli Friedman	f080a57b81	Zap useless code; this hasn't done anything useful since fast-isel switched to being bottom-up (a very long time ago). llvm-svn: 131329	2011-05-14 00:19:32 +00:00
Eric Christopher	2a9dbbbb12	Turn this into a table, this will make more sense shortly. Part of rdar://8470697 llvm-svn: 131200	2011-05-11 21:44:58 +00:00
Nadav Rotem	8f971c27fb	Add custom lowering of X86 vector SRA/SRL/SHL when the shift amount is a splat vector. llvm-svn: 131179	2011-05-11 08:12:09 +00:00
Eric Christopher	4a34e61e53	Optimize atomic lock or that doesn't use the result value. Next up: xor and and. Part of rdar://8470697 llvm-svn: 131171	2011-05-10 23:57:45 +00:00
Eric Christopher	e33464663f	Refactor lock versions of binary operators to be a little less cut and paste. llvm-svn: 131139	2011-05-10 18:36:16 +00:00
Benjamin Kramer	d724a590e5	X86: Add a bunch of peeps for add and sub of SETB. "b + ((a < b) ? 1 : 0)" compiles into cmpl %esi, %edi adcl $0, %esi instead of cmpl %esi, %edi sbbl %eax, %eax andl $1, %eax addl %esi, %eax This saves a register, a false dependency on %eax (Intel's CPUs still don't ignore it) and it's shorter. llvm-svn: 131070	2011-05-08 18:36:07 +00:00
Eli Friedman	2518f8376d	Make the logic for determining function alignment more explicit. No functionality change. llvm-svn: 131012	2011-05-06 20:34:06 +00:00
Rafael Espindola	a716096677	Dead code elimination. llvm-svn: 130984	2011-05-06 14:56:22 +00:00
Eli Friedman	f1e2b50a30	PR9848: pandn is not commutative. No test because I can't think of any way to write one that won't break quickly. llvm-svn: 130932	2011-05-05 17:45:31 +00:00
Jakob Stoklund Olesen	808dca12f8	Fix X86RegisterInfo::getMatchingSuperRegClass for sub_8bit_hi. It is OK for B to be any GR8_ABCD_H superclass, the returned register class doesn't have to map surjectively onto B. llvm-svn: 130892	2011-05-04 23:54:54 +00:00
Bill Wendling	db0996c822	Replace the "movnt" intrinsics with a native store + nontemporal metadata bit. <rdar://problem/8460511> llvm-svn: 130791	2011-05-03 21:11:17 +00:00
Michael J. Spencer	9973738b65	Add pentium{3,4}m cpus. Patch by Alexander Best! llvm-svn: 130749	2011-05-03 03:42:50 +00:00
Eric Christopher	d2aa241378	xmm0 is an implicit parameter in this and so shouldn't be in the string template. Fixes rdar://8493866 llvm-svn: 130747	2011-05-03 01:28:32 +00:00
Rafael Espindola	fc8223670a	Add r130623 back now that ELF has been fixed to work with -fno-dwarf2-cfi-asm. llvm-svn: 130658	2011-05-01 15:44:13 +00:00
Chandler Carruth	ddc91b25e3	Remove an unused variable from this function introduced in r130637, likely a result of copy/paste. llvm-svn: 130640	2011-05-01 06:14:10 +00:00
Rafael Espindola	750cb61553	GCC uses a different encoding of pointers in the FDE when using -fno-dwarf2-cfi-asm. Implement the same behavior. llvm-svn: 130637	2011-05-01 04:49:54 +00:00
Rafael Espindola	95215a76cd	I forgot these files in the previous commit. llvm-svn: 130635	2011-05-01 04:19:24 +00:00
Rafael Espindola	b7c2286055	Revert the previous patch while I figure out how to make llvm-gcc less agressive about disabling cfi on linux :-( llvm-svn: 130626	2011-04-30 23:03:44 +00:00
Jakob Stoklund Olesen	2348cdd67f	X86AsmPrinter doesn't know how to handle the X86II::MO_GOT_ABSOLUTE_ADDRESS flag after folding ADD32ri to ADD32mi, so don't do that. This only happens when the greedy register allocator gets itself in trouble and spills %vreg9 here: 16L %vreg9<def> = MOVPC32r 0, %ESP<imp-use>; GR32:%vreg9 48L %vreg9<def> = ADD32ri %vreg9, <es:_GLOBAL_OFFSET_TABLE_>[TF=1], %EFLAGS<imp-def,dead>; GR32:%vreg9 That should never happen, the live range should be split instead. llvm-svn: 130625	2011-04-30 23:00:05 +00:00
Rafael Espindola	5265bc483e	Enable CFI on OS X. Currently the output should be almost identical to the one produced by CodeGen to make the transition easier. The only two differences I know of are: * Some files get an extra advance loc of size 0. This will be fixed when relaxations are enabled. * The optimization of declaring an EH symbol as an external variable is not implemented. This is a subset of adding the nounwind attribute, so we if really this at -O0 we should probably do it at the IL level. llvm-svn: 130623	2011-04-30 22:29:54 +00:00
Benjamin Kramer	6708499b6d	This is done. llvm-svn: 130499	2011-04-29 14:09:57 +00:00
Chris Lattner	011eae7512	clean up after Sean's r127646 patch. llvm-svn: 130475	2011-04-29 05:40:18 +00:00
Daniel Dunbar	a86188bf8e	Target/X86/MC: Add an option for disabling arith relaxation, for my own testing purposes. llvm-svn: 130438	2011-04-28 21:23:31 +00:00
Eli Friedman	7cd5101ad3	fast-isel sret calls, try 2. We actually do need to do something on x86-32. rdar://problem/9303592 . llvm-svn: 130429	2011-04-28 20:19:12 +00:00
Eli Friedman	d5a80ca3c8	Revert r130348; causing buildbot issues on x86-32. llvm-svn: 130412	2011-04-28 18:06:10 +00:00
Rafael Espindola	c5dac4df2e	Add a getExprForPersonalitySymbol method to MCAsmInfo. Use it when converting the symbol passed to .cfi_personality into bytes is the file. llvm-svn: 130400	2011-04-28 16:09:09 +00:00
Chris Lattner	2a75c72e1c	move PR9803 to this readme. llvm-svn: 130385	2011-04-28 05:33:16 +00:00
Eli Friedman	8bd572fc58	fast-isel sret. We actually don't need to do anything special on x86. :) rdar://problem/9303592 . llvm-svn: 130348	2011-04-27 23:58:52 +00:00
Rafael Espindola	ce83fc3463	Remove unnecessary argument. llvm-svn: 130343	2011-04-27 23:17:57 +00:00
Rafael Espindola	08704349da	Rename getPersonalityPICSymbol to getCFIPersonalitySymbol, document it, and give it a bit more responsibility. Also implement it for MachO. If hacked to use cfi, 32 bit MachO will produce .cfi_personality 155, L___gxx_personality_v0$non_lazy_ptr and 64 bit will produce .cfi_presonality ___gxx_personality_v0 The general idea is that .cfi_personality gets passed the final symbol. It is up to codegen to produce it if using indirect representation (like 32 bit MachO), but it is up to MC to decide which relocations to create. llvm-svn: 130341	2011-04-27 23:08:15 +00:00
Eli Friedman	406c471b69	Make the fast-isel code for literal 0.0 a bit shorter/faster, since 0.0 is common. rdar://problem/9303592 . llvm-svn: 130338	2011-04-27 22:41:55 +00:00
Eli Friedman	bcc6914146	Refactor out code to fast-isel a memcpy operation with a small constant length. (I'm planning to use this to implement byval.) llvm-svn: 130274	2011-04-27 01:45:07 +00:00
Eli Friedman	0eea0293d9	Fix an edge case involving branches in fast-isel on x86. rdar://problem/9303306 . llvm-svn: 130272	2011-04-27 01:34:27 +00:00
Jakob Stoklund Olesen	803a200077	Add a TRI::getLargestLegalSuperClass hook to provide an upper limit on register class inflation. The hook will be used by the register allocator when recomputing register classes after removing constraints. Thumb1 code doesn't allow anything larger than tGPR, and x86 needs to ensure that the spill size doesn't change. llvm-svn: 130228	2011-04-26 18:52:33 +00:00
Rafael Espindola	80cb3cb1d6	Print all the moves at a given label instead of just the first one. Remove previous DwarfCFI hack. llvm-svn: 130187	2011-04-26 03:58:56 +00:00
Benjamin Kramer	3db054650b	Silence an overzealous uninitialized variable warning from GCC. llvm-svn: 130053	2011-04-23 08:21:06 +00:00
Benjamin Kramer	4c81624735	X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039) This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is uint64_t foo(uint64_t x) { return (x&1) << 42; } which used to compile into bloated code: shlq $42, %rdi ## encoding: [0x48,0xc1,0xe7,0x2a] movabsq $4398046511104, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00] andq %rdi, %rax ## encoding: [0x48,0x21,0xf8] ret ## encoding: [0xc3] with this patch we can fold the immediate into the and: andq $1, %rdi ## encoding: [0x48,0x83,0xe7,0x01] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] shlq $42, %rax ## encoding: [0x48,0xc1,0xe0,0x2a] ret ## encoding: [0xc3] It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing that without making this code even more complicated. See the TODOs in the code. llvm-svn: 129990	2011-04-22 15:30:40 +00:00
Rafael Espindola	5395f44fe8	Compute the size of the FDE encoding instead of hard coding it. Update X8664_ELFTargetObjectFile::getFDEEncoding to match reality. llvm-svn: 129959	2011-04-22 00:08:43 +00:00
Jakob Stoklund Olesen	0e34c1dfac	Prefer cheap registers for busy live ranges. On the x86-64 and thumb2 targets, some registers are more expensive to encode than others in the same register class. Add a CostPerUse field to the TableGen register description, and make it available from TRI->getCostPerUse. This represents the cost of a REX prefix or a 32-bit instruction encoding required by choosing a high register. Teach the greedy register allocator to prefer cheap registers for busy live ranges (as indicated by spill weight). llvm-svn: 129864	2011-04-20 18:19:48 +00:00
Nick Lewycky	4dae63e35b	This should always be signed chars, so use int8_t. This fixes a miscompile when llvm is built with unsigned chars where an immediate such as 0xff would be zero extended to 64-bits, turning "cmp $0xff,%eax" into "cmp $0xffffffffffffffff,%eax". llvm-svn: 129845	2011-04-20 03:19:42 +00:00
Daniel Dunbar	cd01ed5bd6	ADT/Triple: Renambe isOSX... methods to isMacOSX for consistency with the OS triple component. llvm-svn: 129838	2011-04-20 00:14:25 +00:00
Daniel Dunbar	2b9b0e3748	ADT/Triple: Move a variety of clients to using isOSDarwin() and isOSWindows() predicates. llvm-svn: 129816	2011-04-19 21:14:45 +00:00
Daniel Dunbar	100455a3c8	Target/X86: Eliminate uses of getDarwinVers(). llvm-svn: 129813	2011-04-19 21:04:12 +00:00
Daniel Dunbar	44b530369d	Target/X86: Add getTargetTriple() accessor. llvm-svn: 129812	2011-04-19 21:01:47 +00:00
Eli Friedman	ee92a6b332	Add support for FastISel'ing varargs calls. llvm-svn: 129765	2011-04-19 17:22:22 +00:00
Chris Lattner	91328b317b	Implement support for x86 fastisel of small fixed-sized memcpys, which are generated en-mass for C++ PODs. On my c++ test file, this cuts the fast isel rejects by 10x and shrinks the generated .s file by 5% llvm-svn: 129755	2011-04-19 05:52:03 +00:00
Chris Lattner	34a08c2344	tidy up llvm-svn: 129753	2011-04-19 05:15:59 +00:00
Chris Lattner	5f4b783426	Implement support for fast isel of calls of i1 arguments, even though they are illegal, when they are a truncate from something else. This eliminates fully half of all the fastisel rejections on a test c++ file I'm working with, which should make a substantial improvement for -O0 compile of c++ code. This fixed rdar://9297003 - fast isel bails out on all functions taking bools llvm-svn: 129752	2011-04-19 05:09:50 +00:00
Chris Lattner	d7f7c93914	Handle i1/i8/i16 constant integer arguments to calls by prepromoting them. Before we would bail out on i1 arguments all together, now we just bail on non-constant ones. Also, we used to emit extraneous code. e.g. test12 was: movb $0, %al movzbl %al, %edi callq _test12 and test13 was: movb $0, %al xorl %edi, %edi movb %al, 7(%rsp) callq _test13f Now we get: movl $0, %edi callq _test12 and: movl $0, %edi callq _test13f llvm-svn: 129751	2011-04-19 04:42:38 +00:00
Chris Lattner	c59290a34c	be layout aware, to produce: testb $1, %al je LBB0_2 ## BB#1: ## %if.then movb $0, %al instead of: testb $1, %al jne LBB0_1 jmp LBB0_2 LBB0_1: ## %if.then movb $0, %al how 'bout that. llvm-svn: 129749	2011-04-19 04:26:32 +00:00
Chris Lattner	2c8a4c3b1b	fix rdar://9297006 - fast isel bails out on trunc to i1 -> bools cry, a common cause of fast isel rejects on c++ code. llvm-svn: 129748	2011-04-19 04:22:17 +00:00
Eric Christopher	2e3fbaab39	Invert the meaning of printAliasInstr's return value. It now returns true on success and false on failure. Update callers. llvm-svn: 129722	2011-04-18 21:28:11 +00:00
Chris Lattner	80254a53cc	Add a new bit that ImmLeaf's can opt into, which allows them to duck out of the generated FastISel. X86 doesn't need to generate code to match ADD16ri8 since ADD16ri will do just fine. This is a small codesize win in the generated instruction selector. llvm-svn: 129692	2011-04-18 06:36:55 +00:00
Chris Lattner	c479e0631f	switch the rest of the x86 immediate patterns over to ImmLeaf, simplifying them and exposing more information to tblgen. It would be nice if other target authors adopted this as well, particularly arm since it has fastisel. llvm-svn: 129676	2011-04-17 22:12:55 +00:00
Chris Lattner	2ff8c1a25f	now that predicates have a decent abstraction layer on them, introduce a new kind of predicate: one that is specific to imm nodes. The predicate function specified here just checks an int64_t directly instead of messing around with SDNode's. The virtue of this is that it means that fastisel and other things can reason about these predicates. llvm-svn: 129675	2011-04-17 22:05:17 +00:00
Chris Lattner	514e292b72	Rework our internal representation of node predicates to expose more structure and fix some fixmes. We now have a TreePredicateFn class that handles all of the decoding of these things. This is an internal cleanup that has no impact on the code generated by tblgen. llvm-svn: 129670	2011-04-17 21:38:24 +00:00
Chris Lattner	b53ccb8e36	1. merge fast-isel-shift-imm.ll into fast-isel-x86-64.ll 2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts 3. teach tblgen to handle shift immediates that are different sizes than the shifted operands, eliminating some code from the X86 fast isel backend. 4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function instead of FastEmit_ri to simplify code. llvm-svn: 129666	2011-04-17 20:23:29 +00:00
Chris Lattner	eb729d48ff	fix an x86 fast isel issue where we'd completely give up on folding an address when we have a global variable base an an index. Instead, just give up on folding the global variable. Before we'd geenrate: _test: ## @test ## BB#0: movq _rtx_length@GOTPCREL(%rip), %rax leaq (%rax), %rax addq %rdi, %rax movzbl (%rax), %eax ret now we generate: _test: ## @test ## BB#0: movq _rtx_length@GOTPCREL(%rip), %rax movzbl (%rax,%rdi), %eax ret The difference is even more significant when there is a scale involved. This fixes rdar://9289558 - total fail with addr mode formation at -O0/x86-64 llvm-svn: 129664	2011-04-17 17:47:38 +00:00
Chris Lattner	4832660b4d	fix an oversight which caused us to compile the testcase (and other less trivial things) into a dummy lea. Before we generated: _test: ## @test movq _G@GOTPCREL(%rip), %rax leaq (%rax), %rax ret now we produce: _test: ## @test movq _G@GOTPCREL(%rip), %rax ret This is part of rdar://9289558 llvm-svn: 129662	2011-04-17 17:12:08 +00:00
Chris Lattner	4b026b962a	tidy up and reduce indentation. llvm-svn: 129661	2011-04-17 17:05:12 +00:00
Eli Friedman	55f7bf3289	Remove working entry from README. llvm-svn: 129654	2011-04-17 02:36:27 +00:00
Rafael Espindola	a01cdb0e37	Add 129518 back with a fix for when we are producing eh just because of debug info. Change ELF systems to use CFI for producing the EH tables. This reduces the size of the clang binary in Debug builds from 690MB to 679MB. llvm-svn: 129571	2011-04-15 15:11:06 +00:00
Chris Lattner	0ab5e2cded	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
NAKAMURA Takumi	b5e3e9dd27	Revert r129518, "Change ELF systems to use CFI for producing the EH tables. This reduces the" It broke several builds. llvm-svn: 129557	2011-04-15 03:35:57 +00:00
Michael J. Spencer	30088ba110	Add 3DNow! intrinsics. llvm-svn: 129551	2011-04-15 00:32:41 +00:00
Chris Lattner	6f195469b1	move PR9661 out to here. llvm-svn: 129527	2011-04-14 18:47:18 +00:00
Rafael Espindola	aa2a7cd828	Change ELF systems to use CFI for producing the EH tables. This reduces the size of the clang binary in Debug builds from 690MB to 679MB. llvm-svn: 129518	2011-04-14 15:18:53 +00:00
Michael J. Spencer	b88784c185	Fix whitespace and tabs. llvm-svn: 129517	2011-04-14 14:33:36 +00:00
Bill Wendling	410ec4aad1	As Dan pointed out, movzbl, movsbl, and friends are nicer than their alias (movzx/movsx) because they give more information. Revert that part of the patch. llvm-svn: 129498	2011-04-14 01:46:37 +00:00
Bill Wendling	7e07d6fb69	Have the X86 back-end emit the alias instead of what's being aliased. In most cases, it's much nicer and more informative reading the alias. llvm-svn: 129497	2011-04-14 01:11:51 +00:00
Bill Wendling	6dd69d9241	Add an option to not print the alias of an instruction. It defaults to "print the alias". llvm-svn: 129485	2011-04-13 23:36:21 +00:00
Bill Wendling	b902f1dd88	Reapply r129401 with patch for clang. llvm-svn: 129419	2011-04-13 00:36:11 +00:00
Bill Wendling	dbfde42468	Revert r129401 for now. Clang is using the old way of doing things. llvm-svn: 129403	2011-04-12 22:59:27 +00:00
Bill Wendling	47c24875a1	Remove the unaligned load intrinsics in favor of using native unaligned loads. Now that we have a first-class way to represent unaligned loads, the unaligned load intrinsics are superfluous. First part of <rdar://problem/8460511>. llvm-svn: 129401	2011-04-12 22:46:31 +00:00
Jay Foad	7c14a558fe	Don't include Operator.h from InstrTypes.h. llvm-svn: 129271	2011-04-11 09:35:34 +00:00
Chris Lattner	fc4fe00a65	fix rdar://8735979 - "int 3" doesn't match to "int3". Unfortunately, InstAlias doesn't allow matching immediate operands, so we have to write C++ code to do this. llvm-svn: 129223	2011-04-09 19:41:05 +00:00
Bill Wendling	bc3f79044a	Replace the old algorithm that emitted the "print the alias for an instruction" with the newer, cleaner model. It uses the IAPrinter class to hold the information that is needed to match an instruction with its alias. This also takes into account the available features of the platform. There is one bit of ugliness. The way the logic determines if a pattern is unique is O(N2), which is gross. But in reality, the number of items it's checking against isn't large. So while it's N2, it shouldn't be a massive time sink. llvm-svn: 129110	2011-04-07 21:20:06 +00:00
Rafael Espindola	b4dd95b4f9	Add another case we are not optimizing. llvm-svn: 129012	2011-04-06 17:35:32 +00:00
Rafael Espindola	7a3b244d45	The original issue has been fixed by not doing unnecessary sign extensions. Change the test to force a sign extension and expose the problem again. llvm-svn: 129011	2011-04-06 17:19:35 +00:00
Joerg Sonnenberger	418f186a4b	Make OpcodeMask an unsigned long long literal to deal with overflow. llvm-svn: 128847	2011-04-04 21:38:17 +00:00
Joerg Sonnenberger	fc4789da4a	Add support for the VIA PadLock instructions. llvm-svn: 128826	2011-04-04 16:58:13 +00:00
Joerg Sonnenberger	cc53d9919f	Expand Op0Mask by one bit in preparation for the PadLock prefixes. Define most shift masks incrementally to reduce the redundant hard-coding. Introduce new shift for the VEX flags to replace the magic constant 32 in various places. llvm-svn: 128822	2011-04-04 15:58:30 +00:00
Evan Cheng	ee9d45dd55	Don't try to create zero-sized stack objects. llvm-svn: 128586	2011-03-30 23:44:13 +00:00
Benjamin Kramer	8d2227373d	Make helper static. llvm-svn: 128338	2011-03-26 12:38:19 +00:00
NAKAMURA Takumi	521eb7c11e	Target/X86: [PR8777][PR8778] Tweak alloca/chkstk for Windows targets. FIXME: Some cleanups would be needed. llvm-svn: 128206	2011-03-24 07:07:00 +00:00
Andrew Trick	4ab9a16569	Revert r128175. I'm backing this out for the second time. It was supposed to be fixed by r128164, but the mingw self-host must be defeating the fix. llvm-svn: 128181	2011-03-23 23:11:02 +00:00
Andrew Trick	4046a0de91	Reapply Eli's r127852 now that the pre-RA scheduler can spill EFLAGS. (target-specific branchless method for double-width relational comparisons on x86) llvm-svn: 128175	2011-03-23 22:16:02 +00:00
Dan Gohman	c1783b31a4	Fix fast-isel address mode folding to avoid folding instructions outside of the current basic block. This fixes PR9500, rdar://9156159. llvm-svn: 128041	2011-03-22 00:04:35 +00:00
Bill Wendling	00f0cddfd4	We need to pass the TargetMachine object to the InstPrinter if we are printing the alias of an InstAlias instead of the thing being aliased. Because we need to know the features that are valid for an InstAlias. This is part of a work-in-progress. llvm-svn: 127986	2011-03-21 04:13:46 +00:00
Evan Cheng	0663f23bd8	Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified. llvm-svn: 127981	2011-03-21 01:19:09 +00:00
Daniel Dunbar	327cd36f74	Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR", it broke a lot of things. llvm-svn: 127954	2011-03-19 21:47:14 +00:00
Evan Cheng	824a711305	SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953	2011-03-19 17:17:39 +00:00
Nadav Rotem	e7a101ccab	Add support for legalizing UINT_TO_FP of vectors on platforms which do not have native support for this operation (such as X86). The legalized code uses two vector INT_TO_FP operations and is faster than scalarizing. llvm-svn: 127951	2011-03-19 13:09:10 +00:00
Eli Friedman	59721e3238	Revert r127852; it's apparently causing an ICE on mingw. llvm-svn: 127909	2011-03-18 21:12:29 +00:00
Joerg Sonnenberger	3fbfcc0e1e	Support explicit argument forms for the X86 string instructions. For now, only the default segments are supported. llvm-svn: 127875	2011-03-18 11:59:40 +00:00
Eli Friedman	1a916a3c0c	Add a target-specific branchless method for double-width relational comparisons on x86. Essentially, the way this works is that SUB+SBB sets the relevant flags the same way a double-width CMP would. This is a substantial improvement over the generic lowering in LLVM. The output is also shorter than the gcc-generated output; I haven't done any detailed benchmarking, though. llvm-svn: 127852	2011-03-18 02:34:11 +00:00
Cameron Zwarich	2ef0c69df1	Move more logic into getTypeForExtArgOrReturn. llvm-svn: 127809	2011-03-17 14:53:37 +00:00
Cameron Zwarich	34e7b3f77e	Rename getTypeForExtendedInteger() to getTypeForExtArgOrReturn(). llvm-svn: 127807	2011-03-17 14:21:56 +00:00
Eli Friedman	e8f2be0c10	A couple new README entries. llvm-svn: 127786	2011-03-17 01:22:09 +00:00
Cameron Zwarich	ac106273d4	The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte rather than an int. Thankfully, this only causes LLVM to miss optimizations, not generate incorrect code. This just fixes the zext at the return. We still insert an i32 ZextAssert when reading a function's arguments, but it is followed by a truncate and another i8 ZextAssert so it is not optimized. llvm-svn: 127766	2011-03-16 22:20:18 +00:00
Sean Callanan	b60b0bc47e	Enabled disassembler support for AVX instructions in the instruction tables and fixed a few bugs that were causing decode conflicts. Rudimentary tests are coming up in the next patch. llvm-svn: 127646	2011-03-15 01:28:15 +00:00
Sean Callanan	c3fd523731	X86 table-generator and disassembler support for the AVX instruction set. This code adds support for the VEX prefix and for the YMM registers accessible on AVX-enabled architectures. Instruction table support that enables AVX instructions for the disassembler is in an upcoming patch. llvm-svn: 127644	2011-03-15 01:23:15 +00:00
Eric Christopher	cf56a5034f	Change the x86 32-bit scheduler to register pressure and fix up the corresponding testcases back to the previous versions. Fixes some performance regressions only seen on 32-bit. llvm-svn: 127441	2011-03-11 01:05:58 +00:00
Stuart Hastings	d17ae4e939	Revert 127359; it broke lencod. llvm-svn: 127382	2011-03-10 00:25:53 +00:00
Evan Cheng	b4c6a34415	Re-commit 127368 and 127371. They are exonerated. llvm-svn: 127380	2011-03-10 00:16:32 +00:00
Evan Cheng	d4b3f8e009	Revert 127368 and 127371 for now. llvm-svn: 127376	2011-03-09 23:53:17 +00:00
Evan Cheng	ca9a936332	Change the definition of TargetRegisterInfo::getCrossCopyRegClass to be more flexible. If it returns a register class that's different from the input, then that's the register class used for cross-register class copies. If it returns a register class that's the same as the input, then no cross- register class copies are needed (normal copies would do). If it returns null, then it's not at all possible to copy registers of the specified register class. llvm-svn: 127368	2011-03-09 22:47:38 +00:00
Benjamin Kramer	801c9afd94	Fix a pasto that broke all x86_64-elf targets. llvm-svn: 127365	2011-03-09 22:07:13 +00:00
Stuart Hastings	9955e2f912	X86 byval copies no longer always_inline. <rdar://problem/8706628> llvm-svn: 127359	2011-03-09 21:10:30 +00:00
Jan Sjödin	6348dc0566	Add createELFObjectTargetWriter method to TargetAsmBackend, which enables construction of non-standard ELFObjectWriters that can be used in MCJIT. llvm-svn: 127346	2011-03-09 18:44:41 +00:00
NAKAMURA Takumi	58d1f93b03	Target/X86: Tweak va_arg for Win64 not to miss taking va_start when number of fixed args > 4. llvm-svn: 127328	2011-03-09 11:33:15 +00:00
Benjamin Kramer	679cfb54ec	X86: Fix the (saddo/ssub x, 1) -> incl/decl selection to check the right operand for 1. Found by inspection. llvm-svn: 127247	2011-03-08 15:20:20 +00:00
Eric Christopher	eb19e9e9fc	Turn on list-ilp scheduling by default on x86 and x86-64, fix up testcases accordingly. Some are currently xfailed and will be filed as bugs to be fixed or understood. Performance results: roughly neutral on SPEC some micro benchmarks in the llvm suite are up between 100 and 150%, only a pair of regressions that are due to be investigated john-the-ripper saw: 10% improvement in traditional DES 8% improvement in BSDI DES 59% improvement in FreeBSD MD5 67% improvement in OpenBSD Blowfish 14% improvement in LM DES Small compile time impact. llvm-svn: 127208	2011-03-08 02:42:25 +00:00
Cameron Zwarich	df61694417	Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo. llvm-svn: 127175	2011-03-07 21:56:36 +00:00
Andrew Trick	641e2d4f8c	Increased the register pressure limit on x86_64 from 8 to 12 regs. This is the only change in this checkin that may affects the default scheduler. With better register tracking and heuristics, it doesn't make sense to artificially lower the register limit so much. Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to give the scheduler a way to account for div and sqrt on targets that don't have an itinerary. It is currently defaults to 10 (the actual number doesn't matter much), but only takes effect on non-default schedulers: list-hybrid and list-ilp. Added several heuristics that can be individually disabled for the non-default sched=list-ilp mode. This helps us determine how much better we can do on a given benchmark than the default scheduler. Certain compute intensive loops run much faster in this mode with the right set of heuristics, and it doesn't seem to have much negative impact elsewhere. Not all of the heuristics are needed, but we still need to experiment to decide which should be disabled by default for sched=list-ilp. llvm-svn: 127067	2011-03-05 08:00:22 +00:00
Andrew Trick	27c079e1b0	whitespace llvm-svn: 127065	2011-03-05 06:31:54 +00:00
Eli Friedman	f63614a982	PR9377: Handle x86 str with register operand in a way consistent with gas. llvm-svn: 126970	2011-03-04 00:10:17 +00:00
Tilmann Scheller	3bc0bcf3ad	Use X86_thiscall calling convention for Win64 as well. llvm-svn: 126934	2011-03-03 07:49:07 +00:00
Tilmann Scheller	a3769f8021	Add Win64 thiscall calling convention. llvm-svn: 126862	2011-03-02 19:29:22 +00:00
David Greene	dd567b214b	[AVX] Fix mask predicates for 256-bit UNPCKLPS/D and implement missing patterns for them. Add a SIMD test subdirectory to hold tests for SIMD instruction selection correctness and quality. ' llvm-svn: 126845	2011-03-02 17:23:43 +00:00
Duncan Sands	c76ae9c8e0	Add datalayout information for the IEEE quad precision fp128 type. llvm-svn: 126780	2011-03-01 20:56:50 +00:00
Chris Lattner	c93d207e8c	fix a signed comparison warning. llvm-svn: 126682	2011-02-28 20:50:35 +00:00
David Greene	20a1cbefad	[AVX] Add decode support for VUNPCKLPS/D instructions, both 128-bit and 256-bit forms. Because the number of elements in a vector does not determine the vector type (4 elements could be v4f32 or v4f64), pass the full type of the vector to decode routines. llvm-svn: 126664	2011-02-28 19:06:56 +00:00
Benjamin Kramer	25bddae404	Silence enum conversion warnings. llvm-svn: 126578	2011-02-27 18:13:53 +00:00
NAKAMURA Takumi	d4e5003a3f	Target/X86: Always emit "push/pop GPRs" in prologue/epilogue and emit "spill/reload frames" for XMMs. It improves Win64's prologue/epilogue but it would not affect ia32 and amd64 (lack of nonvolatile XMMs). llvm-svn: 126568	2011-02-27 08:47:19 +00:00
Owen Anderson	b2c80da4ae	Allow targets to specify a the type of the RHS of a shift parameterized on the type of the LHS. llvm-svn: 126518	2011-02-25 21:41:48 +00:00
Cameron Zwarich	fcf51fd298	Roll out r126425 and r126450 to see if it fixes the failures on the buildbots. llvm-svn: 126488	2011-02-25 16:30:32 +00:00
Chris Lattner	0152b7bc7c	remove command line option debugging hook. llvm-svn: 126441	2011-02-24 21:53:03 +00:00
Devang Patel	b037383a35	Enable DebugInfo support for COFF object files. Patch by Nathan Jeffords! llvm-svn: 126425	2011-02-24 21:04:00 +00:00
Evan Cheng	3923466e82	Fix bug in X86 folding / unfolding table. Int_CMPSDrm and Int_CMPSSrm memory operands starts at index 2, not 1. rdar://9045024 PR9305 llvm-svn: 126359	2011-02-24 02:36:52 +00:00
David Greene	9a6040dc86	[AVX] General VUNPCKL codegen support. llvm-svn: 126264	2011-02-22 23:31:46 +00:00
Joerg Sonnenberger	b7e635dcad	Use the same (%dx) hack for in[bwl] as for out[bwl]. llvm-svn: 126244	2011-02-22 20:40:09 +00:00

... 15 16 17 18 19 ...

8585 Commits