llvm-project

Commit Graph

Author	SHA1	Message	Date
Jack Carter	9c1a027fe8	This patch that sets the EmitAlias flag in td files and enables the instruction printer to print aliased instructions. Due to usage of RegisterOperands a change in common code (utils/TableGen/AsmWriterEmitter.cpp) is required to get the correct register value if it is a RegisterOperand. Contributer: Vladimir Medic llvm-svn: 174358	2013-02-05 08:32:10 +00:00
Jack Carter	10be6aef15	This patch changes a static_cast to dyn_cast for MipsELFStreamer objects. Contributer: Jack Carter llvm-svn: 174354	2013-02-05 07:47:41 +00:00
Jyotsna Verma	7ab68fbd1d	Hexagon: Add V4 combine instructions and some more Def Pats for V2. llvm-svn: 174331	2013-02-04 15:52:56 +00:00
Benjamin Kramer	c35d526489	Disable a couple more vector splat optimizations on PPC. I didn't see those because the test case used "not grep". FileCheck the test and XFAIL it, preserving the old optimization, so this can be fixed eventually. llvm-svn: 174330	2013-02-04 15:52:32 +00:00
Tim Northover	24937c12eb	Fix some abuses of StringRef We were taking a StringRef to a temporary result, which can go horribly wrong. llvm-svn: 174328	2013-02-04 15:44:38 +00:00
Benjamin Kramer	2c9da989c2	X86: Open up some opportunities for constant folding by postponing shift lowering. Fixes PR15141. llvm-svn: 174327	2013-02-04 15:19:33 +00:00
Benjamin Kramer	0611298446	X86: Simplify code. No functionality change. llvm-svn: 174326	2013-02-04 15:19:25 +00:00
Benjamin Kramer	548ffa274a	SelectionDAG: Teach FoldConstantArithmetic how to deal with vectors. This required disabling a PowerPC optimization that did the following: input: x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16> lowered to: tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8> x = ADD tmp, tmp The add now gets folded immediately and we're back at the BUILD_VECTOR we started from. I don't see a way to fix this currently so I left it disabled for now. Fix some trivially foldable X86 tests too. llvm-svn: 174325	2013-02-04 15:19:18 +00:00
Tim Northover	aefc30f2a4	Give explicit suffix to integer constant over 32-bits. llvm-svn: 174324	2013-02-04 14:14:58 +00:00
Evgeniy Stepanov	1f5a71492d	More MSan/ASan annotations. This change lets us bootstrap LLVM/Clang under ASan and MSan. It contains fixes for 2 issues: - X86JIT reads return address from stack, which MSan does not know is initialized. - bugpoint tests run binaries with RLIMIT_AS. This does not work with certain Sanitizers. We are no longer including config.h in Compiler.h with this change. llvm-svn: 174306	2013-02-04 07:03:24 +00:00
Arnold Schwaighofer	98f1012f9b	ARM cost model: Penalize insertelement into D subregisters Swift has a renaming dependency if we load into D subregisters. We don't have a way of distinguishing between insertelement operations of values from loads and other values. Therefore, we are pessimistic for now (The performance problem showed up in example 14 of gcc-loops). radar://13096933 llvm-svn: 174300	2013-02-04 02:52:05 +00:00
NAKAMURA Takumi	80159432de	PPCDarwinAsmPrinter::EmitStartOfAsmFile(): Add checking range in CPUDirectives[]. llvm-svn: 174298	2013-02-04 00:47:38 +00:00
NAKAMURA Takumi	3d591ae0b9	PPCDarwinAsmPrinter::EmitStartOfAsmFile(): Add possible elements in CPUDirectives[]. llvm-svn: 174297	2013-02-04 00:47:33 +00:00
Reed Kotler	f8933f83f0	Start static relocation implementation for mips16. This checkin makes hello world work. llvm-svn: 174264	2013-02-02 04:07:35 +00:00
Bill Schmidt	cc99a2f61d	Add notes about future PowerPC features llvm-svn: 174232	2013-02-01 23:10:09 +00:00
Bill Schmidt	52742c25ae	LLVM enablement for some older PowerPC CPUs llvm-svn: 174230	2013-02-01 22:59:51 +00:00
David Sehr	8114a7a651	Two changes relevant to LEA and x32: 1) allows the use of RIP-relative addressing in 32-bit LEA instructions under x86-64 (ILP32 and LP64) 2) separates the size of address registers in 64-bit LEA instructions from control by ILP32/LP64. llvm-svn: 174208	2013-02-01 19:28:09 +00:00
Jyotsna Verma	2ceafa6684	Replace LDriu*[bhdw]_indexed_V4 instructions with "def Pats". llvm-svn: 174193	2013-02-01 16:36:16 +00:00
Jyotsna Verma	d6eda1c227	Add appropriate TSFlags to the instructions that must be always extended. llvm-svn: 174186	2013-02-01 15:54:43 +00:00
Tim Northover	111b6cb37b	Remove currently unused register decoder from AArch64. This should fix a warning when building this backend. llvm-svn: 174177	2013-02-01 14:55:05 +00:00
Chandler Carruth	e5d8d0d64b	Switch the code added in r173885 to use the new, shiny RTTI infrastructure on MCStreamer to test for whether there is an MCELFStreamer object available. This is just a cleanup on the AsmPrinter side of things, moving ad-hoc tests of random APIs to a direct type query. But the AsmParser completely broken. There were no tests, it just blindly cast its streamer to an MCELFStreamer and started manipulating it. I don't have a test case -- this actually failed on LLVM's own regression test suite. Unfortunately the failure only appears when the stars, compilers, and runtime align to misbehave when we read a pointer to a formatted_raw_ostream as-if it were an MCAssembler. =/ UBSan would catch this immediately. Many thanks to Matt for doing about 80% of the debugging work here in GDB, Jim for helping to explain how exactly to fix this, and others for putting up with the hair pulling that ensued during debugging it. llvm-svn: 174118	2013-01-31 23:43:14 +00:00
Chandler Carruth	de093ef8d6	Give the MCStreamer class hierarchy LLVM RTTI facilities for use with isa<> and dyn_cast<>. In several places, code is already hacking around the absence of this, and there seem to be several interfaces that might be lifted and/or devirtualized using this. This change was based on a discussion with Jim Grosbach about how best to handle testing for specific MCStreamer subclasses. He said that this was the correct end state, and everything else was too hacky so I decided to just make it so. No functionality should be changed here, this is just threading the kind through all the constructors and setting up the classof overloads. llvm-svn: 174113	2013-01-31 23:29:57 +00:00
NAKAMURA Takumi	e1137a2058	Update AMDGPURegisterInfo::eliminateFrameIndex() corresponding to r174083. llvm-svn: 174106	2013-01-31 22:55:51 +00:00
Tom Stellard	4926921bd4	R600: Fold clamp, neg, abs Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174099	2013-01-31 22:11:54 +00:00
Tom Stellard	dd04c83a4d	R600: Consider bitcast when folding const_address node. Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174098	2013-01-31 22:11:53 +00:00
Tom Stellard	af1bce7d1d	R600: Make store_dummy intrinsic more general by passing export type Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174097	2013-01-31 22:11:46 +00:00
Chad Rosier	3145deac19	Remove unused variable, which should have been removed with r174083. llvm-svn: 174094	2013-01-31 21:23:44 +00:00
Tim Northover	c6d39314b2	Update AArch64 backend to changed eliminateFrameIndex interface. llvm-svn: 174086	2013-01-31 20:46:53 +00:00
Chad Rosier	df782d2225	[PEI] Pass the frame index operand number to the eliminateFrameIndex function. Each target implementation was needlessly recomputing the index. Part of rdar://13076458 llvm-svn: 174083	2013-01-31 20:02:54 +00:00
Tim Northover	e0e3aefdd3	Add AArch64 as an experimental target. This patch adds support for AArch64 (ARM's 64-bit architecture) to LLVM in the "experimental" category. Currently, it won't be built unless requested explicitly. This initial commit should have support for: + Assembly of all scalar (i.e. non-NEON, non-Crypto) instructions (except the late addition CRC instructions). + CodeGen features required for C++03 and C99. + Compilation for the "small" memory model: code+static data < 4GB. + Absolute and position-independent code. + GNU-style (i.e. "__thread") TLS. + Debugging information. The principal omission, currently, is performance tuning. This patch excludes the NEON support also reviewed due to an outbreak of batshit insanity in our legal department. That will be committed soon bringing the changes to precisely what has been approved. Further reviews would be gratefully received. llvm-svn: 174054	2013-01-31 12:12:40 +00:00
Eric Christopher	258c867c0b	Whitespace. llvm-svn: 174009	2013-01-31 00:50:48 +00:00
Eric Christopher	4e3e94c13d	Check and allow floating point registers to select the size of the register for inline asm. This conforms to how gcc allows for effective casting of inputs into gprs (fprs is already handled). llvm-svn: 174008	2013-01-31 00:50:46 +00:00
Hal Finkel	e1df90958d	PPC QPX requires a 32-byte aligned stack On systems which support the QPX vector instructions, the stack must be 32-byte aligned. llvm-svn: 173993	2013-01-30 23:43:27 +00:00
Evan Cheng	d2ca4e2ed9	Restrict sin/cos optimization to 64-bit only for now. 32-bit is a bit messy and less critical. llvm-svn: 173987	2013-01-30 22:56:35 +00:00
Hal Finkel	b3fc509b23	Initialize hasQPX in PPCSubtarget This should have gone in with r173973. llvm-svn: 173984	2013-01-30 22:43:44 +00:00
Hal Finkel	efb305e54c	Add definitions for the PPC a2q core marked as having QPX available This is the first commit of a large series which will add support for the QPX vector instruction set to the PowerPC backend. This instruction set is used on the IBM Blue Gene/Q supercomputers. llvm-svn: 173973	2013-01-30 21:17:42 +00:00
Eli Bendersky	2e2ce49e59	Add a special ARM trap encoding for NaCl. More details in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130128/163783.html Patch by JF Bastien llvm-svn: 173943	2013-01-30 16:30:19 +00:00
Logan Chien	a436e4c7e4	Add missing header and test cases for r173939. llvm-svn: 173941	2013-01-30 15:48:50 +00:00
Logan Chien	2bcc42c730	Override virtual function for ARM EH directives. llvm-svn: 173939	2013-01-30 15:39:04 +00:00
David Blaikie	24f44ac53a	Removing initializer for the field removed in r173887 llvm-svn: 173888	2013-01-30 03:04:07 +00:00
David Blaikie	b7fa813373	Remove unused variable (introduced in r173884) to clear clang -Werror build llvm-svn: 173887	2013-01-30 02:56:02 +00:00
Jack Carter	b01a90ce40	Forgot to add new file to CMakeLists llvm-svn: 173886	2013-01-30 02:32:36 +00:00
Jack Carter	718da0b53b	This patch implements runtime ARM specific setting of ELF header e_flags. Contributer: Jack Carter llvm-svn: 173885	2013-01-30 02:24:33 +00:00
Jack Carter	7f378104b6	This patch implements runtime Mips specific setting of ELF header e_flags. Contributer: Jack Carter llvm-svn: 173884	2013-01-30 02:16:36 +00:00
Jack Carter	1bd90ff6cc	This patch reworks how llvm targets set and update ELF header e_flags. Currently gathering information such as symbol, section and data is done by collecting it in an MCAssembler object. From MCAssembler and MCAsmLayout objects ELFObjectWriter::WriteObject() forms and streams out the ELF object file. This patch just adds a few members to the MCAssember class to store and access the e_flag settings. It allows for runtime additions to the e_flag by assembler directives. The standalone assembler can get to MCAssembler from getParser().getStreamer().getAssembler(). This patch is the generic infrastructure and will be followed by patches for ARM and Mips for their target specific use. Contributer: Jack Carter llvm-svn: 173882	2013-01-30 02:09:52 +00:00
Akira Hatanaka	c0b020690b	[mips] Lower EH_RETURN. Patch by Sasa Stankovic. llvm-svn: 173862	2013-01-30 00:26:49 +00:00
Renato Golin	5e9d55eca0	Adding simple cast cost to ARM Changing ARMBaseTargetMachine to return ARMTargetLowering intead of the generic one (similar to x86 code). Tests showing which instructions were added to cast when necessary or cost zero when not. Downcast to 16 bits are not lowered in NEON, so costs are not there yet. llvm-svn: 173849	2013-01-29 23:31:38 +00:00
Jyotsna Verma	b16a9cb132	Use multiclass for post-increment store instructions. llvm-svn: 173816	2013-01-29 18:42:41 +00:00
Jyotsna Verma	a609b1c89d	Add constant extender support for MInst type instructions. llvm-svn: 173813	2013-01-29 18:18:50 +00:00
Evan Cheng	27e41c9f70	Remove dead code. llvm-svn: 173812	2013-01-29 18:08:22 +00:00
NAKAMURA Takumi	978b5a0e02	R600/AMDILPeepholeOptimizer.cpp: Tweak std::make_pair to satisfy C++11. llvm-svn: 173807	2013-01-29 16:31:56 +00:00
Hans Wennborg	5deecd9043	Fix typo in X86BaseInfo.h that I introduced in r157818. llvm-svn: 173798	2013-01-29 14:05:57 +00:00
Tim Northover	a0edd3ee66	Fix 64-bit atomic operations in Thumb mode. The ARM and Thumb variants of LDREXD and STREXD have different constraints and take different operands. Previously the code expanding atomic operations didn't take this into account and asserted in Thumb mode. llvm-svn: 173780	2013-01-29 09:06:13 +00:00
Craig Topper	c048154b9b	Merge SSE and AVX shuffle instructions in the comment printer. llvm-svn: 173777	2013-01-29 07:54:31 +00:00
Evan Cheng	0e88c7d897	Teach SDISel to combine fsin / fcos into a fsincos node if the following conditions are met: 1. They share the same operand and are in the same BB. 2. Both outputs are used. 3. The target has a native instruction that maps to ISD::FSINCOS node or the target provides a sincos library call. Implemented the generic optimization in sdisel and enabled it for Mac OSX. Also added an additional optimization for x86_64 Mac OSX by using an alternative entry point __sincos_stret which returns the two results in xmm0 / xmm1. rdar://13087969 PR13204 llvm-svn: 173755	2013-01-29 02:32:37 +00:00
Hal Finkel	7f9e8d3eaa	Add isBGQ method to PPCSubtarget This function will be used in future commits. llvm-svn: 173729	2013-01-29 00:22:47 +00:00
Craig Topper	5c683972bc	Fix 256-bit PALIGNR comment decoding to understand that it works on independent 256-bit lanes. llvm-svn: 173674	2013-01-28 07:41:18 +00:00
Craig Topper	71d99ffe4a	Add missing break in 256-bit palignr comment printing. No test case yet because the comment itself is still wrong. llvm-svn: 173669	2013-01-28 07:19:11 +00:00
Craig Topper	8fb09f0abb	Fix inconsistent usage of PALIGN and PALIGNR when referring to the same instruction. llvm-svn: 173667	2013-01-28 06:48:25 +00:00
Craig Topper	b3ede5e3b1	Remove addToNoHelperNeeded function that was left unused after r173649. Fixes a -Wunused warning. llvm-svn: 173664	2013-01-28 06:09:24 +00:00
Reed Kotler	97f8e2fa8f	Make some code a little simpler. llvm-svn: 173649	2013-01-28 02:46:49 +00:00
Richard Osborne	038d24f90c	[XCore] Add missing l2rus instructions. These instructions are not targeted by the compiler but they are needed for the MC layer. llvm-svn: 173634	2013-01-27 22:28:30 +00:00
Richard Osborne	f2ecd40929	[XCore] Add missing l2r instructions. These instructions are not targeted by the compiler but they are needed for the MC layer. llvm-svn: 173629	2013-01-27 21:26:02 +00:00
Richard Osborne	7fe8f63544	[XCore] Add missing 1r instructions. These instructions are not targeted by the compiler but they are needed for the MC layer. llvm-svn: 173624	2013-01-27 20:46:21 +00:00
Richard Osborne	8f56317287	[XCore] Add missing 0r instructions. These instructions are not targeted by the compiler but they are needed for the MC layer. llvm-svn: 173623	2013-01-27 20:42:57 +00:00
Bill Wendling	cc1fc9465b	Convert the CPP backend to use the AttributeSet instead of AttributeWithIndex. Further removal of the introspective AttributeWithIndex thing. Also fix the #includes. llvm-svn: 173599	2013-01-27 01:22:51 +00:00
Benjamin Kramer	6a93596538	X86: Decode PALIGN operands so I don't have to do it in my head. llvm-svn: 173572	2013-01-26 13:31:37 +00:00
Benjamin Kramer	99c68dd964	X86: Do splat promotion later, so the optimizer can chew on it first. This catches many cases where we can emit a more efficient shuffle for a specific mask or when the mask contains undefs. Once the splat is lowered to unpacks we can't do that anymore. There is a possibility of moving the promotion after pshufb matching, but I'm not sure if pshufb with a mask loaded from memory is faster than 3 shuffles, so I avoided that for now. llvm-svn: 173569	2013-01-26 11:44:21 +00:00
Reed Kotler	233cee2b5b	fix use of std::std. it's ordered set. llvm-svn: 173563	2013-01-26 06:58:35 +00:00
Dmitri Gribenko	c451bdf9ff	Remove unused variables, silences -Wunused-variable llvm-svn: 173526	2013-01-25 23:17:21 +00:00
Bill Wendling	57625a4966	Remove some introspection functions. The 'getSlot' function and its ilk allow introspection into the AttributeSet class. However, that class should be opaque. Allow access through accessor methods instead. llvm-svn: 173522	2013-01-25 23:09:36 +00:00
Hal Finkel	4e5ca9e578	Initial implementation of PPCTargetTransformInfo This provides a place to add customized operation cost information and control some other target-specific IR-level transformations. The only non-trivial logic in this checkin assigns a higher cost to unaligned loads and stores (covered by the included test case). llvm-svn: 173520	2013-01-25 23:05:59 +00:00
Eli Bendersky	597fc1233a	In this patch, we teach X86_64TargetMachine that it has a ILP32 (defined by the x32 ABI) mode, in which case its pointers are 32-bits in size. This knowledge is also added to X86RegisterInfo that now returns the appropriate registers in getPointerRegClass. There are many outcomes to this change. In order to keep the patches separate and manageable, we start by focusing on some simple testable cases. The patch adds a test with passing a pointer to a function - focusing on the difference between the two data models for x86-64. Another test is added for handling of 'sret' arguments (and functionality is added in X86ISelLowering to make it work). A note on naming: the "x32 ABI" document refers to the AMD64 architecture (in LLVM it's distinguished by being is64Bits() in the x86 subtarget) with two variations: the LP64 (default) data model, and the ILP32 data model. This patch adds predicates to the subtarget which are consistent with this naming scheme. llvm-svn: 173503	2013-01-25 22:07:43 +00:00
Richard Osborne	6b86eec819	Add instruction encodings / disassembly support for l4r instructions. llvm-svn: 173501	2013-01-25 21:55:32 +00:00
Bill Wendling	8649283e75	Use the new 'getSlotIndex' method to retrieve the attribute's slot index. llvm-svn: 173499	2013-01-25 21:46:52 +00:00
Richard Osborne	a520a7dcf3	Use the correct format in the STW / SETPSC instruction names. llvm-svn: 173494	2013-01-25 21:25:12 +00:00
Richard Osborne	9a228a13c6	Fix order of operands for crc8_l4r The order in which operands appear in the encoded instruction is different to order in which they appear in assembly. This changes the XCore backend to use the instruction encoding order. llvm-svn: 173493	2013-01-25 21:20:28 +00:00
Richard Osborne	a19fa86a70	Add instruction encodings / disassembly support for l5r instructions. llvm-svn: 173479	2013-01-25 20:20:07 +00:00
Richard Osborne	8ae02d3cef	Fix order of operands for l5r instructions. With this change the operands order matches the order in which the operands are encoded in the instruction. llvm-svn: 173477	2013-01-25 20:16:00 +00:00
Richard Osborne	ea023fcde1	Use correct mnemonic / instruction name for ldivu. llvm-svn: 173476	2013-01-25 20:11:26 +00:00
Hal Finkel	53f4ba6ce3	More cleanup of PPC register definitions. Uses the new !add TableGen operator to do more cleanup of the PPC register definitions. llvm-svn: 173446	2013-01-25 14:49:10 +00:00
Silviu Baranga	3eb45a03af	Fixed the condition codes for the atomic64 min/umin code generation on ARM. If the sutraction of the higher 32 bit parts gives a 0 result, we need to do the store operation. llvm-svn: 173437	2013-01-25 10:39:49 +00:00
Andrew Trick	e2c3f5c982	MIsched: Improve the interface to SchedDFS analysis (subtrees). Allow the strategy to select SchedDFS. Allow the results of SchedDFS to affect initialization of the scheduler state. llvm-svn: 173425	2013-01-25 06:33:57 +00:00
Jack Carter	07c818d2da	This patch implements parsing the .word directive for the Mips assembler. Contributer: Vladimir Medic llvm-svn: 173407	2013-01-25 01:31:34 +00:00
Akira Hatanaka	28aed9ca85	[mips] Set flag neverHasSideEffects flag on some of the floating point instructions. llvm-svn: 173401	2013-01-25 00:20:39 +00:00
Renato Golin	d4c392e6ff	Moving Cost Tables up to share with other targets llvm-svn: 173382	2013-01-24 23:01:00 +00:00
Hal Finkel	41176f43c4	Start cleanup of PPC register definitions using foreach loops. No functionality change intended. This captures the first two cases GPR32/64. For the others, we need an addition operator (if we have one, I've not yet found it). Based on a suggestion made by Tom Stellard in the AArch64 review! llvm-svn: 173366	2013-01-24 20:43:18 +00:00
NAKAMURA Takumi	bf8f207519	MipsISelLowering.cpp: Fill unreachable paths to fix warnings. [-Wsometimes-uninitialized] FIXME: Could they, unreachable(s), be removed? FIXME: I could prefer the coding standards... llvm-svn: 173325	2013-01-24 06:08:06 +00:00
NAKAMURA Takumi	f25b7c6816	MipsISelLowering.cpp: Fix a warning, take two. [-Wunused-variable] ...and fix a typo, s/#ifdef/#ifndef/ llvm-svn: 173324	2013-01-24 05:54:23 +00:00
NAKAMURA Takumi	c77d028bfb	MipsISelLowering.cpp: Fix a warning. [-Wunused-variable] llvm-svn: 173323	2013-01-24 05:47:29 +00:00
Reed Kotler	a2d76bce1f	The next phase of Mips16 hard float implementation. Allow Mips16 routines to call Mips32 routines that have abi requirements that either arguments or return values are passed in floating point registers. This handles only the pic case. We have not done non pic for Mips16 yet in any form. The libm functions are Mips32, so with this addition we have a complete Mips16 hard float implementation. We still are not able to complete mix Mip16 and Mips32 with hard float. That will be the next phase which will have several steps. For Mips32 to freely call Mips16 some stub functions must be created. llvm-svn: 173320	2013-01-24 04:24:02 +00:00
Tom Stellard	6f1b8657f9	R600: Add a llvm.R600.store.swizzle intrinsics This intrinsic is translated to ALLOC_EXPORT_WORD1_SWIZ, hence its name. It is used to store vs/fs outputs Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173297	2013-01-23 21:39:49 +00:00
Tom Stellard	d8ac91d436	R600: Simplify stream outputs intrinsic Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173296	2013-01-23 21:39:47 +00:00
Richard Osborne	54e311821f	Add instruction encodings / disassembly support for l6r instructions. llvm-svn: 173288	2013-01-23 20:08:11 +00:00
Eli Bendersky	f759526983	Fix powerpc test failure - forgot to initialize stack slot size for PPCLinuxMCAsmInfo llvm-svn: 173275	2013-01-23 17:12:15 +00:00
Eli Bendersky	32aab2216d	Clean up assignment of CalleeSaveStackSlotSize: get rid of the default and explicitly set this in every target that needs to change it from the default. llvm-svn: 173270	2013-01-23 16:22:04 +00:00
Benjamin Kramer	c4231cc9b3	NVPTX: Stop leaking memory by using a managed constant instead of a new Argument. This is still an egregious hack since we don't have a nice interface for this kind of thing but should help the valgrind leak check buildbot to become green. llvm-svn: 173267	2013-01-23 15:21:44 +00:00
Bill Wendling	d154e283f2	Add the IR attribute 'sspstrong'. SSPStrong applies a heuristic to insert stack protectors in these situations: * A Protector is required for functions which contain an array, regardless of type or length. * A Protector is required for functions which contain a structure/union which contains an array, regardless of type or length. Note, there is no limit to the depth of nesting. * A protector is required when the address of a local variable (i.e., stack based variable) is exposed. (E.g., such as through a local whose address is taken as part of the RHS of an assignment or a local whose address is taken as part of a function argument.) This patch implements the SSPString attribute to be equivalent to SSPRequired. This will change in a subsequent patch. llvm-svn: 173230	2013-01-23 06:41:41 +00:00
Tom Stellard	365366f9ef	R600: rework handling of the constants Remove Cxxx registers, add new special register - "ALU_CONST" and new operand for each alu src - "sel". ALU_CONST is used to designate that the new operand contains the value to override src.sel, src.kc_bank, src.chan for constants in the driver. Patch by: Vadim Girlin Vincent Lejeune: - Use pointers for constants - Fold CONST_ADDRESS when possible Tom Stellard: - Give CONSTANT_BUFFER_0 its own address space - Use integer types for constant loads Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173222	2013-01-23 02:09:06 +00:00
Tom Stellard	ff62c35da0	R600: Add a CONST_ADDRESS node to model constant buf read Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173221	2013-01-23 02:09:03 +00:00
Tom Stellard	ab28e9a30a	R600: Factorise VTX_WORD0 and VTX_WORD1 in tblgen def Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 173220	2013-01-23 02:09:01 +00:00
Richard Osborne	1a06479f46	Add instruction encodings / disassembly support for u10 / lu10 instructions. llvm-svn: 173204	2013-01-22 22:55:04 +00:00
Michael Liao	3dffc5e2b7	Fix an issue of pseudo atomic instruction DAG schedule - Add list of physical registers clobbered in pseudo atomic insts Physical registers are clobbered when pseudo atomic instructions are expanded. Add them in clobber list to prevent DAG scheduler to mis-schedule them after these insns are declared side-effect free. - Add test case from Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 173200	2013-01-22 21:47:38 +00:00
Akira Hatanaka	88c0ec826c	[mips] Implement MipsRegisterInfo::getRegPressureLimit. llvm-svn: 173197	2013-01-22 21:34:25 +00:00
Akira Hatanaka	f7d16d0563	[mips] Clean up code in MipsTargetLowering::LowerCall. No functional change intended llvm-svn: 173189	2013-01-22 20:05:56 +00:00
Benjamin Kramer	fee7d21ae7	X86: Make sure we account for the FMA4 register immediate value, otherwise rip-rel relocations will be off by one byte. PR15040. llvm-svn: 173176	2013-01-22 18:05:59 +00:00
Eli Bendersky	0893e1079d	Initial patch for x32 ABI support. Add the x32 environment kind to the triple, and separate the concept of pointer size and callee save stack slot size, since they're not equal on x32. llvm-svn: 173175	2013-01-22 18:02:49 +00:00
Tim Northover	29178a348a	Make APFloat constructor require explicit semantics. Previously we tried to infer it from the bit width size, with an added IsIEEE argument for the PPC/IEEE 128-bit case, which had a default value. This default value allowed bugs to creep in, where it was inappropriate. llvm-svn: 173138	2013-01-22 09:46:31 +00:00
Richard Osborne	5d477751df	Fix some incorrectly named u10 / lu10 instructions. llvm-svn: 173090	2013-01-21 21:12:30 +00:00
Richard Osborne	38cff3ea7f	Remove unused multiclass. llvm-svn: 173087	2013-01-21 20:50:54 +00:00
Richard Osborne	9d3ec06ef8	Add instruction encodings / disassembly support for u6 / lu6 instructions. llvm-svn: 173086	2013-01-21 20:44:17 +00:00
Richard Osborne	6e58c6d86d	Add instruction encoding / disassembly support for ru6 / lru6 instructions. llvm-svn: 173085	2013-01-21 20:42:16 +00:00
Richard Osborne	0d68e21ca7	Use correct format for the LDAWCP instruction (u6). llvm-svn: 173083	2013-01-21 20:32:54 +00:00
Tom Stellard	c9b903138d	R600/SI: Use unnormalized coordinates for sampling with the RECT target. Patch by: Michel Dänzer Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 173053	2013-01-21 15:40:48 +00:00
Tom Stellard	14421a793f	R600/SI: Take target parameter for sample intrinsics. Patch by: Michel Dänzer Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 173052	2013-01-21 15:40:47 +00:00
Tom Stellard	74dda0da31	R600/SI: Derive all sample intrinsics from a single class. Patch by: Michel Dänzer Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 173051	2013-01-21 15:40:46 +00:00
NAKAMURA Takumi	c96fb1bd36	R600/SILowerControlFlow.cpp: Fix a warning. [-Wunused-variable] llvm-svn: 173040	2013-01-21 14:06:48 +00:00
Craig Topper	66163a35ee	Use <0 checks in place of ==-1 because it results in simpler code. llvm-svn: 173010	2013-01-21 07:25:16 +00:00
Craig Topper	9b29486f42	Use MVT instead of EVT in LowerVECTOR_SHUFFLEtoBlend. llvm-svn: 173009	2013-01-21 07:19:54 +00:00
Craig Topper	32c5406dcf	Remove trailing whitespace. llvm-svn: 173008	2013-01-21 06:57:59 +00:00
Craig Topper	5c84c25bf4	Fix some 80 column violations. llvm-svn: 173006	2013-01-21 06:21:54 +00:00
Craig Topper	2cd375896a	Make helper method static. llvm-svn: 173005	2013-01-21 06:13:28 +00:00
Craig Topper	cf93977920	Convert more EVT's to MVT's in the lowering methods. llvm-svn: 172995	2013-01-20 21:50:27 +00:00
Craig Topper	e65a08be64	Capitalize lowerTRUNCATE so that it matches the other lower functions in this file despite it not matching coding standards. llvm-svn: 172994	2013-01-20 21:34:37 +00:00
Renato Golin	e1fb059327	Revert CostTable algorithm, will re-write llvm-svn: 172992	2013-01-20 20:57:20 +00:00
Richard Osborne	4e69724869	Add instruction encodings / disassembly support for l2rus instructions. llvm-svn: 172987	2013-01-20 18:51:15 +00:00
Richard Osborne	9fbf57b26c	Add instruction encodings / disassembly support for l3r instructions. llvm-svn: 172986	2013-01-20 18:37:49 +00:00
Richard Osborne	f063fcee7a	Add instruction encodings / disassembler support for 2rus instructions. llvm-svn: 172985	2013-01-20 17:22:43 +00:00
Richard Osborne	3fb7395233	Add instruction encodings / disassembly support 3r instructions. It is not possible to distinguish 3r instructions from 2r / rus instructions using only the fixed bits. Therefore if an instruction doesn't match the 2r / rus format try to decode it as a 3r instruction before returning Fail. llvm-svn: 172984	2013-01-20 17:18:47 +00:00
Craig Topper	ce61fdf0a3	Make LowerVSETCC a static function and use MVT instead of EVT. llvm-svn: 172969	2013-01-20 09:02:22 +00:00
Nadav Rotem	9450fcfff1	Revert 172708. The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends. This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical. Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model. llvm-svn: 172968	2013-01-20 08:35:56 +00:00
Craig Topper	9976974cc6	Make some helper methods static. llvm-svn: 172936	2013-01-20 00:50:58 +00:00
Craig Topper	4ac87da529	Remove DebugLoc argument from static function. It can easily be obtained from the SVOp passed in. llvm-svn: 172935	2013-01-20 00:43:42 +00:00
Craig Topper	3da6507c41	Use MVT instead of EVT in more instruction lowering code. llvm-svn: 172933	2013-01-20 00:38:18 +00:00
Craig Topper	53c7fbabbf	Use MVT instead of EVT in more of the shuffle lowering code. llvm-svn: 172930	2013-01-19 23:36:09 +00:00
Craig Topper	bb772d27a7	Capitalize LowerVectorIntExtend to be consistent with all the other lower functions in this file. llvm-svn: 172927	2013-01-19 23:14:09 +00:00
Nadav Rotem	7b3120b9ae	On Sandybridge split unaligned 256bit stores into two xmm-sized stores. llvm-svn: 172894	2013-01-19 08:38:41 +00:00
Craig Topper	84b01120bc	Use MVT instead of EVT when computing shuffle immediates since they can only be for legal types. Keeps compiler from generating unneeded checks and handling for extended types. llvm-svn: 172893	2013-01-19 08:27:45 +00:00
Chandler Carruth	1fe21fc0b5	Sort all of the includes. Several files got checked in with mis-sorted includes. llvm-svn: 172891	2013-01-19 08:03:47 +00:00
Jack Carter	7ab15fafe3	This is a resubmittal. For some reason it broke the bots yesterday but I cannot reproduce the problem and have scrubed my sources and even tested with llvm-lit -v --vg. Formatting fixes. Mostly long lines and blank spaces at end of lines. Contributer: Jack Carter llvm-svn: 172882	2013-01-19 02:00:40 +00:00
Nadav Rotem	7431211214	On Sandybridge loading unaligned 256bits using two XMM loads (vmovups and vinsertf128) is faster than using a single vmovups instruction. llvm-svn: 172868	2013-01-18 23:10:30 +00:00
Jack Carter	c1b17ed2e1	This is a resubmittal. For some reason it broke the bots yesterday but I cannot reproduce the problem and have scrubed my sources and even tested with llvm-lit -v --vg. Support for Mips register information sections. Mips ELF object files have a section that is dedicated to register use info. Some of this information such as the assumed Global Pointer value is used by the linker in relocation resolution. The register info file is .reginfo in o32 and .MIPS.options in 64 and n32 abi files. This patch contains the changes needed to create the sections, but leaves the actual register accounting for a future patch. Contributer: Jack Carter llvm-svn: 172847	2013-01-18 21:20:38 +00:00
Tom Stellard	c4cabef782	R600: Proper insert S_WAITCNT instructions Some instructions like memory reads/writes are executed asynchronously, so we need to insert S_WAITCNT instructions to block before accessing their results. Previously we have just inserted S_WAITCNT instructions after each async instruction, this patch fixes this and adds a prober insertion pass. Patch by: Christian König Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 172846	2013-01-18 21:15:53 +00:00
Tom Stellard	be8ebeebf7	R600: Optimize and cleanup KILL on SI We shouldn't insert KILL optimization if we don't have a kill instruction at all. Patch by: Christian König Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 172845	2013-01-18 21:15:50 +00:00
Jack Carter	86c2c564ff	This is a resubmittal. For some reason it broke the bots yesterday but I cannot reproduce the problem and have scrubed my sources and even tested with llvm-lit -v --vg. Removal of redundant code and formatting fixes. Contributers: Jack Carter/Vladimir Medic llvm-svn: 172842	2013-01-18 20:15:06 +00:00
Craig Topper	1cb8aa581b	Calculate vector element size more directly for VINSERTF128/VEXTRACTF128 immediate handling. Also use MVT since this only called on legal types during pattern matching. llvm-svn: 172797	2013-01-18 08:41:28 +00:00
Craig Topper	e938138daf	Minor formatting fix. No functional change. llvm-svn: 172795	2013-01-18 07:27:20 +00:00
Craig Topper	908f7d14b5	Spelling fix: extened->extended. Trailing whitespace in same function. llvm-svn: 172793	2013-01-18 06:50:59 +00:00
Craig Topper	01fcf2e2f2	Make more use of is128BitVector/is256BitVector in place of getSizeInBits() == 128/256. llvm-svn: 172792	2013-01-18 06:44:29 +00:00
Chad Rosier	1e8f053bd1	[ms-inline asm] Make the error message more generic now that we support the 'SIZE' and 'LENGTH' operators. llvm-svn: 172773	2013-01-18 00:50:59 +00:00
Bill Schmidt	dee1ef8f53	This patch fixes PR13626 by providing i128 support in the return calling convention. 128-bit integers are now properly returned in GPR3 and GPR4 on PowerPC. llvm-svn: 172745	2013-01-17 19:34:57 +00:00
Chad Rosier	d0ed73acb4	[ms-inline asm] Add support for the 'SIZE' and 'LENGTH' operators. Part of rdar://12576868 llvm-svn: 172743	2013-01-17 19:21:48 +00:00
Jyotsna Verma	9b60c1d171	Add indexed load/store instructions for offset validation check. This patch fixes bug 14902 - http://llvm.org/bugs/show_bug.cgi?id=14902 llvm-svn: 172737	2013-01-17 18:42:37 +00:00
Bill Schmidt	6b2940b01e	This patch fixes the PPC calling convention to handle returns of _Complex float and _Complex long double, by simply increasing the number of floating point registers available for return values. The test case verifies that the correct registers are loaded. llvm-svn: 172733	2013-01-17 17:45:19 +00:00
Elena Demikhovsky	f6a30e05d5	Optimization for the following SIGN_EXTEND pairs: v8i8 -> v8i64, v8i8 -> v8i32, v4i8 -> v4i64, v4i16 -> v4i64 for AVX and AVX2. Bug 14865. llvm-svn: 172708	2013-01-17 09:59:53 +00:00
Craig Topper	c7e6feee42	Combine AVX and SSE forms of MOVSS and MOVSD into the same multiclasses so they get instantiated together. llvm-svn: 172704	2013-01-17 06:59:42 +00:00
Jakob Stoklund Olesen	213a2f8b3f	Provide a place for targets to insert ILP optimization passes. Move the early if-conversion pass into this group. ILP optimizations usually need to find the right balance between register pressure and ILP using the MachineTraceMetrics analysis to identify critical paths and estimate other costs. Such passes should run together so they can share dominator tree and loop info analyses. Besides if-conversion, future passes to run here here could include expression height reduction and ARM's MLxExpansion pass. llvm-svn: 172687	2013-01-17 00:58:38 +00:00
Jack Carter	2a74a87b71	This is a resubmittal. For some reason it broke the bots yesterday but I cannot reproduce the problem and have scrubed my sources and even tested with llvm-lit -v --vg. The Mips RDHWR (Read Hardware Register) instruction was not tested for assembler or dissassembler consumption. This patch adds that functionality. Contributer: Vladimir Medic llvm-svn: 172685	2013-01-17 00:28:20 +00:00
Renato Golin	f104c4c4ca	Change CostTable model to be global to all targets Moving the X86CostTable to a common place, so that other back-ends can share the code. Also simplifying it a bit and commoning up tables with one and two types on operations. llvm-svn: 172658	2013-01-16 21:29:55 +00:00
Jack Carter	5619f91bf7	reverting 172579 llvm-svn: 172594	2013-01-16 01:29:10 +00:00
Jack Carter	e0c1e1a47e	Akira, Hope you are feeling better. The Mips RDHWR (Read Hardware Register) instruction was not tested for assembler or dissassembler consumption. This patch adds that functionality. Contributer: Vladimir Medic llvm-svn: 172579	2013-01-16 00:07:45 +00:00
Jack Carter	f238510c43	This patch fixes a Mips specific bug where we need to generate a N64 compound relocation R_MIPS_GPREL_32/R_MIPS_64/R_MIPS_NONE. The bug was exposed by the SingleSourcetest case DuffsDevice.c. Contributer: Jack Carter llvm-svn: 172496	2013-01-15 01:08:02 +00:00
Chad Rosier	5c118fd2ec	[ms-inline asm] Extend support for parsing Intel bracketed memory operands that have an arbitrary ordering of the base register, index register and displacement. rdar://12527141 llvm-svn: 172484	2013-01-14 22:31:35 +00:00
Dmitri Gribenko	f24e57f227	Improve r172468: const_cast is not needed here llvm-svn: 172483	2013-01-14 22:18:18 +00:00
Dmitri Gribenko	2e1df0e354	Improve r172471: avoid all those extra casts on the lines nearby llvm-svn: 172481	2013-01-14 22:08:37 +00:00
Quentin Colombet	77ca8b83a9	Follow up of commit r172472. Refactor the big if/else sequence into one string switch for ARM subtype selection. llvm-svn: 172475	2013-01-14 21:34:09 +00:00
Quentin Colombet	1a71168624	Complete the existing support of ARM v6m, v7m, and v7em, i.e., respectively cortex-m0, cortex-m3, and cortex-m4 on the backend side. Adds new subtype values for the MachO format and use them when the related triple are set. llvm-svn: 172472	2013-01-14 21:07:43 +00:00
David Greene	cf7ae6c2fd	Fix Casting Fix a casting-away-const compiler warning. llvm-svn: 172471	2013-01-14 21:04:47 +00:00
David Greene	c311561708	Fix Another Cast Properly cast code to eliminate cast-away-const errors. llvm-svn: 172468	2013-01-14 21:04:42 +00:00
Craig Topper	0d2c29e807	Simplify nested strconcats in X86 td files since strconcat can take more than 2 arguments. llvm-svn: 172379	2013-01-14 07:46:34 +00:00
Craig Topper	4c69a05d2d	Create a single multiclass for SSE and AVX version of MOVL/MOVH. Prevents needing to specify everything twice. No functional change intended llvm-svn: 172378	2013-01-14 07:26:58 +00:00
Nick Lewycky	f41a80efd0	Fix typo in comment. llvm-svn: 172364	2013-01-13 19:03:55 +00:00
Dmitri Gribenko	226fea5bd6	Remove redundant 'llvm::' qualifications llvm-svn: 172358	2013-01-13 16:01:15 +00:00
Benjamin Kramer	bcd14a0f26	X86: Add patterns for X86ISD::VSEXT in registers. Those can occur when something between the sextload and the store is on the same chain and blocks isel. Fixes PR14887. llvm-svn: 172353	2013-01-13 11:37:04 +00:00
NAKAMURA Takumi	de45c3a485	MipsDisassembler.cpp: Prune DecodeHWRegs64RegisterClass() to suppress a warning. [-Wunused-function] llvm-svn: 172319	2013-01-12 15:37:00 +00:00
NAKAMURA Takumi	956c123ab6	MipsAsmParser: Try to unbreak tests to add extra check. llvm-svn: 172315	2013-01-12 15:19:10 +00:00
Jack Carter	873c724b4a	This patch tackles the problem of parsing Mips register names in the standalone assembler llvm-mc. Registers such as $A1 can represent either a 32 or 64 bit register based on the instruction using it. In addition, based on the abi, $T0 can represent different 32 bit registers. The problem is resolved by the Mips specific AsmParser td definitions changing to work together. Many cases of RegisterClass parameters are now RegisterOperand. Contributer: Vladimir Medic llvm-svn: 172284	2013-01-12 01:03:14 +00:00
Preston Gurd	99c6990457	Update patch for the pad short functions pass for Intel Atom (only). Adds a check for -Oz, changes the code to not re-visit BBs, and skips over DBG_VALUE instrs. Patch by Andy Zhang. llvm-svn: 172258	2013-01-11 22:06:56 +00:00
NAKAMURA Takumi	7f25427686	X86AsmParser.cpp: Fix up r172148, to add initializer in another CreateMem(). llvm-svn: 172157	2013-01-11 01:13:54 +00:00
Jakub Staszak	ab3d878f35	Remove heavy and unused #inclues from X86TargetObjectFile.cpp. llvm-svn: 172151	2013-01-10 23:43:56 +00:00
Chad Rosier	8c2a9c744e	[ms-inline asm] Make sure we set a default value for AddressOf. Follow on to r172121. llvm-svn: 172148	2013-01-10 23:39:07 +00:00
Chad Rosier	a4bc9437a2	[ms-inline asm] Add support for calling functions from inline assembly. Part of rdar://12991541 llvm-svn: 172121	2013-01-10 22:10:27 +00:00
Joel Jones	5459754d33	Fix description of ARMOperand llvm-svn: 172011	2013-01-09 22:34:16 +00:00
Nadav Rotem	b1791a75cd	ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor. llvm-svn: 172010	2013-01-09 22:29:00 +00:00
Adhemerval Zanella	1ae2248e14	PowerPC: EH adjustments This patch adjust the r171506 to make all DWARF enconding pc-relative for PPC64. It also adds the R_PPC64_REL32 relocation handling in MCJIT (since the eh_frame will not generate PIC-relative relocation) and also adds the emission of stubs created by the TTypeEncoding. llvm-svn: 171979	2013-01-09 17:08:15 +00:00
Nadav Rotem	977e0be4a0	Efficient lowering of vector sdiv when the divisor is a splatted power of two constant. PR 14848. The lowered sequence is based on the existing sequence the target-independent DAG Combiner creates for the scalar case. Patch by Zvi Rackover. llvm-svn: 171953	2013-01-09 05:14:33 +00:00
Eric Christopher	bf7bc4966c	Last in the series of removing unnecessary '0' arguments for address space. Reordered the EmitULEB128IntValue arguments to make this easier. llvm-svn: 171949	2013-01-09 03:52:05 +00:00
Andrew Trick	9f0b95f260	MIsched: add an ILP window property to machine model. This was an experimental option, but needs to be defined per-target. e.g. PPC A2 needs to aggressively hide latency. I converted some in-order scheduling tests to A2. Hal is working on more test cases. llvm-svn: 171946	2013-01-09 03:36:49 +00:00
Eric Christopher	e3ab3d0e2c	These functions have default arguments of 0 for the last arg. Use them. llvm-svn: 171933	2013-01-09 01:57:54 +00:00
Nadav Rotem	b696c36fcd	Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM. llvm-svn: 171928	2013-01-09 01:15:42 +00:00
Jack Carter	c3dd91c4d7	This patch produces the correct addend value for an R_MIPS_GPREL16 relocation. Contributer: Jack Carter llvm-svn: 171882	2013-01-08 19:01:28 +00:00
Jack Carter	9e28cd3fad	This patch produces the correct pointer size value in the 64 bit .eh_frame section. It doesn't however allow exception handling to work yet since it depends on the correct relocation model being set in the ELF header flags. Contributer: Jack Carter llvm-svn: 171881	2013-01-08 18:53:20 +00:00
Preston Gurd	a01daace88	Pad Short Functions for Intel Atom The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. llvm-svn: 171879	2013-01-08 18:27:24 +00:00
Eli Bendersky	4d9ada036c	Renamed MCInstFragment to MCRelaxableFragment and added some comments. No change in functionality. llvm-svn: 171822	2013-01-08 00:22:56 +00:00
Jim Grosbach	9dbf3ee9d0	ARM: Copy-paste error. llvm-svn: 171790	2013-01-07 21:24:35 +00:00
Jim Grosbach	553eb75663	ARM: Fix a few copy-paste errors. s/X86/ARM/ llvm-svn: 171789	2013-01-07 21:12:13 +00:00
Bill Schmidt	9b1e3e25dc	This patch addresses bug 14678 by fixing two problems in medium code model code generation. Variables addressed through a GlobalAlias were not being handled, and variables with available_externally linkage were treated incorrectly. The patch contains two new tests to verify the correct code generation for these cases. llvm-svn: 171778	2013-01-07 19:29:18 +00:00
Jordan Rose	e8f1eaea8a	Change SMRange to be half-open (exclusive end) instead of closed (inclusive) This is necessary not only for representing empty ranges, but for handling multibyte characters in the input. (If the end pointer in a range refers to a multibyte character, should it point to the beginning or the end of the character in a char array?) Some of the code in the asm parsers was already assuming this anyway. llvm-svn: 171765	2013-01-07 19:00:49 +00:00
NAKAMURA Takumi	458a8277cc	R600/SIISelLowering.cpp: Suppress a warning. [-Wunused-variable] llvm-svn: 171728	2013-01-07 11:14:44 +00:00
Tim Northover	2883da3b51	Add LICENSE.TXT covering contributions made by ARM. Absent a Contributor's License Agreement (CLA) with an LLVM legal entity and as reviewed and agreed with Chris Lattner, add a patent license covering future contributions from ARM until there is a CLA. This is to make explicit ARM's grant of patent rights to recipients of LLVM containing ARM-contributed material. llvm-svn: 171721	2013-01-07 10:04:49 +00:00
Craig Topper	ae65212a4b	Remove more unnecessary # operators with nothing to paste proceeding them. llvm-svn: 171702	2013-01-07 06:14:20 +00:00
Craig Topper	a8c5ec09c7	Remove # from the beginning and end of def names. The # is a paste operator and should only be used with something to paste on either side. llvm-svn: 171697	2013-01-07 05:45:56 +00:00
Craig Topper	25cdf92b34	Remove # from the beginning and end of def names. llvm-svn: 171696	2013-01-07 05:26:58 +00:00
Craig Topper	bd62d64cbf	Remove unnecessary # tokens at the beginning and end of defm names. llvm-svn: 171694	2013-01-07 05:04:39 +00:00
Chandler Carruth	2109f47d97	Fix the enumerator names for ShuffleKind to match tho coding standards, and make its comments doxygen comments. llvm-svn: 171688	2013-01-07 03:20:02 +00:00
Chandler Carruth	50a36cd148	Make the popcnt support enums and methods have more clear names and follow the conding conventions regarding enumerating a set of "kinds" of things. llvm-svn: 171687	2013-01-07 03:16:03 +00:00
Chandler Carruth	d3e73556d6	Move TargetTransformInfo to live under the Analysis library. This no longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686	2013-01-07 03:08:10 +00:00
Chandler Carruth	664e354de7	Switch TargetTransformInfo from an immutable analysis pass that requires a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis pass, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it always being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681	2013-01-07 01:37:14 +00:00
Craig Topper	4f1c7256f9	Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior. cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix. cvtt2si/cvt2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix. Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein. llvm-svn: 171668	2013-01-06 20:39:29 +00:00
Evan Cheng	3fb03e23a4	Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch. llvm-svn: 171665	2013-01-06 19:00:15 +00:00
Chandler Carruth	539edf4ee0	Convert the TargetTransformInfo from an immutable pass with dynamic interfaces which could be extracted from it, and must be provided on construction, to a chained analysis group. The end goal here is that TTI works much like AA -- there is a baseline "no-op" and target independent pass which is in the group, and each target can expose a target-specific pass in the group. These passes will naturally chain allowing each target-specific pass to delegate to the generic pass as needed. In particular, this will allow a much simpler interface for passes that would like to use TTI -- they can have a hard dependency on TTI and it will just be satisfied by the stub implementation when that is all that is available. This patch is a WIP however. In particular, the "stub" pass is actually the one and only pass, and everything there is implemented by delegating to the target-provided interfaces. As a consequence the tools still have to explicitly construct the pass. Switching targets to provide custom passes and sinking the stub behavior into the NoTTI pass is the next step. llvm-svn: 171621	2013-01-05 11:43:11 +00:00
Craig Topper	92a70b1e65	Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. llvm-svn: 171608	2013-01-05 07:39:25 +00:00
Nadav Rotem	478b6a47ec	Revert revision 171524. Original message: URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171603	2013-01-05 05:42:48 +00:00
Chandler Carruth	4a7c311008	Refactor the ScalarTargetTransformInfo API for querying about the legality of an address mode to not use a struct of four values and instead to accept them as parameters. I'd love to have named parameters here as most callers only care about one or two of these, but the defaults aren't terribly scary to write out. That said, there is no real impact of this as the passes aren't yet using STTI for this and are still relying upon TargetLowering. llvm-svn: 171595	2013-01-05 03:36:17 +00:00
Akira Hatanaka	d35a263076	[mips] Fix data layout string. Add 64 to the list of native integer widths and add stack alignment information. llvm-svn: 171587	2013-01-05 02:00:56 +00:00
Jakub Staszak	43fafaf496	Move 'break' to the right place to prevent fallthru. There is no test-case because conditions in the next case prevented from doing anything nasty. llvm-svn: 171549	2013-01-04 23:01:26 +00:00
Preston Gurd	e36b685a94	The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171524	2013-01-04 20:54:54 +00:00
Akira Hatanaka	b13b33359b	[mips] MipsTargetLowering::getSetCCResultType should return a vector type if vectors are being compared. llvm-svn: 171517	2013-01-04 20:06:01 +00:00
Akira Hatanaka	e067e5a13f	[mips] 80 columns. llvm-svn: 171515	2013-01-04 19:38:05 +00:00
Akira Hatanaka	f412e7501a	[mips] Reorder template parameters. Remove class shift_rotate_imm32 and shift_rotate_imm64. llvm-svn: 171513	2013-01-04 19:25:46 +00:00
Akira Hatanaka	a7a9fa1c16	[mips] Refactor conditional move instructions. llvm-svn: 171511	2013-01-04 19:16:38 +00:00
Akira Hatanaka	e36e2f6876	[mips] Refactor instructions which move data from or to coprocessors. llvm-svn: 171510	2013-01-04 19:13:49 +00:00
Adhemerval Zanella	9b0b781395	PowerPC: Fix eh_frame relocation for PIC This patch fixes the PPC eh_frame definitions for the personality and frame unwinding for PIC objects. It makes PIC build correctly creates relative relocations in the '.rela.eh_frame' segments and thus avoiding a text relocation that generates a DT_TEXTREL segments in link phase. llvm-svn: 171506	2013-01-04 19:08:13 +00:00
Nadav Rotem	e6bb35435d	Change the default number of registers to prevent unrolling on targets that dont have this hook. llvm-svn: 171489	2013-01-04 18:40:39 +00:00
Nadav Rotem	e1d5c4b8b9	LoopVectorizer: 1. Add code to estimate register pressure. 2. Add code to select the unroll factor based on register pressure. 3. Add bits to TargetTransformInfo to provide the number of registers. llvm-svn: 171469	2013-01-04 17:48:25 +00:00
Nadav Rotem	c616a5408a	Revert revision: 171467. This transformation is incorrect and makes some tests fail. Original message: Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171468	2013-01-04 17:35:21 +00:00
Elena Demikhovsky	5f2f06d2d9	Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171467	2013-01-03 08:48:33 +00:00
Michael Gottesman	820aac1c78	Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks." This reverts commit r171461 since it breaks the following tests: Clang :: Analysis/outofbound-notwork.c Clang :: Analysis/string-fail.c Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp Clang :: CXX/temp/temp.param/p14.cpp Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c Clang :: CodeGen/blocks-2.c Clang :: CodeGen/libcalls-d.c Clang :: CodeGen/libcalls-ld.c Clang :: CodeGenCXX/conversion-function.cpp Clang :: CodeGenCXX/debug-info-limit-type.cpp Clang :: CodeGenCXX/inheriting-constructor.cpp Clang :: FixIt/fixit-errors.c Clang :: FixIt/fixit-pmem.cpp Clang :: Modules/namespaces.cpp Clang :: PCH/changed-files.c Clang :: PCH/pr4489.c Clang :: PCH/source-manager-stack.c Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp Clang :: SemaTemplate/instantiate-function-1.mm llvm-svn: 171466	2013-01-03 08:18:30 +00:00
Craig Topper	7c27cc9fd0	Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. llvm-svn: 171461	2013-01-03 06:40:20 +00:00
Hal Finkel	95de3f3018	Add a subtype parameter to VTTI::getShuffleCost In order to cost subvector insertion and extraction, we need to know the type of the subvector being extracted. No functionality change. llvm-svn: 171453	2013-01-03 02:34:09 +00:00
Kevin Enderby	726e0ea6eb	Adds missing aliases for fcom and fcomp instructions without arguments. Patch by Michael M Kuperstein! llvm-svn: 171414	2013-01-02 21:20:15 +00:00
Nadav Rotem	761937a757	AVX: Fix a bug in WidenMaskArithmetic. llvm-svn: 171398	2013-01-02 17:41:03 +00:00
Chandler Carruth	9fb823bbd4	Move all of the header files which are involved in modelling the LLVM IR into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366	2013-01-02 11:36:10 +00:00
Chandler Carruth	be81023d74	Resort the #include lines in include/... and lib/... with the utils/sort_includes.py script. Most of these are updating the new R600 target and fixing up a few regressions that have creeped in since the last time I sorted the includes. llvm-svn: 171362	2013-01-02 10:22:59 +00:00
Craig Topper	9791afb182	Merge SSE and AVX instruction definitions for scalar forms of SQRT, RSQRT, and RCP. llvm-svn: 171356	2013-01-02 08:00:39 +00:00
Craig Topper	4bc5c4e152	Merge SSE and AVX instruction definitions for PSHUFD/PSHUFHW/PSHUFLW. llvm-svn: 171355	2013-01-02 07:27:49 +00:00
Rafael Espindola	db1a84c84a	Revert 171351. It broke MC/X86/x86-32-avx.s. llvm-svn: 171352	2013-01-02 01:35:11 +00:00
Craig Topper	86d0cdb82f	Merge SSE and AVX instruction definitions for scalar forms of SQRT, RSQRT, and RCP. llvm-svn: 171351	2013-01-01 20:53:20 +00:00
Craig Topper	12ed9cd6ae	Remove unused argument from a multiclass. llvm-svn: 171340	2013-01-01 03:42:44 +00:00
Craig Topper	2edafc059d	Merge intrinsic instruction definitions for SSE and AVX versions of RCPPS and RSQRTPS. llvm-svn: 171339	2013-01-01 03:30:21 +00:00
Craig Topper	d04dbec6c9	Remove 2 unused multiclasses. llvm-svn: 171338	2013-01-01 02:02:45 +00:00
Craig Topper	7cc4f322cf	Merge AVX/SSE instruction definitions for SQRTPS/PD, RSQRTPS, RCPPS. No funcitonal change intended. llvm-svn: 171337	2013-01-01 00:11:07 +00:00
Craig Topper	c2521cd309	Use packed instead of scalar itineraries for SSE1/2 SQRTPS/PD, RCPPS, and RSQRTPS. VEX-encoded forms already use packed. llvm-svn: 171336	2012-12-31 23:49:05 +00:00
Bill Wendling	6e95ae803a	Remove the getAttributesAtIndex and getNumAttrs methods in favor of using the getAttrSomewhere predicate. This prevents the uses of 'Attribute' as a collection of attributes. llvm-svn: 171271	2012-12-31 00:49:59 +00:00
Nuno Lopes	b6ad98224a	convert a bunch of callers from DataLayout::getIndexedOffset() to GEP::accumulateConstantOffset(). The later API is nicer than the former, and is correct regarding wrap-around offsets (if anyone cares). There are a few more places left with duplicated code, which I'll remove soon. llvm-svn: 171259	2012-12-30 16:25:48 +00:00
Bill Wendling	749a43d874	Use the predicate methods off of AttributeSet instead of Attribute. llvm-svn: 171257	2012-12-30 13:50:49 +00:00
Bill Wendling	74dba875e2	Remove the Function::getRetAttributes method in favor of using the AttributeSet accessor method. llvm-svn: 171256	2012-12-30 13:01:51 +00:00
Bill Wendling	94dcaf8e2b	Remove Function::getParamAttributes and use the AttributeSet accessor methods instead. llvm-svn: 171255	2012-12-30 12:45:13 +00:00
Bill Wendling	698e84fc4f	Remove the Function::getFnAttributes method in favor of using the AttributeSet directly. This is in preparation for removing the use of the 'Attribute' class as a collection of attributes. That will shift to the AttributeSet class instead. llvm-svn: 171253	2012-12-30 10:32:01 +00:00
Bill Wendling	6190254e0f	s/hasAttribute/contains/g to be more consistent with other method names. llvm-svn: 171252	2012-12-30 09:17:46 +00:00
Craig Topper	fe82eb6bcd	Remove intrinsic specific instructions for (V)SQRTPS/PD. Instead lower to target-independent ISD nodes and use the existing patterns for those. llvm-svn: 171237	2012-12-29 18:18:20 +00:00
Craig Topper	f4a9c6e21b	Merge similar functionality using a nested switch. llvm-svn: 171229	2012-12-29 17:19:06 +00:00
Craig Topper	6b27251a76	Remove intrinsic specific instructions for SSE/SSE2/AVX floating point max/min instructions. Lower them to target specific nodes and use those patterns instead. This also allows them to be commuted if UnsafeFPMath is enabled. llvm-svn: 171227	2012-12-29 16:44:25 +00:00
Jakub Staszak	215f94143c	Simplify code, no functionality change. llvm-svn: 171226	2012-12-29 15:57:26 +00:00
Jakub Staszak	afe8109fce	Delete executive bit on ./lib/Target/Hexagon/HexagonAsmPrinter.h. llvm-svn: 171225	2012-12-29 15:23:06 +00:00
Nadav Rotem	9785f519b4	CostModel: initial checkin for code that estimates the cost of special shuffles. llvm-svn: 171180	2012-12-28 08:19:03 +00:00
Nadav Rotem	c982a2dc25	wrap 80-col lines. llvm-svn: 171179	2012-12-28 07:28:43 +00:00
Nadav Rotem	3da9ac72fa	AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend. llvm-svn: 171178	2012-12-28 05:45:24 +00:00
Nadav Rotem	68441914a5	Reverse the 'if' condition and reduce the indentation. llvm-svn: 171172	2012-12-27 23:08:05 +00:00
Craig Topper	ab2e6842cc	Merge basic_sse12_fp_binop_p_int and basic_sse12_fp_binop_p_y_int multiclasses. llvm-svn: 171171	2012-12-27 22:53:47 +00:00
Nadav Rotem	3b34190100	AVX/AVX2: Move the SEXT lowering code from a target specific DAGco to a lowering function. llvm-svn: 171170	2012-12-27 22:47:16 +00:00
Craig Topper	e2eec3c52b	Merge basic_sse12_fp_binop_p and basic_sse12_fp_binop_p_y multiclasses. llvm-svn: 171166	2012-12-27 18:51:50 +00:00
Nadav Rotem	2a054b4475	On AVX/AVX2 the type v8i1 is legalized to v8i16, which is an XMM sized register. In most cases we actually compare or select YMM-sized registers and mixing the two types creates horrible code. This commit optimizes some of the transition sequences. PR14657. llvm-svn: 171148	2012-12-27 08:15:45 +00:00
Nadav Rotem	8e5d80eba3	AVX/AVX2: Move the code that lowers vector-trunc from a DAGCo-hook to custom lowering hook. The vector truncs were scalarized during LegalizeVectorOps, later vectorized again by some DAGCombine optimization and finally, lowered by a dagcombing optimization. Now, they are properly lowered during LegalizeVectorOps. No new testcase because the original testcases still work. llvm-svn: 171146	2012-12-27 07:45:10 +00:00
Craig Topper	757f3fc394	Add hasSideEffects=0 to some forms of ROUND, RCP, and RSQRT. llvm-svn: 171143	2012-12-27 07:16:08 +00:00
Craig Topper	09ce4b9efe	Move single letter 'P' prefix out of multiclass now that tablegen allows defm to start with #NAME. This makes instruction names more searchable again. llvm-svn: 171141	2012-12-27 06:34:54 +00:00
Craig Topper	396cb795bc	Add hasSideEffects=0 to some shift and rotate instructions. None of which are currently used by code generation. llvm-svn: 171137	2012-12-27 03:35:44 +00:00
Craig Topper	c7910828e4	Mark the divide instructions as hasSideEffects=0. llvm-svn: 171136	2012-12-27 03:01:18 +00:00
Craig Topper	5b807aaa38	Add hasSideEffects=0 to CMP*rr_REV. llvm-svn: 171130	2012-12-27 02:08:46 +00:00
Craig Topper	89e8607755	Add mayLoad, mayStore, and hasSideEffects tags to BT/BTS/BTR/BTC instructions. Shouldn't change any functionality since they don't have patterns to select them. llvm-svn: 171128	2012-12-27 02:01:33 +00:00
Craig Topper	c557343956	Fix operands and encoding form for ARPL instruction. Register form had and reversed. Memory form writes memory, but was marked as MRMSrcMem. llvm-svn: 171123	2012-12-26 23:27:57 +00:00
Craig Topper	d47a70de9f	Add hasSideEffects=0 to some atomic instructions. llvm-svn: 171122	2012-12-26 23:08:12 +00:00
Craig Topper	af2372087b	Mark the AL/AX/EAX forms of the basic arithmetic operations has never having side effects. llvm-svn: 171121	2012-12-26 22:19:23 +00:00
Craig Topper	1b8c0750ee	Mark all the _REV instructions as not having side effects. They aren't really emitted by the backend, but it reduces the number of instructions in the output files with unmodelled side effects to make auditing easier. llvm-svn: 171118	2012-12-26 21:30:22 +00:00
Craig Topper	18f2675e9b	Remove a special conditional setting of neverHasSideEffects if the instruction didn't have a pattern. This was leftover from when tablegen used to complain if things were already inferred from patterns. llvm-svn: 171117	2012-12-26 21:04:30 +00:00
Craig Topper	24f316e4db	Merge still more SSE/AVX instruction definitions. llvm-svn: 171103	2012-12-26 07:54:43 +00:00
Craig Topper	af629e2700	Merge more SSE/AVX instruction definitions. llvm-svn: 171102	2012-12-26 07:20:35 +00:00
Craig Topper	65fe30450d	Fix 80 column violation. llvm-svn: 171097	2012-12-26 06:15:53 +00:00
Craig Topper	f4d0fe8fcd	Fix class name in comment. llvm-svn: 171096	2012-12-26 06:15:09 +00:00
Craig Topper	59747c4dbd	Merge SSE/AVX PCMPEQ/PCMPGT instruction definitions. llvm-svn: 171095	2012-12-26 06:14:15 +00:00
Craig Topper	8a48677586	Remove 'v' from mnemonic to fix asm matching failures. llvm-svn: 171093	2012-12-26 06:02:15 +00:00
Craig Topper	b4ef0fa3a1	Use an additional multiclass to merge the 128/256-bit SSE/AVX instruction definitions for a bunch of SSE2 integer arithmetic instructions. llvm-svn: 171092	2012-12-26 05:49:15 +00:00
Nadav Rotem	5267bb71b8	Reformat the docs. llvm-svn: 171091	2012-12-26 04:59:20 +00:00
Craig Topper	a2594dd5f0	Use an additional multiclass to merge the 128/256-bit SSE/AVX instruction definitions for PAND/POR/PXOR/PANDN llvm-svn: 171087	2012-12-26 04:36:03 +00:00
Craig Topper	97730a0d6a	Merge an AVX/SSE 256-bit and 128-bit multiclass. llvm-svn: 171086	2012-12-26 03:56:47 +00:00
Craig Topper	8b59746390	Mark VANDNPD/VANDNPDS as not commutable. llvm-svn: 171085	2012-12-26 03:48:10 +00:00
Craig Topper	81d1e596bb	Remove alignment from a bunch more VEX encoded operations in the folding tables. llvm-svn: 171082	2012-12-26 02:44:47 +00:00
Craig Topper	b2922164f0	Remove alignment from folding table for VMOVUPD as an unaligned instruction it shouldn't require alignment... llvm-svn: 171081	2012-12-26 02:14:19 +00:00
Craig Topper	d09a9af9b6	Remove alignment requirements from (V)EXTRACTPS. This instruction does 32-bit stores which aren't required to be aligned on SSE or AVX. llvm-svn: 171080	2012-12-26 01:47:12 +00:00
Craig Topper	caef1c5d86	Remove alignment requirement from VCVTSS2SD in folding tables. Reverting r171049. This instruction doesn't require alignment. llvm-svn: 171078	2012-12-26 00:35:47 +00:00
Hal Finkel	1b5ff08d43	Expand PPC64 atomic load and store Use of store or load with the atomic specifier on 64-bit types would cause instruction-selection failures. As with the 32-bit case, these can use the default expansion in terms of cmp-and-swap. llvm-svn: 171072	2012-12-25 17:22:53 +00:00
Benjamin Kramer	81b5a8fd2e	X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use of and commutativity. llvm-svn: 171064	2012-12-25 13:09:08 +00:00
Benjamin Kramer	df4af41b9b	X86: Custom lower <2 x i64> eq and ne when SSE41 is not available. pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack. Small speedup on loop-vectorized viterbi (-march=core2). llvm-svn: 171063	2012-12-25 12:54:19 +00:00
Nadav Rotem	00410ae625	VCVTSS2SD requires a strict alignment. Thanks Elena. llvm-svn: 171049	2012-12-25 03:29:18 +00:00
Nick Lewycky	521e0d59f3	Quiet gcc's -Wparenthesis warning. No functionality change. llvm-svn: 171044	2012-12-24 19:58:45 +00:00
Benjamin Kramer	9d46110ff1	Use a std::string rather than a dynamically allocated char* buffer. This affords us to use std::string's allocation routines and use the destructor for the memory management. Switching to that also means that we can use operator==(const std::string&, const char *) to perform the string comparison rather than resorting to libc functionality (i.e. strcmp). Patch by Saleem Abdulrasool! Differential Revision: http://llvm-reviews.chandlerc.com/D230 llvm-svn: 171042	2012-12-24 19:23:30 +00:00
Nadav Rotem	3ee6b10dd4	CostModel: We have API for checking the costs of known shuffles. This patch adds support for the insert-subvector and extract-subvector kinds. llvm-svn: 171027	2012-12-24 10:04:03 +00:00
Nadav Rotem	dc0ad92b64	Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. llvm-svn: 171024	2012-12-24 09:40:33 +00:00
Nadav Rotem	7e1599e100	Change the codegen Cost Model API for shuffeles. This patch removes the API for broadcast and adds a more general API that accepts an enum of known shuffles. llvm-svn: 171022	2012-12-24 08:57:47 +00:00
Nadav Rotem	cf9999d9d5	CostModel: Change the default target-independent implementation for finding the cost of arithmetic functions. We now assume that the cost of arithmetic operations that are marked as Legal or Promote is low, but ops that are marked as custom are higher. llvm-svn: 171002	2012-12-23 17:31:23 +00:00
Nadav Rotem	b15c69a725	whitespace llvm-svn: 170997	2012-12-23 07:33:44 +00:00
Nadav Rotem	1bef5a0509	Rename a function. llvm-svn: 170996	2012-12-23 07:30:09 +00:00
Nadav Rotem	2cade68025	Loop Vectorizer: Update the cost model of scatter/gather operations and make them more expensive. llvm-svn: 170995	2012-12-23 07:23:55 +00:00
Benjamin Kramer	76268ac682	X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available. pmuludq is slow, but it turns out that all the unpacking and packing of the scalarized mul is even slower. 10% speedup on loop-vectorized paq8p. llvm-svn: 170985	2012-12-22 16:07:56 +00:00
Benjamin Kramer	b2f0a2bd4b	X86: Emit vector sext as shuffle + sra if vpmovsx is not available. Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better than scalarized loads. Fixes PR14590. llvm-svn: 170984	2012-12-22 11:34:28 +00:00
Nadav Rotem	d5aae980cb	In some cases, due to scheduling constraints we copy the EFLAGS. The only way to read the eflags is using push and pop. If we don't adjust the stack then we run over the first frame index. This is not something that we want to do, so we have to make sure that our machine function does not copy the flags. If it does then we have to emit the prolog that adjusts the stack. rdar://12896831 llvm-svn: 170961	2012-12-21 23:48:49 +00:00
Akira Hatanaka	6ac2fc4976	[mips] Refactor subword-swap, EXT/INS, load-effective-address and read-hardware instructions. llvm-svn: 170956	2012-12-21 23:21:32 +00:00
Akira Hatanaka	beea8a34c3	[mips] Refactor SYNC and multiply/divide instructions. llvm-svn: 170955	2012-12-21 23:17:36 +00:00
Akira Hatanaka	31ddec5887	[mips] Refactor BAL instructions. llvm-svn: 170954	2012-12-21 23:15:59 +00:00
Akira Hatanaka	d6b694f036	[mips] Fix encoding of BAL instruction. Also, fix assembler test case which was not catching the error. llvm-svn: 170953	2012-12-21 23:13:59 +00:00
Akira Hatanaka	a158042a56	[mips] Refactor jump, jump register, jump-and-link and nop instructions. llvm-svn: 170952	2012-12-21 23:03:50 +00:00
Akira Hatanaka	e1826d7464	[mips] Refactor load/store left/right and load-link and store-conditional instructions. llvm-svn: 170950	2012-12-21 23:01:24 +00:00
Akira Hatanaka	d9bf8424e5	[mips] Refactor load/store instructions. llvm-svn: 170948	2012-12-21 22:58:55 +00:00
Akira Hatanaka	b59b047fbe	[mips] Remove unnecessary isPseudo parameter. llvm-svn: 170947	2012-12-21 22:57:26 +00:00
Akira Hatanaka	e738efc95b	[mips] Refactor LUI instruction. llvm-svn: 170944	2012-12-21 22:46:07 +00:00
Akira Hatanaka	895e1cb2aa	[mips] Refactor count leading zero or one instructions. llvm-svn: 170942	2012-12-21 22:43:58 +00:00
Akira Hatanaka	4f4c4aa05e	[mips] Refactor sign-extension-in-register instructions. llvm-svn: 170940	2012-12-21 22:41:52 +00:00
Akira Hatanaka	b14c6e4e5f	[mips] Refactor instructions which copy from and to HI/LO registers. llvm-svn: 170939	2012-12-21 22:39:17 +00:00
Akira Hatanaka	9e89195dce	[mips] Refactor logical NOR instructions. llvm-svn: 170937	2012-12-21 22:35:47 +00:00
Akira Hatanaka	ac10697207	[mips] Move instruction definitions in MipsInstrInfo.td. llvm-svn: 170936	2012-12-21 22:33:43 +00:00
Tom Stellard	09ef8425e9	R600: Coding style - remove empty spaces from the beginning of functions No functionality change. llvm-svn: 170923	2012-12-21 20:12:02 +00:00
Tom Stellard	41398026e7	R600: Fix MAX_UINT definition Patch by: Vadim Girlin Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170922	2012-12-21 20:12:01 +00:00
Tom Stellard	4fa7ac29f1	R600: Add SHADOWCUBE to TEX_SHADOW pattern Patch by: Vadim Girlin Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170921	2012-12-21 20:11:59 +00:00
Benjamin Kramer	5521b94b07	Cleanup compiler warnings on discarding type qualifiers in casts. Switch to C++ style casts. Patch by Saleem Abdulrasool! Differential Revision: http://llvm-reviews.chandlerc.com/D204 llvm-svn: 170917	2012-12-21 19:09:53 +00:00
Benjamin Kramer	82d1c371e2	X86: Match pmin/pmax as a target specific dag combine. This occurs during vectorization. Part of PR14667. llvm-svn: 170908	2012-12-21 17:46:58 +00:00
Roman Divacky	a229186a82	Remove duplicate includes. llvm-svn: 170902	2012-12-21 17:06:44 +00:00
Tom Stellard	a8b0351720	R600: Expand vec4 INT <-> FP conversions llvm-svn: 170901	2012-12-21 16:33:24 +00:00
Benjamin Kramer	4669d18893	X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics This is very mechanical, no functionality change. Preparation for PR14667. llvm-svn: 170898	2012-12-21 14:04:55 +00:00
Nadav Rotem	eacbb731d3	Add a missing "virtual" keyword. llvm-svn: 170842	2012-12-21 05:02:12 +00:00
Quentin Colombet	b1b66e7a25	Add ARM cortex-r5 subtarget. llvm-svn: 170840	2012-12-21 04:35:05 +00:00
Nadav Rotem	6d4fdd6d2c	Improve the X86 cost model for loads and stores. llvm-svn: 170830	2012-12-21 01:33:59 +00:00
Nadav Rotem	a4b53f20a3	BB-Vectorizer: Check the cost of the store pointer type and not the return type, which is void. A number of test cases fail after adding the assertion in TTImpl. llvm-svn: 170828	2012-12-21 01:24:36 +00:00
Reed Kotler	9bff1ead0e	Call llvm_unreachable instead of assert. llvm-svn: 170822	2012-12-21 00:44:59 +00:00
Jakob Stoklund Olesen	33f5d1492d	Add an MF argument to MI::copyImplicitOps(). This function is often used to decorate dangling instructions, so a context reference is required to allocate memory for the operands. Also add a corresponding MachineInstrBuilder method. llvm-svn: 170797	2012-12-20 22:54:02 +00:00
Jakob Stoklund Olesen	2ea203694d	MachineInstrBuilderize ARM. llvm-svn: 170795	2012-12-20 22:53:55 +00:00
Jakob Stoklund Olesen	4255c96aed	MachineInstrBuilderize NVPTX. llvm-svn: 170794	2012-12-20 22:53:53 +00:00
Bob Wilson	7bba4f8957	Revert "Adding support for llvm.arm.neon.vaddl[su].* and" This reverts r170694. The operations can be represented in IR without adding any new intrinsics. llvm-svn: 170765	2012-12-20 21:09:38 +00:00
Evan Cheng	ddc0cb6dc5	On some ARM cpus, flags setting movs with shifter operand, i.e. lsl, lsr, asr, are more expensive than the non-flag setting variant. Teach thumb2 size reduction pass to avoid generating them unless we are optimizing for size. rdar://12892707 llvm-svn: 170728	2012-12-20 19:59:30 +00:00
Roman Divacky	ff95a1dc12	Remove MCTargetAsmLexer and its derived classes now that edis, its only user, is gone. llvm-svn: 170699	2012-12-20 14:43:30 +00:00
Renato Golin	6b2ea4a48f	Adding support for llvm.arm.neon.vaddl[su].* and llvm.arm.neon.vsub[su].* intrinsics. Patch by Pete Couperus <pjcoup@gmail.com> llvm-svn: 170694	2012-12-20 13:52:11 +00:00
Reed Kotler	d11acc7dc0	Implement cfi_def_cfa_offset. "Make check" test case for this comming in the next few days but it's already tested a lot from test-suite and works fine. This patch completes almost 100% pass of test-suite for mips 16. llvm-svn: 170674	2012-12-20 06:59:37 +00:00
Reed Kotler	8965d24a2a	There is one more patch to finish large frames. Make sure we assert on code that has large frames which will not yet compile correctly. llvm-svn: 170673	2012-12-20 06:57:00 +00:00
Jyotsna Verma	56605448f2	Add constant extender support to GP-relative load/store instructions. llvm-svn: 170672	2012-12-20 06:52:46 +00:00
Jyotsna Verma	bf75aaf53e	Add TSFlags to ALU32 type instructions for constant-extender/Relationship maps. llvm-svn: 170671	2012-12-20 06:45:39 +00:00
Reed Kotler	7bff8f1d7a	set register class properly for mips16 here llvm-svn: 170669	2012-12-20 06:06:35 +00:00
Rafael Espindola	fb8ac2df09	Undefine PPC harder. This was causing a build failure while trying to build on ppc ubuntu 12.10 with cmake. llvm-svn: 170668	2012-12-20 05:13:09 +00:00
Reed Kotler	92fc33bc97	This assert is overly restrictive and does not work for mips16. llvm-svn: 170667	2012-12-20 05:09:15 +00:00
Reed Kotler	fd633229f7	Turn on register scavenger for Mips 16 We use an unused Mips 32 register for the emergency slot instead of using the stack. llvm-svn: 170665	2012-12-20 04:44:58 +00:00
Akira Hatanaka	e7f1acc7c0	[mips] Refactor SLT (set on less than) instructions. Separate encoding information from the rest. llvm-svn: 170664	2012-12-20 04:27:52 +00:00
Akira Hatanaka	bbd197e9c4	[mips] Refactor unconditional branch instruction. Separate encoding information from the rest. llvm-svn: 170663	2012-12-20 04:22:39 +00:00
Akira Hatanaka	b1527b7505	[mips] Remove asm string parameter from pseudo instructions. Add InstrItinClass parameter. llvm-svn: 170661	2012-12-20 04:20:09 +00:00
Akira Hatanaka	14f9ce0f83	[mips] Delete definition of CPRESTORE instruction. llvm-svn: 170660	2012-12-20 04:15:30 +00:00
Akira Hatanaka	c0ea0bb99b	[mips] Refactor conditional branch instructions with one register operand. Separate encoding information from the rest. llvm-svn: 170659	2012-12-20 04:13:23 +00:00
Akira Hatanaka	f71ffd29d9	[mips] Refactor conditional branch instructions with two register operands. Separate encoding information from the rest. llvm-svn: 170657	2012-12-20 04:10:13 +00:00
Reed Kotler	d019dbf75e	fix most of remaining issues with large frames. these patches are tested a lot by test-suite but make check tests are forthcoming once the next few patches that complete this are committed. with the next few patches the pass rate for mips16 is near 100% llvm-svn: 170656	2012-12-20 04:07:42 +00:00
Akira Hatanaka	f423672117	[mips] Use "or $r0, $r1, $zero" instead of "addu $r0, $zero, $r1" to copy physical register $r1 to $r0. GNU disassembler recognizes an "or" instruction as a "move", and this change makes the disassembled code easier to read. Original patch by Reed Kotler. llvm-svn: 170655	2012-12-20 04:06:06 +00:00
Richard Smith	15b1e3727b	Fix use-before-construction of X86TargetLowering. llvm-svn: 170654	2012-12-20 04:04:17 +00:00
Akira Hatanaka	7d75f9e3d3	[mips] Change the order of template parameters. Move the default parameters to the end. llvm-svn: 170651	2012-12-20 03:52:08 +00:00
Akira Hatanaka	244f9e874c	[mips] Refactor shift instructions with register operands. Separate encoding information from the rest. llvm-svn: 170650	2012-12-20 03:48:24 +00:00
Akira Hatanaka	7f96ad325f	[mips] Refactor shift immediate instructions. Separate encoding information from the rest. llvm-svn: 170649	2012-12-20 03:44:41 +00:00
Akira Hatanaka	ab1b715bf2	[mips] Refactor arithmetic and logic instructions with immediate operands. Separate encoding information from the rest. llvm-svn: 170648	2012-12-20 03:40:03 +00:00
Akira Hatanaka	1b37c4af01	[mips] Refactor arithmetic and logic instructions. Separate encoding information from the rest. llvm-svn: 170647	2012-12-20 03:34:05 +00:00
Akira Hatanaka	73495897b1	[mips] Delete ArithOverflowR and ArithOverflow and use ArithLogicR and ArithLogicI as the instruction base classes. llvm-svn: 170642	2012-12-20 03:00:16 +00:00
NAKAMURA Takumi	2a0b40f584	Target/R600: Update MIB according to r170588. llvm-svn: 170620	2012-12-20 00:22:11 +00:00
Jim Grosbach	6df94846ec	MC: Add MCInstrDesc::mayAffectControlFlow() method. MC disassembler clients (LLDB) are interested in querying if an instruction may affect control flow other than by virtue of being an explicit branch instruction. For example, instructions which write directly to the PC on some architectures. llvm-svn: 170610	2012-12-19 23:38:53 +00:00
Tom Stellard	1c315d5411	R600: Remove unecessary VREG alignment. Unlike SGPRs VGPRs doesn't need to be aligned. Patch by: Christian König Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 170593	2012-12-19 22:10:34 +00:00
Tom Stellard	e7b907d85c	R600: control flow optimization Branch if we have enough instructions so that it makes sense. Also remove branches if they don't make sense. Patch by: Christian König Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 170592	2012-12-19 22:10:33 +00:00
Tom Stellard	f8794354b2	R600: New control flow for SI v2 This patch replaces the control flow handling with a new pass which structurize the graph before transforming it to machine instruction. This has a couple of different advantages and currently fixes 20 piglit tests without a single regression. It is now a general purpose transformation that could be not only be used for SI/R6xx, but also for other hardware implementations that use a form of structurized control flow. v2: further cleanup, fixes and documentation Patch by: Christian König Signed-off-by: Christian König <deathsimple@vodafone.de> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170591	2012-12-19 22:10:31 +00:00
Jakob Stoklund Olesen	b159b5ff0d	Remove the explicit MachineInstrBuilder(MI) constructor. Use the version that also takes an MF reference instead. It would technically be possible to extract an MF reference from the MI as MI->getParent()->getParent(), but that would not work for MIs that are not inserted into any basic block. Given the reasonably small number of places this constructor was used at all, I preferred the compile time check to a run time assertion. llvm-svn: 170588	2012-12-19 21:31:56 +00:00
Evan Cheng	eae6d2ccea	LLVM sdisel normalize bit extraction of the form: ((x & 0xff00) >> 8) << 2 to (x >> 6) & 0x3fc This is general goodness since it folds a left shift into the mask. However, the trailing zeros in the mask prevents the ARM backend from using the bit extraction instructions. And worse since the mask materialization may require an addition instruction. This comes up fairly frequently when the result of the bit twiddling is used as memory address. e.g. = ptr[(x & 0xFF0000) >> 16] We want to generate: ubfx r3, r1, #16, #8 ldr.w r3, [r0, r3, lsl #2] vs. mov.w r9, #1020 and.w r2, r9, r1, lsr #14 ldr r2, [r0, r2] Add a late ARM specific isel optimization to ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the 'base + offset' address computation; change the mask to one which doesn't have trailing zeros and enable the use of ubfx. Note the optimization has to be done late since it's target specific and we don't want to change the DAG normalization. It's also fairly restrictive as shifter operands are not always free. It's only done for lsh 1 / 2. It's known to be free on some cpus and they are most common for address computation. This is a slight win for blowfish, rijndael, etc. rdar://12870177 llvm-svn: 170581	2012-12-19 20:16:09 +00:00
Roman Divacky	e3d323052f	Remove edis - the enhanced disassembler. Fixes PR14654. llvm-svn: 170578	2012-12-19 19:55:47 +00:00
Paul Redmond	5917f4c715	Transform (x&C)>V into (x&C)!=0 where possible When the least bit of C is greater than V, (x&C) must be greater than V if it is not zero, so the comparison can be simplified. Although this was suggested in Target/X86/README.txt, it benefits any architecture with a directly testable form of AND. Patch by Kevin Schoedel llvm-svn: 170576	2012-12-19 19:47:13 +00:00
Benjamin Kramer	c5071466d4	PowerPC: Expand VSELECT nodes. There's probably a better expansion for those nodes than the default for altivec, but this is better than crashing. VSELECTs occur in loop vectorizer output. llvm-svn: 170551	2012-12-19 15:49:14 +00:00
Patrik Hagglund	e09cac9a67	Change TargetLowering::getTypeForExtArgOrReturn to take and return MVTs, instead of EVTs. llvm-svn: 170537	2012-12-19 12:02:25 +00:00
Patrik Hagglund	bad545ccba	Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of EVTs. llvm-svn: 170535	2012-12-19 11:48:16 +00:00
Patrik Hagglund	f9eb168ef4	Change TargetLowering::findRepresentativeClass to take an MVT, instead of EVT. llvm-svn: 170532	2012-12-19 11:30:36 +00:00
NAKAMURA Takumi	89209462fe	X86ISelLowering.cpp: Fix warnings. [-Wlogical-op-parentheses] llvm-svn: 170523	2012-12-19 10:12:48 +00:00
Elena Demikhovsky	14a4af0e66	Optimized load + SIGN_EXTEND patterns in the X86 backend. llvm-svn: 170506	2012-12-19 07:50:20 +00:00
Bill Wendling	3d7b0b8ac7	Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future. llvm-svn: 170502	2012-12-19 07:18:57 +00:00
Reed Kotler	3aad762d1d	Add some missing Defs and Uses. llvm-svn: 170493	2012-12-19 04:06:15 +00:00
Jakub Staszak	338863a546	Reverse order of checking SSE level when calculating compare cost, so we check AVX2 before AVX. llvm-svn: 170464	2012-12-18 22:57:56 +00:00
Quentin Colombet	23b404d5ad	Disable ARM partial flag dependency optimization at -Oz To not over constrain the scheduler for ARM in thumb mode, some optimizations for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency). Disables this check when code size is the priority, i.e., code is compiled with -Oz. llvm-svn: 170462	2012-12-18 22:47:16 +00:00
Eli Bendersky	39e7c6e370	Get rid of the pesky -Woverloaded-virtual warning. No change in functionality. llvm-svn: 170438	2012-12-18 18:21:29 +00:00
Jakob Stoklund Olesen	41bbf9c256	Repair bundles that were broken by removing and reinserting the first instruction. This isn't strictly necessary at the moment because Thumb2SizeReduction also copies all MI flags from the old instruction to the new. However, a future patch will make that kind of direct flag tampering illegal. llvm-svn: 170395	2012-12-18 00:46:39 +00:00
Jakob Stoklund Olesen	43b1e13386	Extract a method, no functional change intended. Sadly, this costs us a perfectly good opportunity to use 'goto'. llvm-svn: 170385	2012-12-18 00:13:11 +00:00
Chad Rosier	150d35bc1d	[arm fast-isel] Minor cleanup. No functional change intended. llvm-svn: 170379	2012-12-17 22:35:29 +00:00
Chad Rosier	62a144f099	[arm fast-isel] Fast-isel only handles simple VTs, so make sure the necessary checks are in place. Some minor cleanup as well. llvm-svn: 170360	2012-12-17 19:59:43 +00:00
Richard Osborne	459e35c261	Add instruction encodings / disassembly support for l2r instructions. llvm-svn: 170345	2012-12-17 16:28:02 +00:00
Tom Stellard	5a6879466a	R600: enable S_N2_ instructions They seem to work fine. Patch by: Christian König Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 170343	2012-12-17 15:14:56 +00:00
Tom Stellard	9e90b5895d	R600: BB operand support for SI Patch by: Christian König Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 170342	2012-12-17 15:14:54 +00:00
Tom Stellard	16a17c6d3e	R600: remove nonsense setPrefLoopAlignment The Align parameter is a power of two, so 16 results in 64K alignment. Additional to that even 16 byte alignment doesn't make any sense, so just remove it. Patch by: Christian König Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 170341	2012-12-17 15:14:53 +00:00
Patrik Hagglund	c494d24a68	Revert/correct some FastISel changes in r170104 (EVT->MVT for TargetLowering::getRegClassFor). Some isSimple() guards were missing, or getSimpleVT() were hoisted too far, resulting in asserts on valid LLVM assembly input. llvm-svn: 170336	2012-12-17 14:30:06 +00:00
Richard Osborne	51bf1b269a	Add instruction encodings for PEEK and ENDIN. Previously these were marked with the wrong format. llvm-svn: 170334	2012-12-17 14:23:54 +00:00
Richard Osborne	c104bf2769	Fix parameter name in prototypes in XCoreDisassembler. llvm-svn: 170332	2012-12-17 13:55:49 +00:00
Richard Osborne	041071c558	Add instruction encodings / disassembly support for rus instructions. llvm-svn: 170330	2012-12-17 13:50:04 +00:00
Richard Osborne	e405e58639	Add instruction encodings for ZEXT and SEXT. Previously these were marked with the wrong format. llvm-svn: 170327	2012-12-17 13:20:37 +00:00
Richard Osborne	3a0d5cc314	Add instruction encodings / disassembly support for 2r instructions. llvm-svn: 170323	2012-12-17 12:29:31 +00:00
Richard Osborne	016967e4ff	Add instruction encodings / disassembly support for 0r instructions. llvm-svn: 170322	2012-12-17 12:26:29 +00:00
Richard Osborne	1cc2b68ad6	Simplify assertion in XCoreInstPrinter. llvm-svn: 170321	2012-12-17 12:13:46 +00:00
Richard Osborne	4e1e14bccd	Update comments to match recommended doxygen style. llvm-svn: 170320	2012-12-17 12:13:41 +00:00
Richard Osborne	eb31fa483e	Remove unnecessary include. llvm-svn: 170319	2012-12-17 12:13:32 +00:00
Craig Topper	354ed773b8	Remove EFLAGS from the BLSI/BLSMSK/BLSR patterns. The nodes created by DAG combine don't contain an EFLAGS def. llvm-svn: 170308	2012-12-17 06:13:48 +00:00
Craig Topper	f3ff6ae066	Simplify BMI ANDN matching to use patterns instead of a DAG combine. Also add ANDN to isDefConvertible. llvm-svn: 170305	2012-12-17 05:12:30 +00:00
Craig Topper	f924a58af1	Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt. llvm-svn: 170304	2012-12-17 05:02:29 +00:00
Craig Topper	5b08cf7736	Remove store forms of DEC/INC from isDefConvertible. Since they are stores they don't have a register def. llvm-svn: 170303	2012-12-17 04:55:07 +00:00
Richard Osborne	1b5562ad8e	Add instruction encodings and disassembly for 1r instructions. llvm-svn: 170293	2012-12-16 17:37:34 +00:00
Richard Osborne	e31735a52b	Add XCore disassembler. Currently there is no instruction encoding info and XCoreDisassembler::getInstruction() always returns Fail. I intend to add instruction encodings and tests in follow on commits. llvm-svn: 170292	2012-12-16 17:29:14 +00:00
Richard Osborne	872f51e301	Remove invalid instruction encodings. llvm-svn: 170291	2012-12-16 16:46:31 +00:00
Richard Osborne	e298556706	Mark anything deriving from PseudoInstXCore as a pseudo instruction. llvm-svn: 170290	2012-12-16 16:46:28 +00:00
Richard Osborne	f12cb9ef27	Set instruction size correctly in XCoreInstrFormats.td llvm-svn: 170289	2012-12-16 16:46:24 +00:00
Richard Osborne	3c31e21837	Change XCoreAsmPrinter to lower MachineInstrs to MCInsts before emission. This change adds XCoreMCInstLower to do the lowering to MCInst and XCoreInstPrinter to print the MCInsts. llvm-svn: 170288	2012-12-16 16:20:48 +00:00
Richard Osborne	b1de9f7e07	Replace ${:comment} with the comment symbol. llvm-svn: 170286	2012-12-16 15:59:02 +00:00
Reed Kotler	aee4d5d194	This patch is needed to make c++ exceptions work for mips16. Mips16 is really a processor decoding mode (ala thumb 1) and in the same program, mips16 and mips32 functions can exist and can call each other. If a jal type instruction encounters an address with the lower bit set, then the processor switches to mips16 mode (if it is not already in it). If the lower bit is not set, then it switches to mips32 mode. The linker knows which functions are mips16 and which are mips32. When relocation is performed on code labels, this lower order bit is set if the code label is a mips16 code label. In general this works just fine, however when creating exception handling tables and dwarf, there are cases where you don't want this lower order bit added in. This has been traditionally distinguished in gas assembly source by using a different syntax for the label. lab1: ; this will cause the lower order bit to be added lab2=. ; this will not cause the lower order bit to be added In some cases, it does not matter because in dwarf and debug tables the difference of two labels is used and in that case the lower order bits subtract each other out. To fix this, I have added to mcstreamer the notion of a debuglabel. The default is for label and debug label to be the same. So calling EmitLabel and EmitDebugLabel produce the same result. For various reasons, there is only one set of labels that needs to be modified for the mips exceptions to work. These are the "$eh_func_beginXXX" labels. Mips overrides the debug label suffix from ":" to "=." . This initial patch fixes exceptions. More changes most likely will be needed to DwarfCFException to make all of this work for actual debugging. These changes will be to emit debug labels in some places where a simple label is emitted now. Some historical discussion on this from gcc can be found at: http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00623.html http://gcc.gnu.org/ml/gcc-patches/2008-11/msg01273.html llvm-svn: 170279	2012-12-16 04:00:45 +00:00
Benjamin Kramer	b16ccde7a4	X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible. We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases if y is a constant. DAGCombiner canonicalizes those so we first have to undo the canonicalization for those cases. The pattern occurs in gzip when the loop vectorizer is enabled. Part of PR14613. llvm-svn: 170273	2012-12-15 16:47:44 +00:00
Chandler Carruth	7a28f95419	Make '-mtune=x86_64' assume fast unaligned memory accesses. Not all chips targeted by x86_64 have this feature, but a dramatically increasing number do. Specifying a chip-specific tuning parameter will continue to turn the feature on or off as appropriate for that particular chip, but the generic flag should try to achieve the best performance on the most widely available hardware. Today, the number of chips with fast UA access dwarfs those without in the x86-64 space. Note that this also brings LLVM's code generation for this '-march' flag more in line with that of modern GCCs. Reviewed by Dan Gohman. llvm-svn: 170269	2012-12-15 09:01:13 +00:00
Reed Kotler	5fdeb21249	This code implements most of mips16 hardfloat as it is done by gcc. In this case, essentially it is soft float with different library routines. The next step will be to make this fully interoperational with mips32 floating point and that requires creating stubs for functions with signatures that contain floating point types. I have a more sophisticated design for mips16 hardfloat which I hope to implement at a later time that directly does floating point without the need for function calls. The mips16 encoding has no floating point instructions so one needs to switch to mips32 mode to execute floating point instructions. llvm-svn: 170259	2012-12-15 00:20:05 +00:00
Kevin Enderby	06aa3eb8ce	Make sure the alternate PC+imm syntax of LDR instruction with a small immediate generates the narrow version. Needed when doing round-trip assemble/disassemble testing using the alternate syntax that specifies 'pc' directly. llvm-svn: 170255	2012-12-14 23:04:25 +00:00
Nadav Rotem	8487537bdb	TypeLegalizer: Do not generate target specific nodes with illegal types, because we cant type-legalize them. llvm-svn: 170245	2012-12-14 21:20:37 +00:00
Bill Schmidt	a4f898448c	This patch removes some nondeterminism from direct object file output for TLS dynamic models on 64-bit PowerPC ELF. The default sort routine for relocations only sorts on the r_offset field; but with TLS, there can be two relocations with the same r_offset. For PowerPC, this patch sorts secondarily on descending r_type, which matches the behavior expected by the linker. llvm-svn: 170237	2012-12-14 20:28:38 +00:00
Bill Schmidt	9f0b4ec0f5	This patch improves the 64-bit PowerPC InitialExec TLS support by providing for a wider range of GOT entries that can hold thread-relative offsets. This matches the behavior of GCC, which was not documented in the PPC64 TLS ABI. The ABI will be updated with the new code sequence. Former sequence: ld 9,x@got@tprel(2) add 9,9,x@tls New sequence: addis 9,2,x@got@tprel@ha ld 9,x@got@tprel@l(9) add 9,9,x@tls Note that a linker optimization exists to transform the new sequence into the shorter sequence when appropriate, by replacing the addis with a nop and modifying the base register and relocation type of the ld. llvm-svn: 170209	2012-12-14 17:02:38 +00:00
Shuxin Yang	97e07bf211	Remove two popcount patterns which we are already able to recognize. llvm-svn: 170158	2012-12-13 23:16:19 +00:00
Bill Schmidt	9ed4dbcb75	This is another cleanup patch for 64-bit PowerPC TLS processing. I had some hackery in place that hid my poor use of TblGen, which I've now sorted out and cleaned up. No change in observable behavior, so no new test cases. llvm-svn: 170149	2012-12-13 20:57:10 +00:00
Tom Stellard	6975d35979	Fix warnings with -DNDEBUG Patch by: NAKAMURA Takumi llvm-svn: 170142	2012-12-13 19:38:52 +00:00
Bill Schmidt	732eb91f05	This is just a clean-up patch that simplifies the initial-exec TLS logic by avoiding use of machine operand flags. No change in observable behavior, so no new test cases. llvm-svn: 170141	2012-12-13 18:45:54 +00:00
Patrik Hagglund	5e6c361bc0	Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. This is the second attempt. In the first attempt (r169837), a few getSimpleVT() were hoisted too far, detected by bootstrap failures. llvm-svn: 170104	2012-12-13 06:34:11 +00:00
Akira Hatanaka	cf9a61b6ee	[mips] Do not copy GOT address to register $gp if the function being called has internal linkage. llvm-svn: 170092	2012-12-13 03:17:29 +00:00
Eric Christopher	80882db88f	Add a way of printing out an arbitrary label name for a section given the section. llvm-svn: 170087	2012-12-13 03:00:35 +00:00
Akira Hatanaka	b2cc8a756f	[mips] Delete all floating point instruction classes that are no longer used. No functionality change. llvm-svn: 170084	2012-12-13 02:05:02 +00:00
Akira Hatanaka	6262bbf819	[mips] Modify definitions of floating point conditional move instructions. No functionality change. llvm-svn: 170080	2012-12-13 01:41:15 +00:00
Akira Hatanaka	79e1cdb00b	[mips] Modify definitions of floating point comparison instructions. No functionality change. llvm-svn: 170077	2012-12-13 01:34:09 +00:00
Akira Hatanaka	fd9163b74c	[mips] Modify definitions of floating point branch instructions. No functionality change. llvm-svn: 170076	2012-12-13 01:32:36 +00:00
Akira Hatanaka	cd3dfd238e	[mips] Modify definitions of floating point indexed load and store instructions. No functionality change. llvm-svn: 170075	2012-12-13 01:30:49 +00:00
Akira Hatanaka	b0d4acbc65	[mips] Modify definitions of floating point multiply-add/sub instructions. No functionality change. llvm-svn: 170073	2012-12-13 01:27:48 +00:00
Akira Hatanaka	92994f4846	[mips] Modify definitions of floating point load and store instructions. No functionality change. llvm-svn: 170072	2012-12-13 01:24:00 +00:00
Akira Hatanaka	2b75dde5fa	[mips] Modify definitions of move from/to coprocessor instructions. No functionality change. llvm-svn: 170071	2012-12-13 01:16:49 +00:00
Akira Hatanaka	dea8f61ae0	[mips] Modify definitions of two register operand floating point instructions. No functionality change. llvm-svn: 170069	2012-12-13 01:14:07 +00:00
Akira Hatanaka	29b513871a	[mips] Modify definitions of three register operand floating point instructions and separate encoding information from the rest. llvm-svn: 170066	2012-12-13 01:07:37 +00:00
Jakob Stoklund Olesen	436eea9833	Avoid setIsInsideBundle in Target/R600. This function is going to be removed. llvm-svn: 170064	2012-12-13 00:59:38 +00:00
Akira Hatanaka	84693d5606	[mips] Move classes that do not belong in MipsInstrFormats.td into MipsInstrFPU.td. llvm-svn: 170061	2012-12-13 00:49:23 +00:00
Akira Hatanaka	db49b39200	[mips] Set isCommutable flag in a more explicit way. llvm-svn: 170060	2012-12-13 00:46:23 +00:00
Akira Hatanaka	193e1f738a	[mips] Remove fmt from the parameter list of classes FMADDSUB and FNMADDSUB. llvm-svn: 170057	2012-12-13 00:38:59 +00:00
Akira Hatanaka	caaf4dd516	[mips] Remove single-precision floating point instruction from multiclass FFR2P_M. llvm-svn: 170055	2012-12-13 00:35:54 +00:00
Akira Hatanaka	02ec5516f8	[mips] Move class IsCommutable into MipsInstrInfo.td. llvm-svn: 170054	2012-12-13 00:32:01 +00:00
Akira Hatanaka	e986a59ad9	[mips] Remove single-precision floating point instructions from multiclasses FFR1_W_M and FFR1P_M. The new instruction definitions have one-to-one correspondence with the instructions in the ISA manual. llvm-svn: 170053	2012-12-13 00:29:29 +00:00
Eli Bendersky	b2022f3a5a	Fix a bogus comment llvm-svn: 170052	2012-12-13 00:24:56 +00:00
Akira Hatanaka	7bc144c366	[mips] Fix a memory leak bug report by NAKAMURA Takumi. llvm-svn: 170012	2012-12-12 20:09:58 +00:00
Bill Schmidt	24b8dd6eb7	This patch implements local-dynamic TLS model support for the 64-bit PowerPC target. This is the last of the four models, so we now have full TLS support. This is mostly a straightforward extension of the general dynamic model. I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the register copy following ADDI_TLSLD_L; otherwise everything above the ADDIS_DTPREL_HA appeared dead and was removed. As before, there are new test cases to test the assembly generation, and the relocations output during integrated assembly. The expected code gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll. There are a couple of things I think can be done more efficiently in the overall TLS code, so there will likely be a clean-up patch forthcoming; but for now I want to be sure the functionality is in place. Bill llvm-svn: 170003	2012-12-12 19:29:35 +00:00
Logan Chien	4dd14fb5eb	Add ARM NONE and PREL31 relocation types. Add R_ARM_NONE and R_ARM_PREL31 relocation types to MCExpr. Both of them will be used while generating .ARM.extab and .ARM.exidx sections. llvm-svn: 169965	2012-12-12 07:14:46 +00:00
NAKAMURA Takumi	85292a1338	[CMake] Fixup R600. llvm-svn: 169962	2012-12-12 03:34:26 +00:00
Evan Cheng	962711ee71	Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I mention the inline memcpy / memset expansion code is a mess? This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset. The first indicates whether it is expanding a memset or a memcpy / memmove. The later is whether the memset is a memset of zero. It's totally possible (likely even) that targets may want to do different things for memcpy and memset of zero. llvm-svn: 169959	2012-12-12 02:34:41 +00:00
Evan Cheng	c3d1aca657	- Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term. Also added more comments to explain why it is generally ok to return true. - Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to be true for loaded source (memcpy) or zero constants (memset). The poor name choice is probably some kind of legacy issue. llvm-svn: 169954	2012-12-12 01:32:07 +00:00
Evan Cheng	04e5518783	Avoid using lossy load / stores for memcpy / memset expansion. e.g. f64 load / store on non-SSE2 x86 targets. llvm-svn: 169944	2012-12-12 00:42:09 +00:00
Jim Grosbach	647c702780	Trim unneeded header #include. llvm-svn: 169933	2012-12-11 23:39:51 +00:00
Jim Grosbach	0ddedcc560	ARM: Remove old testing option. Pre-regalloc frame allocation and referencing has been on by default for ages. No need for the testing option that disables it. llvm-svn: 169931	2012-12-11 23:31:12 +00:00
Jim Grosbach	1197889c44	ARM: Remove old testing options. Base pointer referencing has been enabled for ages. llvm-svn: 169930	2012-12-11 23:31:10 +00:00
Evan Cheng	eb54240dc2	Replace TargetLowering::isIntImmLegal() with ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined term for something like integer immediate materialization. It is always possible to materialize an integer immediate. Whether to use it for memcpy expansion is more a "cost" conceern. llvm-svn: 169929	2012-12-11 23:26:14 +00:00
Tom Stellard	75aadc2813	Add R600 backend A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX llvm-svn: 169915	2012-12-11 21:25:42 +00:00
Bill Schmidt	c56f1d34bc	This patch implements the general dynamic TLS model for 64-bit PowerPC. Given a thread-local symbol x with global-dynamic access, the generated code to obtain x's address is: Instruction Relocation Symbol addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x R_PPC64_REL24 __tls_get_addr nop <use address in r3> The implementation borrows from the medium code model work for introducing special forms of ADDIS and ADDI into the DAG representation. This is made slightly more complicated by having to introduce a call to the external function __tls_get_addr. Using the full call machinery is overkill and, more importantly, makes it difficult to add a special relocation. So I've introduced another opcode GET_TLS_ADDR to represent the function call, and surrounded it with register copies to set up the parameter and return value. Most of the code is pretty straightforward. I ran into one peculiarity when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like BL8_NOP_ELF except that it takes another parameter to represent the symbol ("x" above) that requires a relocation on the call. Something in the TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated identically during the emit phase, so this second operand was never visited to generate relocations. This is the reason for the slightly messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding(). Two new tests are included to demonstrate correct external assembly and correct generation of relocations using the integrated assembler. Comments welcome! Thanks, Bill llvm-svn: 169910	2012-12-11 20:30:11 +00:00
Patrik Hagglund	e98b7a0389	Revert EVT->MVT changes, r169836-169851, due to buildbot failures. llvm-svn: 169854	2012-12-11 11:14:33 +00:00
Patrik Hagglund	ad432a8e70	Change TargetLowering::getTypeForExtArgOrReturn to take and return MVTs, instead of EVTs. Accordingly, add bitsLT (and similar) to MVT. llvm-svn: 169850	2012-12-11 10:20:51 +00:00
Patrik Hagglund	03e9628cfa	Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of EVTs. llvm-svn: 169848	2012-12-11 10:09:23 +00:00
Patrik Hagglund	8d2e7cf561	Change TargetLowering::findRepresentativeClass to take an MVT, instead of EVT. llvm-svn: 169845	2012-12-11 09:57:18 +00:00
Patrik Hagglund	3708e548f8	Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. llvm-svn: 169837	2012-12-11 09:10:33 +00:00
NAKAMURA Takumi	99feb75cb8	[CMake] Remove dependencies to intrinsics_gen I introduced in r169724. llvm-svn: 169819	2012-12-11 05:53:54 +00:00
Jyotsna Verma	92e71918b1	Use multiclass for new-value store instructions with MEMri operand. llvm-svn: 169814	2012-12-11 05:12:25 +00:00
Evan Cheng	c2bd620fac	Stylistic tweak. llvm-svn: 169811	2012-12-11 02:31:57 +00:00
Chad Rosier	df42cf39ab	Fall back to the selection dag isel to select tail calls. This shouldn't affect codegen for -O0 compiles as tail call markers are not emitted in unoptimized compiles. Testing with the external/internal nightly test suite reveals no change in compile time performance. Testing with -O1, -O2 and -O3 with fast-isel enabled did not cause any compile-time or execution-time failures. All tests were performed on my x86 machine. I'll monitor our arm testers to ensure no regressions occur there. In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue and objc_retainAutoreleaseReturnValue as tail calls unconditionally. While it's theoretically true that this is just an optimization, it's an optimization that we very much want to happen even at -O0, or else ARC applications become substantially harder to debug. Part of rdar://12553082 llvm-svn: 169796	2012-12-11 00:18:02 +00:00
Evan Cheng	79e2ca90bc	Some enhancements for memcpy / memset inline expansion. 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do not replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 llvm-svn: 169791	2012-12-10 23:21:26 +00:00
Akira Hatanaka	5d6faed1f0	[mips] Set HWEncoding field of registers. Use delete function getMipsRegisterNumbering and use MCRegisterInfo::getEncodingValue instead. llvm-svn: 169760	2012-12-10 20:04:40 +00:00
Chandler Carruth	867c7bff9a	Revert "Make '-mtune=x86_64' assume fast unaligned memory accesses." Accidental commit... git svn betrayed me. Sorry for the noise. llvm-svn: 169741	2012-12-10 18:23:52 +00:00
Chandler Carruth	7eaa45c738	Make '-mtune=x86_64' assume fast unaligned memory accesses. Summary: Not all chips targeted by x86_64 have this feature, but a dramatically increasing number do. Specifying a chip-specific tuning parameter will continue to turn the feature on or off as appropriate for that particular chip, but the generic flag should try to achieve the best performance on the most widely available hardware. Today, the number of chips with fast UA access dwarfs those without in the x86-64 space. Note that this also brings LLVM's code generation for this '-march' flag more in line with that of modern GCCs. CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D195 llvm-svn: 169740	2012-12-10 18:22:42 +00:00
Chandler Carruth	17f25c4e0d	Fix a typo in my previous commit -- bloomfield is 0x1A not 0x2A. Thanks to the PaX folks for noticing in review! We need some tests here, any sugestions welcome... llvm-svn: 169739	2012-12-10 18:22:40 +00:00
Chandler Carruth	0f58558101	Address a FIXME and update the fast unaligned memory feature for newer Intel chips. The model number rules were determined by inspecting Intel's documentation for their newer chip model numbers. My understanding is that all of the newer Intel chips have fast unaligned memory access, but if anyone is concerned about a particular chip, just shout. No tests updated; it's not clear we have dedicated tests for the chips' various features, but if anyone would like tests (or can point me at some existing ones), I'm happy to oblige. llvm-svn: 169730	2012-12-10 09:18:44 +00:00
NAKAMURA Takumi	6b819c5fb1	[CMake] Update dependencies to intrinsics_gen corresponding to r169711. llvm-svn: 169724	2012-12-10 05:27:15 +00:00
Paul Redmond	2adb13c100	LoopVectorize: support vectorizing intrinsic calls - added function to VectorTargetTransformInfo to query cost of intrinsics - vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc. Reviewed by: Nadav llvm-svn: 169711	2012-12-09 20:42:17 +00:00
Shuxin Yang	95de7c37e2	- Re-enable population count loop idiom recognization - fix a bug which cause sigfault. - add two testing cases which was causing crash llvm-svn: 169687	2012-12-09 03:12:46 +00:00
Chandler Carruth	91e47532fe	Revert the patches adding a popcount loop idiom recognition pass. There are still bugs in this pass, as well as other issues that are being worked on, but the bugs are crashers that occur pretty easily in the wild. Test cases have been sent to the original commit's review thread. This reverts the commits: r169671: Fix a logic error. r169604: Move the popcnt tests to an X86 subdirectory. r168931: Initial commit adding the pass. llvm-svn: 169683	2012-12-08 22:18:29 +00:00
Benjamin Kramer	f242d8c31b	Simplify code. Sort includes. No functionality change. llvm-svn: 169676	2012-12-08 10:45:24 +00:00
Chandler Carruth	1d94e932bc	Fix a use-after-free bug found by ASan. You can't assign a temporary std::string to a StringRef. Moreover, the method being called accepts a Twine to simplify these patterns. Fixes this ASan failure: ==6312== ERROR: AddressSanitizer: heap-use-after-free on address 0x7fd558b1af58 at pc 0xcb7529 bp 0x7fffff572080 sp 0x7fffff572078 READ of size 1 at 0x7fd558b1af58 thread T0 #0 0xcb7528 .../llvm/include/llvm/ADT/StringRef.h:192 llvm::StringRef::operator[]() #1 0x1d53c0a .../llvm/include/llvm/ADT/StringExtras.h:128 llvm::HashString() #2 0x1d53878 .../llvm/lib/Support/StringMap.cpp:64 llvm::StringMapImpl::LookupBucketFor() #3 0x1b6872f .../llvm/include/llvm/ADT/StringMap.h:352 llvm::StringMap<>::GetOrCreateValue<>() #4 0x1b61836 .../llvm/lib/MC/MCContext.cpp:109 llvm::MCContext::GetOrCreateSymbol() #5 0xe9fd47 .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:154 (anonymous namespace)::ARMELFStreamer::EmitMappingSymbol() #6 0xea01dd .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:133 (anonymous namespace)::ARMELFStreamer::EmitDataMappingSymbol() #7 0xe9f78b .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:91 (anonymous namespace)::ARMELFStreamer::EmitBytes() #8 0x1b15d82 .../llvm/lib/MC/MCStreamer.cpp:89 llvm::MCStreamer::EmitIntValue() #9 0xcc0f9b .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:713 llvm::ARMAsmPrinter::emitAttributes() #10 0xcc0d44 .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:632 llvm::ARMAsmPrinter::EmitStartOfAsmFile() #11 0x14692ad .../llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:162 llvm::AsmPrinter::doInitialization() #12 0x1bc4677 .../llvm/lib/VMCore/PassManager.cpp:1561 llvm::FPPassManager::doInitialization() #13 0x1bc4990 .../llvm/lib/VMCore/PassManager.cpp:1595 llvm::MPPassManager::runOnModule() #14 0x1bc55e5 .../llvm/lib/VMCore/PassManager.cpp:1705 llvm::PassManagerImpl::run() #15 0x1bc5878 .../llvm/lib/VMCore/PassManager.cpp:1740 llvm::PassManager::run() #16 0xc3954d .../llvm/tools/llc/llc.cpp:378 compileModule() #17 0xc38001 .../llvm/tools/llc/llc.cpp:194 main #18 0x7fd557d6a11c __libc_start_main 0x7fd558b1af58 is located 24 bytes inside of 29-byte region [0x7fd558b1af40,0x7fd558b1af5d) freed by thread T0 here: #0 0xc337da .../llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:56 operator delete() #1 0x1ee9cef .../libstdc++-v3/include/bits/basic_string.h:535 std::string::~string() #2 0xea01dd .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:133 (anonymous namespace)::ARMELFStreamer::EmitDataMappingSymbol() #3 0xe9f78b .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:91 (anonymous namespace)::ARMELFStreamer::EmitBytes() #4 0x1b15d82 .../llvm/lib/MC/MCStreamer.cpp:89 llvm::MCStreamer::EmitIntValue() #5 0xcc0f9b .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:713 llvm::ARMAsmPrinter::emitAttributes() #6 0xcc0d44 .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:632 llvm::ARMAsmPrinter::EmitStartOfAsmFile() #7 0x14692ad .../llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:162 llvm::AsmPrinter::doInitialization() #8 0x1bc4677 .../llvm/lib/VMCore/PassManager.cpp:1561 llvm::FPPassManager::doInitialization() #9 0x1bc4990 .../llvm/lib/VMCore/PassManager.cpp:1595 llvm::MPPassManager::runOnModule() #10 0x1bc55e5 .../llvm/lib/VMCore/PassManager.cpp:1705 llvm::PassManagerImpl::run() #11 0x1bc5878 .../llvm/lib/VMCore/PassManager.cpp:1740 llvm::PassManager::run() #12 0xc3954d .../llvm/tools/llc/llc.cpp:378 compileModule() #13 0xc38001 .../llvm/tools/llc/llc.cpp:194 main #14 0x7fd557d6a11c __libc_start_main llvm-svn: 169668	2012-12-08 03:10:14 +00:00
Bill Wendling	e94d843e43	s/AttrListPtr/AttributeSet/g to better label what this class is going to be in the near future. llvm-svn: 169651	2012-12-07 23:16:57 +00:00
Nadav Rotem	ad0b5fbe8c	When we use the BLEND instruction that uses the MSB as a mask, we can remove the VSRI instruction before it since it does not affect the MSB. Thanks Craig Topper for suggesting this. llvm-svn: 169638	2012-12-07 21:43:11 +00:00
Matthew Curtis	7a93811e8b	In hexagon convertToHardwareLoop, don't deref end() iterator In particular, check if MachineBasicBlock::iterator is end() before using it to call getDebugLoc(); See also this thread on llvm-commits: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155914.html llvm-svn: 169634	2012-12-07 21:03:15 +00:00
Nadav Rotem	481e50efe0	X86: Prefer using VPSHUFD over VPERMIL because it has better throughput. llvm-svn: 169624	2012-12-07 19:01:13 +00:00
Tim Northover	5cc3dc86bb	Added Mapping Symbols for ARM ELF Before this patch, when you objdump an LLVM-compiled file, objdump tried to decode data-in-code sections as if they were code. This patch adds the missing Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D). Patch based on work by Greg Fitzgerald. llvm-svn: 169609	2012-12-07 16:50:23 +00:00
Jakob Stoklund Olesen	97030e0c0e	Use the new MIBundleBuilder class in the Mips target. This is the preferred way of creating bundled machine instructions. llvm-svn: 169585	2012-12-07 04:23:40 +00:00
Akira Hatanaka	efdce0fb09	[mips] Delete nodes and instructions for dynamic alloca that are no longer in use. llvm-svn: 169580	2012-12-07 03:10:18 +00:00
Akira Hatanaka	97e179f9e4	[mips] Shorten predicate name. llvm-svn: 169579	2012-12-07 03:06:09 +00:00
Akira Hatanaka	c5dc055922	[mips] Delete unused sub-target features. llvm-svn: 169578	2012-12-07 03:04:05 +00:00
Akira Hatanaka	02a346d11f	[mips] Remove unnecessary predicates. llvm-svn: 169577	2012-12-07 03:01:24 +00:00
Matt Beaumont-Gay	4a04c92001	Add a 'using' declaration to suppress GCC's -Woverloaded-virtual while we decide what pattern we want to follow in the future. llvm-svn: 169561	2012-12-06 23:15:36 +00:00
Evan Cheng	9ec512d768	Replace r169459 with something safer. Rather than having computeMaskedBits to understand target implementation of any_extend / extload, just generate zero_extend in place of any_extend for liveouts when the target knows the zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz). rdar://12771555 llvm-svn: 169536	2012-12-06 19:13:27 +00:00
Jakub Staszak	40ee5674cd	Remove unneeded function, since PR8156 was fixed over a year ago. llvm-svn: 169534	2012-12-06 19:05:46 +00:00
Jakub Staszak	65ca2fb9e6	Simplify code. llvm-svn: 169521	2012-12-06 18:22:59 +00:00
Craig Topper	216bcd522b	Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing to the normal instructions. llvm-svn: 169482	2012-12-06 07:31:16 +00:00
Craig Topper	922f10aec4	Mark MOVDQ(A/U)rm as ReMaterializable. Mark all MOVDQ(A/U) instructions as neverHasSideEffects. llvm-svn: 169477	2012-12-06 06:49:16 +00:00
Chad Rosier	9f5c68af4c	[arm fast-isel] Make the fast-isel implementation of memcpy respect alignment. rdar://12821569 llvm-svn: 169460	2012-12-06 01:34:31 +00:00
Evan Cheng	5213139f48	Let targets provide hooks that compute known zero and ones for any_extend and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 llvm-svn: 169459	2012-12-06 01:28:01 +00:00
Jyotsna Verma	d3746e6895	Define new-value store instructions with base+immediate addressing mode using multiclass. llvm-svn: 169432	2012-12-05 22:02:56 +00:00
Nadav Rotem	0a471ea66c	Cost Model: change the default cost of control flow instructions (br / ret / ...) to zero. llvm-svn: 169423	2012-12-05 21:21:26 +00:00
David Sehr	05176cad21	Correct ARM NOP encoding The encoding of NOP in ARMAsmBackend.cpp is missing a trailing zero, which causes the emission of a coprocessor instruction rather than "mov r0, r0" as indicated in the comment. The test also checks for the wrong encoding. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121203/157919.html llvm-svn: 169420	2012-12-05 21:01:27 +00:00
Justin Holewinski	fb711156ae	[NVPTX] Fix crash with unnamed struct arguments Patch by Eric Holk llvm-svn: 169418	2012-12-05 20:50:28 +00:00
Jyotsna Verma	90295156d8	Use multiclass to define store instructions with base+immediate offset addressing mode and immediate stored value. llvm-svn: 169408	2012-12-05 19:32:03 +00:00
Matthew Curtis	cd8c881c9f	Fix misplaced closing brace. llvm-svn: 169404	2012-12-05 19:00:34 +00:00
Kevin Enderby	168ffb36a5	Added a option to the disassembler to print immediates as hex. This is for the lldb team so most of but not all of the values are to be printed as hex with this option. Some small values like the scale in an X86 address were requested to printed in decimal without the leading 0x. There may be some tweaks need to places that may still be in decimal that they want in hex. Specially for arm. I made my best guess. Any tweaks from here should be simple. I also did the best I know now with help from the C++ gurus creating the cleanest formatImm() utility function and containing the changes. But if someone has a better idea to make something cleaner I'm all ears and game for changing the implementation. rdar://8109283 llvm-svn: 169393	2012-12-05 18:13:19 +00:00
Elena Demikhovsky	cd3c1c4a16	Simplified BLEND pattern matching for shuffles. Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2. llvm-svn: 169366	2012-12-05 09:24:57 +00:00
Evan Cheng	d31802c1f6	Add x86 isel lowering logic to form bit test with inverted condition. e.g. x ^ -1. Patch by David Majnemer. rdar://12755626 llvm-svn: 169339	2012-12-05 00:10:38 +00:00
Matt Beaumont-Gay	50f61b662f	Appease GCC's -Wparentheses. (TIL that Clang's -Wparentheses ignores 'x \|\| y && "foo"' on purpose. Neat.) llvm-svn: 169337	2012-12-04 23:54:02 +00:00
Evan Cheng	b4eae1361c	ARM custom lower ctpop for vector types. Patch by Pete Couperus. llvm-svn: 169325	2012-12-04 22:41:50 +00:00
Jyotsna Verma	4da904c8f8	Define store instructions with base+register offset addressing mode using multiclass. llvm-svn: 169314	2012-12-04 21:58:25 +00:00
Eli Bendersky	abe546368b	Make NaCl naming consistent. The triple OSType is called NaCl and is represented textually as NativeClient. Also added a link to the native client project for readers unfamiliar with it. A Clang patch will follow shortly. llvm-svn: 169291	2012-12-04 18:37:26 +00:00
Jyotsna Verma	dfd779e108	Add patterns to define 'combine', 'tstbit', 'ct0/cl0' (count trailing/leading zeros) instructions. llvm-svn: 169287	2012-12-04 18:05:01 +00:00
Jyotsna Verma	22d61dd4ce	Add constant extender support to ALU32 instructions for V2. llvm-svn: 169284	2012-12-04 17:12:00 +00:00
Bill Schmidt	ca4a0c9dbd	This patch introduces initial-exec model support for thread-local storage on 64-bit PowerPC ELF. The patch includes code to handle external assembly and MC output with the integrated assembler. It intentionally does not support the "old" JIT. For the initial-exec TLS model, the ABI requires the following to calculate the address of external thread-local variable x: Code sequence Relocation Symbol ld 9,x@got@tprel(2) R_PPC64_GOT_TPREL16_DS x add 9,9,x@tls R_PPC64_TLS x The register 9 is arbitrary here. The linker will replace x@got@tprel with the offset relative to the thread pointer to the generated GOT entry for symbol x. It will replace x@tls with the thread-pointer register (13). The two test cases verify correct assembly output and relocation output as just described. PowerPC-specific selection node variants are added for the two instructions above: LD_GOT_TPREL and ADD_TLS. These are inserted when an initial-exec global variable is encountered by PPCTargetLowering::LowerGlobalTLSAddress(), and later lowered to machine instructions LDgotTPREL and ADD8TLS. LDgotTPREL is a pseudo that uses the same LDrs support added for medium code model's LDtocL, with a different relocation type. The rest of the processing is straightforward. llvm-svn: 169281	2012-12-04 16:18:08 +00:00
Chandler Carruth	802d755533	Sort includes for all of the .h files under the 'lib' tree. These were missed in the first pass because the script didn't yet handle include guards. Note that the script is now able to handle all of these headers without manual edits. =] llvm-svn: 169224	2012-12-04 07:12:27 +00:00
Jyotsna Verma	5929cfc534	Move all operand definitions into HexagonOperands.td llvm-svn: 169213	2012-12-04 05:00:31 +00:00
Jyotsna Verma	efe4f559b1	Move generic Hexagon subtarget information into Hexagon.td llvm-svn: 169212	2012-12-04 04:29:16 +00:00
Jakob Stoklund Olesen	a32d85b39d	Remove the old TRI::ResolveRegAllocHint() and getRawAllocationOrder() hooks. These functions have been replaced by TRI::getRegAllocationHints() which provides the same capabilities. llvm-svn: 169192	2012-12-04 00:46:13 +00:00
Akira Hatanaka	4c128509a5	Classic JIT is still being supported by MIPS, along with MCJIT. This change adds endian-awareness to MipsJITInfo and emitWordLE in MipsCodeEmitter has become emitWord now to support both endianness. Patch by Petar Jovanovic. llvm-svn: 169177	2012-12-03 23:11:12 +00:00
Akira Hatanaka	60c2837e8d	Functions in MipsCodeEmitter.cpp that expand unaligned loads/stores are dead code. Removing it. Patch by Petar Jovanovic. llvm-svn: 169174	2012-12-03 22:51:22 +00:00
Jakob Stoklund Olesen	742f201e30	Implement ARMBaseRegisterInfo::getRegAllocationHints(). This provides the same functionality as getRawAllocationOrder() for the even/odd hints, but without the many constant register arrays. llvm-svn: 169169	2012-12-03 22:35:35 +00:00
Jyotsna Verma	6f3bd03e50	Define store instructions with base+immediate offset addressing mode using multiclass. llvm-svn: 169168	2012-12-03 22:26:28 +00:00
Jyotsna Verma	4d8686cc42	Define load instructions with base+immediate offset addressing mode using multiclass. llvm-svn: 169153	2012-12-03 21:13:13 +00:00
Jyotsna Verma	c86b3e1b26	Define unsigned const-ext predicates. llvm-svn: 169149	2012-12-03 20:39:45 +00:00
Jyotsna Verma	6aba56e9d4	Removing unnecessary 'else' statement from the predicates defined in HexagonOperards.td. llvm-svn: 169148	2012-12-03 20:14:38 +00:00
Chandler Carruth	ed0881b2a6	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131	2012-12-03 16:50:05 +00:00
Jyotsna Verma	014dfe4de0	Define signed const-ext predicates. llvm-svn: 169117	2012-12-03 06:54:50 +00:00
Sebastian Pop	a204f72237	Codegen failure for vmull with small vectors Codegen was failing with an assertion because of unexpected vector operands when legalizing the selection DAG for a MUL instruction. The asserting code was legalizing multiplies for vectors of size 128 bits. It uses a custom lowering to try and detect cases where it can use a VMULL instruction instead of a VMOVL + VMUL. The code was looking for input operands to the MUL that had been sign or zero extended. If it found the extended operands it would drop the sign/zero extension and use the original vector size as input to a VMULL instruction. The code assumed that the original input vector was 64 bits so that after dropping the extension it would fit directly into a D register and could be used as an operand of a VMULL instruction. The input code that trigger the failure used a vector of <4 x i8> that was sign extended to <4 x i32>. It was not safe to drop the sign extension in this case because the original vector is only 32 bits wide. The fix is to insert a sign extension for the vector to reach the required 64 bit size. In this particular example, the vector would need to be sign extented to a <4 x i16>. llvm-svn: 169024	2012-11-30 19:08:04 +00:00
Jyotsna Verma	a77c054e85	Use multiclass for the load instructions with MEMri operand. llvm-svn: 169018	2012-11-30 17:31:52 +00:00
Adhemerval Zanella	812410f2d1	This patch fixes the Altivec addend construction for the fused multiply-add instruction (vmaddfp) to conform with IEEE to ensure the sign of a zero result when resulting product is -0.0. The -0.0 vector addend to vmaddfp is generated by a creating a vector with full bits sets and then shifting each elements by 31-bits to the left, resulting in a vector of 0x80000000 (or -0.0 as float). The 'buildvec_canonicalize.ll' was adjusted to reflect this change and the 'vec_mul.ll' was complemented with the float vector multiplication test. llvm-svn: 168998	2012-11-30 13:05:44 +00:00
Chandler Carruth	f12e3a67db	Switch LLVM_USE_RVALUE_REFERENCES to LLVM_HAS_RVALUE_REFERENCES. Rationale: 1) This was the name in the comment block. ;] 2) It matches Clang's __has_feature naming convention. 3) It matches other compiler-feature-test conventions. Sorry for the noise. =] I've also switch the comment block to use a \brief tag and not duplicate the name. llvm-svn: 168996	2012-11-30 11:45:22 +00:00
Jyotsna Verma	b950ea61fc	Use multiclass for the store instructions with MEMri operand. llvm-svn: 168983	2012-11-30 06:10:22 +00:00
Jyotsna Verma	ede608cce0	Use multiclass for the load instructions with 'base + register offset' addressing mode. llvm-svn: 168976	2012-11-30 04:19:09 +00:00
Kevin Enderby	136d6746c5	Fixed the arm disassembly of invalid BFI instructions to not build a bad MCInst which would then cause an assert when printed. rdar://11437956 llvm-svn: 168960	2012-11-29 23:47:11 +00:00
Quentin Colombet	13cd521b24	Add cortex-a5 subtarget to the supported ARM architectures llvm-svn: 168933	2012-11-29 19:48:01 +00:00
Shuxin Yang	abcc370423	rdar://12100355 (part 1) This revision attempts to recognize following population-count pattern: while(a) { c++; ... ; a &= a - 1; ... }, where <c> and <a>could be used multiple times in the loop body. TODO: On X8664 and ARM, __buildin_ctpop() are not expanded to a efficent instruction sequence, which need to be improved in the following commits. Reviewed by Nadav, really appreciate! llvm-svn: 168931	2012-11-29 19:38:54 +00:00
Jyotsna Verma	e95559fc16	Use multiclass for 'transfer' instructions. llvm-svn: 168929	2012-11-29 19:35:44 +00:00
Silviu Baranga	93aefa5f2c	Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend. llvm-svn: 168886	2012-11-29 14:41:25 +00:00
Justin Holewinski	bc45119b44	Allow targets to prefer TypeSplitVector over TypePromoteInteger when computing the legalization method for vectors For some targets, it is desirable to prefer scalarizing <N x i1> instead of promoting to a larger legal type, such as <N x i32>. llvm-svn: 168882	2012-11-29 14:26:24 +00:00
Elena Demikhovsky	eace43bff7	I changed hasAVX() to hasFp256() and hasAVX2() to hasInt256() in X86IselLowering.cpp. The logic was not changed, only names. llvm-svn: 168875	2012-11-29 12:44:59 +00:00
Jyotsna Verma	519b3856dd	Define signed const-ext immediate operands and their predicates. llvm-svn: 168810	2012-11-28 20:58:14 +00:00
Benjamin Kramer	b1996da782	ARM: Implement CanLowerReturn so large vectors get expanded into sret. Fixes 14337. llvm-svn: 168809	2012-11-28 20:55:10 +00:00
Ulrich Weigand	9e159fd7a0	Fix initial frame state on powerpc64. The createPPCMCAsmInfo routine used PPC::R1 as the initial frame pointer register, but on PPC64 the 32-bit R1 register does not have a corresponding DWARF number, causing invalid CIE initial frame state to be emitted. Fix by using PPC::X1 instead. llvm-svn: 168799	2012-11-28 18:21:03 +00:00
Jakob Stoklund Olesen	9de596e650	Remove all references to TargetInstrInfoImpl. This class has been merged into its super-class TargetInstrInfo. llvm-svn: 168760	2012-11-28 02:35:17 +00:00
Jakob Stoklund Olesen	fcf14e8436	Move Target{Instr,Register}Info.cpp into lib/CodeGen. The Target library is not allowed to depend on the large CodeGen library, but the TRI and TII classes provide abstract interfaces that require both caller and callee to link to CodeGen. The implementation files for these classes provide default implementations of some of the hooks. These methods may need to reference CodeGen, so they belong in that library. We already have a number of methods implemented in the TargetInstrInfoImpl sub-class because of that. I will merge that class into the parent next. llvm-svn: 168758	2012-11-28 02:35:09 +00:00
Bill Schmidt	e0a68a562b	This patch makes medium code model the default for 64-bit PowerPC ELF. When the CodeGenInfo is to be created for the PPC64 target machine, a default code-model selection is converted to CodeModel::Medium provided we are not targeting the Darwin OS. Defaults for Darwin are unaffected. llvm-svn: 168747	2012-11-27 23:36:26 +00:00
Chad Rosier	b4ac423ed4	[arm fast-isel] Appease the machine verifier by using the proper register classes. The vast majority of the remaining issues are due to uses of invalid registers, which are defined by getRegForValue(). Those will be a little more challenging to cleanup. rdar://12719844 llvm-svn: 168735	2012-11-27 22:29:43 +00:00
Chad Rosier	0c00758065	[arm fast-isel] Appease the machine verifier by using the proper register classes. rdar://12719844 llvm-svn: 168733	2012-11-27 22:12:11 +00:00
Chad Rosier	2ec7db0968	[arm fast-isel] Appease the machine verifier by using the proper register classes. Also a bit of cleanup. rdar://12719844 llvm-svn: 168728	2012-11-27 21:46:46 +00:00
Manman Ren	5b4628201f	X86: do not fold load instructions such as [V]MOVS[S\|D] to other instructions when the destination register is wider than the memory load. These load instructions load from m32 or m64 and set the upper bits to zero, while the folded instructions may accept m128. rdar://12721174 llvm-svn: 168710	2012-11-27 18:09:26 +00:00
Bill Schmidt	34627e3434	This patch implements medium code model support for 64-bit PowerPC. The default for 64-bit PowerPC is small code model, in which TOC entries must be addressable using a 16-bit offset from the TOC pointer. Additionally, only TOC entries are addressed via the TOC pointer. With medium code model, TOC entries and data sections can all be addressed via the TOC pointer using a 32-bit offset. Cooperation with the linker allows 16-bit offsets to be used when these are sufficient, reducing the number of extra instructions that need to be executed. Medium code model also does not generate explicit TOC entries in ".section toc" for variables that are wholly internal to the compilation unit. Consider a load of an external 4-byte integer. With small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei With medium model, it instead generates: addis 3, 2, .LC1@toc@ha ld 3, .LC1@toc@l(3) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the 32-bit offset of ei's TOC entry from the TOC base pointer. Similarly, .LC1@toc@l is a relocation requesting the lower 16 bits. Note that if the linker determines that ei's TOC entry is within a 16-bit offset of the TOC base pointer, it will replace the "addis" with a "nop", and replace the "ld" with the identical "ld" instruction from the small code model example. Consider next a load of a function-scope static integer. For small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc test_fn_static.si[TC],test_fn_static.si .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 For medium code model, the compiler generates: addis 3, 2, test_fn_static.si@toc@ha addi 3, 3, test_fn_static.si@toc@l lwz 4, 0(3) .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 Again, the linker may replace the "addis" with a "nop", calculating only a 16-bit offset when this is sufficient. Note that it would be more efficient for the compiler to generate: addis 3, 2, test_fn_static.si@toc@ha lwz 4, test_fn_static.si@toc@l(3) The current patch does not perform this optimization yet. This will be addressed as a peephole optimization in a later patch. For the moment, the default code model for 64-bit PowerPC will remain the small code model. We plan to eventually change the default to medium code model, which matches current upstream GCC behavior. Note that the different code models are ABI-compatible, so code compiled with different models will be linked and execute correctly. I've tested the regression suite and the application/benchmark test suite in two ways: Once with the patch as submitted here, and once with additional logic to force medium code model as the default. The tests all compile cleanly, with one exception. The mandel-2 application test fails due to an unrelated ABI compatibility with passing complex numbers. It just so happens that small code model was incredibly lucky, in that temporary values in floating-point registers held the expected values needed by the external library routine that was called incorrectly. My current thought is to correct the ABI problems with _Complex before making medium code model the default, to avoid introducing this "regression." Here are a few comments on how the patch works, since the selection code can be difficult to follow: The existing logic for small code model defines three pseudo-instructions: LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for constant pool addresses. These are expanded by SelectCodeCommon(). The pseudo-instruction approach doesn't work for medium code model, because we need to generate two instructions when we match the same pattern. Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY node for medium code model, and generates an ADDIStocHA followed by either a LDtocL or an ADDItocL. These new node types correspond naturally to the sequences described above. The addis/ld sequence is generated for the following cases: * Jump table addresses * Function addresses * External global variables * Tentative definitions of global variables (common linkage) The addis/addi sequence is generated for the following cases: * Constant pool entries * File-scope static global variables * Function-scope static variables Expanding to the two-instruction sequences at select time exposes the instructions to subsequent optimization, particularly scheduling. The rest of the processing occurs at assembly time, in PPCAsmPrinter::EmitInstruction. Each of the instructions is converted to a "real" PowerPC instruction. When a TOC entry needs to be created, this is done here in the same manner as for the existing LDtoc, LDtocJTI, and LDtocCPT pseudo-instructions (I factored out a new routine to handle this). I had originally thought that if a TOC entry was needed for LDtocL or ADDItocL, it would already have been generated for the previous ADDIStocHA. However, at higher optimization levels, the ADDIStocHA may appear in a different block, which may be assembled textually following the block containing the LDtocL or ADDItocL. So it is necessary to include the possibility of creating a new TOC entry for those two instructions. Note that for LDtocL, we generate a new form of LD called LDrs. This allows specifying the @toc@l relocation for the offset field of the LD instruction (i.e., the offset is replaced by a SymbolLo relocation). When the peephole optimization described above is added, we will need to do similar things for all immediate-form load and store operations. The seven "mcm-n.ll" test cases are kept separate because otherwise the intermingling of various TOC entries and so forth makes the tests fragile and hard to understand. The above assumes use of an external assembler. For use of the integrated assembler, new relocations are added and used by PPCELFObjectWriter. Testing is done with "mcm-obj.ll", which tests for proper generation of the various relocations for the same sequences tested with the external assembler. llvm-svn: 168708	2012-11-27 17:35:46 +00:00

... 9 10 11 12 13 ...

23654 Commits