llvm-project

Commit Graph

Author	SHA1	Message	Date
Jim Grosbach	553eb75663	ARM: Fix a few copy-paste errors. s/X86/ARM/ llvm-svn: 171789	2013-01-07 21:12:13 +00:00
Bill Schmidt	9b1e3e25dc	This patch addresses bug 14678 by fixing two problems in medium code model code generation. Variables addressed through a GlobalAlias were not being handled, and variables with available_externally linkage were treated incorrectly. The patch contains two new tests to verify the correct code generation for these cases. llvm-svn: 171778	2013-01-07 19:29:18 +00:00
Jordan Rose	e8f1eaea8a	Change SMRange to be half-open (exclusive end) instead of closed (inclusive) This is necessary not only for representing empty ranges, but for handling multibyte characters in the input. (If the end pointer in a range refers to a multibyte character, should it point to the beginning or the end of the character in a char array?) Some of the code in the asm parsers was already assuming this anyway. llvm-svn: 171765	2013-01-07 19:00:49 +00:00
NAKAMURA Takumi	458a8277cc	R600/SIISelLowering.cpp: Suppress a warning. [-Wunused-variable] llvm-svn: 171728	2013-01-07 11:14:44 +00:00
Tim Northover	2883da3b51	Add LICENSE.TXT covering contributions made by ARM. Absent a Contributor's License Agreement (CLA) with an LLVM legal entity and as reviewed and agreed with Chris Lattner, add a patent license covering future contributions from ARM until there is a CLA. This is to make explicit ARM's grant of patent rights to recipients of LLVM containing ARM-contributed material. llvm-svn: 171721	2013-01-07 10:04:49 +00:00
Craig Topper	ae65212a4b	Remove more unnecessary # operators with nothing to paste proceeding them. llvm-svn: 171702	2013-01-07 06:14:20 +00:00
Craig Topper	a8c5ec09c7	Remove # from the beginning and end of def names. The # is a paste operator and should only be used with something to paste on either side. llvm-svn: 171697	2013-01-07 05:45:56 +00:00
Craig Topper	25cdf92b34	Remove # from the beginning and end of def names. llvm-svn: 171696	2013-01-07 05:26:58 +00:00
Craig Topper	bd62d64cbf	Remove unnecessary # tokens at the beginning and end of defm names. llvm-svn: 171694	2013-01-07 05:04:39 +00:00
Chandler Carruth	2109f47d97	Fix the enumerator names for ShuffleKind to match tho coding standards, and make its comments doxygen comments. llvm-svn: 171688	2013-01-07 03:20:02 +00:00
Chandler Carruth	50a36cd148	Make the popcnt support enums and methods have more clear names and follow the conding conventions regarding enumerating a set of "kinds" of things. llvm-svn: 171687	2013-01-07 03:16:03 +00:00
Chandler Carruth	d3e73556d6	Move TargetTransformInfo to live under the Analysis library. This no longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686	2013-01-07 03:08:10 +00:00
Chandler Carruth	664e354de7	Switch TargetTransformInfo from an immutable analysis pass that requires a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis pass, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it always being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681	2013-01-07 01:37:14 +00:00
Craig Topper	4f1c7256f9	Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior. cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix. cvtt2si/cvt2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix. Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein. llvm-svn: 171668	2013-01-06 20:39:29 +00:00
Evan Cheng	3fb03e23a4	Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch. llvm-svn: 171665	2013-01-06 19:00:15 +00:00
Chandler Carruth	539edf4ee0	Convert the TargetTransformInfo from an immutable pass with dynamic interfaces which could be extracted from it, and must be provided on construction, to a chained analysis group. The end goal here is that TTI works much like AA -- there is a baseline "no-op" and target independent pass which is in the group, and each target can expose a target-specific pass in the group. These passes will naturally chain allowing each target-specific pass to delegate to the generic pass as needed. In particular, this will allow a much simpler interface for passes that would like to use TTI -- they can have a hard dependency on TTI and it will just be satisfied by the stub implementation when that is all that is available. This patch is a WIP however. In particular, the "stub" pass is actually the one and only pass, and everything there is implemented by delegating to the target-provided interfaces. As a consequence the tools still have to explicitly construct the pass. Switching targets to provide custom passes and sinking the stub behavior into the NoTTI pass is the next step. llvm-svn: 171621	2013-01-05 11:43:11 +00:00
Craig Topper	92a70b1e65	Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. llvm-svn: 171608	2013-01-05 07:39:25 +00:00
Nadav Rotem	478b6a47ec	Revert revision 171524. Original message: URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171603	2013-01-05 05:42:48 +00:00
Chandler Carruth	4a7c311008	Refactor the ScalarTargetTransformInfo API for querying about the legality of an address mode to not use a struct of four values and instead to accept them as parameters. I'd love to have named parameters here as most callers only care about one or two of these, but the defaults aren't terribly scary to write out. That said, there is no real impact of this as the passes aren't yet using STTI for this and are still relying upon TargetLowering. llvm-svn: 171595	2013-01-05 03:36:17 +00:00
Akira Hatanaka	d35a263076	[mips] Fix data layout string. Add 64 to the list of native integer widths and add stack alignment information. llvm-svn: 171587	2013-01-05 02:00:56 +00:00
Jakub Staszak	43fafaf496	Move 'break' to the right place to prevent fallthru. There is no test-case because conditions in the next case prevented from doing anything nasty. llvm-svn: 171549	2013-01-04 23:01:26 +00:00
Preston Gurd	e36b685a94	The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171524	2013-01-04 20:54:54 +00:00
Akira Hatanaka	b13b33359b	[mips] MipsTargetLowering::getSetCCResultType should return a vector type if vectors are being compared. llvm-svn: 171517	2013-01-04 20:06:01 +00:00
Akira Hatanaka	e067e5a13f	[mips] 80 columns. llvm-svn: 171515	2013-01-04 19:38:05 +00:00
Akira Hatanaka	f412e7501a	[mips] Reorder template parameters. Remove class shift_rotate_imm32 and shift_rotate_imm64. llvm-svn: 171513	2013-01-04 19:25:46 +00:00
Akira Hatanaka	a7a9fa1c16	[mips] Refactor conditional move instructions. llvm-svn: 171511	2013-01-04 19:16:38 +00:00
Akira Hatanaka	e36e2f6876	[mips] Refactor instructions which move data from or to coprocessors. llvm-svn: 171510	2013-01-04 19:13:49 +00:00
Adhemerval Zanella	9b0b781395	PowerPC: Fix eh_frame relocation for PIC This patch fixes the PPC eh_frame definitions for the personality and frame unwinding for PIC objects. It makes PIC build correctly creates relative relocations in the '.rela.eh_frame' segments and thus avoiding a text relocation that generates a DT_TEXTREL segments in link phase. llvm-svn: 171506	2013-01-04 19:08:13 +00:00
Nadav Rotem	e6bb35435d	Change the default number of registers to prevent unrolling on targets that dont have this hook. llvm-svn: 171489	2013-01-04 18:40:39 +00:00
Nadav Rotem	e1d5c4b8b9	LoopVectorizer: 1. Add code to estimate register pressure. 2. Add code to select the unroll factor based on register pressure. 3. Add bits to TargetTransformInfo to provide the number of registers. llvm-svn: 171469	2013-01-04 17:48:25 +00:00
Nadav Rotem	c616a5408a	Revert revision: 171467. This transformation is incorrect and makes some tests fail. Original message: Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171468	2013-01-04 17:35:21 +00:00
Elena Demikhovsky	5f2f06d2d9	Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171467	2013-01-03 08:48:33 +00:00
Michael Gottesman	820aac1c78	Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks." This reverts commit r171461 since it breaks the following tests: Clang :: Analysis/outofbound-notwork.c Clang :: Analysis/string-fail.c Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp Clang :: CXX/temp/temp.param/p14.cpp Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c Clang :: CodeGen/blocks-2.c Clang :: CodeGen/libcalls-d.c Clang :: CodeGen/libcalls-ld.c Clang :: CodeGenCXX/conversion-function.cpp Clang :: CodeGenCXX/debug-info-limit-type.cpp Clang :: CodeGenCXX/inheriting-constructor.cpp Clang :: FixIt/fixit-errors.c Clang :: FixIt/fixit-pmem.cpp Clang :: Modules/namespaces.cpp Clang :: PCH/changed-files.c Clang :: PCH/pr4489.c Clang :: PCH/source-manager-stack.c Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp Clang :: SemaTemplate/instantiate-function-1.mm llvm-svn: 171466	2013-01-03 08:18:30 +00:00
Craig Topper	7c27cc9fd0	Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. llvm-svn: 171461	2013-01-03 06:40:20 +00:00
Hal Finkel	95de3f3018	Add a subtype parameter to VTTI::getShuffleCost In order to cost subvector insertion and extraction, we need to know the type of the subvector being extracted. No functionality change. llvm-svn: 171453	2013-01-03 02:34:09 +00:00
Kevin Enderby	726e0ea6eb	Adds missing aliases for fcom and fcomp instructions without arguments. Patch by Michael M Kuperstein! llvm-svn: 171414	2013-01-02 21:20:15 +00:00
Nadav Rotem	761937a757	AVX: Fix a bug in WidenMaskArithmetic. llvm-svn: 171398	2013-01-02 17:41:03 +00:00
Chandler Carruth	9fb823bbd4	Move all of the header files which are involved in modelling the LLVM IR into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366	2013-01-02 11:36:10 +00:00
Chandler Carruth	be81023d74	Resort the #include lines in include/... and lib/... with the utils/sort_includes.py script. Most of these are updating the new R600 target and fixing up a few regressions that have creeped in since the last time I sorted the includes. llvm-svn: 171362	2013-01-02 10:22:59 +00:00
Craig Topper	9791afb182	Merge SSE and AVX instruction definitions for scalar forms of SQRT, RSQRT, and RCP. llvm-svn: 171356	2013-01-02 08:00:39 +00:00
Craig Topper	4bc5c4e152	Merge SSE and AVX instruction definitions for PSHUFD/PSHUFHW/PSHUFLW. llvm-svn: 171355	2013-01-02 07:27:49 +00:00
Rafael Espindola	db1a84c84a	Revert 171351. It broke MC/X86/x86-32-avx.s. llvm-svn: 171352	2013-01-02 01:35:11 +00:00
Craig Topper	86d0cdb82f	Merge SSE and AVX instruction definitions for scalar forms of SQRT, RSQRT, and RCP. llvm-svn: 171351	2013-01-01 20:53:20 +00:00
Craig Topper	12ed9cd6ae	Remove unused argument from a multiclass. llvm-svn: 171340	2013-01-01 03:42:44 +00:00
Craig Topper	2edafc059d	Merge intrinsic instruction definitions for SSE and AVX versions of RCPPS and RSQRTPS. llvm-svn: 171339	2013-01-01 03:30:21 +00:00
Craig Topper	d04dbec6c9	Remove 2 unused multiclasses. llvm-svn: 171338	2013-01-01 02:02:45 +00:00
Craig Topper	7cc4f322cf	Merge AVX/SSE instruction definitions for SQRTPS/PD, RSQRTPS, RCPPS. No funcitonal change intended. llvm-svn: 171337	2013-01-01 00:11:07 +00:00
Craig Topper	c2521cd309	Use packed instead of scalar itineraries for SSE1/2 SQRTPS/PD, RCPPS, and RSQRTPS. VEX-encoded forms already use packed. llvm-svn: 171336	2012-12-31 23:49:05 +00:00
Bill Wendling	6e95ae803a	Remove the getAttributesAtIndex and getNumAttrs methods in favor of using the getAttrSomewhere predicate. This prevents the uses of 'Attribute' as a collection of attributes. llvm-svn: 171271	2012-12-31 00:49:59 +00:00
Nuno Lopes	b6ad98224a	convert a bunch of callers from DataLayout::getIndexedOffset() to GEP::accumulateConstantOffset(). The later API is nicer than the former, and is correct regarding wrap-around offsets (if anyone cares). There are a few more places left with duplicated code, which I'll remove soon. llvm-svn: 171259	2012-12-30 16:25:48 +00:00
Bill Wendling	749a43d874	Use the predicate methods off of AttributeSet instead of Attribute. llvm-svn: 171257	2012-12-30 13:50:49 +00:00
Bill Wendling	74dba875e2	Remove the Function::getRetAttributes method in favor of using the AttributeSet accessor method. llvm-svn: 171256	2012-12-30 13:01:51 +00:00
Bill Wendling	94dcaf8e2b	Remove Function::getParamAttributes and use the AttributeSet accessor methods instead. llvm-svn: 171255	2012-12-30 12:45:13 +00:00
Bill Wendling	698e84fc4f	Remove the Function::getFnAttributes method in favor of using the AttributeSet directly. This is in preparation for removing the use of the 'Attribute' class as a collection of attributes. That will shift to the AttributeSet class instead. llvm-svn: 171253	2012-12-30 10:32:01 +00:00
Bill Wendling	6190254e0f	s/hasAttribute/contains/g to be more consistent with other method names. llvm-svn: 171252	2012-12-30 09:17:46 +00:00
Craig Topper	fe82eb6bcd	Remove intrinsic specific instructions for (V)SQRTPS/PD. Instead lower to target-independent ISD nodes and use the existing patterns for those. llvm-svn: 171237	2012-12-29 18:18:20 +00:00
Craig Topper	f4a9c6e21b	Merge similar functionality using a nested switch. llvm-svn: 171229	2012-12-29 17:19:06 +00:00
Craig Topper	6b27251a76	Remove intrinsic specific instructions for SSE/SSE2/AVX floating point max/min instructions. Lower them to target specific nodes and use those patterns instead. This also allows them to be commuted if UnsafeFPMath is enabled. llvm-svn: 171227	2012-12-29 16:44:25 +00:00
Jakub Staszak	215f94143c	Simplify code, no functionality change. llvm-svn: 171226	2012-12-29 15:57:26 +00:00
Jakub Staszak	afe8109fce	Delete executive bit on ./lib/Target/Hexagon/HexagonAsmPrinter.h. llvm-svn: 171225	2012-12-29 15:23:06 +00:00
Nadav Rotem	9785f519b4	CostModel: initial checkin for code that estimates the cost of special shuffles. llvm-svn: 171180	2012-12-28 08:19:03 +00:00
Nadav Rotem	c982a2dc25	wrap 80-col lines. llvm-svn: 171179	2012-12-28 07:28:43 +00:00
Nadav Rotem	3da9ac72fa	AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend. llvm-svn: 171178	2012-12-28 05:45:24 +00:00
Nadav Rotem	68441914a5	Reverse the 'if' condition and reduce the indentation. llvm-svn: 171172	2012-12-27 23:08:05 +00:00
Craig Topper	ab2e6842cc	Merge basic_sse12_fp_binop_p_int and basic_sse12_fp_binop_p_y_int multiclasses. llvm-svn: 171171	2012-12-27 22:53:47 +00:00
Nadav Rotem	3b34190100	AVX/AVX2: Move the SEXT lowering code from a target specific DAGco to a lowering function. llvm-svn: 171170	2012-12-27 22:47:16 +00:00
Craig Topper	e2eec3c52b	Merge basic_sse12_fp_binop_p and basic_sse12_fp_binop_p_y multiclasses. llvm-svn: 171166	2012-12-27 18:51:50 +00:00
Nadav Rotem	2a054b4475	On AVX/AVX2 the type v8i1 is legalized to v8i16, which is an XMM sized register. In most cases we actually compare or select YMM-sized registers and mixing the two types creates horrible code. This commit optimizes some of the transition sequences. PR14657. llvm-svn: 171148	2012-12-27 08:15:45 +00:00
Nadav Rotem	8e5d80eba3	AVX/AVX2: Move the code that lowers vector-trunc from a DAGCo-hook to custom lowering hook. The vector truncs were scalarized during LegalizeVectorOps, later vectorized again by some DAGCombine optimization and finally, lowered by a dagcombing optimization. Now, they are properly lowered during LegalizeVectorOps. No new testcase because the original testcases still work. llvm-svn: 171146	2012-12-27 07:45:10 +00:00
Craig Topper	757f3fc394	Add hasSideEffects=0 to some forms of ROUND, RCP, and RSQRT. llvm-svn: 171143	2012-12-27 07:16:08 +00:00
Craig Topper	09ce4b9efe	Move single letter 'P' prefix out of multiclass now that tablegen allows defm to start with #NAME. This makes instruction names more searchable again. llvm-svn: 171141	2012-12-27 06:34:54 +00:00
Craig Topper	396cb795bc	Add hasSideEffects=0 to some shift and rotate instructions. None of which are currently used by code generation. llvm-svn: 171137	2012-12-27 03:35:44 +00:00
Craig Topper	c7910828e4	Mark the divide instructions as hasSideEffects=0. llvm-svn: 171136	2012-12-27 03:01:18 +00:00
Craig Topper	5b807aaa38	Add hasSideEffects=0 to CMP*rr_REV. llvm-svn: 171130	2012-12-27 02:08:46 +00:00
Craig Topper	89e8607755	Add mayLoad, mayStore, and hasSideEffects tags to BT/BTS/BTR/BTC instructions. Shouldn't change any functionality since they don't have patterns to select them. llvm-svn: 171128	2012-12-27 02:01:33 +00:00
Craig Topper	c557343956	Fix operands and encoding form for ARPL instruction. Register form had and reversed. Memory form writes memory, but was marked as MRMSrcMem. llvm-svn: 171123	2012-12-26 23:27:57 +00:00
Craig Topper	d47a70de9f	Add hasSideEffects=0 to some atomic instructions. llvm-svn: 171122	2012-12-26 23:08:12 +00:00
Craig Topper	af2372087b	Mark the AL/AX/EAX forms of the basic arithmetic operations has never having side effects. llvm-svn: 171121	2012-12-26 22:19:23 +00:00
Craig Topper	1b8c0750ee	Mark all the _REV instructions as not having side effects. They aren't really emitted by the backend, but it reduces the number of instructions in the output files with unmodelled side effects to make auditing easier. llvm-svn: 171118	2012-12-26 21:30:22 +00:00
Craig Topper	18f2675e9b	Remove a special conditional setting of neverHasSideEffects if the instruction didn't have a pattern. This was leftover from when tablegen used to complain if things were already inferred from patterns. llvm-svn: 171117	2012-12-26 21:04:30 +00:00
Craig Topper	24f316e4db	Merge still more SSE/AVX instruction definitions. llvm-svn: 171103	2012-12-26 07:54:43 +00:00
Craig Topper	af629e2700	Merge more SSE/AVX instruction definitions. llvm-svn: 171102	2012-12-26 07:20:35 +00:00
Craig Topper	65fe30450d	Fix 80 column violation. llvm-svn: 171097	2012-12-26 06:15:53 +00:00
Craig Topper	f4d0fe8fcd	Fix class name in comment. llvm-svn: 171096	2012-12-26 06:15:09 +00:00
Craig Topper	59747c4dbd	Merge SSE/AVX PCMPEQ/PCMPGT instruction definitions. llvm-svn: 171095	2012-12-26 06:14:15 +00:00
Craig Topper	8a48677586	Remove 'v' from mnemonic to fix asm matching failures. llvm-svn: 171093	2012-12-26 06:02:15 +00:00
Craig Topper	b4ef0fa3a1	Use an additional multiclass to merge the 128/256-bit SSE/AVX instruction definitions for a bunch of SSE2 integer arithmetic instructions. llvm-svn: 171092	2012-12-26 05:49:15 +00:00
Nadav Rotem	5267bb71b8	Reformat the docs. llvm-svn: 171091	2012-12-26 04:59:20 +00:00
Craig Topper	a2594dd5f0	Use an additional multiclass to merge the 128/256-bit SSE/AVX instruction definitions for PAND/POR/PXOR/PANDN llvm-svn: 171087	2012-12-26 04:36:03 +00:00
Craig Topper	97730a0d6a	Merge an AVX/SSE 256-bit and 128-bit multiclass. llvm-svn: 171086	2012-12-26 03:56:47 +00:00
Craig Topper	8b59746390	Mark VANDNPD/VANDNPDS as not commutable. llvm-svn: 171085	2012-12-26 03:48:10 +00:00
Craig Topper	81d1e596bb	Remove alignment from a bunch more VEX encoded operations in the folding tables. llvm-svn: 171082	2012-12-26 02:44:47 +00:00
Craig Topper	b2922164f0	Remove alignment from folding table for VMOVUPD as an unaligned instruction it shouldn't require alignment... llvm-svn: 171081	2012-12-26 02:14:19 +00:00
Craig Topper	d09a9af9b6	Remove alignment requirements from (V)EXTRACTPS. This instruction does 32-bit stores which aren't required to be aligned on SSE or AVX. llvm-svn: 171080	2012-12-26 01:47:12 +00:00
Craig Topper	caef1c5d86	Remove alignment requirement from VCVTSS2SD in folding tables. Reverting r171049. This instruction doesn't require alignment. llvm-svn: 171078	2012-12-26 00:35:47 +00:00
Hal Finkel	1b5ff08d43	Expand PPC64 atomic load and store Use of store or load with the atomic specifier on 64-bit types would cause instruction-selection failures. As with the 32-bit case, these can use the default expansion in terms of cmp-and-swap. llvm-svn: 171072	2012-12-25 17:22:53 +00:00
Benjamin Kramer	81b5a8fd2e	X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use of and commutativity. llvm-svn: 171064	2012-12-25 13:09:08 +00:00
Benjamin Kramer	df4af41b9b	X86: Custom lower <2 x i64> eq and ne when SSE41 is not available. pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack. Small speedup on loop-vectorized viterbi (-march=core2). llvm-svn: 171063	2012-12-25 12:54:19 +00:00
Nadav Rotem	00410ae625	VCVTSS2SD requires a strict alignment. Thanks Elena. llvm-svn: 171049	2012-12-25 03:29:18 +00:00
Nick Lewycky	521e0d59f3	Quiet gcc's -Wparenthesis warning. No functionality change. llvm-svn: 171044	2012-12-24 19:58:45 +00:00
Benjamin Kramer	9d46110ff1	Use a std::string rather than a dynamically allocated char* buffer. This affords us to use std::string's allocation routines and use the destructor for the memory management. Switching to that also means that we can use operator==(const std::string&, const char *) to perform the string comparison rather than resorting to libc functionality (i.e. strcmp). Patch by Saleem Abdulrasool! Differential Revision: http://llvm-reviews.chandlerc.com/D230 llvm-svn: 171042	2012-12-24 19:23:30 +00:00
Nadav Rotem	3ee6b10dd4	CostModel: We have API for checking the costs of known shuffles. This patch adds support for the insert-subvector and extract-subvector kinds. llvm-svn: 171027	2012-12-24 10:04:03 +00:00
Nadav Rotem	dc0ad92b64	Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. llvm-svn: 171024	2012-12-24 09:40:33 +00:00
Nadav Rotem	7e1599e100	Change the codegen Cost Model API for shuffeles. This patch removes the API for broadcast and adds a more general API that accepts an enum of known shuffles. llvm-svn: 171022	2012-12-24 08:57:47 +00:00
Nadav Rotem	cf9999d9d5	CostModel: Change the default target-independent implementation for finding the cost of arithmetic functions. We now assume that the cost of arithmetic operations that are marked as Legal or Promote is low, but ops that are marked as custom are higher. llvm-svn: 171002	2012-12-23 17:31:23 +00:00
Nadav Rotem	b15c69a725	whitespace llvm-svn: 170997	2012-12-23 07:33:44 +00:00
Nadav Rotem	1bef5a0509	Rename a function. llvm-svn: 170996	2012-12-23 07:30:09 +00:00
Nadav Rotem	2cade68025	Loop Vectorizer: Update the cost model of scatter/gather operations and make them more expensive. llvm-svn: 170995	2012-12-23 07:23:55 +00:00
Benjamin Kramer	76268ac682	X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available. pmuludq is slow, but it turns out that all the unpacking and packing of the scalarized mul is even slower. 10% speedup on loop-vectorized paq8p. llvm-svn: 170985	2012-12-22 16:07:56 +00:00
Benjamin Kramer	b2f0a2bd4b	X86: Emit vector sext as shuffle + sra if vpmovsx is not available. Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better than scalarized loads. Fixes PR14590. llvm-svn: 170984	2012-12-22 11:34:28 +00:00
Nadav Rotem	d5aae980cb	In some cases, due to scheduling constraints we copy the EFLAGS. The only way to read the eflags is using push and pop. If we don't adjust the stack then we run over the first frame index. This is not something that we want to do, so we have to make sure that our machine function does not copy the flags. If it does then we have to emit the prolog that adjusts the stack. rdar://12896831 llvm-svn: 170961	2012-12-21 23:48:49 +00:00
Akira Hatanaka	6ac2fc4976	[mips] Refactor subword-swap, EXT/INS, load-effective-address and read-hardware instructions. llvm-svn: 170956	2012-12-21 23:21:32 +00:00
Akira Hatanaka	beea8a34c3	[mips] Refactor SYNC and multiply/divide instructions. llvm-svn: 170955	2012-12-21 23:17:36 +00:00
Akira Hatanaka	31ddec5887	[mips] Refactor BAL instructions. llvm-svn: 170954	2012-12-21 23:15:59 +00:00
Akira Hatanaka	d6b694f036	[mips] Fix encoding of BAL instruction. Also, fix assembler test case which was not catching the error. llvm-svn: 170953	2012-12-21 23:13:59 +00:00
Akira Hatanaka	a158042a56	[mips] Refactor jump, jump register, jump-and-link and nop instructions. llvm-svn: 170952	2012-12-21 23:03:50 +00:00
Akira Hatanaka	e1826d7464	[mips] Refactor load/store left/right and load-link and store-conditional instructions. llvm-svn: 170950	2012-12-21 23:01:24 +00:00
Akira Hatanaka	d9bf8424e5	[mips] Refactor load/store instructions. llvm-svn: 170948	2012-12-21 22:58:55 +00:00
Akira Hatanaka	b59b047fbe	[mips] Remove unnecessary isPseudo parameter. llvm-svn: 170947	2012-12-21 22:57:26 +00:00
Akira Hatanaka	e738efc95b	[mips] Refactor LUI instruction. llvm-svn: 170944	2012-12-21 22:46:07 +00:00
Akira Hatanaka	895e1cb2aa	[mips] Refactor count leading zero or one instructions. llvm-svn: 170942	2012-12-21 22:43:58 +00:00
Akira Hatanaka	4f4c4aa05e	[mips] Refactor sign-extension-in-register instructions. llvm-svn: 170940	2012-12-21 22:41:52 +00:00
Akira Hatanaka	b14c6e4e5f	[mips] Refactor instructions which copy from and to HI/LO registers. llvm-svn: 170939	2012-12-21 22:39:17 +00:00
Akira Hatanaka	9e89195dce	[mips] Refactor logical NOR instructions. llvm-svn: 170937	2012-12-21 22:35:47 +00:00
Akira Hatanaka	ac10697207	[mips] Move instruction definitions in MipsInstrInfo.td. llvm-svn: 170936	2012-12-21 22:33:43 +00:00
Tom Stellard	09ef8425e9	R600: Coding style - remove empty spaces from the beginning of functions No functionality change. llvm-svn: 170923	2012-12-21 20:12:02 +00:00
Tom Stellard	41398026e7	R600: Fix MAX_UINT definition Patch by: Vadim Girlin Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170922	2012-12-21 20:12:01 +00:00
Tom Stellard	4fa7ac29f1	R600: Add SHADOWCUBE to TEX_SHADOW pattern Patch by: Vadim Girlin Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170921	2012-12-21 20:11:59 +00:00
Benjamin Kramer	5521b94b07	Cleanup compiler warnings on discarding type qualifiers in casts. Switch to C++ style casts. Patch by Saleem Abdulrasool! Differential Revision: http://llvm-reviews.chandlerc.com/D204 llvm-svn: 170917	2012-12-21 19:09:53 +00:00
Benjamin Kramer	82d1c371e2	X86: Match pmin/pmax as a target specific dag combine. This occurs during vectorization. Part of PR14667. llvm-svn: 170908	2012-12-21 17:46:58 +00:00
Roman Divacky	a229186a82	Remove duplicate includes. llvm-svn: 170902	2012-12-21 17:06:44 +00:00
Tom Stellard	a8b0351720	R600: Expand vec4 INT <-> FP conversions llvm-svn: 170901	2012-12-21 16:33:24 +00:00
Benjamin Kramer	4669d18893	X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics This is very mechanical, no functionality change. Preparation for PR14667. llvm-svn: 170898	2012-12-21 14:04:55 +00:00
Nadav Rotem	eacbb731d3	Add a missing "virtual" keyword. llvm-svn: 170842	2012-12-21 05:02:12 +00:00
Quentin Colombet	b1b66e7a25	Add ARM cortex-r5 subtarget. llvm-svn: 170840	2012-12-21 04:35:05 +00:00
Nadav Rotem	6d4fdd6d2c	Improve the X86 cost model for loads and stores. llvm-svn: 170830	2012-12-21 01:33:59 +00:00
Nadav Rotem	a4b53f20a3	BB-Vectorizer: Check the cost of the store pointer type and not the return type, which is void. A number of test cases fail after adding the assertion in TTImpl. llvm-svn: 170828	2012-12-21 01:24:36 +00:00
Reed Kotler	9bff1ead0e	Call llvm_unreachable instead of assert. llvm-svn: 170822	2012-12-21 00:44:59 +00:00
Jakob Stoklund Olesen	33f5d1492d	Add an MF argument to MI::copyImplicitOps(). This function is often used to decorate dangling instructions, so a context reference is required to allocate memory for the operands. Also add a corresponding MachineInstrBuilder method. llvm-svn: 170797	2012-12-20 22:54:02 +00:00
Jakob Stoklund Olesen	2ea203694d	MachineInstrBuilderize ARM. llvm-svn: 170795	2012-12-20 22:53:55 +00:00
Jakob Stoklund Olesen	4255c96aed	MachineInstrBuilderize NVPTX. llvm-svn: 170794	2012-12-20 22:53:53 +00:00
Bob Wilson	7bba4f8957	Revert "Adding support for llvm.arm.neon.vaddl[su].* and" This reverts r170694. The operations can be represented in IR without adding any new intrinsics. llvm-svn: 170765	2012-12-20 21:09:38 +00:00
Evan Cheng	ddc0cb6dc5	On some ARM cpus, flags setting movs with shifter operand, i.e. lsl, lsr, asr, are more expensive than the non-flag setting variant. Teach thumb2 size reduction pass to avoid generating them unless we are optimizing for size. rdar://12892707 llvm-svn: 170728	2012-12-20 19:59:30 +00:00
Roman Divacky	ff95a1dc12	Remove MCTargetAsmLexer and its derived classes now that edis, its only user, is gone. llvm-svn: 170699	2012-12-20 14:43:30 +00:00
Renato Golin	6b2ea4a48f	Adding support for llvm.arm.neon.vaddl[su].* and llvm.arm.neon.vsub[su].* intrinsics. Patch by Pete Couperus <pjcoup@gmail.com> llvm-svn: 170694	2012-12-20 13:52:11 +00:00
Reed Kotler	d11acc7dc0	Implement cfi_def_cfa_offset. "Make check" test case for this comming in the next few days but it's already tested a lot from test-suite and works fine. This patch completes almost 100% pass of test-suite for mips 16. llvm-svn: 170674	2012-12-20 06:59:37 +00:00
Reed Kotler	8965d24a2a	There is one more patch to finish large frames. Make sure we assert on code that has large frames which will not yet compile correctly. llvm-svn: 170673	2012-12-20 06:57:00 +00:00
Jyotsna Verma	56605448f2	Add constant extender support to GP-relative load/store instructions. llvm-svn: 170672	2012-12-20 06:52:46 +00:00
Jyotsna Verma	bf75aaf53e	Add TSFlags to ALU32 type instructions for constant-extender/Relationship maps. llvm-svn: 170671	2012-12-20 06:45:39 +00:00
Reed Kotler	7bff8f1d7a	set register class properly for mips16 here llvm-svn: 170669	2012-12-20 06:06:35 +00:00
Rafael Espindola	fb8ac2df09	Undefine PPC harder. This was causing a build failure while trying to build on ppc ubuntu 12.10 with cmake. llvm-svn: 170668	2012-12-20 05:13:09 +00:00
Reed Kotler	92fc33bc97	This assert is overly restrictive and does not work for mips16. llvm-svn: 170667	2012-12-20 05:09:15 +00:00
Reed Kotler	fd633229f7	Turn on register scavenger for Mips 16 We use an unused Mips 32 register for the emergency slot instead of using the stack. llvm-svn: 170665	2012-12-20 04:44:58 +00:00
Akira Hatanaka	e7f1acc7c0	[mips] Refactor SLT (set on less than) instructions. Separate encoding information from the rest. llvm-svn: 170664	2012-12-20 04:27:52 +00:00
Akira Hatanaka	bbd197e9c4	[mips] Refactor unconditional branch instruction. Separate encoding information from the rest. llvm-svn: 170663	2012-12-20 04:22:39 +00:00
Akira Hatanaka	b1527b7505	[mips] Remove asm string parameter from pseudo instructions. Add InstrItinClass parameter. llvm-svn: 170661	2012-12-20 04:20:09 +00:00
Akira Hatanaka	14f9ce0f83	[mips] Delete definition of CPRESTORE instruction. llvm-svn: 170660	2012-12-20 04:15:30 +00:00
Akira Hatanaka	c0ea0bb99b	[mips] Refactor conditional branch instructions with one register operand. Separate encoding information from the rest. llvm-svn: 170659	2012-12-20 04:13:23 +00:00
Akira Hatanaka	f71ffd29d9	[mips] Refactor conditional branch instructions with two register operands. Separate encoding information from the rest. llvm-svn: 170657	2012-12-20 04:10:13 +00:00
Reed Kotler	d019dbf75e	fix most of remaining issues with large frames. these patches are tested a lot by test-suite but make check tests are forthcoming once the next few patches that complete this are committed. with the next few patches the pass rate for mips16 is near 100% llvm-svn: 170656	2012-12-20 04:07:42 +00:00
Akira Hatanaka	f423672117	[mips] Use "or $r0, $r1, $zero" instead of "addu $r0, $zero, $r1" to copy physical register $r1 to $r0. GNU disassembler recognizes an "or" instruction as a "move", and this change makes the disassembled code easier to read. Original patch by Reed Kotler. llvm-svn: 170655	2012-12-20 04:06:06 +00:00
Richard Smith	15b1e3727b	Fix use-before-construction of X86TargetLowering. llvm-svn: 170654	2012-12-20 04:04:17 +00:00
Akira Hatanaka	7d75f9e3d3	[mips] Change the order of template parameters. Move the default parameters to the end. llvm-svn: 170651	2012-12-20 03:52:08 +00:00
Akira Hatanaka	244f9e874c	[mips] Refactor shift instructions with register operands. Separate encoding information from the rest. llvm-svn: 170650	2012-12-20 03:48:24 +00:00
Akira Hatanaka	7f96ad325f	[mips] Refactor shift immediate instructions. Separate encoding information from the rest. llvm-svn: 170649	2012-12-20 03:44:41 +00:00
Akira Hatanaka	ab1b715bf2	[mips] Refactor arithmetic and logic instructions with immediate operands. Separate encoding information from the rest. llvm-svn: 170648	2012-12-20 03:40:03 +00:00
Akira Hatanaka	1b37c4af01	[mips] Refactor arithmetic and logic instructions. Separate encoding information from the rest. llvm-svn: 170647	2012-12-20 03:34:05 +00:00
Akira Hatanaka	73495897b1	[mips] Delete ArithOverflowR and ArithOverflow and use ArithLogicR and ArithLogicI as the instruction base classes. llvm-svn: 170642	2012-12-20 03:00:16 +00:00
NAKAMURA Takumi	2a0b40f584	Target/R600: Update MIB according to r170588. llvm-svn: 170620	2012-12-20 00:22:11 +00:00
Jim Grosbach	6df94846ec	MC: Add MCInstrDesc::mayAffectControlFlow() method. MC disassembler clients (LLDB) are interested in querying if an instruction may affect control flow other than by virtue of being an explicit branch instruction. For example, instructions which write directly to the PC on some architectures. llvm-svn: 170610	2012-12-19 23:38:53 +00:00
Tom Stellard	1c315d5411	R600: Remove unecessary VREG alignment. Unlike SGPRs VGPRs doesn't need to be aligned. Patch by: Christian König Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 170593	2012-12-19 22:10:34 +00:00
Tom Stellard	e7b907d85c	R600: control flow optimization Branch if we have enough instructions so that it makes sense. Also remove branches if they don't make sense. Patch by: Christian König Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 170592	2012-12-19 22:10:33 +00:00
Tom Stellard	f8794354b2	R600: New control flow for SI v2 This patch replaces the control flow handling with a new pass which structurize the graph before transforming it to machine instruction. This has a couple of different advantages and currently fixes 20 piglit tests without a single regression. It is now a general purpose transformation that could be not only be used for SI/R6xx, but also for other hardware implementations that use a form of structurized control flow. v2: further cleanup, fixes and documentation Patch by: Christian König Signed-off-by: Christian König <deathsimple@vodafone.de> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 170591	2012-12-19 22:10:31 +00:00
Jakob Stoklund Olesen	b159b5ff0d	Remove the explicit MachineInstrBuilder(MI) constructor. Use the version that also takes an MF reference instead. It would technically be possible to extract an MF reference from the MI as MI->getParent()->getParent(), but that would not work for MIs that are not inserted into any basic block. Given the reasonably small number of places this constructor was used at all, I preferred the compile time check to a run time assertion. llvm-svn: 170588	2012-12-19 21:31:56 +00:00
Evan Cheng	eae6d2ccea	LLVM sdisel normalize bit extraction of the form: ((x & 0xff00) >> 8) << 2 to (x >> 6) & 0x3fc This is general goodness since it folds a left shift into the mask. However, the trailing zeros in the mask prevents the ARM backend from using the bit extraction instructions. And worse since the mask materialization may require an addition instruction. This comes up fairly frequently when the result of the bit twiddling is used as memory address. e.g. = ptr[(x & 0xFF0000) >> 16] We want to generate: ubfx r3, r1, #16, #8 ldr.w r3, [r0, r3, lsl #2] vs. mov.w r9, #1020 and.w r2, r9, r1, lsr #14 ldr r2, [r0, r2] Add a late ARM specific isel optimization to ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the 'base + offset' address computation; change the mask to one which doesn't have trailing zeros and enable the use of ubfx. Note the optimization has to be done late since it's target specific and we don't want to change the DAG normalization. It's also fairly restrictive as shifter operands are not always free. It's only done for lsh 1 / 2. It's known to be free on some cpus and they are most common for address computation. This is a slight win for blowfish, rijndael, etc. rdar://12870177 llvm-svn: 170581	2012-12-19 20:16:09 +00:00
Roman Divacky	e3d323052f	Remove edis - the enhanced disassembler. Fixes PR14654. llvm-svn: 170578	2012-12-19 19:55:47 +00:00
Paul Redmond	5917f4c715	Transform (x&C)>V into (x&C)!=0 where possible When the least bit of C is greater than V, (x&C) must be greater than V if it is not zero, so the comparison can be simplified. Although this was suggested in Target/X86/README.txt, it benefits any architecture with a directly testable form of AND. Patch by Kevin Schoedel llvm-svn: 170576	2012-12-19 19:47:13 +00:00
Benjamin Kramer	c5071466d4	PowerPC: Expand VSELECT nodes. There's probably a better expansion for those nodes than the default for altivec, but this is better than crashing. VSELECTs occur in loop vectorizer output. llvm-svn: 170551	2012-12-19 15:49:14 +00:00
Patrik Hagglund	e09cac9a67	Change TargetLowering::getTypeForExtArgOrReturn to take and return MVTs, instead of EVTs. llvm-svn: 170537	2012-12-19 12:02:25 +00:00
Patrik Hagglund	bad545ccba	Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of EVTs. llvm-svn: 170535	2012-12-19 11:48:16 +00:00
Patrik Hagglund	f9eb168ef4	Change TargetLowering::findRepresentativeClass to take an MVT, instead of EVT. llvm-svn: 170532	2012-12-19 11:30:36 +00:00
NAKAMURA Takumi	89209462fe	X86ISelLowering.cpp: Fix warnings. [-Wlogical-op-parentheses] llvm-svn: 170523	2012-12-19 10:12:48 +00:00
Elena Demikhovsky	14a4af0e66	Optimized load + SIGN_EXTEND patterns in the X86 backend. llvm-svn: 170506	2012-12-19 07:50:20 +00:00
Bill Wendling	3d7b0b8ac7	Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future. llvm-svn: 170502	2012-12-19 07:18:57 +00:00
Reed Kotler	3aad762d1d	Add some missing Defs and Uses. llvm-svn: 170493	2012-12-19 04:06:15 +00:00
Jakub Staszak	338863a546	Reverse order of checking SSE level when calculating compare cost, so we check AVX2 before AVX. llvm-svn: 170464	2012-12-18 22:57:56 +00:00
Quentin Colombet	23b404d5ad	Disable ARM partial flag dependency optimization at -Oz To not over constrain the scheduler for ARM in thumb mode, some optimizations for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency). Disables this check when code size is the priority, i.e., code is compiled with -Oz. llvm-svn: 170462	2012-12-18 22:47:16 +00:00
Eli Bendersky	39e7c6e370	Get rid of the pesky -Woverloaded-virtual warning. No change in functionality. llvm-svn: 170438	2012-12-18 18:21:29 +00:00
Jakob Stoklund Olesen	41bbf9c256	Repair bundles that were broken by removing and reinserting the first instruction. This isn't strictly necessary at the moment because Thumb2SizeReduction also copies all MI flags from the old instruction to the new. However, a future patch will make that kind of direct flag tampering illegal. llvm-svn: 170395	2012-12-18 00:46:39 +00:00
Jakob Stoklund Olesen	43b1e13386	Extract a method, no functional change intended. Sadly, this costs us a perfectly good opportunity to use 'goto'. llvm-svn: 170385	2012-12-18 00:13:11 +00:00
Chad Rosier	150d35bc1d	[arm fast-isel] Minor cleanup. No functional change intended. llvm-svn: 170379	2012-12-17 22:35:29 +00:00
Chad Rosier	62a144f099	[arm fast-isel] Fast-isel only handles simple VTs, so make sure the necessary checks are in place. Some minor cleanup as well. llvm-svn: 170360	2012-12-17 19:59:43 +00:00
Richard Osborne	459e35c261	Add instruction encodings / disassembly support for l2r instructions. llvm-svn: 170345	2012-12-17 16:28:02 +00:00
Tom Stellard	5a6879466a	R600: enable S_N2_ instructions They seem to work fine. Patch by: Christian König Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 170343	2012-12-17 15:14:56 +00:00
Tom Stellard	9e90b5895d	R600: BB operand support for SI Patch by: Christian König Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 170342	2012-12-17 15:14:54 +00:00
Tom Stellard	16a17c6d3e	R600: remove nonsense setPrefLoopAlignment The Align parameter is a power of two, so 16 results in 64K alignment. Additional to that even 16 byte alignment doesn't make any sense, so just remove it. Patch by: Christian König Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <deathsimple@vodafone.de> llvm-svn: 170341	2012-12-17 15:14:53 +00:00
Patrik Hagglund	c494d24a68	Revert/correct some FastISel changes in r170104 (EVT->MVT for TargetLowering::getRegClassFor). Some isSimple() guards were missing, or getSimpleVT() were hoisted too far, resulting in asserts on valid LLVM assembly input. llvm-svn: 170336	2012-12-17 14:30:06 +00:00
Richard Osborne	51bf1b269a	Add instruction encodings for PEEK and ENDIN. Previously these were marked with the wrong format. llvm-svn: 170334	2012-12-17 14:23:54 +00:00
Richard Osborne	c104bf2769	Fix parameter name in prototypes in XCoreDisassembler. llvm-svn: 170332	2012-12-17 13:55:49 +00:00
Richard Osborne	041071c558	Add instruction encodings / disassembly support for rus instructions. llvm-svn: 170330	2012-12-17 13:50:04 +00:00
Richard Osborne	e405e58639	Add instruction encodings for ZEXT and SEXT. Previously these were marked with the wrong format. llvm-svn: 170327	2012-12-17 13:20:37 +00:00
Richard Osborne	3a0d5cc314	Add instruction encodings / disassembly support for 2r instructions. llvm-svn: 170323	2012-12-17 12:29:31 +00:00
Richard Osborne	016967e4ff	Add instruction encodings / disassembly support for 0r instructions. llvm-svn: 170322	2012-12-17 12:26:29 +00:00
Richard Osborne	1cc2b68ad6	Simplify assertion in XCoreInstPrinter. llvm-svn: 170321	2012-12-17 12:13:46 +00:00
Richard Osborne	4e1e14bccd	Update comments to match recommended doxygen style. llvm-svn: 170320	2012-12-17 12:13:41 +00:00
Richard Osborne	eb31fa483e	Remove unnecessary include. llvm-svn: 170319	2012-12-17 12:13:32 +00:00
Craig Topper	354ed773b8	Remove EFLAGS from the BLSI/BLSMSK/BLSR patterns. The nodes created by DAG combine don't contain an EFLAGS def. llvm-svn: 170308	2012-12-17 06:13:48 +00:00
Craig Topper	f3ff6ae066	Simplify BMI ANDN matching to use patterns instead of a DAG combine. Also add ANDN to isDefConvertible. llvm-svn: 170305	2012-12-17 05:12:30 +00:00
Craig Topper	f924a58af1	Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt. llvm-svn: 170304	2012-12-17 05:02:29 +00:00
Craig Topper	5b08cf7736	Remove store forms of DEC/INC from isDefConvertible. Since they are stores they don't have a register def. llvm-svn: 170303	2012-12-17 04:55:07 +00:00
Richard Osborne	1b5562ad8e	Add instruction encodings and disassembly for 1r instructions. llvm-svn: 170293	2012-12-16 17:37:34 +00:00
Richard Osborne	e31735a52b	Add XCore disassembler. Currently there is no instruction encoding info and XCoreDisassembler::getInstruction() always returns Fail. I intend to add instruction encodings and tests in follow on commits. llvm-svn: 170292	2012-12-16 17:29:14 +00:00
Richard Osborne	872f51e301	Remove invalid instruction encodings. llvm-svn: 170291	2012-12-16 16:46:31 +00:00
Richard Osborne	e298556706	Mark anything deriving from PseudoInstXCore as a pseudo instruction. llvm-svn: 170290	2012-12-16 16:46:28 +00:00
Richard Osborne	f12cb9ef27	Set instruction size correctly in XCoreInstrFormats.td llvm-svn: 170289	2012-12-16 16:46:24 +00:00
Richard Osborne	3c31e21837	Change XCoreAsmPrinter to lower MachineInstrs to MCInsts before emission. This change adds XCoreMCInstLower to do the lowering to MCInst and XCoreInstPrinter to print the MCInsts. llvm-svn: 170288	2012-12-16 16:20:48 +00:00
Richard Osborne	b1de9f7e07	Replace ${:comment} with the comment symbol. llvm-svn: 170286	2012-12-16 15:59:02 +00:00
Reed Kotler	aee4d5d194	This patch is needed to make c++ exceptions work for mips16. Mips16 is really a processor decoding mode (ala thumb 1) and in the same program, mips16 and mips32 functions can exist and can call each other. If a jal type instruction encounters an address with the lower bit set, then the processor switches to mips16 mode (if it is not already in it). If the lower bit is not set, then it switches to mips32 mode. The linker knows which functions are mips16 and which are mips32. When relocation is performed on code labels, this lower order bit is set if the code label is a mips16 code label. In general this works just fine, however when creating exception handling tables and dwarf, there are cases where you don't want this lower order bit added in. This has been traditionally distinguished in gas assembly source by using a different syntax for the label. lab1: ; this will cause the lower order bit to be added lab2=. ; this will not cause the lower order bit to be added In some cases, it does not matter because in dwarf and debug tables the difference of two labels is used and in that case the lower order bits subtract each other out. To fix this, I have added to mcstreamer the notion of a debuglabel. The default is for label and debug label to be the same. So calling EmitLabel and EmitDebugLabel produce the same result. For various reasons, there is only one set of labels that needs to be modified for the mips exceptions to work. These are the "$eh_func_beginXXX" labels. Mips overrides the debug label suffix from ":" to "=." . This initial patch fixes exceptions. More changes most likely will be needed to DwarfCFException to make all of this work for actual debugging. These changes will be to emit debug labels in some places where a simple label is emitted now. Some historical discussion on this from gcc can be found at: http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00623.html http://gcc.gnu.org/ml/gcc-patches/2008-11/msg01273.html llvm-svn: 170279	2012-12-16 04:00:45 +00:00
Benjamin Kramer	b16ccde7a4	X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible. We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases if y is a constant. DAGCombiner canonicalizes those so we first have to undo the canonicalization for those cases. The pattern occurs in gzip when the loop vectorizer is enabled. Part of PR14613. llvm-svn: 170273	2012-12-15 16:47:44 +00:00
Chandler Carruth	7a28f95419	Make '-mtune=x86_64' assume fast unaligned memory accesses. Not all chips targeted by x86_64 have this feature, but a dramatically increasing number do. Specifying a chip-specific tuning parameter will continue to turn the feature on or off as appropriate for that particular chip, but the generic flag should try to achieve the best performance on the most widely available hardware. Today, the number of chips with fast UA access dwarfs those without in the x86-64 space. Note that this also brings LLVM's code generation for this '-march' flag more in line with that of modern GCCs. Reviewed by Dan Gohman. llvm-svn: 170269	2012-12-15 09:01:13 +00:00
Reed Kotler	5fdeb21249	This code implements most of mips16 hardfloat as it is done by gcc. In this case, essentially it is soft float with different library routines. The next step will be to make this fully interoperational with mips32 floating point and that requires creating stubs for functions with signatures that contain floating point types. I have a more sophisticated design for mips16 hardfloat which I hope to implement at a later time that directly does floating point without the need for function calls. The mips16 encoding has no floating point instructions so one needs to switch to mips32 mode to execute floating point instructions. llvm-svn: 170259	2012-12-15 00:20:05 +00:00
Kevin Enderby	06aa3eb8ce	Make sure the alternate PC+imm syntax of LDR instruction with a small immediate generates the narrow version. Needed when doing round-trip assemble/disassemble testing using the alternate syntax that specifies 'pc' directly. llvm-svn: 170255	2012-12-14 23:04:25 +00:00
Nadav Rotem	8487537bdb	TypeLegalizer: Do not generate target specific nodes with illegal types, because we cant type-legalize them. llvm-svn: 170245	2012-12-14 21:20:37 +00:00
Bill Schmidt	a4f898448c	This patch removes some nondeterminism from direct object file output for TLS dynamic models on 64-bit PowerPC ELF. The default sort routine for relocations only sorts on the r_offset field; but with TLS, there can be two relocations with the same r_offset. For PowerPC, this patch sorts secondarily on descending r_type, which matches the behavior expected by the linker. llvm-svn: 170237	2012-12-14 20:28:38 +00:00
Bill Schmidt	9f0b4ec0f5	This patch improves the 64-bit PowerPC InitialExec TLS support by providing for a wider range of GOT entries that can hold thread-relative offsets. This matches the behavior of GCC, which was not documented in the PPC64 TLS ABI. The ABI will be updated with the new code sequence. Former sequence: ld 9,x@got@tprel(2) add 9,9,x@tls New sequence: addis 9,2,x@got@tprel@ha ld 9,x@got@tprel@l(9) add 9,9,x@tls Note that a linker optimization exists to transform the new sequence into the shorter sequence when appropriate, by replacing the addis with a nop and modifying the base register and relocation type of the ld. llvm-svn: 170209	2012-12-14 17:02:38 +00:00
Shuxin Yang	97e07bf211	Remove two popcount patterns which we are already able to recognize. llvm-svn: 170158	2012-12-13 23:16:19 +00:00
Bill Schmidt	9ed4dbcb75	This is another cleanup patch for 64-bit PowerPC TLS processing. I had some hackery in place that hid my poor use of TblGen, which I've now sorted out and cleaned up. No change in observable behavior, so no new test cases. llvm-svn: 170149	2012-12-13 20:57:10 +00:00
Tom Stellard	6975d35979	Fix warnings with -DNDEBUG Patch by: NAKAMURA Takumi llvm-svn: 170142	2012-12-13 19:38:52 +00:00
Bill Schmidt	732eb91f05	This is just a clean-up patch that simplifies the initial-exec TLS logic by avoiding use of machine operand flags. No change in observable behavior, so no new test cases. llvm-svn: 170141	2012-12-13 18:45:54 +00:00
Patrik Hagglund	5e6c361bc0	Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. This is the second attempt. In the first attempt (r169837), a few getSimpleVT() were hoisted too far, detected by bootstrap failures. llvm-svn: 170104	2012-12-13 06:34:11 +00:00
Akira Hatanaka	cf9a61b6ee	[mips] Do not copy GOT address to register $gp if the function being called has internal linkage. llvm-svn: 170092	2012-12-13 03:17:29 +00:00
Eric Christopher	80882db88f	Add a way of printing out an arbitrary label name for a section given the section. llvm-svn: 170087	2012-12-13 03:00:35 +00:00
Akira Hatanaka	b2cc8a756f	[mips] Delete all floating point instruction classes that are no longer used. No functionality change. llvm-svn: 170084	2012-12-13 02:05:02 +00:00
Akira Hatanaka	6262bbf819	[mips] Modify definitions of floating point conditional move instructions. No functionality change. llvm-svn: 170080	2012-12-13 01:41:15 +00:00
Akira Hatanaka	79e1cdb00b	[mips] Modify definitions of floating point comparison instructions. No functionality change. llvm-svn: 170077	2012-12-13 01:34:09 +00:00
Akira Hatanaka	fd9163b74c	[mips] Modify definitions of floating point branch instructions. No functionality change. llvm-svn: 170076	2012-12-13 01:32:36 +00:00
Akira Hatanaka	cd3dfd238e	[mips] Modify definitions of floating point indexed load and store instructions. No functionality change. llvm-svn: 170075	2012-12-13 01:30:49 +00:00
Akira Hatanaka	b0d4acbc65	[mips] Modify definitions of floating point multiply-add/sub instructions. No functionality change. llvm-svn: 170073	2012-12-13 01:27:48 +00:00
Akira Hatanaka	92994f4846	[mips] Modify definitions of floating point load and store instructions. No functionality change. llvm-svn: 170072	2012-12-13 01:24:00 +00:00
Akira Hatanaka	2b75dde5fa	[mips] Modify definitions of move from/to coprocessor instructions. No functionality change. llvm-svn: 170071	2012-12-13 01:16:49 +00:00
Akira Hatanaka	dea8f61ae0	[mips] Modify definitions of two register operand floating point instructions. No functionality change. llvm-svn: 170069	2012-12-13 01:14:07 +00:00
Akira Hatanaka	29b513871a	[mips] Modify definitions of three register operand floating point instructions and separate encoding information from the rest. llvm-svn: 170066	2012-12-13 01:07:37 +00:00
Jakob Stoklund Olesen	436eea9833	Avoid setIsInsideBundle in Target/R600. This function is going to be removed. llvm-svn: 170064	2012-12-13 00:59:38 +00:00
Akira Hatanaka	84693d5606	[mips] Move classes that do not belong in MipsInstrFormats.td into MipsInstrFPU.td. llvm-svn: 170061	2012-12-13 00:49:23 +00:00
Akira Hatanaka	db49b39200	[mips] Set isCommutable flag in a more explicit way. llvm-svn: 170060	2012-12-13 00:46:23 +00:00
Akira Hatanaka	193e1f738a	[mips] Remove fmt from the parameter list of classes FMADDSUB and FNMADDSUB. llvm-svn: 170057	2012-12-13 00:38:59 +00:00
Akira Hatanaka	caaf4dd516	[mips] Remove single-precision floating point instruction from multiclass FFR2P_M. llvm-svn: 170055	2012-12-13 00:35:54 +00:00
Akira Hatanaka	02ec5516f8	[mips] Move class IsCommutable into MipsInstrInfo.td. llvm-svn: 170054	2012-12-13 00:32:01 +00:00
Akira Hatanaka	e986a59ad9	[mips] Remove single-precision floating point instructions from multiclasses FFR1_W_M and FFR1P_M. The new instruction definitions have one-to-one correspondence with the instructions in the ISA manual. llvm-svn: 170053	2012-12-13 00:29:29 +00:00
Eli Bendersky	b2022f3a5a	Fix a bogus comment llvm-svn: 170052	2012-12-13 00:24:56 +00:00
Akira Hatanaka	7bc144c366	[mips] Fix a memory leak bug report by NAKAMURA Takumi. llvm-svn: 170012	2012-12-12 20:09:58 +00:00
Bill Schmidt	24b8dd6eb7	This patch implements local-dynamic TLS model support for the 64-bit PowerPC target. This is the last of the four models, so we now have full TLS support. This is mostly a straightforward extension of the general dynamic model. I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the register copy following ADDI_TLSLD_L; otherwise everything above the ADDIS_DTPREL_HA appeared dead and was removed. As before, there are new test cases to test the assembly generation, and the relocations output during integrated assembly. The expected code gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll. There are a couple of things I think can be done more efficiently in the overall TLS code, so there will likely be a clean-up patch forthcoming; but for now I want to be sure the functionality is in place. Bill llvm-svn: 170003	2012-12-12 19:29:35 +00:00
Logan Chien	4dd14fb5eb	Add ARM NONE and PREL31 relocation types. Add R_ARM_NONE and R_ARM_PREL31 relocation types to MCExpr. Both of them will be used while generating .ARM.extab and .ARM.exidx sections. llvm-svn: 169965	2012-12-12 07:14:46 +00:00
NAKAMURA Takumi	85292a1338	[CMake] Fixup R600. llvm-svn: 169962	2012-12-12 03:34:26 +00:00
Evan Cheng	962711ee71	Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I mention the inline memcpy / memset expansion code is a mess? This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset. The first indicates whether it is expanding a memset or a memcpy / memmove. The later is whether the memset is a memset of zero. It's totally possible (likely even) that targets may want to do different things for memcpy and memset of zero. llvm-svn: 169959	2012-12-12 02:34:41 +00:00
Evan Cheng	c3d1aca657	- Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term. Also added more comments to explain why it is generally ok to return true. - Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to be true for loaded source (memcpy) or zero constants (memset). The poor name choice is probably some kind of legacy issue. llvm-svn: 169954	2012-12-12 01:32:07 +00:00
Evan Cheng	04e5518783	Avoid using lossy load / stores for memcpy / memset expansion. e.g. f64 load / store on non-SSE2 x86 targets. llvm-svn: 169944	2012-12-12 00:42:09 +00:00
Jim Grosbach	647c702780	Trim unneeded header #include. llvm-svn: 169933	2012-12-11 23:39:51 +00:00
Jim Grosbach	0ddedcc560	ARM: Remove old testing option. Pre-regalloc frame allocation and referencing has been on by default for ages. No need for the testing option that disables it. llvm-svn: 169931	2012-12-11 23:31:12 +00:00
Jim Grosbach	1197889c44	ARM: Remove old testing options. Base pointer referencing has been enabled for ages. llvm-svn: 169930	2012-12-11 23:31:10 +00:00
Evan Cheng	eb54240dc2	Replace TargetLowering::isIntImmLegal() with ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined term for something like integer immediate materialization. It is always possible to materialize an integer immediate. Whether to use it for memcpy expansion is more a "cost" conceern. llvm-svn: 169929	2012-12-11 23:26:14 +00:00
Tom Stellard	75aadc2813	Add R600 backend A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX llvm-svn: 169915	2012-12-11 21:25:42 +00:00
Bill Schmidt	c56f1d34bc	This patch implements the general dynamic TLS model for 64-bit PowerPC. Given a thread-local symbol x with global-dynamic access, the generated code to obtain x's address is: Instruction Relocation Symbol addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x R_PPC64_REL24 __tls_get_addr nop <use address in r3> The implementation borrows from the medium code model work for introducing special forms of ADDIS and ADDI into the DAG representation. This is made slightly more complicated by having to introduce a call to the external function __tls_get_addr. Using the full call machinery is overkill and, more importantly, makes it difficult to add a special relocation. So I've introduced another opcode GET_TLS_ADDR to represent the function call, and surrounded it with register copies to set up the parameter and return value. Most of the code is pretty straightforward. I ran into one peculiarity when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like BL8_NOP_ELF except that it takes another parameter to represent the symbol ("x" above) that requires a relocation on the call. Something in the TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated identically during the emit phase, so this second operand was never visited to generate relocations. This is the reason for the slightly messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding(). Two new tests are included to demonstrate correct external assembly and correct generation of relocations using the integrated assembler. Comments welcome! Thanks, Bill llvm-svn: 169910	2012-12-11 20:30:11 +00:00
Patrik Hagglund	e98b7a0389	Revert EVT->MVT changes, r169836-169851, due to buildbot failures. llvm-svn: 169854	2012-12-11 11:14:33 +00:00
Patrik Hagglund	ad432a8e70	Change TargetLowering::getTypeForExtArgOrReturn to take and return MVTs, instead of EVTs. Accordingly, add bitsLT (and similar) to MVT. llvm-svn: 169850	2012-12-11 10:20:51 +00:00
Patrik Hagglund	03e9628cfa	Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of EVTs. llvm-svn: 169848	2012-12-11 10:09:23 +00:00
Patrik Hagglund	8d2e7cf561	Change TargetLowering::findRepresentativeClass to take an MVT, instead of EVT. llvm-svn: 169845	2012-12-11 09:57:18 +00:00
Patrik Hagglund	3708e548f8	Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. llvm-svn: 169837	2012-12-11 09:10:33 +00:00
NAKAMURA Takumi	99feb75cb8	[CMake] Remove dependencies to intrinsics_gen I introduced in r169724. llvm-svn: 169819	2012-12-11 05:53:54 +00:00
Jyotsna Verma	92e71918b1	Use multiclass for new-value store instructions with MEMri operand. llvm-svn: 169814	2012-12-11 05:12:25 +00:00
Evan Cheng	c2bd620fac	Stylistic tweak. llvm-svn: 169811	2012-12-11 02:31:57 +00:00
Chad Rosier	df42cf39ab	Fall back to the selection dag isel to select tail calls. This shouldn't affect codegen for -O0 compiles as tail call markers are not emitted in unoptimized compiles. Testing with the external/internal nightly test suite reveals no change in compile time performance. Testing with -O1, -O2 and -O3 with fast-isel enabled did not cause any compile-time or execution-time failures. All tests were performed on my x86 machine. I'll monitor our arm testers to ensure no regressions occur there. In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue and objc_retainAutoreleaseReturnValue as tail calls unconditionally. While it's theoretically true that this is just an optimization, it's an optimization that we very much want to happen even at -O0, or else ARC applications become substantially harder to debug. Part of rdar://12553082 llvm-svn: 169796	2012-12-11 00:18:02 +00:00
Evan Cheng	79e2ca90bc	Some enhancements for memcpy / memset inline expansion. 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do not replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 llvm-svn: 169791	2012-12-10 23:21:26 +00:00
Akira Hatanaka	5d6faed1f0	[mips] Set HWEncoding field of registers. Use delete function getMipsRegisterNumbering and use MCRegisterInfo::getEncodingValue instead. llvm-svn: 169760	2012-12-10 20:04:40 +00:00
Chandler Carruth	867c7bff9a	Revert "Make '-mtune=x86_64' assume fast unaligned memory accesses." Accidental commit... git svn betrayed me. Sorry for the noise. llvm-svn: 169741	2012-12-10 18:23:52 +00:00
Chandler Carruth	7eaa45c738	Make '-mtune=x86_64' assume fast unaligned memory accesses. Summary: Not all chips targeted by x86_64 have this feature, but a dramatically increasing number do. Specifying a chip-specific tuning parameter will continue to turn the feature on or off as appropriate for that particular chip, but the generic flag should try to achieve the best performance on the most widely available hardware. Today, the number of chips with fast UA access dwarfs those without in the x86-64 space. Note that this also brings LLVM's code generation for this '-march' flag more in line with that of modern GCCs. CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D195 llvm-svn: 169740	2012-12-10 18:22:42 +00:00
Chandler Carruth	17f25c4e0d	Fix a typo in my previous commit -- bloomfield is 0x1A not 0x2A. Thanks to the PaX folks for noticing in review! We need some tests here, any sugestions welcome... llvm-svn: 169739	2012-12-10 18:22:40 +00:00
Chandler Carruth	0f58558101	Address a FIXME and update the fast unaligned memory feature for newer Intel chips. The model number rules were determined by inspecting Intel's documentation for their newer chip model numbers. My understanding is that all of the newer Intel chips have fast unaligned memory access, but if anyone is concerned about a particular chip, just shout. No tests updated; it's not clear we have dedicated tests for the chips' various features, but if anyone would like tests (or can point me at some existing ones), I'm happy to oblige. llvm-svn: 169730	2012-12-10 09:18:44 +00:00
NAKAMURA Takumi	6b819c5fb1	[CMake] Update dependencies to intrinsics_gen corresponding to r169711. llvm-svn: 169724	2012-12-10 05:27:15 +00:00
Paul Redmond	2adb13c100	LoopVectorize: support vectorizing intrinsic calls - added function to VectorTargetTransformInfo to query cost of intrinsics - vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc. Reviewed by: Nadav llvm-svn: 169711	2012-12-09 20:42:17 +00:00
Shuxin Yang	95de7c37e2	- Re-enable population count loop idiom recognization - fix a bug which cause sigfault. - add two testing cases which was causing crash llvm-svn: 169687	2012-12-09 03:12:46 +00:00
Chandler Carruth	91e47532fe	Revert the patches adding a popcount loop idiom recognition pass. There are still bugs in this pass, as well as other issues that are being worked on, but the bugs are crashers that occur pretty easily in the wild. Test cases have been sent to the original commit's review thread. This reverts the commits: r169671: Fix a logic error. r169604: Move the popcnt tests to an X86 subdirectory. r168931: Initial commit adding the pass. llvm-svn: 169683	2012-12-08 22:18:29 +00:00
Benjamin Kramer	f242d8c31b	Simplify code. Sort includes. No functionality change. llvm-svn: 169676	2012-12-08 10:45:24 +00:00
Chandler Carruth	1d94e932bc	Fix a use-after-free bug found by ASan. You can't assign a temporary std::string to a StringRef. Moreover, the method being called accepts a Twine to simplify these patterns. Fixes this ASan failure: ==6312== ERROR: AddressSanitizer: heap-use-after-free on address 0x7fd558b1af58 at pc 0xcb7529 bp 0x7fffff572080 sp 0x7fffff572078 READ of size 1 at 0x7fd558b1af58 thread T0 #0 0xcb7528 .../llvm/include/llvm/ADT/StringRef.h:192 llvm::StringRef::operator[]() #1 0x1d53c0a .../llvm/include/llvm/ADT/StringExtras.h:128 llvm::HashString() #2 0x1d53878 .../llvm/lib/Support/StringMap.cpp:64 llvm::StringMapImpl::LookupBucketFor() #3 0x1b6872f .../llvm/include/llvm/ADT/StringMap.h:352 llvm::StringMap<>::GetOrCreateValue<>() #4 0x1b61836 .../llvm/lib/MC/MCContext.cpp:109 llvm::MCContext::GetOrCreateSymbol() #5 0xe9fd47 .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:154 (anonymous namespace)::ARMELFStreamer::EmitMappingSymbol() #6 0xea01dd .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:133 (anonymous namespace)::ARMELFStreamer::EmitDataMappingSymbol() #7 0xe9f78b .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:91 (anonymous namespace)::ARMELFStreamer::EmitBytes() #8 0x1b15d82 .../llvm/lib/MC/MCStreamer.cpp:89 llvm::MCStreamer::EmitIntValue() #9 0xcc0f9b .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:713 llvm::ARMAsmPrinter::emitAttributes() #10 0xcc0d44 .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:632 llvm::ARMAsmPrinter::EmitStartOfAsmFile() #11 0x14692ad .../llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:162 llvm::AsmPrinter::doInitialization() #12 0x1bc4677 .../llvm/lib/VMCore/PassManager.cpp:1561 llvm::FPPassManager::doInitialization() #13 0x1bc4990 .../llvm/lib/VMCore/PassManager.cpp:1595 llvm::MPPassManager::runOnModule() #14 0x1bc55e5 .../llvm/lib/VMCore/PassManager.cpp:1705 llvm::PassManagerImpl::run() #15 0x1bc5878 .../llvm/lib/VMCore/PassManager.cpp:1740 llvm::PassManager::run() #16 0xc3954d .../llvm/tools/llc/llc.cpp:378 compileModule() #17 0xc38001 .../llvm/tools/llc/llc.cpp:194 main #18 0x7fd557d6a11c __libc_start_main 0x7fd558b1af58 is located 24 bytes inside of 29-byte region [0x7fd558b1af40,0x7fd558b1af5d) freed by thread T0 here: #0 0xc337da .../llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:56 operator delete() #1 0x1ee9cef .../libstdc++-v3/include/bits/basic_string.h:535 std::string::~string() #2 0xea01dd .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:133 (anonymous namespace)::ARMELFStreamer::EmitDataMappingSymbol() #3 0xe9f78b .../llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp:91 (anonymous namespace)::ARMELFStreamer::EmitBytes() #4 0x1b15d82 .../llvm/lib/MC/MCStreamer.cpp:89 llvm::MCStreamer::EmitIntValue() #5 0xcc0f9b .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:713 llvm::ARMAsmPrinter::emitAttributes() #6 0xcc0d44 .../llvm/lib/Target/ARM/ARMAsmPrinter.cpp:632 llvm::ARMAsmPrinter::EmitStartOfAsmFile() #7 0x14692ad .../llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:162 llvm::AsmPrinter::doInitialization() #8 0x1bc4677 .../llvm/lib/VMCore/PassManager.cpp:1561 llvm::FPPassManager::doInitialization() #9 0x1bc4990 .../llvm/lib/VMCore/PassManager.cpp:1595 llvm::MPPassManager::runOnModule() #10 0x1bc55e5 .../llvm/lib/VMCore/PassManager.cpp:1705 llvm::PassManagerImpl::run() #11 0x1bc5878 .../llvm/lib/VMCore/PassManager.cpp:1740 llvm::PassManager::run() #12 0xc3954d .../llvm/tools/llc/llc.cpp:378 compileModule() #13 0xc38001 .../llvm/tools/llc/llc.cpp:194 main #14 0x7fd557d6a11c __libc_start_main llvm-svn: 169668	2012-12-08 03:10:14 +00:00
Bill Wendling	e94d843e43	s/AttrListPtr/AttributeSet/g to better label what this class is going to be in the near future. llvm-svn: 169651	2012-12-07 23:16:57 +00:00
Nadav Rotem	ad0b5fbe8c	When we use the BLEND instruction that uses the MSB as a mask, we can remove the VSRI instruction before it since it does not affect the MSB. Thanks Craig Topper for suggesting this. llvm-svn: 169638	2012-12-07 21:43:11 +00:00
Matthew Curtis	7a93811e8b	In hexagon convertToHardwareLoop, don't deref end() iterator In particular, check if MachineBasicBlock::iterator is end() before using it to call getDebugLoc(); See also this thread on llvm-commits: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155914.html llvm-svn: 169634	2012-12-07 21:03:15 +00:00
Nadav Rotem	481e50efe0	X86: Prefer using VPSHUFD over VPERMIL because it has better throughput. llvm-svn: 169624	2012-12-07 19:01:13 +00:00
Tim Northover	5cc3dc86bb	Added Mapping Symbols for ARM ELF Before this patch, when you objdump an LLVM-compiled file, objdump tried to decode data-in-code sections as if they were code. This patch adds the missing Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D). Patch based on work by Greg Fitzgerald. llvm-svn: 169609	2012-12-07 16:50:23 +00:00
Jakob Stoklund Olesen	97030e0c0e	Use the new MIBundleBuilder class in the Mips target. This is the preferred way of creating bundled machine instructions. llvm-svn: 169585	2012-12-07 04:23:40 +00:00
Akira Hatanaka	efdce0fb09	[mips] Delete nodes and instructions for dynamic alloca that are no longer in use. llvm-svn: 169580	2012-12-07 03:10:18 +00:00
Akira Hatanaka	97e179f9e4	[mips] Shorten predicate name. llvm-svn: 169579	2012-12-07 03:06:09 +00:00
Akira Hatanaka	c5dc055922	[mips] Delete unused sub-target features. llvm-svn: 169578	2012-12-07 03:04:05 +00:00
Akira Hatanaka	02a346d11f	[mips] Remove unnecessary predicates. llvm-svn: 169577	2012-12-07 03:01:24 +00:00
Matt Beaumont-Gay	4a04c92001	Add a 'using' declaration to suppress GCC's -Woverloaded-virtual while we decide what pattern we want to follow in the future. llvm-svn: 169561	2012-12-06 23:15:36 +00:00
Evan Cheng	9ec512d768	Replace r169459 with something safer. Rather than having computeMaskedBits to understand target implementation of any_extend / extload, just generate zero_extend in place of any_extend for liveouts when the target knows the zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz). rdar://12771555 llvm-svn: 169536	2012-12-06 19:13:27 +00:00
Jakub Staszak	40ee5674cd	Remove unneeded function, since PR8156 was fixed over a year ago. llvm-svn: 169534	2012-12-06 19:05:46 +00:00
Jakub Staszak	65ca2fb9e6	Simplify code. llvm-svn: 169521	2012-12-06 18:22:59 +00:00
Craig Topper	216bcd522b	Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing to the normal instructions. llvm-svn: 169482	2012-12-06 07:31:16 +00:00
Craig Topper	922f10aec4	Mark MOVDQ(A/U)rm as ReMaterializable. Mark all MOVDQ(A/U) instructions as neverHasSideEffects. llvm-svn: 169477	2012-12-06 06:49:16 +00:00

... 4 5 6 7 8 ...

23209 Commits