llvm-project

Commit Graph

Author	SHA1	Message	Date
Jim Grosbach	cb540f5cff	ARM: Define generic HINT instruction. The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/ a different immediate value in bits [7,0]. Define a generic HINT instruction and refactor NOP, WFI, WFI, SEV and YIELD to be assembly aliases of that. rdar://11600518 llvm-svn: 158674	2012-06-18 19:45:50 +00:00
Nuno Lopes	b7c941bad9	add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers. This metadata can be attached to any instruction returning a pointer llvm-svn: 158660	2012-06-18 16:04:04 +00:00
Joel Jones	3237ce737e	This change handles a another case for generating the bic instruction when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't need to be handled, as the 8 bit constants are encoded directly, thereby not needing a separate load instruction to form the constant into a register. <rdar://problem/11481151> llvm-svn: 158659	2012-06-18 14:51:32 +00:00
Chandler Carruth	a1da0bf5ef	Add a regression test for the bug exposed by r158087, which has been temporarily reverted. This test is annoyingly overspecified, but I don't know of another way to thoroughly test the saving and restoring of the registers. While this will have to be adjusted even with the issue fixed in order to re-apply r158087, those adjustments should very clearly indicate that it is still correct (%esp getting restored prior to pops), whereas without it, this case can easily slip under the radar. Still, any suggestions for improvements are very welcome. All credit to Matt Beaumont-Gay for reducing this out of an insane Address Sanitizer crash to a reasonably small seg-faulting C program when built with -mstackrealign. I just reduced it to IR, which was much simpler. =] llvm-svn: 158656	2012-06-18 09:15:04 +00:00
Chandler Carruth	2cc11fd8c7	Temporarily revert r158087. This patch causes problems when both dynamic stack realignment and dynamic allocas combine in the same function. With this patch, we no longer build the epilog correctly, and silently restore registers from the wrong position in the stack. Thanks to Matt for tracking this down, and getting at least an initial test case to Chad. I'm going to try to check a variation of that test case in so we can easily track the fixes required. llvm-svn: 158654	2012-06-18 07:03:12 +00:00
Pete Cooper	33ee6c9bf1	Now that SROA can form alloca's for dynamic vector accesses, further improve it to be able to replace operations on these vector alloca's with insert/extract element insts llvm-svn: 158623	2012-06-17 03:58:26 +00:00
Hal Finkel	6261c2dc28	Cleanup trip-count finding for PPC CTR loops (and some bug fixes). This cleans up the method used to find trip counts in order to form CTR loops on PPC. This refactoring allows the pass to find loops which have a constant trip count but also happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different classes of loops that are currently ignored. In addition, we now search through all potential induction operations instead of just the first. Also, we check the predicate code on the conditional branch and abort the transformation if the code is not EQ or NE, and we then make sure that the branch to be transformed matches the condition register defined by the comparison (multiple possible comparisons will be considered). llvm-svn: 158607	2012-06-16 20:34:07 +00:00
Hal Finkel	fa103d3fc7	Teach BBVectorize to combine, when possible, or discard metadata when fusing instructions. The present implementation handles only TBAA and FP metadata, discarding everything else. For debug metadata, the current behavior is maintained (the debug metadata associated with one of the instructions will be kept, discarding that attached to the other). This should address PR 13040. llvm-svn: 158606	2012-06-16 20:34:06 +00:00
Rafael Espindola	f70bea93e2	Implement irpc. Extracted from a patch by the PaX team. I just added the test. llvm-svn: 158604	2012-06-16 18:03:25 +00:00
Pete Cooper	818e9f4a26	Fix crash from r158529 on Bullet. Dynamic GEPs created by SROA needed to insert extra "i32 0" operands to index through structs and arrays to get to the vector being indexed. llvm-svn: 158590	2012-06-16 01:43:26 +00:00
Andrew Trick	e67a30c77f	Unit test for LSR kind=Special fix: r158536. llvm-svn: 158570	2012-06-15 22:46:31 +00:00
Kevin Enderby	6c7279ec2e	Fix the encoding of the armv7m (MClass) for MSR registers other than aspr, iaspr, espr and xpsr which also needed to have 0b10 in their mask encoding bits. llvm-svn: 158560	2012-06-15 22:14:44 +00:00
Manman Ren	e0763c7472	ARM: optimization for sub+abs. This patch will optimize abs(x-y) FROM sub, movs, rsbmi TO subs, rsbmi For abs, we will use cmp instead of movs. This is necessary because we already have an existing peephole pass which optimizes away cmp following sub. rdar: 11633193 llvm-svn: 158551	2012-06-15 21:32:12 +00:00
Pete Cooper	e24d6a19e3	Allow SROA to split up an array of vectors into multiple vectors, even when the vectors are dynamically indexed llvm-svn: 158529	2012-06-15 18:07:29 +00:00
Rafael Espindola	1821c6c3b0	Some optimizations done by globalopt are safe only for internal linkage, not linkonce linkage. For example, it is not valid to add unnamed_addr. This also fixes a crash in g++.dg/opt/static5.C. llvm-svn: 158528	2012-06-15 18:00:24 +00:00
Jakob Stoklund Olesen	a15a224db0	Preserve <undef> flags in ARMExpandPseudo. This probably mostly shows up in bugpoint-generated code. llvm-svn: 158527	2012-06-15 17:46:54 +00:00
Rafael Espindola	768b41c17a	Factor macro argument parsing into helper methods and add support for .irp. Patch extracted from a larger one by the PaX team. I added the testcases and tightened error handling a bit. llvm-svn: 158523	2012-06-15 14:02:34 +00:00
Duncan Sands	7838603ffc	Fix issues (infinite loop and/or crash) with self-referential instructions, for example degenerate phi nodes and binops that use themselves in unreachable code. Thanks to Charles Davis for the testcase that uncovered this can of worms. llvm-svn: 158508	2012-06-15 08:37:50 +00:00
Pete Cooper	1d1fa72837	Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct llvm-svn: 158479	2012-06-14 23:53:53 +00:00
Rafael Espindola	def1b09be2	Implement the isSafeToDiscardIfUnused predicate and use it in globalopt and globaldce. Globaldce was already removing linkonce globals, but globalopt was not. llvm-svn: 158476	2012-06-14 22:48:13 +00:00
Akira Hatanaka	d8ab16b86f	1. introduce MipsPat in place of Pat in order to exclude those from being used by Mips16 or Micro Mips 2. clean up a few lines too long encountered Patch by Reed Kotler. llvm-svn: 158470	2012-06-14 21:03:23 +00:00
Akira Hatanaka	1b420ac4c8	Make machine verifier check the first instruction of the last bundle instead of the last instruction of a basic block. llvm-svn: 158468	2012-06-14 20:51:13 +00:00
Pete Cooper	5d19452f3f	Revert r158454: Allow SROA to look at a vector type... Its breaking the vectorise buildbot This reverts commit 12c1f86ffa731e2952c80d2cc577000c96b8962c. llvm-svn: 158462	2012-06-14 18:32:52 +00:00
Pete Cooper	a7e6d58a87	Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct llvm-svn: 158454	2012-06-14 16:38:13 +00:00
Richard Barton	b0ec375b96	Replace assertion failure for badly formatted CPS instrution with error message. llvm-svn: 158445	2012-06-14 10:48:04 +00:00
Manman Ren	2764301a77	Revert: test/CodeGen/ARM/iabs.ll in r158441 Sorry that I accidently checked in this file with my previous commit. llvm-svn: 158442	2012-06-14 06:04:02 +00:00
Manman Ren	c2bc2d106b	InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y). uno && ueq was converted to ueq, it should be converted to uno. llvm-svn: 158441	2012-06-14 05:57:42 +00:00
Akira Hatanaka	c6496e2cb6	Test case for MIPS long branch pass. llvm-svn: 158438	2012-06-14 02:12:21 +00:00
Akira Hatanaka	843aca9328	Fix test cases. llvm-svn: 158435	2012-06-14 01:21:00 +00:00
Akira Hatanaka	df5205ef3d	Implement a DAGCombine in MipsISelLowering.cpp which transforms the following pattern: (add v0, (add v1, abs_lo(tjt))) => (add (add v0, v1), abs_lo(tjt)) "tjt" is a TargetJumpTable node. llvm-svn: 158419	2012-06-13 20:33:18 +00:00
Akira Hatanaka	1daf8c2a16	Set a higher value for maxStoresPerMemcpy in MipsISelLowering.cpp. llvm-svn: 158414	2012-06-13 19:33:32 +00:00
Akira Hatanaka	f0273603f5	Implement fastcc calling convention for MIPS. llvm-svn: 158410	2012-06-13 18:06:00 +00:00
Richard Osborne	ab7d788eb5	Fix pattern for MKMSK instruction. llvm-svn: 158409	2012-06-13 17:59:12 +00:00
Pete Cooper	e2fe809772	Revert "Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access" This reverts commit 51786e0aaec76b973205066bd44f7f427b21969f. llvm-svn: 158408	2012-06-13 17:55:22 +00:00
Pete Cooper	e1d4e8b563	Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access llvm-svn: 158407	2012-06-13 17:30:34 +00:00
Duncan Sands	409d8ae165	It is possible for several constants which aren't individually absorbing to combine to the absorbing element. Thanks to nbjoerg on IRC for pointing this out. llvm-svn: 158399	2012-06-13 12:15:56 +00:00
Craig Topper	71dc02d659	Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them. llvm-svn: 158396	2012-06-13 07:18:53 +00:00
Manman Ren	d33f4efbfd	SimplifyCFG: fold unconditional branch to its predecessor if profitable. This patch extends FoldBranchToCommonDest to fold unconditional branches. For unconditional branches, we fold them if it is easy to update the phi nodes in the common successors. rdar://10554090 llvm-svn: 158392	2012-06-13 05:43:29 +00:00
Akira Hatanaka	5fa541231b	disable use of directive .set nomicromips until this directive is pushed in gas to open source fsf Patch by Reed Kotler. llvm-svn: 158381	2012-06-13 02:41:14 +00:00
Andrew Trick	344fb64fa3	sched: fix latency of memory dependence chain edges for consistency. For store->load dependencies that may alias, we should always use TrueMemOrderLatency, which may eventually become a subtarget hook. In effect, we should guarantee at least TrueMemOrderLatency on at least one DAG path from a store to a may-alias load. This should fix the standard mode as well as -enable-aa-sched-mi". llvm-svn: 158380	2012-06-13 02:39:03 +00:00
Duncan Sands	67cd591989	Use std::map rather than SmallMap because SmallMap assumes that the value has POD type, causing memory corruption when mapping to APInts with bitwidth > 64. Merge another crash testcase into crash.ll while there. llvm-svn: 158369	2012-06-12 20:16:51 +00:00
Chad Rosier	c6916f88a8	[arm-fast-isel] Add support for -arm-long-calls. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 158368	2012-06-12 19:25:13 +00:00
Duncan Sands	d7aeefebd6	Now that Reassociate's LinearizeExprTree can look through arbitrary expression topologies, it is quite possible for a leaf node to have huge multiplicity, for example: x0 = xx, x1 = x0x0, x2 = x1*x1, ... rapidly gives a value which is x raised to a vast power (the multiplicity, or weight, of x). This patch fixes the computation of weights by correctly computing them no matter how big they are, rather than just overflowing and getting a wrong value. It turns out that the weight for a value never needs more bits to represent than the value itself, so it is enough to represent weights as APInts of the same bitwidth and do the right overflow-avoiding dance steps when computing weights. As a side-effect it reduces the number of multiplies needed in some cases of large powers. While there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree static, pushing the rank computation out into users. This is progress towards fixing PR13021. llvm-svn: 158358	2012-06-12 14:33:56 +00:00
Jakob Stoklund Olesen	e782fa649f	Fix test that depends on register allocation. The test is really checking the prolog/epilog load/store multiple formation. llvm-svn: 158328	2012-06-11 21:14:28 +00:00
Jakob Stoklund Olesen	4e28777465	Fix test case to work on ARM. Patch by James Benton! llvm-svn: 158316	2012-06-11 16:01:14 +00:00
Bill Wendling	4b79647a6e	Re-enable the CMN instruction. We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> llvm-svn: 158302	2012-06-11 08:07:26 +00:00
Benjamin Kramer	8b8a76974f	InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare. This saves a cast, and zext is more expensive on platforms with subreg support than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750. On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the same performance now when not inlining either function. stupid_memchr: 323.0us bsd_memchr: 321.0us memchr: 479.0us where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time, I haven't fully understood the issue yet, something is grossly mangling the loop after inlining. llvm-svn: 158297	2012-06-10 20:35:00 +00:00
Hal Finkel	4e9f1a859f	Enable ILP scheduling for all nodes by default on PPC. Over the entire test-suite, this has an insignificantly negative average performance impact, but reduces some of the worst slowdowns from the anti-dep. change (r158294). Largest speedups: SingleSource/Benchmarks/Stanford/Quicksort - 28% SingleSource/Benchmarks/Stanford/Towers - 24% SingleSource/Benchmarks/Shootout-C++/matrix - 23% MultiSource/Benchmarks/SciMark2-C/scimark2 - 19% MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15% (matrix and automotive-bitcount were both in the top-5 slowdown list from the anti-dep. change) Largest slowdowns: MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28% MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26% MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21% SingleSource/Benchmarks/CoyoteBench/lpbench - 20% MultiSource/Applications/d/make_dparser - 16% llvm-svn: 158296	2012-06-10 19:32:29 +00:00
Nadav Rotem	17ee58a792	Add AutoUpgrade support for the SSE4 ptest intrinsics. Patch by Michael Kuperstein. llvm-svn: 158295	2012-06-10 18:42:51 +00:00
Hal Finkel	2edfbddcf0	Improve ext/trunc patterns on PPC64. The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that would leave self-moves in the final assembly. Replacing those patterns with ones based on the SUBREG builtins yields better-looking code. Thanks to Jakob and Owen for their suggestions in this matter. llvm-svn: 158283	2012-06-09 22:10:19 +00:00
Craig Topper	3352ba55b9	Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument. llvm-svn: 158278	2012-06-09 16:46:13 +00:00
Hal Finkel	eb50c2d4a4	Enable tail merging on PPC. Tail merging had been disabled on PPC because it would disturb bundling decisions made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions are made during post-RA scheduling, and tail merging is generally beneficial (the average test-suite speedup is insignificantly positive). Largest test-suite speedups: MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30% MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23% SingleSource/Benchmarks/Shootout-C++/ary - 21% SingleSource/Benchmarks/Stanford/Queens - 17% Largest slowdowns: MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24% MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22% MultiSource/Applications/JM/ldecod/ldecod - 14% MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9% This is improved by using full (instead of just critical) anti-dependency breaking, but doing so still causes miscompiles and so cannot yet be enabled by default. llvm-svn: 158259	2012-06-09 03:14:50 +00:00
Jakob Stoklund Olesen	33a1b416ac	Don't run RAFast in the optimizing regalloc pipeline. The fast register allocator is not supposed to work in the optimizing pipeline. It doesn't make sense to compute live intervals, run full copy coalescing, and then run RAFast. Fast register allocation in the optimizing pipeline is better done by RABasic. llvm-svn: 158242	2012-06-08 23:15:12 +00:00
Nuno Lopes	2710f1b049	canonicalize: -%a + 42 into 42 - %a previously we were emitting: -(%a + 42) This fixes the infinite loop in PR12338. The generated code is still not perfect, though. Will work on that next llvm-svn: 158237	2012-06-08 22:30:05 +00:00
Hal Finkel	c6b5debb40	Enable PPC CTR loop formation by default. Thanks to Jakob's help, this now causes no new test suite failures! Over the entire test suite, this gives an average 1% speedup. The largest speedups are: SingleSource/Benchmarks/Misc/pi - 108% SingleSource/Benchmarks/CoyoteBench/lpbench - 54% MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50% SingleSource/Benchmarks/Shootout/ary3 - 32% SingleSource/Benchmarks/Shootout-C++/matrix - 30% The largest slowdowns are: MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30% MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25% MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22% MultiSource/Applications/d/make_dparser - -14% SingleSource/Benchmarks/Shootout-C++/ary - -13% In light of these slowdowns, additional profiling work is obviously needed! llvm-svn: 158223	2012-06-08 19:19:53 +00:00
Manman Ren	bf86b295bb	Test case for r158160 llvm-svn: 158218	2012-06-08 18:42:37 +00:00
Chad Rosier	3d464d8068	Fix a crash in APInt::lshr when shiftAmt > BitWidth. Patch by James Benton <jbenton@vmware.com>. llvm-svn: 158213	2012-06-08 18:04:52 +00:00
NAKAMURA Takumi	5412cef77d	test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911. llvm-svn: 158209	2012-06-08 16:28:06 +00:00
Hal Finkel	821e00121c	Disable the PPC CTR-Loops pass by default. The pass itself works well, but the something in the Machine* infrastructure does not understand terminators which define registers. Without the ability to use the block-placement pass, etc. this causes performance regressions (and so is turned off by default). Turning off the analysis turns off the problems with the Machine* infrastructure. llvm-svn: 158206	2012-06-08 15:38:25 +00:00
Hal Finkel	8b01503ee5	Fix a bug in the new PPC CTR-Loops pass. The code which tests for an induction operation cannot assume that any ADDI instruction will have a register operand because the operand could also be a frame index; for example: %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16 llvm-svn: 158205	2012-06-08 15:38:23 +00:00
Hal Finkel	96c2d4d945	Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form CTR-based loop branching code. This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are no longer otherwise used. Also, invalid preheader DebugLoc is not used. llvm-svn: 158204	2012-06-08 15:38:21 +00:00
Duncan Sands	9a5cf92250	Revert commit 158073 while waiting for a fix. The issue is that reassociate can move instructions within the instruction list. If the instruction just happens to be the one the basic block iterator is pointing to, and it is moved to a different basic block, then we get into an infinite loop due to the iterator running off the end of the basic block (for some reason this doesn't fire any assertions). Original commit message: Grab-bag of reassociate tweaks. Unify handling of dead instructions and instructions to reoptimize. Exploit this to more systematically eliminate dead instructions (this isn't very useful in practice but is convenient for analysing some testcase I am working on). No need for WeakVH any more: use an AssertingVH instead. llvm-svn: 158199	2012-06-08 13:37:30 +00:00
Manman Ren	2cdc8afccf	X86: optimize generated code for integer ABS This patch will generate the following for integer ABS: movl %edi, %eax negl %eax cmovll %edi, %eax INSTEAD OF movl %edi, %ecx sarl $31, %ecx leal (%rdi,%rcx), %eax xorl %ecx, %eax There exists a target-independent DAG combine for integer ABS, which converts integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. This is implemented in PerformXorCombine. rdar://10695237 llvm-svn: 158175	2012-06-07 22:39:10 +00:00
Nadav Rotem	4e50efead6	Fix a bug in FoldSelectOpOp. Bitcast ops may change the number of vector elements, which may disagree with the select condition type. llvm-svn: 158166	2012-06-07 20:28:57 +00:00
Rafael Espindola	55d1145bd5	Use a base register instead of an index register with the local dynamic model. Fixes pr13048. llvm-svn: 158158	2012-06-07 18:39:19 +00:00
Meador Inge	161d5bb6f7	Adding a missing -S to the opt invocation. llvm-svn: 158128	2012-06-07 01:02:13 +00:00
Manman Ren	ae02c5a93e	X86: replace SUB with CMP if possible This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 llvm-svn: 158126	2012-06-07 00:42:47 +00:00
Bill Wendling	3e8cf2be79	Spell optimization name correclty. llvm-svn: 158123	2012-06-06 23:53:23 +00:00
Manman Ren	9c9641812c	Revert r157755. The commit is intended to fix rdar://11540023. It is implemented as part of peephole optimization. We can actually implement this in the SelectionDAG lowering phase. llvm-svn: 158122	2012-06-06 23:53:03 +00:00
Bill Wendling	618547a9c6	Another testcase for r156548. <rdar://problem/10889741> llvm-svn: 158121	2012-06-06 23:36:22 +00:00
Chad Rosier	5d6f01ad77	Add support for dynamic stack realignment in the presence of dynamic allocas on X86. rdar://11496434 llvm-svn: 158087	2012-06-06 17:37:40 +00:00
Chad Rosier	faa3894628	Fix combine of uno && ord -> false so that the ordering of the fcmps doesn't matter. rdar://11579835 llvm-svn: 158084	2012-06-06 17:22:40 +00:00
Duncan Sands	763da45e9e	Grab-bag of reassociate tweaks. Unify handling of dead instructions and instructions to reoptimize. Exploit this to more systematically eliminate dead instructions (this isn't very useful in practice but is convenient for analysing some testcase I am working on). No need for WeakVH any more: use an AssertingVH instead. llvm-svn: 158073	2012-06-06 14:53:10 +00:00
Richard Barton	f1ef87ddbb	Correct decoder for T1 conditional B encoding llvm-svn: 158055	2012-06-06 09:12:53 +00:00
Chad Rosier	280e5df2ac	Remove extraneous CHECK-NOTs from previous commit and add a new test case. llvm-svn: 158045	2012-06-06 02:12:17 +00:00
Chad Rosier	1de1b54e72	FileCheckize this test. llvm-svn: 158044	2012-06-06 01:38:32 +00:00
Joel Jones	7f2ac7a2c8	Revert commit r157966 llvm-svn: 157972	2012-06-05 00:47:21 +00:00
Joel Jones	d08534f82e	This change handles a another case for generating the bic instruction when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. <rdar://problem/11481151> llvm-svn: 157966	2012-06-04 23:38:57 +00:00
Rafael Espindola	47d988c54c	When gvn decides to replace an instruction with another, we have to patch the replacement to make it at least as generic as the instruction being replaced. This includes: * dropping nsw/nuw flags * getting the least restrictive tbaa and fpmath metadata * merging ranges Fixes PR12979. llvm-svn: 157958	2012-06-04 22:44:21 +00:00
Akira Hatanaka	3ee0405231	Add a test case for mips64 unaligned load/store instructions. llvm-svn: 157939	2012-06-04 17:57:06 +00:00
Akira Hatanaka	b964932e70	Rename test/CodeGen/Mips/load-shift-left-right.ll. llvm-svn: 157938	2012-06-04 17:50:36 +00:00
Roman Divacky	e3f15c98d1	Implement local-exec TLS on PowerPC. llvm-svn: 157935	2012-06-04 17:36:38 +00:00
Nadav Rotem	b7bb72e4f3	Remove the "-promote-elements" flag. This flag is now enabled by default. llvm-svn: 157925	2012-06-04 11:27:21 +00:00
Hal Finkel	595817eebe	Enable generating PPC pre-increment (r+imm) instructions by default. It seems that this no longer causes test suite failures on PPC64 (after r157159), and often gives a performance benefit, so it can be enabled by default. llvm-svn: 157911	2012-06-04 02:21:00 +00:00
Craig Topper	79dbb0c6e4	Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang. llvm-svn: 157903	2012-06-03 18:58:46 +00:00
Craig Topper	fd53b80219	Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit. llvm-svn: 157898	2012-06-03 07:26:46 +00:00
Manman Ren	5097e4f38a	Revert r157831 llvm-svn: 157896	2012-06-03 03:14:24 +00:00
Craig Topper	29eafea292	Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior. llvm-svn: 157895	2012-06-03 01:40:43 +00:00
Manman Ren	be10421c17	ARM: add testing case for struct byval rdar://9877866 llvm-svn: 157876	2012-06-02 05:37:44 +00:00
Akira Hatanaka	27512b167b	Add another test case which tests Mips' unaligned load/store instructions. llvm-svn: 157874	2012-06-02 01:13:10 +00:00
Akira Hatanaka	63c0e2c58c	Fix test cases in test/CodeGen/Mips. llvm-svn: 157868	2012-06-02 00:05:45 +00:00
Rafael Espindola	103c2cfbbd	Use dominates(Instruction, Use) in the verifier. This removes a bit of context from the verifier erros, but reduces code duplication in a fairly critical part of LLVM and makes dominates easier to test. llvm-svn: 157845	2012-06-01 21:56:26 +00:00
Manman Ren	879ca9d47d	X86: peephole optimization to remove cmp instruction This patch will optimize the following: sub r1, r3 cmp r3, r1 or cmp r1, r3 bge L1 TO sub r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can eliminate the "cmp" instruction. llvm-svn: 157831	2012-06-01 19:49:33 +00:00
Rafael Espindola	2c3f63cbda	Add some tests checking that the verifier rejects cases where a definition doesn't dominate a use. llvm-svn: 157829	2012-06-01 19:24:57 +00:00
Chris Lattner	b1359894f3	testcase for PR13006, thanks to Duncan for filing it. llvm-svn: 157824	2012-06-01 18:19:46 +00:00
Nuno Lopes	adf1c859dd	BoundsChecking: fix a bug when the handling of recursive PHIs failed and could leave dangling references in the cache add regression tests for this problem. Can already compile & run: PHP, PCRE, and ICU (i.e., all the software I tried) llvm-svn: 157822	2012-06-01 17:43:31 +00:00
Hans Wennborg	789acfb63d	Implement the local-dynamic TLS model for x86 (PR3985) This implements codegen support for accesses to thread-local variables using the local-dynamic model, and adds a clean-up pass so that the base address for the TLS block can be re-used between local-dynamic access on an execution path. llvm-svn: 157818	2012-06-01 16:27:21 +00:00
Craig Topper	00649d5111	Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though. llvm-svn: 157804	2012-06-01 06:07:48 +00:00
Chris Lattner	466076b95f	enhance the logic for looking through tailcalls to look through transparent casts in multiple-return value scenarios, like what happens on X86-64 when returning small structs. llvm-svn: 157800	2012-06-01 05:29:15 +00:00
Chris Lattner	182fe3eef1	enhance getNoopInput to know about vector<->vector bitcasts of legal types, as well as int<->ptr casts. This allows us to tailcall functions with some trivial casts between the call and return (i.e. because the return types disagree). llvm-svn: 157798	2012-06-01 05:16:33 +00:00
Chris Lattner	22afea7689	add some simple 64-bit tail call tests. llvm-svn: 157797	2012-06-01 05:03:31 +00:00
Chris Lattner	21b1e6bbdc	merge some tests. llvm-svn: 157795	2012-06-01 05:00:54 +00:00
Chris Lattner	d82ae12d8c	rename test llvm-svn: 157794	2012-06-01 04:58:50 +00:00
Eric Christopher	1cf3338bb4	Add support for enum forward declarations. Part of rdar://11570854 llvm-svn: 157786	2012-06-01 00:22:32 +00:00
Nuno Lopes	288e86ff6b	add -bounds-checking-multiple-traps option to make one trap BB per check disabled by default for now; we can discusse the default value (& name) later llvm-svn: 157777	2012-05-31 22:58:48 +00:00
Nuno Lopes	7d00061d87	revamp BoundsChecking considerably: - compute size & offset at the same time. The side-effects of this are that we now support negative GEPs. It's now approaching a phase that it can be reused by other passes (e.g., lowering of the objectsize intrinsic) - use APInt throughout to handle wrap-arounds - add support for PHI instrumentation - add a cache (required for recursive PHIs anyway) - remove hoisting support for now, since it was wrong in a few cases sorry for the churn here.. tests will follow soon. llvm-svn: 157775	2012-05-31 22:45:40 +00:00
Owen Anderson	ff458f89aa	Make this testcase independent of register allocation. llvm-svn: 157761	2012-05-31 18:07:02 +00:00
Manman Ren	9bccb64e56	X86: replace SUB with CMP if possible This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 llvm-svn: 157755	2012-05-31 17:20:29 +00:00
Rafael Espindola	e3c5f3e5b1	Fix typos noticed by Benjamin Kramer. Also make the checks stronger and test that we reject ranges that overlap a previous wrapped range. llvm-svn: 157749	2012-05-31 16:04:26 +00:00
Rafael Espindola	97d7787788	Require intervals in the range metadata to be in a canonical form: They must be non contiguous, non overlapping and sorted by the lower end. While this is technically a backward incompatibility, every frontent currently produces range metadata with a single interval and we don't have any pass that merges intervals yet, so no existing bitcode files should be rejected by this. llvm-svn: 157741	2012-05-31 13:45:46 +00:00
Elena Demikhovsky	602f3a26d6	Added FMA3 Intel instructions. I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks. I added tests for GodeGen and intrinsics. I did not change llvm.fma.f32/64 - it may be done later. llvm-svn: 157737	2012-05-31 09:20:20 +00:00
Duncan Sands	339bb61e32	Enhance the sinking code to handle diamond patterns. Patch by Carlo Alberto Ferraris. llvm-svn: 157736	2012-05-31 08:09:49 +00:00
Craig Topper	c1ac05dad5	Add intrinsic for pclmulqdq instruction. llvm-svn: 157731	2012-05-31 04:37:40 +00:00
Akira Hatanaka	c13ed945aa	Add lit.local.cfg to run the tests in test/MC/Disassembler/Mips. llvm-svn: 157725	2012-05-31 00:49:56 +00:00
Jakob Stoklund Olesen	05e2245fc6	Prioritize smaller register classes for urgent evictions. It helps compile exotic inline asm. In the test case, normal GR32 virtual registers use up eax-edx so the final GR32_ABCD live range has no registers left. Since all the live ranges were tiny, we had no way of prioritizing the smaller register class. This patch allows tiny unspillable live ranges to be evicted by tiny unspillable live ranges from a smaller register class. <rdar://problem/11542429> llvm-svn: 157715	2012-05-30 21:46:58 +00:00
Eric Christopher	f481ab3877	Add support for the mips inline asm 'm' output modifier. Patch by Jack Carter. llvm-svn: 157709	2012-05-30 19:05:19 +00:00
Owen Anderson	0eda3e1de6	Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention. llvm-svn: 157708	2012-05-30 18:54:50 +00:00
Owen Anderson	c7aaf523e1	Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node. llvm-svn: 157707	2012-05-30 18:50:39 +00:00
Benjamin Kramer	50b26ebb2b	Teach SCEV's icmp simplification logic that a-b == 0 is equivalent to a == b. This also required making recursive simplifications until nothing changes or a hard limit (currently 3) is hit. With the simplification in place indvars can canonicalize loops of the form for (unsigned i = 0; i < a-b; ++i) into for (unsigned i = 0; i != a-b; ++i) which used to fail because SCEV created a weird umax expr for the backedge taken count. llvm-svn: 157701	2012-05-30 18:32:23 +00:00
Chris Lattner	1622a99e58	it's pointed out that R11 can be used for magic things, and doing things just for 64-bit registers is silly. Just optimize 3 more. llvm-svn: 157699	2012-05-30 18:08:02 +00:00
Chris Lattner	04d722a68d	Extend the (abi-irrelevant) return convention to be able to return more than two values in integer registers. This is already supported by the fastcc convention, but it doesn't hurt to support it in the standard conventions as well. In cases where we can cheat at the calling convention, this allows us to avoid returning things through memory in more cases. llvm-svn: 157698	2012-05-30 17:50:14 +00:00
Chad Rosier	820d248c4d	[arm-fast-isel] Add support for the llvm.frameaddress() intrinsic. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 157696	2012-05-30 17:23:22 +00:00
Kostya Serebryany	9024160439	[asan] instrument cmpxchg and atomicrmw llvm-svn: 157683	2012-05-30 09:04:06 +00:00
Andrew Trick	a3f9043196	SCEV: Handle a corner case reducing AddRecExpr * AddRecExpr If integer overflow causes one of the terms to reach zero, that can force the entire expression to zero. Fixes PR12929: cast<Ty>() argument of incompatible type llvm-svn: 157673	2012-05-30 03:35:20 +00:00
Evan Cheng	bc2453dd3d	Teach taildup to update livein set. rdar://11538365 llvm-svn: 157663	2012-05-30 00:42:39 +00:00
Bob Wilson	33e5188c27	Add an insertPass API to TargetPassConfig. <rdar://problem/11498613> Besides adding the new insertPass function, this patch uses it to enhance the existing -print-machineinstrs so that the MachineInstrs after a specific pass can be printed. Patch by Bin Zeng! llvm-svn: 157655	2012-05-30 00:17:12 +00:00
Benjamin Kramer	ef479ea854	Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions. This required light surgery on the assembler and disassembler because the instructions use an uncommon encoding. They are the only two instructions in x86 that use register operands and two immediates. llvm-svn: 157634	2012-05-29 19:05:25 +00:00
Peter Collingbourne	913869be45	Add llvm.fabs intrinsic. llvm-svn: 157594	2012-05-28 21:48:37 +00:00
Benjamin Kramer	b8743a9150	InstCombine: Fix infinite loop when encountering switch on trivial icmp. The test case feeds the following into InstCombine's visitSelect: %tobool8 = icmp ne i32 0, 0 %phitmp = select i1 %tobool8, i32 3, i32 0 Then instcombine replaces the right side of the switch with 0, doesn't notice that nothing changes and tries again indefinitely. This fixes PR12897. llvm-svn: 157587	2012-05-28 19:18:16 +00:00
Meador Inge	e17b69a373	PR12696: Attribute bits above 1<<30 are not encoded in bitcode Attribute bits above 1<<30 are now encoded correctly. Additionally, the encoding/decoding functionality has been hoisted to helper functions in Attributes.h in an effort to help the encoding/decoding to stay in sync with the Attribute bitcode definitions. llvm-svn: 157581	2012-05-28 15:45:43 +00:00
Chris Lattner	ff9e08baf9	rdar://11542750 - llvm.trap should be marked no return. llvm-svn: 157551	2012-05-27 23:20:41 +00:00
Benjamin Kramer	152f106e5f	PR12967: Don't crash when trying to fold a shift that's larger than the type's size. llvm-svn: 157548	2012-05-27 22:03:32 +00:00
Chris Lattner	f7f59b15aa	These tests used intrinsics with the wrong prototype. They weren't caught because the old verifier just checked that something "was a pointer", but not that the pointee was correct. llvm-svn: 157544	2012-05-27 19:35:41 +00:00
Chris Lattner	4cca620c18	remove two (useless) tests that use incorrect intrinsic prototypes, detected by the new intrinsic verifier. llvm-svn: 157543	2012-05-27 19:31:00 +00:00
Peter Collingbourne	4d358b55fa	Have getOrCreateSubprogramDIE store the DIE for a subprogram definition in the map before calling itself to retrieve the DIE for the declaration. Without this change, if this causes getOrCreateSubprogramDIE to be recursively called on the definition, it will create multiple DIEs for that definition. Fixes PR12831. llvm-svn: 157541	2012-05-27 18:36:44 +00:00
Benjamin Kramer	f2beccf6b4	SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights. SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move the most likely condition to the front so it is checked first and the others can be skipped. This is currently not as effective as it could be because SimplifyCFG destroys profiling metadata when merging branches and switches. Merging branch weight metadata is tricky though. This code touches at most 3 cases so I didn't use a proper sorting algorithm. llvm-svn: 157521	2012-05-26 20:01:32 +00:00
Duncan Sands	3c05cd3ea8	Since commit 157467, if reassociate isn't actually going to change an expression then it doesn't alter the instructions composing it, however it would continue to move the instructions to just before the expression root. Ensure it doesn't move them either, so now it really does nothing if there is nothing to do. That commit also ensured that nsw etc flags weren't cleared if the expression was not being changed. Tweak this a bit so that it doesn't clear flags on the initial part of a computation either if that part didn't change but later bits did. llvm-svn: 157518	2012-05-26 16:42:52 +00:00
Nuno Lopes	e9b0bdf804	bounds checking: add support for byval arguments llvm-svn: 157498	2012-05-25 21:15:17 +00:00
Justin Holewinski	c98041d4d9	[NVPTX] Add a new test case for the newly-enabled call handling NV_CONTRIB llvm-svn: 157485	2012-05-25 17:20:38 +00:00
Nuno Lopes	a6da3ff896	boundschecking: add support for select add experimental support for alloc_size metadata llvm-svn: 157481	2012-05-25 16:54:04 +00:00
NAKAMURA Takumi	3eca973bf8	test/CodeGen/X86/bigstructret.ll: Suppress one test. It is msvc-incompatible. (compatible to mingw32 and netbsd, though) llvm-svn: 157474	2012-05-25 15:40:54 +00:00
NAKAMURA Takumi	501dbd06ae	test/CodeGen/X86/bigstructret.ll: Relax stack offsets for hosts of stack-align=8, eg. win32 and netbsd. llvm-svn: 157471	2012-05-25 15:12:21 +00:00
Duncan Sands	bddfb2f96b	Make the reassociation pass more powerful so that it can handle expressions with arbitrary topologies (previously it would give up when hitting a diamond in the use graph for example). The testcase from PR12764 is now reduced from a pile of additions to the optimal 1617*%x0+208. In doing this I changed the previous strategy of dropping all uses for expression leaves to one of dropping all but one use. This works out more neatly (but required a bunch of tweaks) and is also safer: some recently fixed bugs during recursive linearization were because the linearization code thinks it completely owns a node if it has no uses outside the expression it is linearizing. But if the node was also in another expression that had been linearized (and thus all uses of the node from that expression dropped) then the conclusion that it is completely owned by the expression currently being linearized is wrong. Keeping one use from within each linearized expression avoids this kind of mistake. llvm-svn: 157467	2012-05-25 12:03:02 +00:00
Eli Friedman	315a0c79f3	Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process. llvm-svn: 157446	2012-05-25 00:09:29 +00:00
Jakob Stoklund Olesen	36a5c8e550	Add support for range expressions in TableGen foreach loops. Like this: foreach i = 0-127 in ... Use braces for composite ranges: foreach i = {0-3,9-7} in ... llvm-svn: 157432	2012-05-24 22:17:39 +00:00
Jakob Stoklund Olesen	74fd80e8fc	Don't put TGParser scratch results in the output. Only fully expanded Records should go into RecordKeeper. llvm-svn: 157431	2012-05-24 22:17:36 +00:00
David Blaikie	c575c80c3b	Fix for CHECK-NOT misspelling. Patch by Nicklas Bo Jensen. llvm-svn: 157421	2012-05-24 22:08:29 +00:00
Justin Holewinski	907f7606f2	Remove the PTX back-end and all of its artifacts (triple, etc.) This back-end was deprecated in favor of the NVPTX back-end. NV_CONTRIB llvm-svn: 157417	2012-05-24 21:38:21 +00:00
Owen Anderson	921082b883	Teach tblgen's set theory "sequence" operator to support an optional stride operand. llvm-svn: 157416	2012-05-24 21:37:08 +00:00
Akira Hatanaka	a649cc75b3	Turn on mips16 pseudo op when compiling for mips16. Expand test case for this. Patch by Reed Kotler. llvm-svn: 157410	2012-05-24 18:37:43 +00:00
Akira Hatanaka	df98a7a34d	Enable Mips16 compiler to compile a null program. First code from the Mips16 compiler. Includes trivial test program. Patch by Reed Kotler. llvm-svn: 157408	2012-05-24 18:32:33 +00:00
Tobias Grosser	6b31d170a4	Add half support to LLVM (for OpenCL) Submitted by: Anton Lokhmotov <Anton.Lokhmotov@arm.com> Approved by: o Anton Korobeynikov o Micah Villmow o David Neto llvm-svn: 157393	2012-05-24 15:59:06 +00:00
Stepan Dyatkovskiy	183d18aa5a	PR1255 related changes (case ranges): LowerSwitch::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced. test/Transform/LowerSwitch/feature.ll - this test was refactored: grep + count was replaced with FileCheck usage. llvm-svn: 157384	2012-05-24 09:33:20 +00:00
Jakob Stoklund Olesen	41ebcda8f4	Add a test case for global live range splitting. llvm-svn: 157357	2012-05-23 23:42:23 +00:00
Jakob Stoklund Olesen	0ce90494e6	Add a last resort tryInstructionSplit() to RAGreedy. Live ranges with a constrained register class may benefit from splitting around individual uses. It allows the remaining live range to use a larger register class where it may allocate. This is like spilling to a different register class. This is only attempted on constrained register classes. <rdar://problem/11438902> llvm-svn: 157354	2012-05-23 22:37:27 +00:00
Kaelyn Uhrain	4dbe0cd6dc	Fix typo in flag to opt, and also a CHECK-NEXT that doesn't follow a CHECK. The latter error was hidden by the former, and the test harness used by e.g. "make check" silently ignored that opt was printing an error message about an unknown flag instead of running on the test file. llvm-svn: 157341	2012-05-23 20:21:36 +00:00
Jakob Stoklund Olesen	5b8f476037	Correctly deal with identity copies in RegisterCoalescer. Now that the coalescer keeps live intervals and machine code in sync at all times, it needs to deal with identity copies differently. When merging two virtual registers, all identity copies are removed right away. This means that other identity copies must come from somewhere else, and they are going to have a value number. Deal with such copies by merging the value numbers before erasing the copy instruction. Otherwise, we leave dangling value numbers in the live interval. This fixes PR12927. llvm-svn: 157340	2012-05-23 20:21:06 +00:00
Chad Rosier	223faf719c	[arm-fast-isel] Add support for non-global callee. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 157336	2012-05-23 18:38:57 +00:00
Nuno Lopes	10287d839f	BoundsChecking: add a couple of simple tests and fix a bug in branch emition llvm-svn: 157329	2012-05-23 16:24:52 +00:00
Patrik Hägglund	8a1e316c15	Fix the inliner so that the optsize function attribute don't alter the inline threshold if the global inline threshold is lower (as for -Oz). Reviewed by Chandler Carruth and Bill Wendling. llvm-svn: 157323	2012-05-23 13:42:57 +00:00
Eric Christopher	c49643586b	Add support for C++11 enum classes in llvm. Part of rdar://11496790 llvm-svn: 157303	2012-05-23 00:09:20 +00:00
Andrew Trick	a7a3de1bcf	LSR fix: add a missing phi check during IV hoisting. Fixes PR12898: SCEVExpander crash. llvm-svn: 157263	2012-05-22 17:39:59 +00:00
Nuno Lopes	ad40c0a425	revert my previous patches that introduced an additional parameter to the objectsize intrinsic. After a lot of discussion, we realized it's not the best option for run-time bounds checking llvm-svn: 157255	2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen	924279ca0e	Only erase virtregs with no uses left. Also make sure registers aren't erased twice if the dead def mentions the register twice. This fixes PR12911. llvm-svn: 157254	2012-05-22 14:52:12 +00:00
Duncan Sands	4df5e96d3a	Fix PR12858, a crash due to GVN's PRE not fully removing an instruction from the leader table. That's because it wasn't expecting instructions to turn up as leader for a value number that is not its own, but equality propagation could create this situation. One solution is to have the leader table use a WeakVH but this slows down GVN by about 5%. Instead just have equality propagation not add instructions to the leader table, only constants and arguments. In theory this might cause GVN to run more (each time it changes something it runs again) but it doesn't seem to occur enough to cause a slow down. llvm-svn: 157251	2012-05-22 14:17:53 +00:00
Jim Grosbach	da04fa0d02	FileCheck'ize test, and add a bit to test for r157221. llvm-svn: 157222	2012-05-21 23:50:00 +00:00
Craig Topper	e88f2fd4f7	Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces. llvm-svn: 157175	2012-05-21 06:40:16 +00:00
Peter Collingbourne	8eb05fd093	When legalising shifts, do not pre-build a list of operands which may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve the other operands when calling UpdateNodeOperands. Fixes PR12889. llvm-svn: 157162	2012-05-20 18:36:15 +00:00
Hal Finkel	601f555eee	Add a missing PPC 64-bit stwu pattern. This seems to fix the remaining compile-time failures on PPC64 when compiling with -enable-ppc-preinc. llvm-svn: 157159	2012-05-20 17:11:24 +00:00
Jakob Stoklund Olesen	691ae3388f	Use the right register class for LDRrs. llvm-svn: 157152	2012-05-20 06:38:47 +00:00
Jakob Stoklund Olesen	4fd0e4f415	Transfer memory operands to the right instruction. They need to go on the PICLDR as the verifier points out. llvm-svn: 157151	2012-05-20 06:38:42 +00:00
Jakob Stoklund Olesen	1f1c6add10	Properly constrain register classes for sub-registers. Not all GR64 registers have sub_8bit sub-registers. llvm-svn: 157150	2012-05-20 06:38:37 +00:00
Jakob Stoklund Olesen	a103a516c6	Properly constrain register classes in 2-addr. X86 has 2-addr instructions with different constraints on the tied def and use operands. One is GR32, one is GR32_NOSP. llvm-svn: 157149	2012-05-20 06:38:32 +00:00
Peter Collingbourne	9a03c73297	Do not pass an invalid domtree to SimplifyInstruction from LoopUnswitch. Fixes PR12887. llvm-svn: 157140	2012-05-20 01:32:09 +00:00
Jakob Stoklund Olesen	a34a69ce0c	Fix 12892. Dead code elimination during coalescing could cause a virtual register to be split into connected components. The following rewriting would be confused about the already joined copies present in the code, but without a corresponding value number in the live range. Erase all joined copies instantly when joining intervals such that the MI and LiveInterval representations are always in sync. llvm-svn: 157135	2012-05-19 23:34:59 +00:00
Peter Collingbourne	97b1076435	Do not eliminate allocas whose alignment exceeds that of the copied-in constant, as a subsequent user may rely on over alignment. Fixes PR12885. llvm-svn: 157134	2012-05-19 22:52:10 +00:00
Jakob Stoklund Olesen	25ced18407	Erase joined copies immediately. The late dead code elimination is no longer necessary. The test changes are cause by a register hint that can be either %rdi or %rax. The choice depends on the use list order, which this patch changes. llvm-svn: 157131	2012-05-19 20:54:07 +00:00
Nadav Rotem	c93e91da27	On Haswell, perfer storing YMM registers using a single instruction. llvm-svn: 157129	2012-05-19 20:30:08 +00:00
Nadav Rotem	900c7cb7ce	Add support for additional in-reg vbroadcast patterns llvm-svn: 157127	2012-05-19 19:57:37 +00:00
Eric Christopher	b5cf66cda2	Actually support DW_TAG_rvalue_reference_type that we were trying to generate out of the front end. rdar://11479676 llvm-svn: 157094	2012-05-19 01:36:37 +00:00
Eric Christopher	bc5d24999c	Add support for the 'd' mips inline asm output modifier. Patch by Jack Carter. llvm-svn: 157093	2012-05-19 00:51:56 +00:00
Andrew Trick	7fa4e0fea6	SCEV: Add MarkPendingLoopPredicates to avoid recursive isImpliedCond. getUDivExpr attempts to simplify by checking for overflow. isLoopEntryGuardedByCond then evaluates the loop predicate which may lead to the same getUDivExpr causing endless recursion. Fixes PR12868: clang 3.2 segmentation fault. llvm-svn: 157092	2012-05-19 00:48:25 +00:00
Dan Gohman	14862c3141	Fix replacing all the users of objc weak runtime routines when deleting them. rdar://11434915. llvm-svn: 157080	2012-05-18 22:17:29 +00:00
Nuno Lopes	ac59380dfd	allow LazyValueInfo::getEdgeValue() to reason about multiple edges from the same switch instruction by doing union of ranges (which may still be conservative, but it's more aggressive than before) llvm-svn: 157071	2012-05-18 21:02:10 +00:00
Jim Grosbach	4b63d2ae1d	Refactor data-in-code annotations. Use a dedicated MachO load command to annotate data-in-code regions. This is the same format the linker produces for final executable images, allowing consistency of representation and use of introspection tools for both object and executable files. Data-in-code regions are annotated via ".data_region"/".end_data_region" directive pairs, with an optional region type. data_region_directive := ".data_region" { region_type } region_type := "jt8" \| "jt16" \| "jt32" \| "jta32" end_data_region_directive := ".end_data_region" The previous handling of ARM-style "$d.*" labels was broken and has been removed. Specifically, it didn't handle ARM vs. Thumb mode when marking the end of the section. rdar://11459456 llvm-svn: 157062	2012-05-18 19:12:01 +00:00
Nuno Lopes	b63d6cdf79	add test case for bugfix in r157032 llvm-svn: 157058	2012-05-18 17:44:58 +00:00
Eric Christopher	9ca26cfb5f	Add support for the mips 'x' inline asm modifier. Patch by Jack Carter. llvm-svn: 157057	2012-05-18 17:39:35 +00:00
Joel Jones	f1c120e9ef	FileCheck-ify, apropos of nothing llvm-svn: 157051	2012-05-18 16:24:01 +00:00
Craig Topper	92db928ee9	Simplify handling of v16i8 shuffles and fix a missed optimization. llvm-svn: 157043	2012-05-18 06:42:06 +00:00
Evan Cheng	22d405f57b	Teach two-address pass to update the "source" map so it doesn't perform a non-profitable commute using outdated info. The test case would still fail because of poor pre-RA schedule. That will be fixed by MI scheduler. rdar://11472010 llvm-svn: 157038	2012-05-18 01:33:51 +00:00
Danil Malyshev	cd492b0a98	Temporarily disabled the MCJIT tests for Darwin, because the RuntimeDyldMachO has a problems with relocations for 32bit x86. llvm-svn: 157035	2012-05-18 00:30:58 +00:00
Kevin Enderby	badd100c26	Fixed a bug in llvm-objdump when disassembling using -macho option for a binary containing no symbols. Fixed the crash and fixed it not disassembling anything. llvm-svn: 157031	2012-05-18 00:13:56 +00:00
Jakob Stoklund Olesen	874e401382	Remove a test that was only testing for physreg joining. This is the same as the other tests: Clever tricks are required to make the arguments and return value line up in a single-instruction function. It rarely happens in real life. We have plenty other examples of this behavior. llvm-svn: 157030	2012-05-18 00:07:14 +00:00
Jakob Stoklund Olesen	589c6eb95c	Remove -join-physregs from the test suite. This option has been disabled for a while, and it is going away so I can clean up the coalescer code. The tests that required physreg joining to be enabled were almost all of the form "tiny function with interference between arguments and return value". Such functions are usually inlined in the real world. The problem exposed by phys_subreg_coalesce-3.ll is real, but fairly rare. llvm-svn: 157027	2012-05-17 23:44:19 +00:00
Kevin Enderby	f1b225d0e0	Fix the encoding of the armv7m (MClass) for MSR APSR writes which was missing the 0b10 mask encoding bits. Make MSR APSR writes without a _<bits> qualifier an alias for MSR APSR_nzcvq even though ARM as deprecated it use. Also add support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions. Some FIXMEs in the code for better error checking when versions shouldn't be used. rdar://11457025 llvm-svn: 157019	2012-05-17 22:18:01 +00:00
Danil Malyshev	7c5db45350	- Added ExecutionEngine/MCJIT tests - Added HOST_ARCH to Makefile.config.in The HOST_ARCH will be used by MCJIT tests filter, because MCJIT supported only x86 and ARM architectures now. llvm-svn: 157015	2012-05-17 21:07:47 +00:00
Tim Northover	af501a29d3	Remove incorrect pattern for ARM SMML instruction. Patch by Meador Inge. llvm-svn: 156989	2012-05-17 13:12:13 +00:00
Chandler Carruth	d8c08c2111	Teach the 'opt' tool about '-Os' and '-Oz', corresponding to the Clang options, to enable easier testing of the innards of LLVM that are enabled by such optimization strategies. Note that this doesn't provide the (much needed) function attribute support for -Oz (as opposed to -Os), but still seems like a positive step to better test the logic that Clang currently relies on. Patch by Patrik Hägglund. llvm-svn: 156913	2012-05-16 08:32:49 +00:00
Evan Cheng	58a95f0c8a	Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474 llvm-svn: 156896	2012-05-16 01:54:27 +00:00
Jakob Stoklund Olesen	984997b3a0	Enable sub-sub-register copy coalescing. It is now possible to coalesce weird skewed sub-register copies by picking a super-register class larger than both original registers. The included test case produces code like this: vld2.32 {d16, d17, d18, d19}, [r0]! vst2.32 {d18, d19, d20, d21}, [r0] We still perform interference checking as if it were a normal full copy join, so this is still quite conservative. In particular, the f1 and f2 functions in the included test case still have remaining copies because of false interference. llvm-svn: 156878	2012-05-15 23:31:35 +00:00
Kevin Enderby	a414bcc0e3	Add a test case for r156840, a fix to llvm-objdump when disassembling using -macho to disassemble the last symbol to the end of the section. llvm-svn: 156850	2012-05-15 20:20:50 +00:00
Sirish Pande	91856a1f15	Enable all Hexagon tests. llvm-svn: 156824	2012-05-15 16:13:12 +00:00
David Majnemer	a9330fe553	Teach SimplifyLibCalls about stpcpy. llvm-svn: 156815	2012-05-15 11:46:21 +00:00
Jakob Stoklund Olesen	dc2e0cd44a	Fix PR12821. RAFast must add an <imp-def> operand when it is rewriting a sub-register def that isn't a read-modify-write. llvm-svn: 156777	2012-05-14 21:10:25 +00:00
Chad Rosier	a968caf8e0	Move the capture analysis from MemoryDependencyAnalysis to a more general place so that it can be reused in MemCpyOptimizer. This analysis is needed to remove an unnecessary memcpy when returning a struct into a local variable. rdar://11341081 PR12686 llvm-svn: 156776	2012-05-14 20:35:04 +00:00
Brendon Cahoon	f6b687e5d1	Revert 156634 upon request until code improvement changes are made. llvm-svn: 156775	2012-05-14 19:35:42 +00:00
Dan Gohman	164fe18cfe	Rename @llvm.debugger to @llvm.debugtrap. llvm-svn: 156774	2012-05-14 18:58:10 +00:00
Rafael Espindola	47b7dac220	Add support for the .rept directive. Patch by Vladmir Sorokin. I added support for nesting. llvm-svn: 156714	2012-05-12 16:31:10 +00:00
Benjamin Kramer	6bee7f750d	ELF: Add support for the asm .version directive. llvm-svn: 156712	2012-05-12 14:30:47 +00:00
Benjamin Kramer	95d31bcba5	AsmParser: Add support for the .purgem directive. Based on a patch by Team PaX. llvm-svn: 156709	2012-05-12 11:21:46 +00:00
Benjamin Kramer	66b8d4d28f	AsmParser: ignore the .extern directive. llvm-svn: 156707	2012-05-12 11:18:59 +00:00
Benjamin Kramer	e297b9f506	AsmParser: Add support for .ifc and .ifnc directives. Based on a patch from PaX Team. llvm-svn: 156706	2012-05-12 11:18:51 +00:00
Benjamin Kramer	62c18b0881	AsmParser: Add support for .ifb and .ifnb directives. Based on a patch from PaX Team. llvm-svn: 156705	2012-05-12 11:18:42 +00:00
Stepan Dyatkovskiy	0beab5e1cd	Recommited r156374 with critical fixes in BitcodeReader/Writer: Ordinary patch for PR1255. Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object. Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported. llvm-svn: 156704	2012-05-12 10:48:17 +00:00
Jay Foad	ca0c499609	Teach Function::hasAddressTaken that BlockAddress doesn't really take the address of a function. llvm-svn: 156703	2012-05-12 08:30:16 +00:00
Sirish Pande	4bd20c50eb	Support for Hexagon feature, New Value Jump. llvm-svn: 156698	2012-05-12 05:10:30 +00:00
Akira Hatanaka	763ab85690	Fix test cases. llvm-svn: 156697	2012-05-12 03:25:16 +00:00
Akira Hatanaka	8f3573034b	Make the following changes in MipsAsmPrinter.cpp: - Remove code which lowers pseudo SETGP01. - Fix LowerSETGP01. The first two of the three instructions that are emitted to initialize the global pointer register now use register $2. - Stop emitting .cpload directive. llvm-svn: 156689	2012-05-12 00:48:43 +00:00
Akira Hatanaka	d918f77ba3	Insert instructions to the entry basic block which initializes the global pointer register. This is the first of the series of patches which clean up the way global pointer register is used. The patches will make the following improvements: - Make $gp an allocatable temporary register rather than reserving it. - Use a virtual register as the global pointer register and let the register allocator decide which register to assign to it or whether spill/reloads are needed. - Make sure $gp is valid at the entry of a called function, which is necessary for functions using lazy binding. - Remove the need for emitting .cprestore and .cpload directives. llvm-svn: 156671	2012-05-12 00:17:17 +00:00
Akira Hatanaka	0661b81bca	Do not replace operands of pseudo instructions with register $zero. llvm-svn: 156663	2012-05-11 23:22:18 +00:00
Akira Hatanaka	5d60c36f37	Use regular expression to match register names. llvm-svn: 156656	2012-05-11 23:00:40 +00:00
Chad Rosier	aa9cb9df59	[fast-isel] Add support for selecting @llvm.trap(). llvm-svn: 156646	2012-05-11 21:33:49 +00:00
Brendon Cahoon	31f8723ef3	Hexagon constant extender support. Patch by Jyotsna Verma. llvm-svn: 156634	2012-05-11 19:56:59 +00:00
Chad Rosier	3268692aa8	[fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup. llvm-svn: 156632	2012-05-11 19:40:25 +00:00
Chad Rosier	90f9afe659	[fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg retval. Hoists check before emitting the call to avoid unnecessary work. rdar://11430407 PR12796 llvm-svn: 156628	2012-05-11 18:51:55 +00:00
Nuno Lopes	e2cfd3ce95	objectsize: add a few more tests and fix a bug llvm-svn: 156625	2012-05-11 18:25:29 +00:00
Hans Wennborg	addad7388d	Fix test/CodeGen/X86/tls-pie.ll. llvm-svn: 156612	2012-05-11 10:19:54 +00:00
Hans Wennborg	f9d0e44b82	Implement initial-exec TLS model for 32-bit PIC x86 This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong code here (see the update to test/CodeGen/X86/tls-pie.ll). llvm-svn: 156611	2012-05-11 10:11:01 +00:00
Silviu Baranga	ddc67a7655	Added the missing bit definition for the 4th bit of the STR (post reg) instruction. It is now set to 0. The patch also sets the unpredictable mask for SEL and SXTB-type instructions. llvm-svn: 156609	2012-05-11 09:28:27 +00:00
Silviu Baranga	5a719f9b9a	Fixed the LLVM ARM v7 assembler and instruction printer for 8-bit immediate offset addressing. The assembler and instruction printer were not properly handeling the #-0 immediate. llvm-svn: 156608	2012-05-11 09:10:54 +00:00
Eli Friedman	e0a64d83fc	Fix a minor logic mistake transforming compares in instcombine. PR12514. llvm-svn: 156600	2012-05-11 01:32:59 +00:00
Manman Ren	dc8ad0058f	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156599	2012-05-11 01:30:47 +00:00
Dan Gohman	dfab443ae8	Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(), but it generates int3 on x86 instead of ud2. llvm-svn: 156593	2012-05-11 00:19:32 +00:00
Nuno Lopes	f573030391	objectsize: add support for GEPs with non-constant indexes add an additional parameter to InstCombiner::EmitGEPOffset() to force it to not emit operations with NUW flag llvm-svn: 156585	2012-05-10 23:17:35 +00:00
Eric Christopher	ed51b9ec0b	Add support for the 'X' inline asm operand modifier. Patch by Jack Carter. llvm-svn: 156577	2012-05-10 21:48:22 +00:00
Sirish Pande	69295b8963	Hexagon V5 FP Support. llvm-svn: 156568	2012-05-10 20:20:25 +00:00
Dan Gohman	ed7c24e2d9	Teach DeadStoreElimination to eliminate exit-block stores with phi addresses. llvm-svn: 156558	2012-05-10 18:57:38 +00:00
Manman Ren	b555b382bd	Revert: 156550 "ARM: peephole optimization to remove cmp instruction" This commit broke an external linux bot and gave a compile-time warning. llvm-svn: 156556	2012-05-10 18:49:43 +00:00
Nuno Lopes	300d629924	teach DSE and isInstructionTriviallyDead() about calloc llvm-svn: 156553	2012-05-10 17:14:00 +00:00
Joel Jones	7f04344b8b	formatting change: strip debug info from test llvm-svn: 156551	2012-05-10 16:55:31 +00:00
Manman Ren	c860887b2d	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156550	2012-05-10 16:48:21 +00:00
Joel Jones	3d90a9ae65	Fix a problem with incomplete equality testing of PHINodes in Instruction::IsIdenticalToWhenDefined. This manifested itself when inlining two calls to the same function. The inlined function had a switch statement that returned one of a set of global variables. Without this modification, the two phi instructions that chose values from the branches of the switch instruction inlined from the callee were considered equivalent and jump-threading replaced a load for the first switch value with a phi selecting from the second switch, thereby producing incorrect code. This patch has been tested with "make check-all", "lnt runteste nt", and llvm self-hosted, and on the original program that had this problem, wireshark. <rdar://problem/11025519> llvm-svn: 156548	2012-05-10 15:59:41 +00:00
Nadav Rotem	15946e50c1	AVX2: Add an additional broadcast idiom. llvm-svn: 156540	2012-05-10 12:39:13 +00:00
Nadav Rotem	b86a3fb8d0	Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program. Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users. Fix PR11900. llvm-svn: 156539	2012-05-10 12:22:05 +00:00
Dan Gohman	f8b19d09ba	Fix the objc_storeStrong recognizer to stop before walking off the end of a basic block if there's no store. llvm-svn: 156520	2012-05-09 23:08:33 +00:00
Nuno Lopes	7100f463b0	objectsize: refactor code a bit to enable future changes to support run-time information add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs) llvm-svn: 156515	2012-05-09 21:30:57 +00:00
Danil Malyshev	47aba39004	Added a regress test for the bug #9964 before close it. This bug was fixed by Jim Grosbach in #138879, thanks Jim! llvm-svn: 156505	2012-05-09 19:07:04 +00:00
Nuno Lopes	01547b3ad2	change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept. This commit only adds the parameter. Code taking advantage of it will follow. llvm-svn: 156473	2012-05-09 15:52:43 +00:00
Filipe Cabecinhas	5c43305383	Fixed a typo llvm-svn: 156471	2012-05-09 14:43:50 +00:00
Akira Hatanaka	ca41d13bbd	Add another peephole pattern for conditional moves. llvm-svn: 156460	2012-05-09 02:29:29 +00:00

... 3 4 5 6 7 ...

16618 Commits