llvm-project

Commit Graph

Author	SHA1	Message	Date
Rafael Espindola	2c06448360	Fix macros arguments with an underscore, dot or dollar in them. This is based on a patch by Andy/PaX. I added the support for dot and dollar. llvm-svn: 162298	2012-08-21 18:29:30 +00:00
Rafael Espindola	af6da83a2c	Make the wording in of the "expected identifier" error in the .macro directive consistent with the other "expected identifier" errors. Extracted from the Andy/PaX patch. I added the test. llvm-svn: 162291	2012-08-21 17:12:05 +00:00
Tim Northover	f39c1a3f72	Add correct set of regression tests for r162094 commit. llvm-svn: 162276	2012-08-21 12:43:03 +00:00
Chandler Carruth	c908ca1766	Port the global copy optimization from the SROA pass to InstCombine. This optimization is really just replacing allocas wholesale with globals, there is no scalarization. The underlying motivation for this patch is to simplify the SROA pass and focus it on splitting and promoting allocas. llvm-svn: 162271	2012-08-21 08:39:44 +00:00
Kostya Serebryany	f4be019fba	[asan] add code to detect global initialization fiasco in C/C++. The sub-pass is off by default for now. Patch by Reid Watson. Note: this patch changes the interface between LLVM and compiler-rt parts of asan. The corresponding patch to compiler-rt will follow. llvm-svn: 162268	2012-08-21 08:24:25 +00:00
Jakob Stoklund Olesen	74e6f9fc65	Add a missing def flag. * Bad machine code: Explicit definition marked as use * - function: test_cos - basic block: BB#0 L.entry (0x7ff2a2024fd0) - instruction: VSETLNi32 %D11, %D11<undef>, %R0, 0, pred:14, pred:%noreg, %Q5<imp-use,kill>, %Q5<imp-def> - operand 0: %D11 llvm-svn: 162247	2012-08-21 00:34:53 +00:00
Jakob Stoklund Olesen	7d33c5739f	Don't add CFG edges for redundant conditional branches. IR that hasn't been through SimplifyCFG can look like this: br i1 %b, label %r, label %r Make sure we don't create duplicate Machine CFG edges in this case. Fix the machine code verifier to accept conditional branches with a single CFG edge. llvm-svn: 162230	2012-08-20 21:39:52 +00:00
Michael Liao	10ff96ce8c	fix a case where all operands of BUILD_VECTOR are undefined llvm-svn: 162214	2012-08-20 17:59:18 +00:00
Stepan Dyatkovskiy	6ee89aafc8	Forget to add testcase for r162195. Sorry. llvm-svn: 162196	2012-08-20 08:03:18 +00:00
Nadav Rotem	178250ad87	When unsafe math is used, we can use commutative FMAX and FMIN. In some cases this allows for better code generation. Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and FMINC, which are commutative. For example: movaps %xmm0, %xmm1 movsd LC(%rip), %xmm0 minsd %xmm1, %xmm0 becomes: minsd LC(%rip), %xmm0 llvm-svn: 162187	2012-08-19 13:06:16 +00:00
Benjamin Kramer	9d03242fcf	InstCombine: Fix a crasher when encountering a function pointer. llvm-svn: 162180	2012-08-18 22:04:34 +00:00
Jakob Stoklund Olesen	dded061f85	Also combine zext/sext into selects for ARM. This turns common i1 patterns into predicated instructions: (add (zext cc), x) -> (select cc (add x, 1), x) (add (sext cc), x) -> (select cc (add x, -1), x) For a function like: unsigned f(unsigned s, int x) { return s + (x>0); } We now produce: cmp r1, #0 it gt addgt.w r0, r0, #1 Instead of: movs r2, #0 cmp r1, #0 it gt movgt r2, #1 add r0, r2 llvm-svn: 162177	2012-08-18 21:25:22 +00:00
Jakob Stoklund Olesen	aab43dbfbb	Also pass logical ops to combineSelectAndUse. Add these transformations to the existing add/sub ones: (and (select cc, -1, c), x) -> (select cc, x, (and, x, c)) (or (select cc, 0, c), x) -> (select cc, x, (or, x, c)) (xor (select cc, 0, c), x) -> (select cc, x, (xor, x, c)) The selects can then be transformed to a single predicated instruction by peephole. This transformation will make it possible to eliminate the ISD::CAND, COR, and CXOR custom DAG nodes. llvm-svn: 162176	2012-08-18 21:25:16 +00:00
Benjamin Kramer	8c2a733c55	InstCombine: Add a couple of fabs identities for comparing with 0.0. llvm-svn: 162174	2012-08-18 20:06:47 +00:00
Benjamin Kramer	000132454c	SimplifyLibcalls: Add fabs and trunc to the list of libcalls that are safe to shrink from double to float. llvm-svn: 162173	2012-08-18 19:27:32 +00:00
Nadav Rotem	a136939fa9	Reapply r162160 with a fix: Optimize Arith->Trunc->SETCC sequence to allow better compare/branch code. llvm-svn: 162172	2012-08-18 17:53:03 +00:00
Nadav Rotem	c324af609e	Revert r162160 because it made a few buildbots fail. llvm-svn: 162164	2012-08-18 05:02:36 +00:00
Nadav Rotem	2cb14a5c4b	The X86 backend has a number of optimizations for SETCC nodes which use arithmetic instructions. However, when small data types are used, a truncate node appears between the SETCC node and the arithmetic operation. This patch adds support for this pattern. Before: xorl %esi, %edi testb %dil, %dil setne %al ret After: xorb %dil, %sil setne %al ret rdar://12081007 llvm-svn: 162160	2012-08-18 02:43:28 +00:00
Eli Friedman	79a6b30d8a	Make atomic load and store of pointers work. Tighten verification of atomic operations so other unexpected operations don't slip through. Based on patch by Logan Chien. PR11786/PR13186. llvm-svn: 162146	2012-08-17 23:24:29 +00:00
Jakob Stoklund Olesen	7b1a2e8f02	Avoid folding ADD instructions with FI operands. PEI can't handle the pseudo-instructions. This can be removed when the pseudo-instructions are replaced by normal predicated instructions. Fixes PR13628. llvm-svn: 162130	2012-08-17 20:55:34 +00:00
Benjamin Kramer	34764fe2e4	MemoryBuiltins: Properly guard ObjectSizeOffsetVisitor against cycles in the IR. The previous fix only checked for simple cycles, use a set to catch longer cycles too. Drop the broken check from the ObjectSizeOffsetEvaluator. The BoundsChecking pass doesn't have to deal with invalid IR like InstCombine does. llvm-svn: 162120	2012-08-17 19:26:41 +00:00
Bill Wendling	34bc34ecae	Change the `linker_private_weak_def_auto' linkage to `linkonce_odr_auto_hide' to make it more consistent with its intended semantics. The `linker_private_weak_def_auto' linkage type was meant to automatically hide globals which never had their addresses taken. It has nothing to do with the `linker_private' linkage type, which outputs the symbols with a `l' (ell) prefix among other things. The intended semantic is more like the `linkonce_odr' linkage type. Change the name of the linkage type to `linkonce_odr_auto_hide'. And therefore changing the semantics so that it produces the correct output for the linker. Note: The old linkage name `linker_private_weak_def_auto' will still parse but is not a synonym for `linkonce_odr_auto_hide'. This should be removed in 4.0. <rdar://problem/11754934> llvm-svn: 162114	2012-08-17 18:33:14 +00:00
Rafael Espindola	9a16735e22	Assert that dominates is not given a multiple edge. Finding out if we have multiple edges between two blocks is linear. If the caller is iterating all edges leaving a BB that would be a square time algorithm. It is more efficient to have the callers handle that case. Currently the only callers are: * GVN: already avoids the multiple edge case. * Verifier: could only hit this assert when looking at an invalid invoke. Since it already rejects the invoke, just avoid computing the dominance for it. llvm-svn: 162113	2012-08-17 18:21:28 +00:00
Benjamin Kramer	ca7ca4f6c6	TargetLowering: Use the large shift amount during legalize types. The legalizer may call us with an overly large type. llvm-svn: 162101	2012-08-17 15:54:21 +00:00
Benjamin Kramer	4901f0d2a2	Guard MemoryBuiltins against self-looping GEPs, which can occur in unreachable code due to constant propagation. Fixes PR13621. llvm-svn: 162098	2012-08-17 14:16:37 +00:00
Benjamin Kramer	2f47a3fb07	Fix broken check lines. I really need to find a way to automate this, but I can't come up with a regex that has no false positives while handling tricky cases like custom check prefixes. llvm-svn: 162097	2012-08-17 12:28:26 +00:00
Tim Northover	f66181530f	Implement NEON domain switching for scalar <-> S-register vmovs on ARM llvm-svn: 162094	2012-08-17 11:32:52 +00:00
Jakob Stoklund Olesen	0ea1fce6b4	Add ADD and SUB to the predicable ARM instructions. It is not my plan to duplicate the entire ARM instruction set with predicated versions. We need a way of representing predicated instructions in SSA form without requiring a separate opcode. Then the pseudo-instructions can go away. llvm-svn: 162061	2012-08-16 23:21:55 +00:00
Rafael Espindola	cc80cdebb9	Teach GVN to reason about edges dominating uses. This allows it to handle cases where some fact lake a=b dominates a use in a phi, but doesn't dominate the basic block itself. This feature could also be implemented by splitting critical edges, but at least with the current algorithm reasoning about the dominance directly is faster. The time for running "opt -O2" in the testcase in pr10584 is 1.003 times slower and on gcc as a single file it is 1.0007 times faster. llvm-svn: 162023	2012-08-16 15:09:43 +00:00
Jush Lu	26088cb30e	[arm-fast-isel] Add support for fastcc. Without fastcc support, the caller just falls through to CallingConv::C for fastcc, but callee still uses fastcc, this inconsistency of calling convention is a problem, and fastcc support can fix it. llvm-svn: 162013	2012-08-16 05:15:53 +00:00
Akira Hatanaka	269d3fd101	Test case for r162008. llvm-svn: 162009	2012-08-16 03:48:41 +00:00
Jakob Stoklund Olesen	6cb96120f1	Fold predicable instructions into MOVCC / t2MOVCC. The ARM select instructions are just predicated moves. If the select is the only use of an operand, the instruction defining the operand can be predicated instead, saving one instruction and decreasing register pressure. This implementation can turn AND/ORR/EOR instructions into their corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to predicate any instruction, but we don't yet support predicated instructions in SSA form. llvm-svn: 161994	2012-08-15 22:16:39 +00:00
Bill Wendling	2c8685e327	Rework test so that it reproduces the error without the horrible flag. llvm-svn: 161989	2012-08-15 21:10:18 +00:00
Bill Wendling	d63f1f5a9c	Remove invalid test. This test requires that dead basic blocks be kept around. That's not how we do things. Besides, the commit message tells us that it is covered by the GCC test suite. ------------------------------------------------------------------------ r127497 \| zwarich \| 2011-03-11 13:51:56 -0800 (Fri, 11 Mar 2011) \| 3 lines Fix the GCC test suite issue exposed by r127477, which was caused by stack protector insertion not working correctly with unreachable code. Since that revision was rolled out, this test doesn't actual fail before this fix. ------------------------------------------------------------------------ llvm-svn: 161985	2012-08-15 20:54:09 +00:00
Evan Cheng	eec6bc6270	Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029 llvm-svn: 161962	2012-08-15 17:44:53 +00:00
Michael Liao	69e172a6f0	fix infinite loop in instcombine with more than 4GB memcpy - memcpy size is wrongly truncated into 32-bit and treat 8GB memcpy is 0-sized memcpy - as 0-sized memcpy/memset is already removed before SimplifyMemTransfer and SimplifyMemSet in visitCallInst, replace 0 checking with assertions. - replace getZExtValue() with getLimitedValue() according to Eli Friedman llvm-svn: 161923	2012-08-15 03:49:59 +00:00
Anton Korobeynikov	c6d945b11a	The names of VFP variants of half-to-float conversion instructions were reversed. This leads to wrong codegen for float-to-half conversion intrinsics which are used to support storage-only fp16 type. NEON variants of same instructions are fine. llvm-svn: 161907	2012-08-14 23:36:01 +00:00
Michael Liao	34107b9177	fix PR11334 - FP_EXTEND only support extending from vectors with matching elements. This results in the scalarization of extending to v2f64 from v2f32, which will be legalized to v4f32 not matching with v2f64. - add X86-specific VFPEXT supproting extending from v4f32 to v2f64. - add BUILD_VECTOR lowering helper to recover back the original extending from v4f32 to v2f64. - test case is enhanced to include different vector width. llvm-svn: 161894	2012-08-14 21:24:47 +00:00
Kostya Serebryany	bf479714f9	[asan] insert crash basic blocks inline as opposed to inserting them at the end of the function. This doesn't seem to fix or break anything, but is considered to be more friendly to downstream passes (test change) llvm-svn: 161871	2012-08-14 14:05:50 +00:00
Craig Topper	2a40418a99	Change greater than to greater than or equal so that an identical sized store to the same offset is treated as completing overwriting. llvm-svn: 161857	2012-08-14 07:32:05 +00:00
Nadav Rotem	70409991bc	During the CodeGenPrepare we often lower intrinsics (such as objsize) and allow some optimizations to turn conditional branches into unconditional. This commit adds a simple control-flow optimization which merges two consecutive basic blocks which are connected by a single edge. This allows the codegen to operate on larger basic blocks. rdar://11973998 llvm-svn: 161852	2012-08-14 05:19:07 +00:00
NAKAMURA Takumi	245920463d	llvm/test/CodeGen/ARM/floorf.ll: Add explicit -mtriple=arm-unknown-unknown, or it fails on msvc. llvm-svn: 161825	2012-08-14 00:56:06 +00:00
Owen Anderson	a40319b7f1	Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC. llvm-svn: 161807	2012-08-13 23:32:49 +00:00
Nadav Rotem	5d4e205874	MemoryDependenceAnalysis attempts to find the first memory dependency for function calls. Currently, if GetLocation reports that it did not find a valid pointer (this is the case for volatile load/stores), we ignore the result. This patch adds code to handle the cases where we did not obtain a valid pointer. rdar://11872864 PR12899 llvm-svn: 161802	2012-08-13 23:03:43 +00:00
Jim Grosbach	b4e1ba7191	ARM: Move Thumb2 tests to Thumb2 test file and fix CHECK lines. These tests weren't actually being run before (missing ':' after CHECK). llvm-svn: 161800	2012-08-13 22:25:44 +00:00
Bill Wendling	72baa6eeae	Rename test since it's not linux-specific. llvm-svn: 161792	2012-08-13 21:32:42 +00:00
Jakob Stoklund Olesen	83a927d84a	Handle extra Tail predecessors in if-conversion. It is still possible to if-convert if the tail block has extra predecessors, but the tail phis must be rewritten instead of being removed. llvm-svn: 161781	2012-08-13 20:49:04 +00:00
Arnold Schwaighofer	0bb7f23cfc	[Hexagon] Don't mark callee saved registers as clobbered by a tail call This was causing unnecessary spills/restores of callee saved registers. Fixes PR13572. Patch by Pranav Bhandarkar! llvm-svn: 161778	2012-08-13 19:54:01 +00:00
Manman Ren	9746b33e26	Fix failure on Atom bot due to r161769 llvm-svn: 161777	2012-08-13 19:34:29 +00:00
Nadav Rotem	3a94c545cf	Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user. rdar://11876519 llvm-svn: 161775	2012-08-13 18:52:44 +00:00
Manman Ren	959acb106b	X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed to a memory operand. PR13576 llvm-svn: 161769	2012-08-13 18:29:41 +00:00
Eric Christopher	7d8b53c1f8	Add support for the %H output modifier. Patch by Weiming Zhao. llvm-svn: 161768	2012-08-13 18:18:52 +00:00
Tim Northover	b4abb84d9c	Add test for previous commit correcting NEON load patterns. llvm-svn: 161750	2012-08-13 10:38:45 +00:00
Nick Lewycky	1dcfe449ef	Give this test an explicit triple. llvm-svn: 161740	2012-08-12 08:21:27 +00:00
Nick Lewycky	333449cd65	When emitting the PC range in an FDE, use the same data encoding for both ends of the range. Fixes PR13581! llvm-svn: 161739	2012-08-12 08:09:45 +00:00
Arnold Schwaighofer	b73da9453c	Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM architecture It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7 thumb O3. llvm-svn: 161736	2012-08-12 05:11:56 +00:00
Michael Liao	e7e828fd64	fix PR13577, an issue introduced by r161687 - FCMOV only supports a subset of X86 conditions. Skip boolean simplification if X86 condition is not valid for FCMOV. - add a minimal test case for PR13577. llvm-svn: 161732	2012-08-11 23:47:06 +00:00
Benjamin Kramer	ef6494f24d	PR13578: Teach MachineCSE that instructions that use a constant register can be CSE'd safely. This is common e.g. when doing rip-relative addressing on x86_64. llvm-svn: 161728	2012-08-11 19:05:13 +00:00
Manman Ren	1acb6707cd	X86: when we are auto-detecting the subtarget features, make sure we turn on FeatureFastUAMem for Nehalem, Westmere and Sandy Bridge. FeatureFastUAMem is already on if we pass in nehalem or westmere as a command argument. rdar: 7252306 llvm-svn: 161717	2012-08-10 23:43:32 +00:00
Eli Friedman	4c923b3b3f	The normal edge of an invoke is not allowed to branch to a block with a landingpad. Enforce it in the verifier, and fix the regression tests to match. llvm-svn: 161697	2012-08-10 20:55:20 +00:00
Michael Liao	5248e9913f	add X86-specific DAG optimization to simplify boolean test - if a boolean test (X86ISD::CMP or X86ISD:SUB) checks a boolean value generated from X86ISD::SETCC, try to simplify the boolean value generation and checking by reusing the original EFLAGS with proper condition code - add hooks to X86 specific SETCC/BRCOND/CMOV, the major 3 places consuming EFLAGS part of patches fixing PR12312 llvm-svn: 161687	2012-08-10 19:58:13 +00:00
Pete Cooper	0deca6be79	Fix crash when when do lto on Bullet. Dynamic GEPs in SROA were incorrectly being applied to all accesses to an alloca, not just the ones which read from the GEP. Thanks to Evan for reducing the test. rdar://11861001 llvm-svn: 161654	2012-08-10 03:26:36 +00:00
Jakob Stoklund Olesen	8c28ac9ec9	Update edge weights correctly in replaceSuccessor(). When replacing Old with New, it can happen that New is already a successor. Add the old and new edge weights instead of creating a duplicate edge. llvm-svn: 161653	2012-08-10 03:23:27 +00:00
Jakob Stoklund Olesen	d9b66506a3	Reapply r161633-161634 "Partition use lists so defs always come before uses."" No changes to these patches, MRI needed to be notified when changing uses into defs and vice versa. llvm-svn: 161644	2012-08-10 00:21:30 +00:00
Jakob Stoklund Olesen	acd27c9279	Revert r161633-161634 "Partition use lists so defs always come before uses." These commits broke a number of buildbots. llvm-svn: 161640	2012-08-09 23:31:36 +00:00
Jakob Stoklund Olesen	df01e00710	Partition use lists so defs always come before uses. This makes it possible to speed up def_iterator by stopping at the first use. This makes def_empty() and getUniqueVRegDef() much faster when there are many uses. In a +Asserts build, LiveVariables is 100x faster in one case because getVRegDef() has an assertion that would scan to the end of a def_iterator chain. Spill weight calculation is significantly faster (300x in one case) because isTriviallyReMaterializable() calls MRI->isConstantPhysReg(%RIP) which calls def_empty(%RIP). llvm-svn: 161634	2012-08-09 22:49:46 +00:00
Jakob Stoklund Olesen	7d7051ca3c	Don't use pointer-pointers for the register use lists. Use a more conventional doubly linked list where the Prev pointers form a cycle. This means it is no longer necessary to adjust the Prev pointers when reallocating the VRegInfo array. The test changes are required because the register allocation hint is using the use-list order to break ties. llvm-svn: 161633	2012-08-09 22:49:42 +00:00
Jakob Stoklund Olesen	4238a89db8	Don't modify MO while use_iterator is still pointing to it. llvm-svn: 161626	2012-08-09 22:08:24 +00:00
Chandler Carruth	4cbe653ace	Teach the LLVM test makefile to run the extra Clang tools' test suites as part of check-all. llvm-svn: 161610	2012-08-09 20:26:41 +00:00
Jack Carter	120a30a732	Another 32 to 64 bit sign extension bug. The fields in the td definition were switched. llvm-svn: 161607	2012-08-09 19:43:18 +00:00
Arnold Schwaighofer	81b2eec1ab	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 161581	2012-08-09 15:25:52 +00:00
Nadav Rotem	e0f84d31c8	Fix the legalization of ExtLoad on ARM. ExpandUnalignedLoad did not properly handle the cases where the memory value type was illegal. PR 13111. llvm-svn: 161565	2012-08-09 01:56:44 +00:00
Bob Wilson	4c65c505e0	Add test triples to fix win32 failures. Revert workaround from r161292. I don't have a win32 system to test, so hopefully I got them all fixed here. llvm-svn: 161519	2012-08-08 20:31:37 +00:00
NAKAMURA Takumi	ec934cd64d	llvm/test/MC/COFF/seh.s: Fixup corresponding to r161487. llvm-svn: 161489	2012-08-08 13:27:04 +00:00
Bill Wendling	ce4fe41d21	Add `.pushsection', `.popsection', and `.previous' directives to Darwin ASM. There are situations where inline ASM may want to change the section -- for instance, to create a variable in the .data section. However, it cannot do this without (potentially) restoring to the wrong section. E.g.: asm volatile (".section __DATA, __data\n\t" ".globl _fnord\n\t" "_fnord: .quad 1f\n\t" ".text\n\t" "1:" :::); This may be wrong if this is inlined into a function that has a "section" attribute. The user should use `.pushsection' and `.popsection' here instead. The addition of `.previous' is added for completeness. <rdar://problem/12048387> llvm-svn: 161477	2012-08-08 06:30:30 +00:00
Eli Friedman	08ec0a8122	isAllocLikeFn is allowed to return true for functions which read memory; make sure we account for that correctly in DeadStoreElimination. Fixes a regression from r158919. PR13547. llvm-svn: 161468	2012-08-08 02:17:32 +00:00
Manman Ren	1be131ba27	X86: enable CSE between CMP and SUB We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462	2012-08-08 00:51:41 +00:00
Dan Gohman	b948736002	Avoid recomputing the unique exit blocks and their insert points when doing multiple scalar promotions on a single loop. This also has the effect of preserving the order of stores sunk out of loops, which is aesthetically pleasing, and it happens to fix the testcase in PR13542, though it doesn't fix the underlying problem. llvm-svn: 161459	2012-08-08 00:00:26 +00:00
Bob Wilson	61f3ad5759	Fix a serious typo in InstCombine's optimization of comparisons. An unsigned value converted to floating-point will always be greater than a negative constant. Unfortunately InstCombine reversed the check so that unsigned values were being optimized to always be greater than all positive floating-point constants. <rdar://problem/12029145> llvm-svn: 161452	2012-08-07 22:35:16 +00:00
Evan Cheng	fbdd25c135	X86 cmp lowering is looking past truncate on the condition node. It should only do so when the high bits are known zero. This caused a subtle miscompilation. rdar://12027825 llvm-svn: 161451	2012-08-07 22:21:00 +00:00
Alexey Samsonov	947228c4f7	Fix the representation of debug line table in DebugInfo LLVM library, and "instruction address -> file/line" lookup. Instead of plain collection of rows, debug line table for compilation unit is now treated as the number of row ranges, describing sequences (series of contiguous machine instructions). The sequences are not always listed in the order of increasing address, so previously used std::lower_bound() sometimes produced wrong results. Now the instruction address lookup consists of two stages: finding the correct sequence, and searching for address in range of rows for this sequence. llvm-svn: 161414	2012-08-07 11:46:57 +00:00
Benjamin Kramer	c99d0e9186	PR13095: Give an inline cost bonus to functions using byval arguments. We give a bonus for every argument because the argument setup is not needed anymore when the function is inlined. With this patch we interpret byval arguments as a compact representation of many arguments. The byval argument setup is implemented in the backend as an inline memcpy, so to model the cost as accurately as possible we take the number of pointer-sized elements in the byval argument and give a bonus of 2 instructions for every one of those. The bonus is capped at 8 elements, which is the number of stores at which the x86 backend switches from an expanded inline memcpy to a real memcpy. It would be better to use the real memcpy threshold from the backend, but it's not available via TargetData. This change brings the performance of c-ray in line with gcc 4.7. The included test case tries to reproduce the c-ray problem to catch regressions for this benchmark early, its performance is dominated by the inline decision of a specific call. This only has a small impact on most code, more on x86 and arm than on x86_64 due to the way the ABI works. When building LLVM for x86 it gives a small inline cost boost to virtually any function using StringRef or STL allocators, but only a 0.01% increase in overall binary size. The size of gcc compiled by clang actually shrunk by a couple bytes with this patch applied, but not significantly. llvm-svn: 161413	2012-08-07 11:13:19 +00:00
Chandler Carruth	2f6cf4884c	Fix PR13412, a nasty miscompile due to the interleaved instsimplify+inline strategy. The crux of the problem is that instsimplify was reasonably relying on an invariant that is true within any single function, but is no longer true mid-inline the way we use it. This invariant is that an argument pointer != a local (alloca) pointer. The fix is really light weight though, and allows instsimplify to be resiliant to these situations: when checking the relation ships to function arguments, ensure that the argumets come from the same function. If they come from different functions, then none of these assumptions hold. All credit to Benjamin Kramer for coming up with this clever solution to the problem. llvm-svn: 161410	2012-08-07 10:59:59 +00:00
Chandler Carruth	881d0a7966	Add a much more conservative strategy for aligning branch targets. Previously, MBP essentially aligned every branch target it could. This bloats code quite a bit, especially non-looping code which has no real reason to prefer aligned branch targets so heavily. As Andy said in review, it's still a bit odd to do this without a real cost model, but this at least has much more plausible heuristics. Fixes PR13265. llvm-svn: 161409	2012-08-07 09:45:24 +00:00
Manman Ren	cb36b8c2e6	MachineCSE: Update the heuristics for isProfitableToCSE. If the result of a common subexpression is used at all uses of the candidate expression, CSE should not increase the live range of the common subexpression. rdar://11393714 and rdar://11819721 llvm-svn: 161396	2012-08-07 06:16:46 +00:00
Jack Carter	f4946cfbb9	The define for 64 bit sign extension neglected to initialize fields of the class that it used. The result was nonsense code. Before: 0000000000000000 <foo>: 0: 00441100 0x441100 4: 03e00008 jr ra 8: 00000000 nop After: 0000000000000000 <foo>: 0: 00041000 sll v0,a0,0x0 4: 03e00008 jr ra 8: 00000000 nop llvm-svn: 161377	2012-08-07 00:35:22 +00:00
Jack Carter	612c66314c	The Mips64InstrInfo.td definitions DynAlloc64 LEA_ADDiu64 were using a class defined for 32 bit instructions and thus the instruction was for addiu instead of daddiu. This was corrected by adding the instruction opcode as a field in the base class to be filled in by the defs. llvm-svn: 161359	2012-08-06 23:29:06 +00:00
Jack Carter	84491abb20	Mips relocations R_MIPS_HIGHER and R_MIPS_HIGHEST. These 2 relocations gain access to the highest and the second highest 16 bits of a 64 bit object. R_MIPS_HIGHER %higher(A+S) The %higher(x) function is [ (((long long) x + 0x80008000LL) >> 32) & 0xffff ]. R_MIPS_HIGHEST %highest(A+S) The %highest(x) function is [ (((long long) x + 0x800080008000LL) >> 48) & 0xffff ]. llvm-svn: 161348	2012-08-06 21:26:03 +00:00
Hal Finkel	33e529d56b	MFTB on PPC64 should really be encoded using MFSPR. The MFTB instruction itself is being phased out, and its functionality is provided by MFSPR. According to the ISA docs, using MFSPR works on all known chips except for the 601 (which did not have a timebase register anyway) and the POWER3. Thanks to Adhemerval Zanella for pointing this out! llvm-svn: 161346	2012-08-06 21:21:44 +00:00
Craig Topper	ab47fe4e16	Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318	2012-08-06 06:22:36 +00:00
Craig Topper	812005e562	Update test to check for r161305 llvm-svn: 161307	2012-08-05 09:06:28 +00:00
Hal Finkel	70381a7b18	Add readcyclecounter lowering on PPC64. On PPC64, this can be done with a simple TableGen pattern. To enable this, I've added the (otherwise missing) readcyclecounter SDNode definition to TargetSelectionDAG.td. llvm-svn: 161302	2012-08-04 14:10:46 +00:00
Anton Korobeynikov	218aaf6d04	Add stack spill / reload instructions for DTriple and DQuad register classes, which were missed for no reason. This fixes PR13377 llvm-svn: 161299	2012-08-04 13:16:12 +00:00
Bob Wilson	874886cd66	Refactor and check "onlyReadsMemory" before optimizing builtins. This patch is mostly just refactoring a bunch of copy-and-pasted code, but it also adds a check that the call instructions are readnone or readonly. That check was already present for sin, cos, sqrt, log2, and exp2 calls, but it was missing for the rest of the builtins being handled in this code. llvm-svn: 161282	2012-08-03 23:29:17 +00:00
Akira Hatanaka	22bec282e9	1. Redo mips16 instructions to avoid multiple opcodes for same instruction. Change these to patterns. 2. Add another 16 instructions. Patch by Reed Kotler. llvm-svn: 161272	2012-08-03 22:57:02 +00:00
Bob Wilson	fa59485b94	Fix memcmp code-gen to honor -fno-builtin. I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp in TargetLibraryInfo, so that it would use custom code for memcmp calls even with -fno-builtin. I also had to add a new -disable-simplify-libcalls option to llc so that I could write a test for this. llvm-svn: 161262	2012-08-03 21:26:18 +00:00
Bob Wilson	3e6fa462f3	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Jush Lu	4705da9020	[arm-fast-isel] Add support for shl, lshr, and ashr. llvm-svn: 161230	2012-08-03 02:37:48 +00:00
NAKAMURA Takumi	edac6ee6aa	[CMake] Add yaml2obj to check-llvm. llvm-svn: 161229	2012-08-03 00:45:32 +00:00
Matt Beaumont-Gay	99fc2e19a6	Move test yaml files under Inputs until they are converted to be the actual test files. llvm-svn: 161219	2012-08-02 21:52:49 +00:00
Manman Ren	ba8122cc25	X86 Peephole: fold loads to the source register operand if possible. Add more comments and use early returns to reduce nesting in isLoadFoldable. Also disable folding for V_SET0 to avoid introducing a const pool entry and a const pool load. rdar://10554090 and rdar://11873276 llvm-svn: 161207	2012-08-02 19:37:32 +00:00
Michael J. Spencer	1ffd9de4f1	Add yaml2obj. A utility to convert YAML to binaries. yaml2obj takes a textual description of an object file in YAML format and outputs the binary equivalent. This greatly simplifies writing tests that take binary object files as input. llvm-svn: 161205	2012-08-02 19:16:56 +00:00
Akira Hatanaka	fffad897f2	Set transient stack alignment in constructor of MipsFrameLowering and re-enable test o32_cc_vararg.ll. llvm-svn: 161189	2012-08-02 18:15:13 +00:00
Jiangning Liu	fa18005a4c	Support fpv4 for ARM Cortex-M4. llvm-svn: 161163	2012-08-02 08:35:55 +00:00
Jiangning Liu	6a43bf7d74	Fix #13035 , a bug around Thumb instruction LDRD/STRD with negative #0 offset index issue. llvm-svn: 161162	2012-08-02 08:29:50 +00:00
Jiangning Liu	288e1af8c8	Fix #13138 , a bug around ARM instruction DSB encoding and decoding issue. llvm-svn: 161161	2012-08-02 08:21:27 +00:00
Jiangning Liu	10dd40e42d	Fix #13241 , a bug around shift immediate operand for ARM instruction ADR. llvm-svn: 161159	2012-08-02 08:13:13 +00:00
NAKAMURA Takumi	7020f51622	llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Make sure this is testing without +avx. FIXME: Could +avx be checked here too? llvm-svn: 161156	2012-08-02 06:36:56 +00:00
NAKAMURA Takumi	aaca1e690d	llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Rewrite expressions to pass regardless of PR11031. - Relax to match even if epilogue (pop %ebp) were emitted. - Assume the return value is stored to %xmm0. llvm-svn: 161155	2012-08-02 06:33:58 +00:00
Manman Ren	5759d01230	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. This patch is a rework of r160919 and was tested on clang self-host on my local machine. rdar://10554090 and rdar://11873276 llvm-svn: 161152	2012-08-02 00:56:42 +00:00
Eric Christopher	b1b9451337	Temporarily revert c23b933d5f8be9b51a1d22e717c0311f65f87dcd. It's causing failures in the debug testsuite and possibly PR13486. llvm-svn: 161121	2012-08-01 18:19:01 +00:00
Matt Beaumont-Gay	7947aecaf1	Line endings. llvm-svn: 161117	2012-08-01 16:42:35 +00:00
Nuno Lopes	b1c473998d	fix 'make check' when ocamlopt returns the compiler path with CFLAGS (and there's a cflag with a = char) llvm-svn: 161114	2012-08-01 15:50:34 +00:00
Elena Demikhovsky	3cb3b0045c	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Nick Lewycky	fb78083b1c	Stay rational; don't assert trying to take the square root of a negative value. If it's negative, the loop is already proven to be infinite. Fixes PR13489! llvm-svn: 161107	2012-08-01 09:14:36 +00:00
Akira Hatanaka	d1c43cee24	Add definitions of two subclasses of MipsFrameLowering, Mips16FrameLowering and MipsSEFrameLowering. Implement MipsSEFrameLowering::hasReservedCallFrame. Call frames will not be reserved if there is a call with a large call frame or there are variable sized objects on the stack. llvm-svn: 161090	2012-07-31 22:50:19 +00:00
Akira Hatanaka	02de0e4425	Let PEI::calculateFrameObjectOffsets compute the final stack size rather than computing it in MipsFrameLowering::emitPrologue. llvm-svn: 161078	2012-07-31 21:28:49 +00:00
Akira Hatanaka	33a25af5a8	Expand DYNAMIC_STACKALLOC nodes rather than doing custom-lowering. The frame object which points to the dynamically allocated area will not be needed after changes are made to cease reserving call frames. llvm-svn: 161076	2012-07-31 20:54:48 +00:00
Akira Hatanaka	beda2241a4	When store nodes or memcpy nodes are created to copy the function call arguments to the stack in MipsISelLowering::LowerCall, use stack pointer and integer offset operands rather than frame object operands. llvm-svn: 161068	2012-07-31 18:46:41 +00:00
Chad Rosier	710be7df71	[x86 frame lowering] In 32-bit mode, use ESI as the base pointer. Previously, we were using EBX, but PIC requires the GOT to be in EBX before function calls via PLT GOT pointer. llvm-svn: 161066	2012-07-31 18:29:21 +00:00
Akira Hatanaka	4ce7c4060d	Fix type of LUXC1 and SUXC1. These instructions were incorrectly defined as single-precision load and store. Also avoid selecting LUXC1 and SUXC1 instructions during isel. It is incorrect to map unaligned floating point load/store nodes to these instructions. llvm-svn: 161063	2012-07-31 18:16:49 +00:00
Manman Ren	8c549b586c	MachineSink: Sort the successors before trying to find SuccToSinkTo. One motivating example is to sink an instruction from a basic block which has two successors: one outside the loop, the other inside the loop. We should try to sink the instruction outside the loop. rdar://11980766 llvm-svn: 161062	2012-07-31 18:10:39 +00:00
Jakob Stoklund Olesen	0c807dfae2	Clear kill flags in removeCopyByCommutingDef(). We are extending live ranges, so kill flags are not accurate. They aren't needed until they are recomputed after RA anyway. <rdar://problem/11950722> llvm-svn: 161023	2012-07-31 02:47:24 +00:00
Manman Ren	2b6a0dfd4c	Reverse order of the two branches at end of a basic block if it is profitable. We branch to the successor with higher edge weight first. Convert from je LBB4_8 --> to outer loop jmp LBB4_14 --> to inner loop to jne LBB4_14 jmp LBB4_8 PR12750 rdar: 11393714 llvm-svn: 161018	2012-07-31 01:11:07 +00:00
Jim Grosbach	20666162e9	Keep empty assembly macro argument values in the middle of the list. Empty macro arguments at the end of the list should be as-if not specified at all, but those in the middle of the list need to be kept so as not to screw up the positional numbering. E.g.: .macro foo foo_-bash___: nop .endm foo 1, 2, 3, 4 foo 1, , 3, 4 Should create two labels, "foo_1_2_3_4" and "foo_1__3_4". rdar://11948769 llvm-svn: 161002	2012-07-30 22:44:17 +00:00
Pete Cooper	91244268d7	Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd llvm-svn: 160987	2012-07-30 20:23:19 +00:00
Kevin Enderby	5c490f1b8f	Fix a bug in ARMMachObjectWriter::RecordRelocation() in ARMMachObjectWriter.cpp where the other_half of the movt and movw relocation entries needs to get set and only with the 16 bits of the other half. rdar://10038370 llvm-svn: 160978	2012-07-30 18:46:15 +00:00
Nadav Rotem	77f1b9c477	When constant folding GEP expressions, keep the address space information of pointers. Together with Ran Chachick <ran.chachick@intel.com> llvm-svn: 160954	2012-07-30 07:25:20 +00:00
Manman Ren	f87dd7c01b	Revert r160920 and r160919 due to dragonegg and clang selfhost failure llvm-svn: 160927	2012-07-29 02:44:09 +00:00
Nick Lewycky	d2c3bdd269	Add testcases for GlobalOpt changes in r160693 and r160757. llvm-svn: 160925	2012-07-29 01:15:37 +00:00
Manman Ren	9de95e779c	X86 Peephole: fold loads to the source register operand if possible. Trying to fix the bot by specifying a triple in the failing testing cases. llvm-svn: 160920	2012-07-28 17:51:24 +00:00
Manman Ren	0fa3ab88ba	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919	2012-07-28 16:48:01 +00:00
Manman Ren	32367c063b	X86 Peephole: fix PR13475 in optimizeCompare. It is possible that an instruction can use and update EFLAGS. When checking the safety, we should check the usage of EFLAGS first before declaring it is safe to optimize due to the update. llvm-svn: 160912	2012-07-28 03:15:46 +00:00
Eric Christopher	86ca9f9e11	Add a DW_AT_high_pc for CUs that are a single address range. Update all tests accordingly. Fixes PR13351. Patch by shinichiro hamaji! llvm-svn: 160899	2012-07-27 22:00:05 +00:00
Evan Cheng	249716e8ae	Teach CodeGenPrep to look past bitcast when it's duplicating return instruction into predecessor blocks to enable tail call optimization. rdar://11958338 llvm-svn: 160894	2012-07-27 21:21:26 +00:00
Jakob Stoklund Olesen	bc65e8f94e	Add <imp-def> of super-register when lowering SUBREG_TO_REG. Patch by Tyler Nowicki! llvm-svn: 160888	2012-07-27 20:19:49 +00:00
Nuno Lopes	85591f899d	fix PR13390: do not loop forever with self-referencing self instructions llvm-svn: 160876	2012-07-27 18:21:15 +00:00
Nuno Lopes	20c7eb3549	fix infinite loop in instcombine in the presence of a (malformed) self-referencing select inst. This can happen as long as the instruction is not reachable. Instcombine does generate these unreachable malformed selects when doing RAUW llvm-svn: 160874	2012-07-27 18:03:57 +00:00
Pete Cooper	abc13af9c6	Simplify demanded bits of select sources where the condition is a constant vector llvm-svn: 160835	2012-07-26 23:10:24 +00:00
Pete Cooper	e807e45bff	Teach SimplifyDemandedBits how to look through fpext and fptrunc to simplify their operand llvm-svn: 160823	2012-07-26 22:37:04 +00:00
Jakob Stoklund Olesen	ceee4a9d0c	Eliminate a batch of uses of sub_ss and sub_sd in the X86 target. These idempotent sub-register indices don't do anything --- They simply map XMM registers to themselves. They no longer affect register classes either since the SubRegClasses field has been removed from Target.td. This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns with COPY_TO_REGCLASS patterns which simply become COPY instructions. The number of IMPLICIT_DEF instructions before register allocation is reduced, and that is the cause of the test case changes. llvm-svn: 160816	2012-07-26 21:40:42 +00:00
Duncan Sands	5651452076	Stop reassociate from looking through expressions of arbitrary complexity. This is a temporary measure until my fix for PR13021 is ready. llvm-svn: 160778	2012-07-26 09:26:40 +00:00
Craig Topper	c7690ac7ac	Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms. llvm-svn: 160775	2012-07-26 07:48:28 +00:00
Akira Hatanaka	64626fc20f	Fix call setup for PIC. Patch by Reed Kotler. llvm-svn: 160774	2012-07-26 02:24:43 +00:00
Manman Ren	e8c6b15137	Update testing case for Atom when disabling rematerialization in TwoAddressInstructionPass. The generated code for Atom has a different code sequence. This is realted to commit r160749. llvm-svn: 160755	2012-07-25 20:17:14 +00:00
Nuno Lopes	f0626f2205	revert r160742: it's breaking CMake build original commit msg: MemoryBuiltins: add support to determine the size of strdup'ed non-constant strings llvm-svn: 160751	2012-07-25 18:49:28 +00:00
Manman Ren	cc1dc6dc11	Disable rematerialization in TwoAddressInstructionPass. It is redundant; RegisterCoalescer will do the remat if it can't eliminate the copy. Collected instruction counts before and after this. A few extra instructions are generated due to spilling but it is normal to see these kinds of changes with almost any small codegen change, according to Jakob. This also fixed rdar://11830760 where xor is expected instead of movi0. llvm-svn: 160749	2012-07-25 18:28:13 +00:00
Nuno Lopes	f0441e04bd	MemoryBuiltins: add support to determine the size of strdup'ed non-constant strings llvm-svn: 160742	2012-07-25 17:29:22 +00:00
Rafael Espindola	11c38b9657	When a return struct pointer is passed in registers, the called has nothing to pop. llvm-svn: 160725	2012-07-25 13:41:10 +00:00
Duncan Sands	77a1f3b564	Don't perform an overaligned load in this test, since that's undefined behaviour that might be exploited one day. llvm-svn: 160714	2012-07-25 09:45:37 +00:00
Duncan Sands	0b875a0c29	When folding a load from a global constant, if the load started in the middle of an array element (rather than at the beginning of the element) and extended into the next element, then the load from the second element was being handled wrong due to incorrect updating of the notion of which byte to load next. This fixes PR13442. Thanks to Chris Smowton for reporting the problem, analyzing it and providing a fix. llvm-svn: 160711	2012-07-25 09:14:54 +00:00
Akira Hatanaka	5a69c235ae	Eliminate the stack slot used to save the global base register. The long branch pass (fixed in r160601) no longer uses the global base register to compute addresses of branch destinations, so it is not necessary to reserve a slot on the stack. llvm-svn: 160703	2012-07-25 03:16:47 +00:00
Rafael Espindola	a92cf29f0d	Add a cpu to the test. Should fix the atom bot. llvm-svn: 160701	2012-07-24 22:56:06 +00:00
Rafael Espindola	f30e9bfb90	Add a triple to the test. llvm-svn: 160698	2012-07-24 21:55:04 +00:00
Rafael Espindola	a44e193a11	In order to correctly compile struct s { double x1; float x2; }; __attribute__((regparm(3))) struct s f(int a, int b, int c); void g(void) { f(41, 42, 43); } We need to be able to represent passing the address of s to f (sret) in a register (inreg). Turns out that all that is needed is to not mark them as mutually incompatible. llvm-svn: 160695	2012-07-24 21:40:17 +00:00
David Chisnall	5b8c1680de	ELF does not imply GNU/Linux. Do not assume GNU conventions just because we are targeting an ELF platform. Only fold gs-relative (and fs-relative) loads if it is actually sensible to do so for the target platform. This fixes PR13438. llvm-svn: 160687	2012-07-24 20:04:16 +00:00
Nuno Lopes	2a4b09c9de	teach objectsize about strdup() and strndup() llvm-svn: 160676	2012-07-24 16:28:13 +00:00
Nick Lewycky	faa9c3b035	Teach globalopt to not nuke all stores to globals. Keep them around of they might be deliberate "one time" leaks, so that leak checkers can find them. This is a reapply of r160602 with the fix that this time I'm committing the code I thought I was committing last time; the I->eraseFromParent() goes after the break out of the loop. llvm-svn: 160664	2012-07-24 07:21:08 +00:00
Akira Hatanaka	26e9ecb7a3	Add basic ability to setup call frame, and make procedure calls. Hello world will compile and execute with this patch. Patch by Reed Kotler. llvm-svn: 160651	2012-07-23 23:45:54 +00:00
Dan Gohman	f64ff8ed3a	An objc_retain can serve as a may-use for a different pointer. rdar://11931823. llvm-svn: 160637	2012-07-23 19:27:31 +00:00
Sylvestre Ledru	35521e2310	Fix a typo (the the => the) llvm-svn: 160621	2012-07-23 08:51:15 +00:00
Nadav Rotem	9056076cab	Fixed DAGCombine optimizations which generate select_cc for targets that do not support it (X86 does not lower select_cc). PR: 13428 Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160619	2012-07-23 07:59:50 +00:00
Nick Lewycky	9669c198ba	Revert r160602. llvm-svn: 160603	2012-07-21 09:03:15 +00:00
Nick Lewycky	72b83e5eaa	Teach globalopt to play nice with leak checkers. This is a reapplication of r160529 that was subsequently reverted. The fix was to not call GV->eraseFromParent() right before the caller does the same. The existing testcases already caught this bug if run under valgrind. llvm-svn: 160602	2012-07-21 08:29:45 +00:00
Akira Hatanaka	f72efdb62f	Fix Mips long branch pass. This pass no longer requires that the global pointer value be saved to the stack or register since it uses bal instruction to compute branch distance. llvm-svn: 160601	2012-07-21 03:30:44 +00:00
Nuno Lopes	705141d4df	baby steps toward fixing some problems with inbound GEPs that overflow, as discussed 2 months ago or so. Make sure we do not emit index computations with NSW flags so that we dont get an undef value if the GEP overflows llvm-svn: 160589	2012-07-20 23:07:40 +00:00
Nuno Lopes	20ea62527a	move the bounds checking pass to the instrumentation folder, where it belongs. I dunno why in the world I dropped it in the Scalar folder in the first place. No functionality change. llvm-svn: 160587	2012-07-20 22:39:33 +00:00
Jakob Stoklund Olesen	e2cfd0d45a	Avoid folding loads that are unsafe to move. LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load into its only use. Only do that when the load is safe to move, and it won't extend any live ranges. This fixes PR13414. llvm-svn: 160575	2012-07-20 21:29:31 +00:00
Jakob Stoklund Olesen	f62c07f147	Split loop exiting edges more aggressively. PHIElimination splits critical edges when it predicts it can resolve interference and eliminate copies. It doesn't split the edge if the interference wouldn't be resolved anyway because the phi-use register is live in the critical edge anyway. Teach PHIElimination to split loop exiting edges with interference, even if it wouldn't resolve the interference. This removes the necessary copies from the loop, which is still an improvement from injecting the copies into the loop. The test case demonstrates the improvement. Before: LBB0_1: cmpb $0, (%rdx) leaq 1(%rdx), %rdx movl %esi, %eax je LBB0_1 After: LBB0_1: cmpb $0, (%rdx) leaq 1(%rdx), %rdx je LBB0_1 movl %esi, %eax llvm-svn: 160571	2012-07-20 20:49:53 +00:00
Richard Osborne	0ab2b0df82	Fix assertion in jump threading (PR13405). GetBestDestForJumpOnUndef() assumes there is at least 1 successor, which isn't true if the block ends in an indirect branch with no successors. Fix this by bailing out earlier in this case. llvm-svn: 160546	2012-07-20 10:36:17 +00:00
Kostya Serebryany	f02c6069ac	[asan] make sure that the crash callbacks do not get merged (Chandler's idea: insert an empty InlineAsm). Change the order in which the new BBs are inserted: the slow path BB is insert between old BBs, the crash BB is inserted at the end. Don't create an empty BB (introduced by recent commits). Update the test. The experimental code that does manual crash callback merge will most likely be deleted later. llvm-svn: 160544	2012-07-20 09:54:50 +00:00
Nick Lewycky	7707e23429	Revert r160529 due to crashes. llvm-svn: 160532	2012-07-19 23:59:21 +00:00
Nick Lewycky	0fa6a28141	Don't wipe out global variables that are probably storing pointers to heap memory. This makes clang play nice with leak checkers. llvm-svn: 160529	2012-07-19 22:35:28 +00:00
Preston Gurd	f2ea70ae4a	Fix remaining lit tests which were failing when run on an Atom processor. Patches by Tyler Nowicki, Andy Zhang, and Preston Gurd! llvm-svn: 160520	2012-07-19 18:53:21 +00:00
NAKAMURA Takumi	67ce1930c1	test/DebugInfo/dwarfdump-test.test: Tweak expressions for Win32 to match backslashes. They are still odd, though. For example, Paths are printed on Win32 as below; /tmp/dbginfo\def2.cc:4:0 /tmp/dbginfo\include\decl2.h:1:0 /tmp/include\decl.h:5:0 llvm-svn: 160505	2012-07-19 13:40:09 +00:00
Jush Lu	e67e07b901	[arm-fast-isel] Add support for vararg function calls. llvm-svn: 160500	2012-07-19 09:49:00 +00:00
Alexey Samsonov	e16e16add6	DebugInfo library: add support for fetching absolute paths to source files (instead of basenames) from DWARF. Use this behavior in llvm-dwarfdump tool. Reviewed by Benjamin Kramer. llvm-svn: 160496	2012-07-19 07:03:58 +00:00
Manman Ren	d0a4ee8427	X86: remove redundant cmp against zero. Updated OptimizeCompare in peephole to remove redundant cmp against zero. We only remove Compare if CF and OF are not used. rdar://11855129 llvm-svn: 160454	2012-07-18 21:40:01 +00:00
Preston Gurd	f0a48ec8f1	This patch fixes 8 out of 20 unexpected failures in "make check" when run on an Intel Atom processor. The failures have arisen due to changes elsewhere in the trunk over the past 8 weeks or so. These failures were not detected by the Atom buildbot because the CPU on the Atom buildbot was not being detected as an Atom CPU. The fix for this problem is in Host.cpp and X86Subtarget.cpp, but shall remain commented out until the current set of Atom test failures are fixed. Patch by Andy Zhang and Tyler Nowicki! llvm-svn: 160451	2012-07-18 20:49:17 +00:00
Chandler Carruth	985454e0ac	Fix a somewhat nasty crasher in PR13378. This crashes inside of LiveIntervals due to the two-addr pass generating bogus MI code. The crux of the issue was a loop nesting problem. The intent of the code which attempts to transform instructions before converting them to two-addr form is to defer and reprocess any transformed instructions as the second processing is likely to have more opportunities to coalesce copies, etc. Unfortunately, there was one section of processing that was not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this rewriting proceeded, not only did it occur early, it removed the bits of information needed for the deferred processing to correctly generate the necessary two address form (specifically inserting a copy), but didn't trigger any immediate assertions and produced what appeared to be already valid two-address from code. Thus, the assertion only fired much later in the pipeline. The fix is to hoist the transformation logic up layer to where it can more firmly defer all further processing, and to teach the normal processing to handle an edge case previously handled as part of the transformation logic. This edge case (already matched tied register operands) needs to not defer any steps. As has been brought up repeatedly in the process: wow does this code need refactoring. I may squeeze in some time to at least bring sanity to this loop... but wow... =] Thanks to Jakob for helpful hints on the way here, and the review. llvm-svn: 160443	2012-07-18 18:58:22 +00:00
Andrew Trick	e002fb5da3	Added unit test for PR13361: LSR + SCEV "hangs" on reasonably sized test. llvm-svn: 160439	2012-07-18 18:07:52 +00:00
Victor Oliveira	a1de408aa7	test commit llvm-svn: 160438	2012-07-18 17:53:05 +00:00
Jack Carter	a62ba82825	Mips specific inline asm operand modifier 'M': Print the high order register of a double word register operand. In 32 bit mode, a 64 bit double word integer will be represented by 2 32 bit registers. This modifier causes the high order register to be used in the asm expression. It is useful if you are using doubles in assembler and continue to control register to variable relationships. This patch also fixes a related bug in a previous patch: case 'D': // Second part of a double word register operand case 'L': // Low order register of a double word register operand case 'M': // High order register of a double word register operand I got 'D' and 'M' confused. The second part of a double word operand will only match 'M' for one of the endianesses. I had 'L' and 'D' be the opposite twins when 'L' and 'M' are. llvm-svn: 160429	2012-07-18 06:41:36 +00:00
Andrew Trick	c08726627c	indvars: Linear function test replace should avoid reusing undef. Fixes PR13371: indvars pass incorrectly substitutes 'undef' values. I do not like this fix. It's needed until/unless the meaning of undef changes. It attempts to be complete according to the IR spec, but I don't have much confidence in the implementation given the difficulty testing undefined behavior. Worse, this invalidates some of my hard-fought work on indvars and LSR to optimize pointer induction variables. It results benchmark regressions, which I'll track internally. On x86_64 no LTO I see: -3% huffbench -3% 400.perlbench -8% fhourstones My only suggestion for recovering is to change the meaning of undef. If we could trust an arbitrary instruction to produce a some real value that can be manipulated (e.g. incremented) according to non-undef rules, then this case could be easily handled with SCEV. llvm-svn: 160421	2012-07-18 04:35:10 +00:00
Craig Topper	01deb5f2df	Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas. llvm-svn: 160420	2012-07-18 04:11:12 +00:00
Joel Jones	b84f7bea09	More replacing of target-dependent intrinsics with target-indepdent intrinsics. The second instruction(s) to be handled are the vector versions of count set bits (ctpop). The changes here are to clang so that it generates a target independent vector ctpop when it sees an ARM dependent vector bits set count. The changes in llvm are to match the target independent vector ctpop and in VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM dependent vector pop counts with target-independent ctpops. There are also changes to an existing test case in llvm for ARM vector count instructions and to a test for the bitcode upgrade. <rdar://problem/11892519> There is deliberately no test for the change to clang, as so far as I know, no consensus has been reached regarding how to test neon instructions in clang; q.v. <rdar://problem/8762292> llvm-svn: 160410	2012-07-18 00:02:16 +00:00
Evan Cheng	f73d7553cc	Add test case for r160387 llvm-svn: 160389	2012-07-17 19:40:05 +00:00
Evan Cheng	e6a3b03ee0	Back out r160101 and instead implement a dag combine to recover from instcombine transformation. llvm-svn: 160387	2012-07-17 18:54:11 +00:00
NAKAMURA Takumi	7364c32b33	llvm/test/Transforms/LoopRotate/PhiRename-1.ll: FileCheck-ize. It fixes PR13301. It began choking since Chandler's r159547, possibly due to improper expression on grep from TclParser to ShParser. llvm-svn: 160367	2012-07-17 15:43:17 +00:00
Alexey Samsonov	b604ff2a07	Improve behavior of DebugInfoEntryMinimal::getSubprogramName() introduced in r159512. To fetch a subprogram name we should not only inspect the DIE for this subprogram, but optionally inspect its specification, or its abstract origin (even if there is no inlining), or even specification of an abstract origin. Reviewed by Benjamin Kramer. llvm-svn: 160365	2012-07-17 15:28:35 +00:00
Nadav Rotem	277a40bc0a	Fix a crash in the legalization of large vectors. When truncating a result of a vector that is split we need to use the result of the split vector, and not re-split the dead node. llvm-svn: 160357	2012-07-17 09:07:37 +00:00
Evan Cheng	780f9b5f92	Implement r160312 as target indepedenet dag combine. llvm-svn: 160354	2012-07-17 08:31:11 +00:00
Evan Cheng	f579beca6d	This is another case where instcombine demanded bits optimization created large immediates. Add dag combine logic to recover in case the large immediates doesn't fit in cmp immediate operand field. int foo(unsigned long l) { return (l>> 47) == 1; } we produce %shr.mask = and i64 %l, -140737488355328 %cmp = icmp eq i64 %shr.mask, 140737488355328 %conv = zext i1 %cmp to i32 ret i32 %conv which codegens to movq $0xffff800000000000,%rax andq %rdi,%rax movq $0x0000800000000000,%rcx cmpq %rcx,%rax sete %al movzbl %al,%eax ret TargetLowering::SimplifySetCC would transform (X & -256) == 256 -> (X >> 8) == 1 if the immediate fails the isLegalICmpImmediate() test. For x86, that's immediates which are not a signed 32-bit immediate. Based on a patch by Eli Friedman. PR10328 rdar://9758774 llvm-svn: 160346	2012-07-17 06:53:39 +00:00
Akira Hatanaka	046744467d	Fix function select_cc_f32 in test/CodeGen/Mips/selectcc.ll. llvm-svn: 160329	2012-07-16 23:56:51 +00:00
Nuno Lopes	482fb19fd5	fix PR13339 (remove the predecessor from the unwind BB when removing an invoke) llvm-svn: 160325	2012-07-16 22:49:40 +00:00
Evan Cheng	75315b877c	For something like uint32_t hi(uint64_t res) { uint_32t hi = res >> 32; return !hi; } llvm IR looks like this: define i32 @hi(i64 %res) nounwind uwtable ssp { entry: %lnot = icmp ult i64 %res, 4294967296 %lnot.ext = zext i1 %lnot to i32 ret i32 %lnot.ext } The optimizer has optimize away the right shift and truncate but the resulting constant is too large to fit in the 32-bit immediate field. The resulting x86 code is worse as a result: movabsq $4294967296, %rax ## imm = 0x100000000 cmpq %rax, %rdi sbbl %eax, %eax andl $1, %eax This patch teaches the x86 lowering code to handle ult against a large immediate with trailing zeros. It will issue a right shift and a truncate followed by a comparison against a shifted immediate. shrq $32, %rdi testl %edi, %edi sete %al movzbl %al, %eax It also handles a ugt comparison against a large immediate with trailing bits set. i.e. X > 0x0ffffffff -> (X >> 32) >= 1 rdar://11866926 llvm-svn: 160312	2012-07-16 19:35:43 +00:00
Nadav Rotem	839a06e9d7	Make ComputeDemandedBits return a deterministic result when computing an AssertZext value. In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits reported that some of the bits were both known to be one and known to be zero. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160305	2012-07-16 18:34:53 +00:00
Tom Stellard	fc3db614c0	Revert "test/CodeGen/R600: Add some basic tests v6" This reverts commit 11d3457afcda7848448dd7f11b2ede6552ffb9ea. llvm-svn: 160300	2012-07-16 18:19:43 +00:00
Kostya Serebryany	874dae6119	[asan] refactor instrumentation to allow merging the crash callbacks (not fully implemented yet, no functionality change except the BB order) llvm-svn: 160284	2012-07-16 16:15:40 +00:00
Jack Carter	f649043aa5	Doubleword Shift Left Logical Plus 32 Mips shift instructions DSLL, DSRL and DSRA are transformed into DSLL32, DSRL32 and DSRA32 respectively if the shift amount is between 32 and 63 Here is a description of DSLL: Purpose: Doubleword Shift Left Logical Plus 32 To execute a left-shift of a doubleword by a fixed amount--32 to 63 bits Description: GPR[rd] <- GPR[rt] << (sa+32) The 64-bit doubleword contents of GPR rt are shifted left, inserting zeros into the emptied bits; the result is placed in GPR rd. The bit-shift amount in the range 0 to 31 is specified by sa. This patch implements the direct object output of these instructions. llvm-svn: 160277	2012-07-16 15:14:51 +00:00

... 2 3 4 5 6 ...

16968 Commits