llvm-project

Commit Graph

Author	SHA1	Message	Date
Quentin Colombet	689623009b	[PeepholeOptimizer] Take advantage of the isInsertSubreg property in the advanced copy optimization. This is the final step patch toward transforming: udiv r0, r0, r2 udiv r1, r1, r3 vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 bx lr into: udiv r0, r0, r2 udiv r1, r1, r3 bx lr Indeed, thanks to this patch, this optimization is able to look through vmov.32 d16[0], r0 vmov.32 d16[1], r1 and is able to rewrite the following sequence: vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 into simple generic GPR copies that the coalescer managed to remove. <rdar://problem/12702965> llvm-svn: 216144	2014-08-21 00:19:16 +00:00
Quentin Colombet	7e3da6677a	Add isInsertSubreg property. This patch adds a new property: isInsertSubreg and the related target hooks: TargetIntrInfo::getInsertSubregInputs and TargetInstrInfo::getInsertSubregLikeInputs to specify that a target specific instruction is a (kind of) INSERT_SUBREG. The approach is similar to r215394. <rdar://problem/12702965> llvm-svn: 216139	2014-08-20 23:49:36 +00:00
Quentin Colombet	67639df146	[PeepholeOptimizer] Take advantage of the isExtractSubreg property in the advanced copy optimization. This patch is a step toward transforming: udiv r0, r0, r2 udiv r1, r1, r3 vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 bx lr into: udiv r0, r0, r2 udiv r1, r1, r3 bx lr Indeed, thanks to this patch, this optimization is able to look through vmov r0, r1, d16 but it does not understand yet vmov.32 d16[0], r0 vmov.32 d16[1], r1 Comming patches will fix that and update the related test case. <rdar://problem/12702965> llvm-svn: 216136	2014-08-20 23:13:02 +00:00
Quentin Colombet	7e75cbaf47	Add isExtractSubreg property. This patch adds a new property: isExtractSubreg and the related target hooks: TargetIntrInfo::getExtractSubregInputs and TargetInstrInfo::getExtractSubregLikeInputs to specify that a target specific instruction is a (kind of) EXTRACT_SUBREG. The approach is similar to r215394. <rdar://problem/12702965> llvm-svn: 216130	2014-08-20 21:51:26 +00:00
Alexey Samsonov	e229ec5bfc	Fix null reference creation in SelectionDAG constructor. Store TargetSelectionDAGInfo as a pointer instead of a reference: getSelectionDAGInfo() may not be implemented for certain backends (e.g. it's not currently implemented for R600). This bug is reported by UBSan. llvm-svn: 216129	2014-08-20 21:40:15 +00:00
Alexey Samsonov	ea0aee622e	Cleanup: Delete seemingly unused reference to MachineDominatorTree from ScheduleDAGInstrs. llvm-svn: 216124	2014-08-20 20:57:26 +00:00
Alexey Samsonov	8968e6d1b0	Fix null reference creation in ScheduleDAGInstrs constructor call. Both MachineLoopInfo and MachineDominatorTree may be null in ScheduleDAGMI constructor call. It is undefined behavior to take references to these values. This bug is reported by UBSan. llvm-svn: 216118	2014-08-20 19:36:05 +00:00
Sanjay Patel	f3cfeef2e9	critical-anti-dependency breaker: don't use reg def info from kill insts (PR20308) In PR20308 ( http://llvm.org/bugs/show_bug.cgi?id=20308 ), the critical-anti-dependency breaker caused a miscompile because it broke a WAR hazard using a register that it thinks is available based on info from a kill inst. Until PR18663 is solved, we shouldn't use any def/use info from a kill because they are really just nops. This patch adds guard checks for kills around calls to ScanInstruction() where the DefIndices array is set. For good measure, add an assert in ScanInstruction() so we don't hit this bug again. The test case is a reduced version of the code from the bug report. Differential Revision: http://reviews.llvm.org/D4977 llvm-svn: 216114	2014-08-20 18:03:00 +00:00
Quentin Colombet	03e43f8e68	[PeepholeOptimizer] Refactor the advanced copy optimization to take advantage of the isRegSequence property. This is a follow-up of r215394 and r215404, which respectively introduces the isRegSequence property and uses it for ARM. Thanks to the property introduced by the previous commits, this patch is able to optimize the following sequence: vmov d0, r2, r3 vmov d1, r0, r1 vmov r0, s0 vmov r1, s2 udiv r0, r1, r0 vmov r1, s1 vmov r2, s3 udiv r1, r2, r1 vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 bx lr into: udiv r0, r0, r2 udiv r1, r1, r3 vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 bx lr This patch refactors how the copy optimizations are done in the peephole optimizer. Prior to this patch, we had one copy-related optimization that replaced a copy or bitcast by a generic, more suitable (in terms of register file), copy. With this patch, the peephole optimizer features two copy-related optimizations: 1. One for rewriting generic copies to generic copies: PeepholeOptimizer::optimizeCoalescableCopy. 2. One for replacing non-generic copies with generic copies: PeepholeOptimizer::optimizeUncoalescableCopy. The goals of these two optimizations are slightly different: one rewrite the operand of the instruction (#1), the other kills off the non-generic instruction and replace it by a (sequence of) generic instruction(s). Both optimizations rely on the ValueTracker introduced in r212100. The ValueTracker has been refactored to use the information from the TargetInstrInfo for non-generic instruction. As part of the refactoring, we switched the tracking from the index of the definition to the actual register (virtual or physical). This one change is to provide better consistency with register related APIs and to ease the use of the TargetInstrInfo. Moreover, this patch introduces a new helper class CopyRewriter used to ease the rewriting of generic copies (i.e., #1). Finally, this patch adds a dead code elimination pass right after the peephole optimizer to get rid of dead code that may appear after rewriting. This is related to <rdar://problem/12702965>. Review: http://reviews.llvm.org/D4874 llvm-svn: 216088	2014-08-20 17:41:48 +00:00
Jiangning Liu	f841b3b79e	Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type legalization stage. With those two optimizations, fewer signed/zero extension instructions can be inserted, and then we can expose more opportunities to Machine CSE pass in back-end. llvm-svn: 216066	2014-08-20 12:05:15 +00:00
Juergen Ributzka	4bf6c01cdb	Reapply [FastISel] Let the target decide first if it wants to materialize a constant (215588). Note: This was originally reverted to track down a buildbot error. This commit exposed a latent bug that was fixed in r215753. Therefore it is reapplied without any modifications. I run it through SPEC2k and SPEC2k6 for AArch64 and it didn't introduce any new regeressions. Original commit message: This changes the order in which FastISel tries to materialize a constant. Originally it would try to use a simple target-independent approach, which can lead to the generation of inefficient code. On X86 this would result in the use of movabsq to materialize any 64bit integer constant - even for simple and small values such as 0 and 1. Also some very funny floating-point materialization could be observed too. On AArch64 it would materialize the constant 0 in a register even the architecture has an actual "zero" register. On ARM it would generate unnecessary mov instructions or not use mvn. This change simply changes the order and always asks the target first if it likes to materialize the constant. This doesn't fix all the issues mentioned above, but it enables the targets to implement such optimizations. Related to <rdar://problem/17420988>. llvm-svn: 216006	2014-08-19 19:05:24 +00:00
Oliver Stannard	f5469bec97	Teach the AArch64 backend to handle f16 This allows the AArch64 backend to handle fadd, fsub, fmul and fdiv operations on f16 (half-precision) types by promoting to f32. llvm-svn: 215891	2014-08-18 14:22:39 +00:00
Craig Topper	6230691c91	Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870	2014-08-18 00:24:38 +00:00
Craig Topper	5229cfd163	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 215868	2014-08-17 23:47:00 +00:00
Matt Arsenault	6cc00429ff	Fix fmul combines with constant splat vectors Fixes things like fmul x, 2 -> fadd x, x llvm-svn: 215820	2014-08-16 10:14:19 +00:00
Andrea Di Biagio	b23bad11e7	[DAGCombiner] Improve the folding of target independet shuffles to Undef. When combining a pair of shuffle nodes, check if the combined shuffle mask is trivially Undef. In case, immediately fold that pair of shuffles to Undef. The lack of checks for undef masks was the root-cause of a poor-codegen bug in the dag combiner. Example: %1 = shufflevector <4 x i32> %A, <4 x i32> %B, <4 x i32> <i32 4, i32 1, i32 1, i32 6> %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 0, i32 4, i32 1, i32 6> %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <4 x i32> <i32 1, i32 5, i32 3, i32 3> Before this patch, on x86 (with -mcpu=corei7) we failed to fold the entire sequence to Undef value and therefore we generated: shufps $-123, %xmm1, $xmm0 pshufd $-46, %xmm0, %xmm0 With this patch, the entire shuffle sequence is folded to Undef and no shuffles are generated in the output assembly. Added new test cases to test 'combine-vec-shuffle-5.ll'. llvm-svn: 215797	2014-08-16 00:29:44 +00:00
Hal Finkel	0815a05fd7	Make isAliased property for fixed-offset stack objects adjustable We used to assume that any fixed-offset stack object was not aliased. This meant that no IR value could point to the memory contained in such an object. This is a reasonable default, but is not a universally-correct target-independent fact. For example, on PowerPC (both Darwin and non-Darwin), some byval arguments are allocated at fixed offsets by the ABI. These, however, certainly can be pointed to by IR values. This change moves the 'isAliased' logic out of FixedStackPseudoSourceValue and into MFI, and allows the isAliased property to be overridden for fixed-offset objects. This will be used by an upcoming commit to the PowerPC backend to fix PR20280. No functionality change intended (the behavior of FixedStackPseudoSourceValue::isAliased has been made more conservative for callers that don't pass an MFI object, but I don't see any in-tree callers that do that). llvm-svn: 215794	2014-08-16 00:17:02 +00:00
Robin Morisset	d18cda620c	Fix typos in comments llvm-svn: 215777	2014-08-15 22:17:28 +00:00
Juergen Ributzka	5b1dbec1b4	[FastISel] Remove an performance debugging assert. As Jim pointed out this assert isn't really needed to test for correctness, because the code right afterwards does the same check and falls-back to SelectionDAG - as intended. llvm-svn: 215735	2014-08-15 17:36:30 +00:00
Rafael Espindola	7bb91d942b	Delete dead code. NFC. llvm-svn: 215720	2014-08-15 14:58:22 +00:00
Juergen Ributzka	790bacf232	Revert several FastISel commits to track down a buildbot error. This reverts: r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants." r215594 "[FastISel][X86] Use XOR to materialize the "0" value." r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization." r215591 "[FastISel][AArch64] Make use of the zero register when possible." r215588 "[FastISel] Let the target decide first if it wants to materialize a constant." r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI." llvm-svn: 215673	2014-08-14 19:56:28 +00:00
Sanjay Patel	35d3133650	optimize vector fneg of bitcasted integer value This patch allows a vector fneg of a bitcasted integer value to be optimized in the same way that we already optimize a scalar fneg. If the integer variable is a constant, we can precompute the result and not require any logic ops. This patch is very similar to a fabs patch committed at r214892. Differential Revision: http://reviews.llvm.org/D4852 llvm-svn: 215646	2014-08-14 15:15:28 +00:00
Chandler Carruth	7cd15be784	[SDAG] Fix a bug in the DAG combiner where we would fail to return the input node after manually adding it to the worklist and using CombineTo. Once we use CombineTo the input node may have been deleted. Despite this being completely confusing and somewhat broken, the only way to "correctly" return from a DAG combine after potentially deleting the input node is to return that exact node.... But really, this code should just never have used CombineTo. It won't do what it wants (returning the node as mentioned above just causes the combine to infloop). The correct way to combine away a casted load to a load of the correct type is to RAUW the chain directly and then return the loaded value to replace the actual value node. I managed to find this with the vector shuffle fuzzer even though it clearly has nothing at all to do with vector shuffles and rather those happen to trigger a load of a constant pool that hits this combine just right. I've included the test as it is small and a nice stress test that the infrastructure isn't asserting. llvm-svn: 215622	2014-08-14 08:18:34 +00:00
Chandler Carruth	8039b16de7	[SDAG] Fix a case where we would iteratively legalize a node during combining by replacing it with something else but not re-process the node afterward to remove it. In a truly remarkable stroke of bad luck, this would (in the test case attached) end up getting some other node combined into it without ever getting re-processed. By adding it back on to the worklist, in addition to deleting the dead nodes more quickly we also ensure that if it stops being dead for any reason it makes it back through the legalizer. Without this, the test case will end up failing during instruction selection due to an and node with a type we don't have an instruction pattern for. It took many million runs of the shuffle fuzz tester to find this. llvm-svn: 215611	2014-08-14 01:07:37 +00:00
Juergen Ributzka	7cee768e55	[FastISel] Let the target decide first if it wants to materialize a constant. This changes the order in which FastISel tries to materialize a constant. Originally it would try to use a simple target-independent approach, which can lead to the generation of inefficient code. On X86 this would result in the use of movabsq to materialize any 64bit integer constant - even for simple and small values such as 0 and 1. Also some very funny floating-point materialization could be observed too. On AArch64 it would materialize the constant 0 in a register even the architecture has an actual "zero" register. On ARM it would generate unnecessary mov instructions or not use mvn. This change simply changes the order and always asks the target first if it likes to materialize the constant. This doesn't fix all the issues mentioned above, but it enables the targets to implement such optimizations. Related to <rdar://problem/17420988>. llvm-svn: 215588	2014-08-13 22:08:02 +00:00
Gerolf Hoflehner	fe2c11ffd6	[MachineCombiner] Removal of dangling DBG_VALUES after combining [20598] This is a cleaner solution to the problem described in r215431. When instructions are combined a dangling DBG_VALUE is removed. This resolves bug 20598. llvm-svn: 215587	2014-08-13 22:07:36 +00:00
Gerolf Hoflehner	caa8bfd13b	[Cleanup] Utility function to erase instruction and mark DBG_Values New function to erase a machine instruction and mark DBG_VALUE for removal. A DBG_VALUE is marked for removal when it references an operand defined in the instruction. Use the new function to cleanup code in dead machine instruction removal pass. llvm-svn: 215580	2014-08-13 21:15:23 +00:00
Quentin Colombet	abea99f65a	[MachineDominatorTree] Provide a method to inform a MachineDominatorTree that a critical edge has been split. The MachineDominatorTree will when lazy update the underlying dominance properties when require. Context This is a follow-up of r215410. Each time a critical edge is split this invalidates the dominator tree information. Thus, subsequent queries of that interface will be slow until the underlying information is actually recomputed (costly). Problem Prior to this patch, splitting a critical edge needed to query the dominator tree to update the dominator information. Therefore, splitting a bunch of critical edges will likely produce poor performance as each query to the dominator tree will use the slow query path. This happens a lot in passes like MachineSink and PHIElimination. Proposed Solution Splitting a critical edge is a local modification of the CFG. Moreover, as soon as a critical edge is split, it is not critical anymore and thus cannot be a candidate for critical edge splitting anymore. In other words, the predecessor and successor of a basic block inserted on a critical edge cannot be inserted by critical edge splitting. Using these observations, we can pile up the splitting of critical edge and apply then at once before updating the DT information. The core of this patch moves the update of the MachineDominatorTree information from MachineBasicBlock::SplitCriticalEdge to a lazy MachineDominatorTree. Performance Thanks to this patch, the motivating example compiles in 4- minutes instead of 6+ minutes. No test case added as the motivating example as nothing special but being huge! The binaries are strictly identical for all the llvm test-suite + SPECs with and without this patch for both Os and O3. Regarding compile time, I observed only noise, although on average I saw a small improvement. <rdar://problem/17894619> llvm-svn: 215576	2014-08-13 21:00:07 +00:00
Benjamin Kramer	a7c40ef022	Canonicalize header guards into a common format. Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558	2014-08-13 16:26:38 +00:00
Andrea Di Biagio	ace8e1e3d4	[DAGCombiner] Improved target independent vector shuffle combine rule. This patch improves the existing algorithm in DAGCombiner that attempts to fold shuffles according to rule: shuffle(shuffle(x, y, M1), undef, M2) -> shuffle(y, undef, M3) Before this change, there were cases where the DAGCombiner conservatively avoided folding shuffles even if the resulting mask would have been legal. That is because the algorithm wrongly assumed that commuting an illegal shuffle mask would always produce an illegal mask. With this change, we now correctly compute the commuted shuffle mask before calling method 'isShuffleMaskLegal' on it. On X86, this improves for example the codegen for the following function: define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) { %1 = shufflevector <4 x i32> %B, <4 x i32> %A, <4 x i32> <i32 1, i32 2, i32 6, i32 7> %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 2, i32 3, i32 2, i32 3> ret <4 x i32> %2 } Before this change the X86 backend (-mcpu=corei7) generated the following assembly code for function @test: shufps $-23, %xmm0, %xmm1 # xmm1 = xmm1[1,2],xmm0[2,3] movhlps %xmm1, %xmm1 # xmm1 = xmm1[1,1] movaps %xmm1, %xmm0 Now we produce: movhlps %xmm0, %xmm0 # xmm0 = xmm0[1,1] Added extra test cases in combine-vec-shuffle-2.ll to verify that we correctly fold according to the above-mentioned rule. llvm-svn: 215555	2014-08-13 16:09:40 +00:00
Hal Finkel	46ef7ce283	[PowerPC] Implement PPCTargetLowering::getTgtMemIntrinsic This implements PPCTargetLowering::getTgtMemIntrinsic for Altivec load/store intrinsics. As with the construction of the MachineMemOperands for the intrinsic calls used for unaligned load/store lowering, the only slight complication is that we need to represent a larger memory range than the loaded/stored value-type size (because the address is rounded down to an aligned address, and we need to conservatively represent the entire possible range of the actual access). This required adding an extra size field to TargetLowering::IntrinsicInfo, and this was done in a way that required no modifications to other targets (the size defaults to the store size of the provided memory data type). This fixes test/CodeGen/PowerPC/unal-altivec-wint.ll (so it can be un-XFAILed). llvm-svn: 215512	2014-08-13 01:15:40 +00:00
Adrian Prantl	5e1fa85ec6	Remove a condition that can never be true, as wittnessed by the assert above. llvm-svn: 215477	2014-08-12 21:55:58 +00:00
Quentin Colombet	8427df974e	Fix a parentheses warning introduced in r215394. llvm-svn: 215459	2014-08-12 17:11:26 +00:00
Eric Christopher	ce40dbcbaa	Have MachineRegisterInfo take and store the MachineFunction it was created for rather than the TargetMachine since we only needed the TM for the subtarget and we can get that from the MF. llvm-svn: 215432	2014-08-12 08:00:56 +00:00
Adrian Prantl	9724b5c9a4	DebugLocEntry: Restore the comparison predicate from before the refactoring in 215384. This way it can unique multiple entries describing the same piece even if they don't have the exact same location. (The same piece may get merged in and be added from OpenRanges). There ought to be a more elegant solution for this, though. llvm-svn: 215418	2014-08-12 01:07:53 +00:00
David Blaikie	f73ae4fbf6	Revert "Partially revert r214761 that asserted that all concrete debug info variables had DIEs, due to a failure on Darwin." I believe this was addressed by r215157 and r215227, so let's have another go at the bots, etc. This reverts commit r214880. llvm-svn: 215412	2014-08-12 00:00:31 +00:00
Quentin Colombet	5cded89d12	[MachineSink] Improve the compile time by preserving the dominance information as long as possible. Context Each time the dominance information is modified, the dominator tree analysis switches in a slow query mode. After a few queries without any modification on the dominator tree, it performs an expensive update of its internal structure to provide fast queries again. Problem Prior to this patch, the MachineSink pass was splitting the critical edges on demand while relying heavy on the dominator tree information. In some cases, this leads to pathological behavior where: - We end up in the slow query mode right after splitting an edge. - We update the dominance information. - We break the dominance information again, thus ending up in the slow query mode and so on. Proposed Solution To mitigate this effect, this patch postpones all the splitting of the edges at the end of each iteration of the main loop. The benefits are: - The dominance information is valid for the life time of an iteration. - This simplifies the code as we do not have to special treat instructions that are sunk on critical edges. Indeed, the related block will be available through the next iteration. The downside is that when edges splitting is required, this incurs an additional iteration of the main loop compared to the previous scheme. Performance Thanks to this patch, the motivating example compiles in 6+ minutes instead of 10+ minutes. No test case added as the motivating example as nothing special but being huge! I have measured only noise for both the compile time and the runtime on the llvm test-suite + SPECs with Os and O3. Note: The current implementation of MachineBasicBlock::SplitCriticalEdge also uses the dominance information and therefore, hits this problem. A subsequent patch will address that. <rdar://problem/17894619> llvm-svn: 215410	2014-08-11 23:52:01 +00:00
Michael J. Spencer	6b2f5b47d2	[x86] Fold extract_vector_elt of a load into the Load's address computation. llvm-svn: 215409	2014-08-11 23:49:33 +00:00
Adrian Prantl	76502d8417	Add a couple of convenience accessors to DebugLocEntry::Value to further simplify common usage patterns. llvm-svn: 215407	2014-08-11 23:22:59 +00:00
Adrian Prantl	e8bde9f070	Make these DebugLocEntry::Value comparison operators friend functions as suggested by dblaikie in a comment on r215384. llvm-svn: 215403	2014-08-11 22:52:56 +00:00
Quentin Colombet	d533cdf26f	Add isRegSequence property. This patch adds a new property: isRegSequence and the related target hooks: TargetIntrInfo::getRegSequenceInputs and TargetInstrInfo::getRegSequenceLikeInputs to specify that a target specific instruction is a (kind of) REG_SEQUENCE. <rdar://problem/12702965> llvm-svn: 215394	2014-08-11 22:17:14 +00:00
Adrian Prantl	be4b5171d3	Debug info: Remove an obsolete constructor from DebugLocEntry. llvm-svn: 215387	2014-08-11 21:06:03 +00:00
Adrian Prantl	1c6f2ec112	Debug info: Modify DebugLocEntry::addValue to take multiple values so it only has to sort/unique values once per batch. llvm-svn: 215386	2014-08-11 21:06:00 +00:00
Adrian Prantl	caaf053c79	Debug info: Further simplify the implementation of buildLocationList by getting rid of the redundant DIVariable in the OpenRanges pair. llvm-svn: 215385	2014-08-11 21:05:57 +00:00
Adrian Prantl	293dd93f95	Debug Info: Move the sorting and uniqueing of pieces from emitLocPieces() into buildLocationList(). By keeping the list of Values sorted, DebugLocEntry::Merge can also merge multi-piece entries. llvm-svn: 215384	2014-08-11 21:05:55 +00:00
Adrian Prantl	e09ee3faaf	Debug info: Refactor DebugLocEntry's Merge function to make buildLocationLists easier to read. The previous implementation conflated the merging of individual pieces and the merging of entire DebugLocEntries. By splitting this functionality into two separate functions the intention of the code should be clearer. llvm-svn: 215383	2014-08-11 20:59:28 +00:00
Hans Wennborg	97a59ae589	PeepholeOptimizer: make parameter ref to SmallPtrSetImpl This makes the function type independent of the in-line size of LocalMIs. llvm-svn: 215356	2014-08-11 13:52:46 +00:00
Hans Wennborg	5f5b8cc04f	Make this SmallVector size a power of two as suggested by Chandler llvm-svn: 215355	2014-08-11 13:47:57 +00:00
Jiangning Liu	dd6e12d71c	In Machine CSE pass, the source register of a COPY machine instruction can be propagated to all its users, and this propagation could increase the probability of finding common subexpressions. If the COPY has only one user, the COPY itself can be removed. llvm-svn: 215344	2014-08-11 05:17:19 +00:00
Hans Wennborg	941a5709dc	Re-commit "Increase the size of this SmallVector in PeepholeOptimizer." (r215340) This time, also update the function that receives a reference to the SmallPtrSet as a parameter. llvm-svn: 215342	2014-08-11 02:50:43 +00:00
Hans Wennborg	98b3cf8594	Revert "Increase the size of this SmallVector in PeepholeOptimizer." (r215340) That broke the build: /data/buildslave/clang-amd64-freebsd/src-llvm/lib/CodeGen/PeepholeOptimizer.cpp:729:46: error: non-const lvalue reference to type 'SmallPtrSet<[...], 8>' cannot bind to a value of unrelated type 'SmallPtrSet<[...], 16>' Changed \|= optimizeExtInstr(MI, MBB, LocalMIs); ^~~~~~~~ /data/buildslave/clang-amd64-freebsd/src-llvm/lib/CodeGen/PeepholeOptimizer.cpp:265:49: note: passing argument to parameter 'LocalMIs' here SmallPtrSet<MachineInstr*, 8> &LocalMIs) { ^ llvm-svn: 215341	2014-08-11 02:34:52 +00:00
Hans Wennborg	5b439f9c8a	Increase the size of this SmallVector in PeepholeOptimizer. During a Clang build, the median size of this was 9 llvm-svn: 215340	2014-08-11 02:21:34 +00:00
Hans Wennborg	12b996da88	Increase the size of SpillPlacement::BlockFrequencies. This SmallVector's median size during a Clang build was 7. llvm-svn: 215338	2014-08-11 02:21:30 +00:00
Hans Wennborg	01416e66b6	Increase the size of this SmallVector in CloneNodeWithValues. In a Clang bootstrap, the size of this vector was always 6. llvm-svn: 215335	2014-08-11 02:21:19 +00:00
Hans Wennborg	b8cff696cd	Increase the size of DwarfAccelTable::TableHeaderData::Atoms. During a Clang bootstrap, it seems this SmallVector always contains 3 elements. llvm-svn: 215334	2014-08-11 02:18:15 +00:00
Petar Jovanovic	3a908a0bfc	Add support for scalarizing cttz_zero_undef Follow up to r214266. Add missing case in ScalarizeVectorResult() for cttz_zero_undef. Differential Revision: http://reviews.llvm.org/D4813 llvm-svn: 215330	2014-08-10 22:49:54 +00:00
Saleem Abdulrasool	f158ca353f	CodeGen: switch to a range based for loop Use a range based for loop instead of manual iteration. NFC. llvm-svn: 215287	2014-08-09 17:21:29 +00:00
David Blaikie	bd56fbb976	DebugInfo: Recommit (reverted in r215217, originally committed in r215157) the assertion that no argument variable is overwritten by subsequent argument variables. This turned up a bug in clang where arguments were emitted with duplicate argument numbers (see r215227). llvm-svn: 215228	2014-08-08 17:12:35 +00:00
Pedro Artigas	caa565887d	Added a TLI hook to signal that the target does not have or does not care about floating point exceptions, added use of flag to fold potentially exception raising floating point math in selection DAG. No functionality change, as targets have to explicitly ask for this behavior and none does today. llvm-svn: 215222	2014-08-08 16:46:53 +00:00
David Blaikie	2b07c88668	DebugInfo: Remove assertion (added in r215157) that's firing on a blocks test in the test-suite while I investigate further. llvm-svn: 215217	2014-08-08 16:21:50 +00:00
Josh Klontz	ac0d28dfe6	Add missing Interpreter intrinsic lowering for sin, cos and ceil llvm-svn: 215209	2014-08-08 15:00:12 +00:00
Patrik Hagglund	b0e86ec814	[pr19635] Revert most of r170537, and add new testcase. Patch provided by Andrey Kuharev. Sorry, r170537 was obviously wrong. llvm-svn: 215190	2014-08-08 08:21:19 +00:00
Akira Hatanaka	5acc58fcfb	[stack protector] Look through bitcasts to get global variable __stack_chk_guard. Handle the case where the pointer operand of the load instruction that loads the stack guard is not a global variable but instead a bitcast. %StackGuard = load i8 bitcast (i64 @__stack_chk_guard to i8*) call void @llvm.stackprotector(i8 %StackGuard, i8** %StackGuardSlot) Original test case provided by Ana Pazos. This fixes PR20558. llvm-svn: 215167	2014-08-07 23:08:24 +00:00
David Blaikie	09fdfabdda	DebugInfo: Fix overwriting/loss of inlined arguments to recursively inlined functions. Due to an unnecessary special case, inlined arguments that happened to be from the same function as they were inlined into were misclassified as non-inline arguments and would overwrite the non-inlined arguments. Assert that we never overwrite a function's arguments, and stop misclassifying inlined arguments as non-inline arguments to fix this issue. Excuse the rather crappy test case - handcrafted IR might do better, or someone who understands better how to tickle the inliner to create a recursive inlining situation like this (though it may also be necessary to tickle the variable in a particular way to cause it to be recorded in the MMI side table and go down this particular path for location information). llvm-svn: 215157	2014-08-07 22:22:49 +00:00
Eric Christopher	b9fd9ed37e	Temporarily Revert "Nuke the old JIT." as it's not quite ready to be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reverts commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 215154	2014-08-07 22:02:54 +00:00
Gerolf Hoflehner	b5220dc779	Debugging Utility - optional ability for dumping critical path length llvm-svn: 215153	2014-08-07 21:49:44 +00:00
Gerolf Hoflehner	97c383bc36	MachineCombiner Pass for selecting faster instruction sequence on AArch64 Re-commit of r214832,r21469 with a work-around that avoids the previous problem with gcc build compilers The work-around is to use SmallVector instead of ArrayRef of basic blocks in preservesResourceLen()/MachineCombiner.cpp llvm-svn: 215151	2014-08-07 21:40:58 +00:00
Frederic Riss	e6bb1871eb	test commit: remove trailing whitespace. llvm-svn: 215138	2014-08-07 20:04:00 +00:00
Akira Hatanaka	bbd33f6766	[Branch probability] Recompute branch weights of tail-merged basic blocks. BranchFolderPass was not correctly setting the basic block branch weights when tail-merging created or merged blocks. This patch recomutes the weights of tail-merged blocks using the following formula: branch_weight(merged block to successor j) = sum(block_frequency(bb) * branch_probability(bb -> j)) bb is a block that is in the set of merged blocks. <rdar://problem/16256423> llvm-svn: 215135	2014-08-07 19:30:13 +00:00
Rafael Espindola	f8b27c41e8	Nuke the old JIT. I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start. Thanks to Lang Hames for making MCJIT a good replacement! llvm-svn: 215111	2014-08-07 14:21:18 +00:00
David Blaikie	ff3dd1701c	Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."" This reverts commit r214761. Revert while Reid investigates & provides a reproduction for an assertion failure for this on Windows. llvm-svn: 214999	2014-08-06 22:30:12 +00:00
Eric Christopher	b5217507c7	Remove the target machine from CCState. Previously it was only used to get the subtarget and that's accessible from the MachineFunction now. This helps clear the way for smaller changes where we getting a subtarget will require passing in a MachineFunction/Function as well. llvm-svn: 214988	2014-08-06 18:45:26 +00:00
Adrian Prantl	364d13170a	Improve performance of calculateDbgValueHistory. In r210492 the logic of calculateDbgValueHistory was changed to end register variable live ranges at the end of MBB conditionally on the fact that the register was or not clobbered by the function body. This requires an initial scan of all the operands of the function to collect all clobbered registers. In a second pass over all instructions, we compare this set with the set of clobbered registers for the current MachineInstruction. This modification incurred a compilation time regression on some benchmarks: the debug info emission phase takes ~10% more time. While a small performance hit is unavoidable due to the initial scan requirement, we can improve the situation by avoiding to create too many temporary sets and just use lambdas to work directly on the result of the initial scan. Fixes <rdar://problem/17884104> Patch by Frederic Riss! llvm-svn: 214987	2014-08-06 18:41:24 +00:00
Adrian Prantl	e2d637597c	Cleanup collectChangingRegs The handling of the epilogue is best expressed as an early exit and there is no reason to look for register defs in DbgValue MIs. Patch by Frederic Riss! llvm-svn: 214986	2014-08-06 18:41:19 +00:00
Reid Kleckner	e41d957028	Round up the size of byval arguments to MinAlign Otherwise we can end up with an argument frame size that is not a multiple of stack slot size, which is very awkward. This fixes PR20547, which was a bug in x86_64 Sys V vararg handling. However, it's much easier to test this with x86 callee-cleanup functions, which previously ended in "retl $6" instead of "retl $8". This does affect behavior of all backends, but it presumably fixes the same bug in all of them. llvm-svn: 214980	2014-08-06 17:57:23 +00:00
Sanjay Patel	d26358e12d	use register iterators that include self to reduce code duplication in CriticalAntiDepBreaker This patch addresses 2 FIXME comments that I added to CriticalAntiDepBreaker while fixing PR20020. Initialize an MCSubRegIterator and an MCRegAliasIterator to include the self reg. Assuming that works as advertised, there should be functional difference with this patch, just less code. Also, remove the associated asserts - we're setting those values just before, so the asserts don't do anything meaningful. Differential Revision: http://reviews.llvm.org/D4566 llvm-svn: 214973	2014-08-06 15:58:15 +00:00
David Blaikie	fb0412f039	DebugInfo: Assert that any CU for which debug_loc lists are emitted, has at least one range. This was coming in weird debug info that had variables (and hence debug_locs) but was in GMLT mode (because it was missing the 13th field of the compile_unit metadata) so no ranges were constructed. We should always have at least one range for any CU with a debug_loc in it - because the range should cover the debug_loc. The assertion just ensures that the "!= 1" range case inside the subsequent loop doesn't get entered for the case where there are no ranges at all, which should never reach here in the first place. llvm-svn: 214939	2014-08-06 00:21:25 +00:00
David Blaikie	e1a26a624d	DebugInfo: Move the reference to the CU from the location list entry to the list itself, since it is constant across an entire list. This simplifies construction and usage while making the data structure smaller. It was a holdover from the days when we didn't have a separate DebugLocList and all we had was a flat list of DebugLocEntries. llvm-svn: 214933	2014-08-05 23:14:16 +00:00
Sanjay Patel	8e5beb6edb	Optimize vector fabs of bitcasted constant integer values. Allow vector fabs operations on bitcasted constant integer values to be optimized in the same way that we already optimize scalar fabs. So for code like this: %bitcast = bitcast i64 18446744069414584320 to <2 x float> ; 0xFFFF_FFFF_0000_0000 %fabs = call <2 x float> @llvm.fabs.v2f32(<2 x float> %bitcast) %ret = bitcast <2 x float> %fabs to i64 Instead of generating something like this: movabsq (constant pool loadi of mask for sign bits) vmovq (move from integer register to vector/fp register) vandps (mask off sign bits) vmovq (move vector/fp register back to integer return register) We should generate: mov (put constant value in return register) I have also removed a redundant clause in the first 'if' statement: N0.getOperand(0).getValueType().isInteger() is the same thing as: IntVT.isInteger() Testcases for x86 and ARM added to existing files that deal with vector fabs. One existing testcase for x86 removed because it is no longer ideal. For more background, please see: http://reviews.llvm.org/D4770 And: http://llvm.org/bugs/show_bug.cgi?id=20354 Differential Revision: http://reviews.llvm.org/D4785 llvm-svn: 214892	2014-08-05 17:35:22 +00:00
David Blaikie	b706b58e78	Partially revert r214761 that asserted that all concrete debug info variables had DIEs, due to a failure on Darwin. I'll work on a reduction and fix after this. llvm-svn: 214880	2014-08-05 16:47:23 +00:00
Eric Christopher	fc6de428c8	Have MachineFunction cache a pointer to the subtarget to make lookups shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838	2014-08-05 02:39:49 +00:00
Pedro Artigas	ec7cbd7d14	Changed the liveness tracking in the RegisterScavenger to use register units instead of registers. reviewed by Jakob Stoklund Olesen. llvm-svn: 214798	2014-08-04 23:07:49 +00:00
Chandler Carruth	40dbd382ad	[SDAG] Fix a really, really terrible bug in the DAG combiner. This code is completely wrong. It is also dead, as if it were to ever run, it would crash. Fortunately, after my work to the combiner, it is at least possible to reach the code, and llvm-stress has found a test case. Thanks to Patrick for reporting. It would be really good if anyone who remembers how this code works and what it was intended to do could add some more obvious test coverage instead of my completely contrived and reduced test case. My test case was so brittle I left a bread crumb comment in it to help the next person to stumble on it and not know what it was actually testing for. llvm-svn: 214785	2014-08-04 21:29:59 +00:00
Eric Christopher	d913448b38	Remove the TargetMachine forwards for TargetSubtargetInfo based information and update all callers. No functional change. llvm-svn: 214781	2014-08-04 21:25:23 +00:00
David Blaikie	448c066eea	Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself." Originally reverted in r213432 with flakey failures on an ASan self-host build. After reduction it seems to be the same issue fixed in r213805 (ArgPromo + DebugInfo: Handle updating debug info over multiple applications of argument promotion) and r213952 (by having LiveDebugVariables strip dbg_value intrinsics in functions that are not described by debug info). Though I cannot explain why this failure was flakey... llvm-svn: 214761	2014-08-04 19:30:08 +00:00
Chandler Carruth	cde4eb56fe	[x86] Don't add nodes to the combined set (and prune subsequent combines) until they are legal. Doing it the old way could, when the stars align just right, cause a node to get into the combine set prior to being legalized. Then, when the same node showed up as an operand to another node later on (but not so much later on that it had been deleted as dead) we would fail to add it back to the worklist thinking it had already been combined. This would in turn cause it to not be legalized. Fortunately, we can also walk the operands looking for uncombined (and thus potentially un-legalized) nodes late. It will still ensure that we walk all operands of all nodes and send all of them through both the legalizer without changes and the combiner at least once. (Which was the original goal of this). I have a test case for this bug, but it is terribly brittle. For example, it will stop finding the bug the moment I enable the new shuffle lowering. I don't yet have any test case that reliably exercises this bug, and it isn't clear that it will be possible to craft one. It is entirely possible that with the new shuffle lowering the two forms of doing this are precisely equivalent. That doesn't mean we shouldn't take the more conservative approach of insisting on things in the combined set having survived the legalizer. llvm-svn: 214673	2014-08-03 23:10:59 +00:00
Saleem Abdulrasool	befa21532c	CodeGen: silence a warning GCC 4.8.2 objects to the tautological condition in the assert as the unsigned value is guaranteed to be >= 0. Simplify the assertion by dropping the tautological condition. llvm-svn: 214671	2014-08-03 23:00:38 +00:00
Sanjay Patel	2ef67440fc	fix for PR20354 - Miscompile of fabs due to vectorization This is intended to be the minimal change needed to fix PR20354 ( http://llvm.org/bugs/show_bug.cgi?id=20354 ). The check for a vector operation was wrong; we need to check that the fabs itself is not a vector operation. This patch will not generate the optimal code. A constant pool load and 'and' op will be generated instead of just returning a value that we can calculate in advance (as we do for the scalar case). I've put a 'TODO' comment for that here and expect to have that patch ready soon. There is a very similar optimization that we can do in visitFNEG, so I've put another 'TODO' there and expect to have another patch for that too. llvm-svn: 214670	2014-08-03 22:48:23 +00:00
Gerolf Hoflehner	5e1207e54c	MachineCombiner Pass for selecting faster instruction sequence - target independent framework When the DAGcombiner selects instruction sequences it could increase the critical path or resource len. For example, on arm64 there are multiply-accumulate instructions (madd, msub). If e.g. the equivalent multiply-add sequence is not on the crictial path it makes sense to select it instead of the combined, single accumulate instruction (madd/msub). The reason is that the conversion from add+mul to the madd could lengthen the critical path by the latency of the multiply. But the DAGCombiner would always combine and select the madd/msub instruction. This patch uses machine trace metrics to estimate critical path length and resource length of an original instruction sequence vs a combined instruction sequence and picks the faster code based on its estimates. This patch only commits the target independent framework that evaluates and selects code sequences. The machine instruction combiner is turned off for all targets and expected to evolve over time by gradually handling DAGCombiner pattern in the target specific code. This framework lays the groundwork for fixing rdar://16319955 llvm-svn: 214666	2014-08-03 21:35:39 +00:00
James Molloy	ce45be0465	[AArch64] Teach DAGCombiner that converting two consecutive loads into a vector load is not a good transform when paired loads are available. The combiner was creating Q-register loads and stores, which then had to be spilled because there are no callee-save Q registers! llvm-svn: 214634	2014-08-02 14:51:24 +00:00
Chandler Carruth	18066974d4	[SDAG] Refactor the code which deletes nodes in the DAG combiner to do so using a single helper which adds operands back onto the worklist. Several places didn't rigorously do this but a couple already did. Factoring them together and doing it rigorously is important to delete things recursively early on in the combiner and get a chance to see accurate hasOneUse values. While no existing test cases change, an upcoming patch to add DAG combining logic for PSHUFB requires this to work correctly. llvm-svn: 214623	2014-08-02 10:02:07 +00:00
Owen Anderson	9d5a8c2813	Fix issues with ISD::FNEG and ISD::FMA SDNodes where they would not be constant-folded during DAGCombine in certain circumstances. Unfortunately, the circumstances required to trigger the issue seem to require a pretty specific interaction of DAGCombines, and I haven't been able to find a testcase that reproduces on X86, ARM, or AArch64. The functionality added here is replicated in essentially every other DAG combine, so it seems pretty obviously correct. llvm-svn: 214622	2014-08-02 08:45:33 +00:00
Justin Bogner	0950d79f60	CodeGen: Remove commented out code These two lines have been commented out for over 4 years. They aren't helping anyone. llvm-svn: 214615	2014-08-02 06:47:07 +00:00
Adrian Prantl	a6cf448226	Attempt to increase the overall happiness of the MSCV-based buildbots. llvm-svn: 214588	2014-08-01 22:56:10 +00:00
Adrian Prantl	b1416837f9	Debug info: Infrastructure to support debug locations for fragmented variables (for example, by-value struct arguments passed in registers, or large integer values split across several smaller registers). On the IR level, this adds a new type of complex address operation OpPiece to DIVariable that describes size and offset of a variable fragment. On the DWARF emitter level, all pieces describing the same variable are collected, sorted and emitted as DWARF expressions using the DW_OP_piece and DW_OP_bit_piece operators. http://reviews.llvm.org/D3373 rdar://problem/15928306 What this patch doesn't do / Future work: - This patch only adds the backend machinery to make this work, patches that change SROA and SelectionDAG's type legalizer to actually create such debug info will follow. (http://reviews.llvm.org/D2680) - Making the DIVariable complex expressions into an argument of dbg.value will reduce the memory footprint of the debug metadata. - The sorting/uniquing of pieces should be moved into DebugLocEntry, to facilitate the merging of multi-piece entries. llvm-svn: 214576	2014-08-01 22:11:58 +00:00
Chandler Carruth	356665a36c	[SDAG] MorphNodeTo recursively deletes dead operands of the old fromulation of the node, which isn't really the desired behavior from within the combiner or legalizer, but is necessary within ISel. I've added a hopefully helpful comment and fixed the only two places where this took place. Yet another step toward the combiner and legalizer not needing to use update listeners with virtual calls to manage the worklists behind legalization and combining. llvm-svn: 214574	2014-08-01 22:09:43 +00:00
Chandler Carruth	1f52b3da0a	[SDAG] Begin simplifying the way in which the legalizer deletes nodes. This lifts the (very few) places the legalizer would delete dead nodes into the outer loop around the legalizer. This is significantly simpler because it doesn't require the legalizer itself to manage the iterator validity, and it doesn't require the legalizer to be a DAG update listener in order to remove things from the legalized set. It also makes the interface much less contrived for the case of the legalizer running inside the last phase of DAG combining. I'm working on centralizing the deletion of nodes during both legalizing and combining as much as possible. My hope is to remove the need for DAG update listeners from the combiner next, which would remove a costly virtual dispatch chain on every deletion. This in turn should allow us to more aggressively delete DAG nodes during combining which will in turn allow us to combine more aggressively by exposing the actual nodes which have single users to the combine phases. llvm-svn: 214546	2014-08-01 19:49:59 +00:00
Philip Reames	87c2b605f5	Explicitly report runtime stack realignment in StackMap section This change adds code to explicitly mark a function which requires runtime stack realignment as not having a fixed frame size in the StackMap section. As it happens, this is not actually a functional change. The size that would be reported without the check is also "-1", but as far as I can tell, that's an accident. The code change makes this explicit. Note: There's a separate bug in handling of stackmaps and patchpoints in functions which need dynamic frame realignment. The current code assumes that offsets can be calculated from RBP, but realigned frames must use RSP. (There's a variable gap between RBP and the spill slots.) This change set does not address that issue. Reviewers: atrick, ributzka Differential Revision: http://reviews.llvm.org/D4572 llvm-svn: 214534	2014-08-01 18:26:27 +00:00
Hal Finkel	b6d0d6b263	[PowerPC] Generate unaligned vector loads using intrinsics instead of regular loads Altivec vector loads on PowerPC have an interesting property: They always load from an aligned address (by rounding down the address actually provided if necessary). In order to generate an actual unaligned load, you can generate two load instructions, one with the original address, one offset by one vector length, and use a special permutation to extract the bytes desired. When this was originally implemented, I generated these two loads using regular ISD::LOAD nodes, now marked as aligned. Unfortunately, there is a problem with this: The alignment of a load does not contribute to its identity, and SDNodes are uniqued. So, imagine that we have some unaligned load, L1, that is not aligned. The routine will create two loads, L1(aligned) and (L1+16)(aligned). Further imagine that there had already existed a load (L1+16)(unaligned) with the same chain operand as the load L1. When (L1+16)(aligned) is created as part of the lowering of L1, this load is also the (L1+16)(unaligned) node, just now marked as aligned (because the new alignment overwrites the old). But the original users of (L1+16)(unaligned) now get the data intended for the permutation yielding the data for L1, and (L1+16)(unaligned) no longer exists to get its own permutation-based expansion. This was PR19991. A second potential problem has to do with the MMOs on these loads, which can be used by AA during instruction scheduling to break chain-based dependencies. If the new "aligned" loads get the MMO from the original unaligned load, this does not represent the fact that it will load data from below the original address. Normally, this would not matter, but this load might be combined with another load pair for a previous vector, and then the dependency on the otherwise- ignored lower bytes can matter. To fix both problems, instead of generating the necessary loads using regular ISD::LOAD instructions, ppc_altivec_lvx intrinsics are used instead. These are provided with MMOs with a conservative address range. Unfortunately, I no longer have a failing test case (since PR19991 was reported, other changes in CodeGen have forced this bug back into hiding it again). Nevertheless, this should fix the underlying problem. llvm-svn: 214481	2014-08-01 05:20:41 +00:00
Louis Gerbarg	09b8cdee12	White space fix. llvm-svn: 214455	2014-07-31 22:57:46 +00:00
Louis Gerbarg	67474e3755	Make sure no loads resulting from load->switch DAGCombine are marked invariant Currently when DAGCombine converts loads feeding a switch into a switch of addresses feeding a load the new load inherits the isInvariant flag of the left side. This is incorrect since invariant loads can be reordered in cases where it is illegal to reoarder normal loads. This patch adds an isInvariant parameter to getExtLoad() and updates all call sites to pass in the data if they have it or false if they don't. It also changes the DAGCombine to use that data to make the right decision when creating the new load. llvm-svn: 214449	2014-07-31 21:45:05 +00:00
Will Schmidt	44ff8f06ec	Disable IsSub subregister assert. pr18663. This is a follow-up to the activity in the bug at http://llvm.org/bugs/show_bug.cgi?id=18663 . The underlying issue has to do with how the KILL pseudo-instruction is handled. I defer to Hal/Jakob/Uli for additional details and background. This will disable the (bad?) assert, add an associated fixme comment, and add a pair of tests. The code change and the pr18663-2.ll test are copied from the referenced bug. That test does not immediately fail in my environment, but I have added the pr18663.ll test which does. (Comment from Hal) to provide everyone else with some context, this assert was not bad when it was written. At that time, we only generated KILL pseudo instructions around subregister copies. This logic, unfortunately, had its own problems. In r199797, the relevant logic in MachineCopyPropagation was replaced to generate KILLs for other kinds of copies too. This change in semantics broke this now-problematic assumption in AggressiveAntiDepBreaker. The AggressiveAntiDepBreaker really needs a proper cleanup to deal with the change, but removing the assert (which just allows the function to return false) is a safe conservative behavior, and should do for the time being. llvm-svn: 214429	2014-07-31 19:50:53 +00:00
Juergen Ributzka	e8514fc1f7	[FastISel] Fix the patchpoint intrinsic lowering in FastISel for large target addresses. This fixes a mistake where I accidentially dropped the upper 32bit of a 64bit pointer during FastISel lowering of the patchpoint intrinsic. llvm-svn: 214367	2014-07-31 00:11:16 +00:00
Rafael Espindola	f21434ccb0	Refactor duplicated code. llvm-svn: 214328	2014-07-30 19:42:16 +00:00
Louis Gerbarg	4fc09b36de	Retain alignment requirements for load->selects modified by DAGCombine DAGCombine may choose to rewrite graphs where two loads feed a select into graphs where a select of two addresses feed a load. While it sanity checks the loads to make sure they are broadly equivalent it currently just uses the alignment restriction of the left node. In cases where the right node has stronger alignment requiresment this may lead to bad codegen, such as generating an aligned load where an unaligned load is required. This patch makes the combine generate a load with an alignment that is the same as whichever is more restrictive of the two alignments. Tests included. rdar://17762530 llvm-svn: 214322	2014-07-30 18:24:41 +00:00
Rafael Espindola	3cf4af11d5	Add the missing hasLinkOnceODRLinkage predicate. llvm-svn: 214312	2014-07-30 15:57:51 +00:00
Chandler Carruth	681069d675	Don't manually (and forcibly) run the verifier on the entire module from the jump instruction table pass. First, the verifier is already built into all the tools. The test case is adapted to just run llvm-as demonstrating that we still catch the broken module. Second, the verifier is extremely slow. This was responsible for very significant compile time regressions. If you have deployed a Clang binary anywhere from r210280 to this commit, you really want to re-deploy. llvm-svn: 214287	2014-07-30 05:44:04 +00:00
Petar Jovanovic	b7c305f091	Add support for scalarizing ctlz_zero_undef Fix the missing case in ScalarizeVectorResult() that was exposed with libclcore.bc in Android. Differential Revision: http://reviews.llvm.org/D4645 llvm-svn: 214266	2014-07-30 00:44:03 +00:00
Richard Smith	5e23fb8691	Header hygiene: remove using directive and #undef DEBUG_TYPE once we're done. llvm-svn: 214263	2014-07-30 00:25:24 +00:00
Manman Ren	72b07e8578	Feedback on r214189, no functionality change. llvm-svn: 214240	2014-07-29 22:58:13 +00:00
Manman Ren	f93ac4bfad	[Debug Info] remove DITrivialType and use null to represent unspecified param. Per feedback on r214111, we are going to use null to represent unspecified parameter. If the type array is {null}, it means a function that returns void; If the type array is {null, null}, it means a variadic function that returns void. In summary if we have more than one element in the type array and the last element is null, it is a variadic function. rdar://17628609 llvm-svn: 214189	2014-07-29 18:20:39 +00:00
Tim Northover	e2239ff3eb	CodeGenPrep: fall back to MVT::Other if instruction's type isn't an EVT. The test being performed is just an approximation anyway, so it really shouldn't crash when things don't go entirely as expected. Should fix PR20474. llvm-svn: 214177	2014-07-29 10:20:22 +00:00
Tim Northover	f67bb2079d	ARM: fix @llvm.convert.from.fp16 on softfloat targets. We need to make sure we use the softened version of all appropriate operands in the libcall, or things go horribly wrong. This may entail actually executing a 1-stage softening. llvm-svn: 214175	2014-07-29 09:56:38 +00:00
Jiangning Liu	c3053129b9	Add TargetInstrInfo interface isAsCheapAsAMove. llvm-svn: 214158	2014-07-29 01:55:19 +00:00
Manman Ren	bd1628a595	[Debug Info] unique MDNodes in the enum types of each compile unit. The enum types array by design contains pointers to MDNodes rather than DIRefs. Unique them when handling the enum types in DwarfDebug. rdar://17628609 llvm-svn: 214139	2014-07-28 23:04:20 +00:00
Manman Ren	f8a1967c8c	[Debug Info] add DISubroutineType and its creation takes DITypeArray. DITypeArray is an array of DITypeRef, at its creation, we will create DITypeRef (i.e use the identifier if the type node has an identifier). This is the last patch to unique the type array of a subroutine type. rdar://17628609 llvm-svn: 214132	2014-07-28 22:24:06 +00:00
Manman Ren	ab8ffbaaee	[Debug Info] rename getTypeArray to getElements, setTypeArray to setArrays. This is the second of a series of patches to handle type uniqueing of the type array for a subroutine type. For vector and array types, getElements returns the array of subranges, so it is a better name than getTypeArray. Even for class, struct and enum types, getElements returns the members, which can be subprograms. setArrays can set up to two arrays, the second is the templates. This commit should have no functionality change. llvm-svn: 214112	2014-07-28 19:14:13 +00:00
Chandler Carruth	b143274ad0	[SDAG] Add DEBUG logging to the legalizer, fixing a "bug" found by inspection in the proccess, and shuffle the logging in the DAG combiner around a bit. With this it is much easier to follow what the legalizer is doing. It should even accurately present most of the strange legalization operations where a single node is replaced by multiple nodes, etc. There is still some information lost (we log SDNodes not SDValues so we don't log which result is used for which thing), but I think this is much closer to a usable system. Notably, this will make it much more apparant when legalization is actually happening inside the combiner, or when there is a cycle caused by interactions of the legalizer and the combiner. The "bug" I fixed here I'm not sure is remotely possible to trigger. We were only adding one of the nodes in a replacement to the updated set rather than all of the nodes in the replacement. Realistically, the worst result of this are nodes not getting back onto the worklist in the DAG combiner. I doubt it is possible to trigger this today, and I certainly don't have any ideas about how, but this at least brings the code into alignment with the principled operation of the routine. llvm-svn: 214105	2014-07-28 17:55:07 +00:00
Matt Arsenault	6f2a526101	Add alignment value to allowsUnalignedMemoryAccess Rename to allowsMisalignedMemoryAccess. On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment, and don't need to be split into multiple accesses. Vector loads with an alignment of the element type are not uncommon in OpenCL code. llvm-svn: 214055	2014-07-27 17:46:40 +00:00
Chandler Carruth	5a85c7beb8	[SDAG] Add an assert that we don't mess up the number of values when replacing nodes in the legalizer. This caught a number of bugs for me during development. llvm-svn: 214022	2014-07-26 05:53:16 +00:00
Chandler Carruth	98655fa4d8	[SDAG] Simplify the code for handling single-value nodes and add a missing transfer of debug information (without which tests fail). llvm-svn: 214021	2014-07-26 05:52:51 +00:00
Chandler Carruth	411fb407f8	[SDAG] When performing post-legalize DAG combining, run the legalizer over each node in the worklist prior to combining. This allows the combiner to produce new nodes which need to go back through legalization. This is particularly useful when generating operands to target specific nodes in a post-legalize DAG combine where the operands are significantly easier to express as pre-legalized operations. My immediate use case will be PSHUFB formation where we need to build a constant shuffle mask with a build_vector node. This also refactors the relevant functionality in the legalizer to support this, and updates relevant tests. I've spoken to the R600 folks and these changes look like improvements to them. The avx512 change needs to be investigated, I suspect there is a disagreement between the legalizer and the DAG combiner there, but it seems a minor issue so leaving it to be re-evaluated after this patch. Differential Revision: http://reviews.llvm.org/D4564 llvm-svn: 214020	2014-07-26 05:49:40 +00:00
Hal Finkel	930469107d	Add @llvm.assume, lowering, and some basic properties This is the first commit in a series that add an @llvm.assume intrinsic which can be used to provide the optimizer with a condition it may assume to be true (when the control flow would hit the intrinsic call). Some basic properties are added here: - llvm.invariant(true) is dead. - llvm.invariant(false) is unreachable (this directly corresponds to the documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef). The intrinsic is tagged as writing arbitrarily, in order to maintain control dependencies. BasicAA has been updated, however, to return NoModRef for any particular location-based query so that we don't unnecessarily block code motion. llvm-svn: 213973	2014-07-25 21:13:35 +00:00
Akira Hatanaka	e5b6e0d231	[stack protector] Fix a potential security bug in stack protector where the address of the stack guard was being spilled to the stack. Previously the address of the stack guard would get spilled to the stack if it was impossible to keep it in a register. This patch introduces a new target independent node and pseudo instruction which gets expanded post-RA to a sequence of instructions that load the stack guard value. Register allocator can now just remat the value when it can't keep it in a register. <rdar://problem/12475629> llvm-svn: 213967	2014-07-25 19:31:34 +00:00
David Blaikie	29459ae83c	Reapply "DebugInfo: Don't put fission type units in comdat sections." This recommits r208930, r208933, and r208975 (by reverting r209338) and reverts r209529 (the FIXME to readd this functionality once the tools were fixed) now that DWP has been fixed to cope with a single section for all fission type units. Original commit message: "Since type units in the dwo file are handled by a debug aware tool, they don't need to leverage the ELF comdat grouping to implement deduplication. Avoid creating all the .group sections for these as a space optimization." llvm-svn: 213956	2014-07-25 17:11:58 +00:00
David Blaikie	2f04011435	Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information. Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped provide a reduced reproduction, though the failure wasn't too hard to guess, and even easier with the example to confirm. The assertion that the subprogram metadata associated with an llvm::Function matches the scope data referenced by the DbgLocs on the instructions in that function is not valid under LTO. In LTO, a C++ inline function might exist in multiple CUs and the subprogram metadata nodes will refer to the same llvm::Function. In this case, depending on the order of the CUs, the first intance of the subprogram metadata may not be the one referenced by the instructions in that function and the assertion will fail. A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the assertion removed and a comment added to explain this situation. This was then reverted again in r213581 as it caused PR20367. The root cause of this was the early exit in LiveDebugVariables meant that spurious DBG_VALUE intrinsics that referenced dead variables were not removed, causing an assertion/crash later on. The fix is to have LiveDebugVariables strip all DBG_VALUE intrinsics in functions without debug info as they're not needed anyway. Test case added to cover this situation (that occurs when a debug-having function is inlined into a nodebug function) in test/DebugInfo/X86/nodebug_with_debug_loc.ll Original commit message: If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 213952	2014-07-25 16:10:16 +00:00
Chandler Carruth	eae2d28cc9	[SDAG] Don't insert the VRBase into a mapping from SDValues when the def doesn't actually correspond to an SDValue at all. Fixes most of the remaining asserts on out-of-range SDValue result numbers. llvm-svn: 213930	2014-07-25 09:19:18 +00:00
Matt Arsenault	197a1e26e3	Store nodes only have 1 result. llvm-svn: 213928	2014-07-25 07:56:42 +00:00
Chandler Carruth	94bd553eb8	[SDAG] Start plumbing an assert into SDValues that we don't form one with a result number outside the range of results for the node. I don't know how we managed to not really check this very basic invariant for so long, but the code is very broken at this point. I have over 270 test failures with the assert enabled. I'm committing it disabled so that others can join in the cleanup effort and reproduce the issues. I've also included one of the obvious fixes that I already found. More fixes to come. llvm-svn: 213926	2014-07-25 07:23:23 +00:00
Chandler Carruth	9f4530b95d	[SDAG] Introduce a combined set to the DAG combiner which tracks nodes which have successfully round-tripped through the combine phase, and use this to ensure all operands to DAG nodes are visited by the combiner, even if they are only added during the combine phase. This is critical to have the combiner reach nodes that are introduced during combining. Previously these would sometimes be visited and sometimes not be visited based on whether they happened to end up on the worklist or not. Now we always run them through the combiner. This fixes quite a few bad codegen test cases lurking in the suite while also being more principled. Among these, the TLS codegeneration is particularly exciting for programs that have this in the critical path like TSan-instrumented binaries (although I think they engineer to use a different TLS that is faster anyways). I've tried to check for compile-time regressions here by running llc over a merged (but not LTO-ed) clang bitcode file and observed at most a 3% slowdown in llc. Given that this is essentially a worst case (none of opt or clang are running at this phase) I think this is tolerable. The actual LTO case should be even less costly, and the cost in normal compilation should be negligible. With this combining logic, it is possible to re-legalize as we combine which is necessary to implement PSHUFB formation on x86 as a post-legalize DAG combine (my ultimate goal). Differential Revision: http://reviews.llvm.org/D4638 llvm-svn: 213898	2014-07-24 22:15:28 +00:00
Chandler Carruth	80b869461e	[x86] Make vector legalization of extloads work more like the "normal" vector operation legalization with support for custom target lowering and fallback to expand when it fails, and use this to implement sext and anyext load lowering for x86 in a more principled way. Previously, the x86 backend relied on a target DAG combine to "combine away" sextload and extload nodes prior to legalization, or would expand them during legalization with terrible code. This is particularly problematic because the DAG combine relies on running over non-canonical DAG nodes at just the right time to match several common and important patterns. It used a combine rather than lowering because we didn't have good lowering support, and to expose some tricks being employed to more combine phases. With this change it becomes a proper lowering operation, the backend marks that it can lower these nodes, and I've added support for handling the canonical forms that don't have direct legal representations such as sextload of a v4i8 -> v4i64 on AVX1. With this change, our test cases for this behavior continue to pass even after the DAG combiner beigns running more systematically over every node. There is some noise caused by this in the test suite where we actually use vector extends instead of subregister extraction. This doesn't really seem like the right thing to do, but is unlikely to be a critical regression. We do regress in one case where by lowering to the target-specific patterns early we were able to combine away extraneous legal math nodes. However, this regression is completely addressed by switching to a widening based legalization which is what I'm working toward anyways, so I've just switched the test to that mode. Differential Revision: http://reviews.llvm.org/D4654 llvm-svn: 213897	2014-07-24 22:09:56 +00:00
Lang Hames	f49bc3f1b1	[X86] Optimize stackmap shadows on X86. This patch minimizes the number of nops that must be emitted on X86 to satisfy stackmap shadow constraints. To minimize the number of nops inserted, the X86AsmPrinter now records the size of the most recent stackmap's shadow in the StackMapShadowTracker class, and tracks the number of instruction bytes emitted since the that stackmap instruction was encountered. Padding is emitted (if it is required at all) immediately before the next stackmap/patchpoint instruction, or at the end of the basic block. This optimization should reduce code-size and improve performance for people using the llvm stackmap intrinsic on X86. <rdar://problem/14959522> llvm-svn: 213892	2014-07-24 20:40:55 +00:00
Hal Finkel	9414665a3b	Add scoped-noalias metadata This commit adds scoped noalias metadata. The primary motivations for this feature are: 1. To preserve noalias function attribute information when inlining 2. To provide the ability to model block-scope C99 restrict pointers Neither of these two abilities are added here, only the necessary infrastructure. In fact, there should be no change to existing functionality, only the addition of new features. The logic that converts noalias function parameters into this metadata during inlining will come in a follow-up commit. What is added here is the ability to generally specify noalias memory-access sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA nodes: !scope0 = metadata !{ metadata !"scope of foo()" } !scope1 = metadata !{ metadata !"scope 1", metadata !scope0 } !scope2 = metadata !{ metadata !"scope 2", metadata !scope0 } !scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 } !scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 } Loads and stores can be tagged with an alias-analysis scope, and also, with a noalias tag for a specific scope: ... = load %ptr1, !alias.scope !{ !scope1 } ... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 } When evaluating an aliasing query, if one of the instructions is associated with an alias.scope id that is identical to the noalias scope associated with the other instruction, or is a descendant (in the scope hierarchy) of the noalias scope associated with the other instruction, then the two memory accesses are assumed not to alias. Note that is the first element of the scope metadata is a string, then it can be combined accross functions and translation units. The string can be replaced by a self-reference to create globally unqiue scope identifiers. [Note: This overview is slightly stylized, since the metadata nodes really need to just be numbers (!0 instead of !scope0), and the scope lists are also global unnamed metadata.] Existing noalias metadata in a callee is "cloned" for use by the inlined code. This is necessary because the aliasing scopes are unique to each call site (because of possible control dependencies on the aliasing properties). For example, consider a function: foo(noalias a, noalias b) { a = b; } that gets inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } -- now just because we know that a1 does not alias with b1 at the first call site, and a2 does not alias with b2 at the second call site, we cannot let inlining these functons have the metadata imply that a1 does not alias with b2. llvm-svn: 213864	2014-07-24 14:25:39 +00:00
Hal Finkel	cc39b67530	AA metadata refactoring (introduce AAMDNodes) In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode). This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode in AliasAnalysis::Location, etc. No functionality change intended. llvm-svn: 213859	2014-07-24 12:16:19 +00:00
Eric Christopher	f19d12ba3c	Fix indenting. llvm-svn: 213811	2014-07-23 22:34:13 +00:00
Eric Christopher	6d0e40bfbf	Reorganize and simplify local variables. llvm-svn: 213809	2014-07-23 22:27:10 +00:00
Eric Christopher	9d9167950e	Remove the query for TargetMachine and TargetInstrInfo since we're already inside TargetInstrInfo. llvm-svn: 213806	2014-07-23 22:12:03 +00:00
Jim Grosbach	19dd3088c0	DAG: fp->int conversion for non-splat constants. Constant fold the lanes of the input constant build_vector individually so we correctly handle when the vector elements are not all the same constant value. PR20394 llvm-svn: 213798	2014-07-23 20:41:31 +00:00
Chad Rosier	17020f96c7	[AArch64] Lower sdiv x, pow2 using add + select + shift. The target-independent DAGcombiner will generate: asr w1, X, #31 w1 = splat sign bit. add X, X, w1, lsr #28 X = X + 0 or pow2-1 asr w0, X, asr #4 w0 = X/pow2 However, the add + shifts is expensive, so generate: add w0, X, 15 w0 = X + pow2-1 cmp X, wzr X - 0 csel X, w0, X, lt X = (X < 0) ? X + pow2-1 : X; asr w0, X, asr 4 w0 = X/pow2 llvm-svn: 213758	2014-07-23 14:57:52 +00:00
James Molloy	bc9fed82cc	Enable partial libcall inlining for all targets by default. This pass attempts to speculatively use a sqrt instruction if one exists on the target, falling back to a libcall if the target instruction returned NaN. This was enabled for MIPS and System-Z, but is well guarded and is good for most targets - GCC does this for (that I've checked) X86, ARM and AArch64. llvm-svn: 213752	2014-07-23 13:33:00 +00:00
Chandler Carruth	9a0051cd59	[SDAG] Make the DAGCombine worklist not grow endlessly due to duplicate insertions. The old behavior could cause arbitrarily bad memory usage in the DAG combiner if there was heavy traffic of adding nodes already on the worklist to it. This commit switches the DAG combine worklist to work the same way as the instcombine worklist where we null-out removed entries and only add new entries to the worklist. My measurements of codegen time shows slight improvement. The memory utilization is unsurprisingly dominated by other factors (the IR and DAG itself I suspect). This change results in subtle, frustrating churn in the particular order in which DAG combines are applied which causes a number of minor regressions where we fail to match a pattern previously matched by accident. AFAICT, all of these should be using AddToWorklist to directly or should be written in a less brittle way. None of the changes seem drastically bad, and a few of the changes seem distinctly better. A major change required to make this work is to significantly harden the way in which the DAG combiner handle nodes which become dead (zero-uses). Previously, we relied on the ability to "priority-bump" them on the combine worklist to achieve recursive deletion of these nodes and ensure that the frontier of remaining live nodes all were added to the worklist. Instead, I've introduced a routine to just implement that precise logic with no indirection. It is a significantly simpler operation than that of the combiner worklist proper. I suspect this will also fix some other problems with the combiner. I think the x86 changes are really minor and uninteresting, but the avx512 change at least is hiding a "regression" (despite the test case being just noise, not testing some performance invariant) that might be looked into. Not sure if any of the others impact specific "important" code paths, but they didn't look terribly interesting to me, or the changes were really minor. The consensus in review is to fix any regressions that show up after the fact here. Thanks to the other reviewers for checking the output on other architectures. There is a specific regression on ARM that Tim already has a fix prepped to commit. Differential Revision: http://reviews.llvm.org/D4616 llvm-svn: 213727	2014-07-23 07:08:53 +00:00
Chandler Carruth	41b20e7783	[SDAG] Refactor the code for inserting a newly allocated SDNode into the DAG into a helper function. This adds a trip through the (very minimal) verification logic in a bunch of places that were missing it, but shouldn't have any other impact outside of refactoring. I'm hoping to use this to do more clever things when DAG nodes are inserted into the graph. llvm-svn: 213612	2014-07-22 04:07:55 +00:00
Chandler Carruth	2fc9a2b8eb	[SDAG] Remove a giant pile of asserts that may have helped track down a bug in 2010 when they were added but are adding no value today. In fact, they are utter lies. NodeAllocator is used to allocate almost all of these node types. I don't know what we were trying to assert here, and the docs don't give any answer. Until we once again stumble upon a bug needing help, let's clear the path for improvements. llvm-svn: 213610	2014-07-22 04:03:22 +00:00
David Blaikie	26f2268cc5	Revert "Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information." This reverts commit r212649 while I investigate/reduce/etc PR20367. llvm-svn: 213581	2014-07-21 20:45:59 +00:00
Logan Chien	63bee2a2bb	Replace the result usages while legalizing cmpxchg. We should update the usages to all of the results; otherwise, we might get assertion failure or SEGV during the type legalization of ATOMIC_CMP_SWAP_WITH_SUCCESS with two or more illegal types. For example, in the following sequence, both i8 and i1 might be illegal in some target, e.g. armv5, mipsel, mips64el, %0 = cmpxchg i8* %ptr, i8 %desire, i8 %new monotonic monotonic %1 = extractvalue { i8, i1 } %0, 1 Since both i8 and i1 should be legalized, the corresponding ATOMIC_CMP_SWAP_WITH_SUCCESS dag will be checked/replaced/updated twice. If we don't update the usage to ALL of the results in the first round, the DAG for extractvalue might be processed earlier. The GetPromotedInteger() will result in assertion failure, because its operand (i.e. the success bit of cmpxchg) is not promoted beforehand. llvm-svn: 213569	2014-07-21 17:33:44 +00:00
Duncan P. N. Exon Smith	6c99015fe2	Revert "[C++11] Add predecessors(BasicBlock ) / successors(BasicBlock ) iterator ranges." This reverts commit r213474 (and r213475), which causes a miscompile on a stage2 LTO build. I'll reply on the list in a moment. llvm-svn: 213562	2014-07-21 17:06:51 +00:00
Tim Northover	f7a02c1762	CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpext This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc, and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation path is now uniform, regardless of the input IR: fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall fpext -> FP16_TO_FP (if f16 illegal) -> libcall Each target should be able to select the version that best matches its operations and not be required to duplicate patterns for both fptrunc and FP_TO_FP16 (for example). As a result we can remove some redundant AArch64 patterns. llvm-svn: 213507	2014-07-21 09:13:56 +00:00
Chandler Carruth	3c0012beb6	[SDAG,cleanup] Switch the DAG combiner over to use the spelling 'Worklist' consistently rather than a deeply confusing mixture of 'WorkList' and 'Worklist'. Notably, the very 'WorkList' of the DAG combiner was exposed to target specific DAG combines under an interface 'AddToWorklist' which was implemented by in turn calling 'AddToWorkList' in the combiner. This has sent me circling with the wrong case in grep one too many times. I chose to normalize on 'Worklist' because that one won the grep-vote for llvm/lib/... by a hundered hits or so, and it is used in places relatively "canonical" such as InstCombine's Worklist. Let's all jsut pick this casing, whether "correct", "good", or "bad" and be consistent... llvm-svn: 213506	2014-07-21 08:56:44 +00:00
Chandler Carruth	24ceb0ce66	[SDAG] Rather than using a narrow test against the one dummy node on the stack, filter all handle nodes from the DAG combiner worklist. This will also handle cases where other handle nodes might be (erroneously) added to the worklist and then cause bugs and explosions when deleted. For example, when running the legalizer within the DAG combiner, there are times when other handle nodes are used and can end up here. llvm-svn: 213505	2014-07-21 08:32:31 +00:00
Andrea Di Biagio	0fb2013192	[DAGCombiner] Improve the shuffle-vector folding logic. Canonicalize shuffles according to rules: * shuffle(A, shuffle(A, B)) -> shuffle(shuffle(A,B), A) * shuffle(B, shuffle(A, B)) -> shuffle(shuffle(A,B), B) * shuffle(B, shuffle(A, Undef)) -> shuffle(shuffle(A, Undef), B) This patch helps identifying more shuffle pairs that could be combined reusing the already existing rules in the DAGCombiner. Added new test 'combine-vec-shuffle-5.ll' to verify that the canonicalized shuffles are now folded into a single shuffle node by the DAGCombiner. Added more test cases to 'combine-vec-shuffle-4.ll'. llvm-svn: 213504	2014-07-21 07:30:54 +00:00
Andrea Di Biagio	4d8bd41600	[DAG] Refactor some logic. No functional change. This patch removes function 'CommuteVectorShuffle' from X86ISelLowering.cpp and moves its logic into SelectionDAG.cpp as method 'getCommutedVectorShuffles'. This refactoring is in preperation of an upcoming change to the DAGCombiner. llvm-svn: 213503	2014-07-21 07:28:51 +00:00
NAKAMURA Takumi	74a5332235	MachineRegionInfo.cpp: Another fix on MachineRegionInfo::MachineRegionInfo::recalculate() to appease msc17. llvm-svn: 213476	2014-07-20 11:14:55 +00:00
Manuel Jacob	d11beffef4	[C++11] Add predecessors(BasicBlock ) / successors(BasicBlock ) iterator ranges. Summary: This patch introduces two new iterator ranges and updates existing code to use it. No functional change intended. Test Plan: All tests (make check-all) still pass. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4481 llvm-svn: 213474	2014-07-20 09:10:11 +00:00
NAKAMURA Takumi	118b0c789d	Fix -Asserts build introduced since r213456. llvm-svn: 213465	2014-07-20 00:00:42 +00:00
David Blaikie	ba80ee392a	Sure up ownership passing of the PBQPBuilder by passing unique_ptrs by value rather than lvalue reference. Also removes an unnecessary '.release()' that should've been a std::move anyway. (I'm on a hunt for '.release()' calls) llvm-svn: 213464	2014-07-19 21:19:45 +00:00
Matt Arsenault	1b8d83796d	Templatify RegionInfo so it works on MachineBasicBlocks llvm-svn: 213456	2014-07-19 18:29:29 +00:00
David Blaikie	b61064ed39	Remove uses of the redundant ".reset(nullptr)" of unique_ptr, in favor of ".reset()" It's also possible to just write "= nullptr", but there's some question of whether that's as readable, so I leave it up to authors to pick which they prefer for now. If we want to discuss standardizing on one or the other, we can do that at some point in the future. llvm-svn: 213438	2014-07-19 01:05:11 +00:00
Eric Christopher	cfd17dd2be	Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."""" After a successful build it seems to have come back on a later build. This reverts commit r213391. llvm-svn: 213432	2014-07-18 23:57:20 +00:00
David Blaikie	db5371b3bb	DebugInfo: Assert that all abstract scopes are subprograms, rather than conditionalizing. There's nothing else these should ever be... llvm-svn: 213417	2014-07-18 22:26:59 +00:00
David Blaikie	5450240219	Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.""" Recommits 212776 which was reverted in r212793. This has been committed and recommitted a few times as I try to test it harder and find/fix more issues. The most recent revert was due to an asan bot failure which I can't seem to reproduce locally, though I believe I'm following all the steps the buildbot does. So I'm going to recommit this in the hopes of investigating the failure on the buildbot itself... apologies in advance for the bot noise. If anyone sees failures with this /please/ provide me with any reproductions, etc. llvm-svn: 213391	2014-07-18 17:49:10 +00:00
Tim Northover	4e80b584fe	ARM: support legalisation of "fptrunc ... to half" operations. llvm-svn: 213373	2014-07-18 13:01:19 +00:00
Tim Northover	20bd0ced30	CodeGen: soften f16 type by default instead of marking legal. Actual support for softening f16 operations is still limited, and can be added when it's needed. But Soften is much closer to being a useful thing to try than keeping it Legal when no registers can actually hold such values. Longer term, we probably want something between Soften and Promote semantics for most targets, it'll be more efficient to promote the 4 basic operations to f32 than libcall them. llvm-svn: 213372	2014-07-18 12:41:46 +00:00
Jim Grosbach	f7502c4884	AArch64: Constant fold converting vector setcc results to float. Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can move unary operations, in this case [su]int_to_fp through the mask operation and constant fold the operation away. Generally speaking: UNARYOP(AND(VECTOR_CMP(x,y), constant)) --> AND(VECTOR_CMP(x,y), constant2) where constant2 is UNARYOP(constant). This implements the transform where UNARYOP is [su]int_to_fp. For example, consider the simple function: define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind { %cmp = fcmp oeq <4 x float> %val, %test %ext = zext <4 x i1> %cmp to <4 x i32> %result = sitofp <4 x i32> %ext to <4 x float> ret <4 x float> %result } Before this change, the code is generated as: fcmeq.4s v0, v0, v1 movi.4s v1, #0x1 // Integer splat value. and.16b v0, v0, v1 // Mask lanes based on the comparison. scvtf.4s v0, v0 // Convert each lane to f32. ret After, the code is improved to: fcmeq.4s v0, v0, v1 fmov.4s v1, #1.00000000 // f32 splat value. and.16b v0, v0, v1 // Mask lanes based on the comparison. ret The svvtf.4s has been constant folded away and the floating point 1.0f vector lanes are materialized directly via fmov.4s. Rather than do the folding manually in the target code, teach getNode() in the generic SelectionDAG to handle folding constant operands of vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do additional constant folding there as well, but I don't have test cases for those operations, so leaving them for another time when it becomes appropriate. rdar://17693791 llvm-svn: 213341	2014-07-18 00:40:52 +00:00
Michael J. Spencer	1eb023013e	Revert "[x86] Fold extract_vector_elt of a load into the Load's address computation." There's a bug where this can create cycles in the DAG. It will take a bit to fix, so I'm backing it out for now. llvm-svn: 213339	2014-07-18 00:15:50 +00:00
Tim Northover	84ce0a642e	CodeGen: generate single libcall for fptrunc -> f16 operations. Previously we asserted on this code. Currently compiler-rt doesn't actually implement any of these new libcalls, but external help is pretty much the only viable option for LLVM. I've followed the much more generic "__truncST2" naming, as opposed to the odd name for f32 -> f16 truncation. This can obviously be changed later, or overridden by any targets that need to. llvm-svn: 213252	2014-07-17 11:12:12 +00:00
Tim Northover	fd7e424935	CodeGen: extend f16 conversions to permit types > float. This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. llvm-svn: 213248	2014-07-17 10:51:23 +00:00
Sanjay Patel	d3bbfa1cb6	Fixed formatting, removed bug reference, renamed testcase Thanks to Duncan Exon Smith for reviewing and cleanup suggestions. llvm-svn: 213205	2014-07-16 22:40:28 +00:00
Juergen Ributzka	618ce3e85e	[FastISel] Local values shouldn't be alive across an inline asm call with side effects. This fixes an issue where a local value is defined before and used after an inline asm call with side effects. This fix simply flushes the local value map, which updates the insertion point for the inline asm call to be above any previously defined local values. This fixes <rdar://problem/17694203> llvm-svn: 213203	2014-07-16 22:20:51 +00:00
Sanjay Patel	ab60d04363	trivial fix for PR20314 Make sure that the AddrInst is an Instruction. llvm-svn: 213197	2014-07-16 21:08:10 +00:00
Chris Bieneman	df4b763be5	[RegisterCoalescer] Moving the RegisterCoalescer subtarget hook onto the TargetRegisterInfo instead of the TargetSubtargetInfo. llvm-svn: 213188	2014-07-16 20:13:31 +00:00
Tim Northover	7f3e11e7c0	CodeGen: don't form illegail EXTLOAD operations. It turns out that in most cases (the main exception being i1-related types) once these operations are formed we cannot separate them and the targets end up having to deal with them whether they want to or not. This is not a good situation, and a more reasonable default can be formed by ackowledging this and having targets leave them as Legal. Only x86 seems to be affected (other targets don't even try marking the operation Expand). Mostly there's no visible change here yet, but it will be useful to have truly expanded EXTLOADS for MVT::f16 softening support. llvm-svn: 213162	2014-07-16 15:37:24 +00:00
Juergen Ributzka	480872b4ce	Remove TLI from isInTailCallPosition's arguments. NFC. There is no need to pass on TLI separately to the function. As Eric pointed out the Target Machine already provides everything we need. llvm-svn: 213108	2014-07-16 00:01:22 +00:00
Sanjay Patel	a2f658d69d	Move Post RA Scheduling flag bit into SchedMachineModel Refactoring; no functional changes intended Removed PostRAScheduler bits from subtargets (X86, ARM). Added PostRAScheduler bit to MCSchedModel class. This bit is set by a CPU's scheduling model (if it exists). Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses. Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!). Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling. Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values. Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. c. PPC overrides the CPU's postRA settings by enabling postRA for everything. d. X86 is the only target that actually has postRA specified via sched model info. Differential Revision: http://reviews.llvm.org/D4217 llvm-svn: 213101	2014-07-15 22:39:58 +00:00
Chris Bieneman	03695ab57e	[RegisterCoalescer] Add new subtarget hook allowing targets to opt-out of coalescing. The coalescer is very aggressive at propagating constraints on the register classes, and the register allocator doesn’t know how to split sub-registers later to recover. This patch provides an escape valve for targets that encounter this problem to limit coalescing. This patch also implements such for ARM to lower register pressure when using lots of large register classes. This works around PR18825. llvm-svn: 213078	2014-07-15 17:18:41 +00:00
Andrea Di Biagio	bd5555cc3f	[DAGCombiner] Add more rules to fold shuffles. This patch adds two new rules to the DAGCombiner: 1. shuffle (shuffle A, Undef, M0), B, M1 -> shuffle A, B, M2 2. shuffle (shuffle A, Undef, M0), A, M1 -> shuffle A, Undef, M2 We only do this if the combined shuffle is legal for the target. Example: ;; define <4 x float> @test(<4 x float> %a, <4 x float> %b) { %1 = shufflevector <4 x float> %a, <4 x float> undef, <4 x i32><i32 6, i32 0, i32 1, i32 7> %2 = shufflevector <4 x float> %1, <4 x float> %b, <4 x i32><i32 1, i32 2, i32 4, i32 5> ret <4 x i32> %2 } ;; (using llc -mcpu=corei7 -march=x86-64) Before, the x86 backend generated: pshufd $120, %xmm0, %xmm0 shufps $-108, %xmm0, %xmm1 movaps %xmm1, %xmm0 Now the x86 backend generates: movsd %xmm1, %xmm0 llvm-svn: 213069	2014-07-15 13:26:28 +00:00
Juergen Ributzka	718bb71ade	[FastISel] Insert patchpoint instruction before the target generated call instruction. The patchpoint instruction should have been inserted before the target generated call instruction to be inside the ADJSTACKDOWN/ADJSTACKUP call sequence window. llvm-svn: 213034	2014-07-15 02:22:46 +00:00
Juergen Ributzka	a415943590	[FastISel] Fix patchpoint lowering to set the result register. Always update the value map with the result register (if there is one), for the patchpoint instruction we created to replace the target-specific call instruction. llvm-svn: 213033	2014-07-15 02:22:43 +00:00
Andrea Di Biagio	2152a6c78b	[DAGCombiner] Avoid calling method 'isShuffleMaskLegal' on illegal vector types. This patch fixes a crasher in method 'DAGCombiner::visitOR' due to an invalid call to method 'isShuffleMaskLegal'. On x86, method 'isShuffleMaskLegal' always expects a legal vector value type in input. With this patch, we immediately check if the input OR dag node has a legal vector type; we only try to fold a OR dag node into a single shufflevector if we know that the resulting shuffle will have a legal type. This is to avoid calling method 'isShuffleMaskLegal' on a potentially illegal vector value type. Added a new test-case to file 'CodeGen/X86/combine-or.ll' to verify that DAGCombiner doesn't crash in the attempt to check/combine an OR between shuffles with illegal types. llvm-svn: 213020	2014-07-15 00:02:32 +00:00
David Majnemer	8bce66b093	CodeGen: Stick constant pool entries in COMDAT sections for WinCOFF COFF lacks a feature that other object file formats support: mergeable sections. To work around this, MSVC sticks constant pool entries in special COMDAT sections so that each constant is in it's own section. This permits unused constants to be dropped and it also allows duplicate constants in different translation units to get merged together. This fixes PR20262. Differential Revision: http://reviews.llvm.org/D4482 llvm-svn: 213006	2014-07-14 22:57:27 +00:00
Andrea Di Biagio	3960a9571f	[DAGCombiner] Add more rules to combine shuffle vector dag nodes. This patch teaches the DAGCombiner how to fold a pair of shuffles according to rules: 1. shuffle(shuffle A, B, M0), B, M1) -> shuffle(A, B, M2) 2. shuffle(shuffle A, B, M0), A, M1) -> shuffle(A, B, M3) The new rules would only trigger if the resulting shuffle has legal type and legal mask. Added test 'combine-vec-shuffle-3.ll' to verify that DAGCombiner correctly folds shuffles on x86 when the resulting mask is legal. Also added some negative cases to verify that we avoid introducing illegal shuffles. llvm-svn: 213001	2014-07-14 22:46:26 +00:00
David Majnemer	5a1c4b8283	CodeGen: Add a getSectionKind method to MachineConstantPoolEntry This is just a helper routine, no functionality has changed. llvm-svn: 212993	2014-07-14 22:06:29 +00:00
Bill Wendling	c80b6c92e2	Unify the lowering of arguments during SjLj prepare. The 'select true, %arg, undef' instruction can be used for both aggregate and non-aggregate arguments. llvm-svn: 212967	2014-07-14 18:21:11 +00:00
Sanjay Patel	b49bf168f2	fixed typo llvm-svn: 212966	2014-07-14 18:21:07 +00:00
Saleem Abdulrasool	271ac58eb3	CodeGen: add missing include Found during windows unwinding work. This header is indirectly included through a chain leading through Support/Win64EH.h. Explicitly include the header. NFC. llvm-svn: 212955	2014-07-14 16:28:09 +00:00
Bill Wendling	151b44d653	Support lowering of empty aggregates. This crash was pretty common while compiling Rust for iOS (armv7). Reason - SjLj preparation step was lowering aggregate arguments as ExtractValue + InsertValue. ExtractValue has assertion which checks that there is some data in value, which is not true in case of empty (no fields) structures. Rust uses them quite extensively so this patch uses a 'select true, %val, undef' instruction to lower the argument. Patch by Valerii Hiora. llvm-svn: 212922	2014-07-14 06:22:36 +00:00
Andrea Di Biagio	67d8b2e2b0	[DAGCombiner] Fix a crash caused by a missing check for legal type when trying to fold shuffles. Verify that DAGCombiner does not crash when trying to fold a pair of shuffles according to rule (added at r212539): (shuffle (shuffle A, Undef, M0), Undef, M1) -> (shuffle A, Undef, M2) The DAGCombiner avoids folding shuffles if the resulting shuffle dag node is not legal for the target. That means, the resulting shuffle must have legal type and legal mask. Before, the DAGCombiner only called method 'TargetLowering::isShuffleMaskLegal' to check if it was "safe" to fold according to the above-mentioned rule. However, this caused a crash in the x86 backend since method 'isShuffleMaskLegal' always expects to be called on a legal vector type. llvm-svn: 212915	2014-07-13 21:02:14 +00:00
Matt Arsenault	4181ea36a9	Templatify DominanceFrontier. Theoretically this should now work for MachineBasicBlocks. llvm-svn: 212885	2014-07-12 21:59:52 +00:00
Reid Kleckner	fb9519838a	Avoid a warning from MSVC on "*/" in this code by inserting a space llvm-svn: 212862	2014-07-12 00:06:46 +00:00
Juergen Ributzka	3d9e6755e4	[FastISel] Add target-independent patchpoint intrinsic support. WIP. This implements the target-independent lowering for the patchpoint intrinsic. Targets have to implement the FastLowerCall hook to support this intrinsic. Related to <rdar://problem/17427052> llvm-svn: 212849	2014-07-11 22:19:02 +00:00
Juergen Ributzka	8179e9e5ad	[FastISel] Add basic infrastructure to support a target-independent call lowering hook in FastISel. WIP The infrastructure mimics the call lowering we have already in place for SelectionDAG, but with limitations. For example structure return demotion and non-simple types are not supported (yet). Currently every backend has its own implementation and duplicated code for call lowering. There is also no specified interface that could be called from target-independent code. The target-hook is opt-in and doesn't affect current implementations. llvm-svn: 212848	2014-07-11 22:01:42 +00:00
Juergen Ributzka	4ce9863d0b	[FastISel] Make isInTailCallPosition independent of SelectionDAG. Break out the arguemnts required from SelectionDAG, so that this function can also be used by FastISel. llvm-svn: 212844	2014-07-11 20:50:47 +00:00
Juergen Ributzka	5dd32136b9	[FastISel] Breakout intrinsic lowering into a separate function and add a target-hook. Create a separate helper function for target-independent intrinsic lowering. Also add an target-hook that allows to directly call into a target-sepcific intrinsic lowering method. Currently the implementation is opt-in and doesn't affect existing target implementations. llvm-svn: 212843	2014-07-11 20:42:12 +00:00
Oliver Stannard	6eda6ffc0c	ARM: Allow __fp16 as a function arg or return type for AArch64 ACLE 2.0 allows __fp16 to be used as a function argument or return type. This enables this for AArch64. llvm-svn: 212812	2014-07-11 13:33:46 +00:00
David Blaikie	de1e1a60e8	Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."" This reverts commit r212776. Nope, still seems to be failing on the sanitizer bots... but hey, not the msan self-host anymore, it's failing in asan now. I'll start looking there next. llvm-svn: 212793	2014-07-11 02:42:57 +00:00
David Blaikie	3ca92d2406	Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself." Committed in r212205 and reverted in r212226 due to msan self-hosting failure, I believe I've got that fixed by r212761 to Clang. Original commit message: "Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), committed again in r212085 and reverted again in r212089 after fixing some other cases, such as debug info subprogram lists not keeping track of the function they represent (r212128) and then short-circuiting things like LiveDebugVariables that build LexicalScopes for functions that might not have full debug info. And again, I believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions." llvm-svn: 212776	2014-07-10 22:59:39 +00:00
Jan Vesely	eca89d283e	SelectionDAG: Factor FP_TO_SINT lower code out of DAGLegalizer Move the code to a helper function to allow calls from TypeLegalizer. No functionality change intended Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Owen Anderson <resistor@mac.com> llvm-svn: 212772	2014-07-10 22:40:18 +00:00
Matt Arsenault	3332b70627	Revert "Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine."" Don't try to convert the select condition type. llvm-svn: 212750	2014-07-10 18:21:04 +00:00
Andrea Di Biagio	b2921c7ca0	[DAG] Further improve the logic in DAGCombiner that folds a pair of shuffles into a single shuffle if the resulting mask is legal. This patch teaches the DAGCombiner how to fold shuffles according to the following new rules: 1. shuffle(shuffle(x, y), undef) -> x 2. shuffle(shuffle(x, y), undef) -> y 3. shuffle(shuffle(x, y), undef) -> shuffle(x, undef) 4. shuffle(shuffle(x, y), undef) -> shuffle(y, undef) The backend avoids to combine shuffles according to rules 3. and 4. if the resulting shuffle does not have a legal mask. This is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes during vector legalization. Added test case combine-vec-shuffle-2.ll to verify that we correctly triggers the new rules when combining shuffles. llvm-svn: 212748	2014-07-10 18:04:55 +00:00
Chandler Carruth	0b666e0648	[x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogous to the zero-extend-vector-inreg node introduced previously for the same purpose: manage the type legalization of widened extend operations, especially to support the experimental widening mode for x86. I'm adding both because sign-extend is expanded in terms of any-extend with shifts to propagate the sign bit. This removes the last fundamental scalarization from vec_cast2.ll (a test case that hit many really bad edge cases for widening legalization), although the trunc tests in that file still appear scalarized because the the shuffle legalization is scalarizing. Funny thing, I've been working on that. Some initial experiments with this and SSE2 scenarios is showing moderately good behavior already for sign extension. Still some work to do on the shuffle combining on X86 before we're generating optimal sequences, but avoiding scalarization is a huge step forward. llvm-svn: 212714	2014-07-10 12:32:32 +00:00
NAKAMURA Takumi	f862ce8908	Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine." This caused miscompilation on, at least, x86-64. SExt(i1 cond) confused other optimizations. llvm-svn: 212708	2014-07-10 11:37:28 +00:00
Daniel Sanders	cbd44c591d	Make it possible for ints/floats to return different values from getBooleanContents() Summary: On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer comparisons return 0 or 1. Updated the various uses of getBooleanContents. Two simplifications had to be disabled when float and int boolean contents differ: - ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially discoverable (i.e. when the condition of the VSELECT is a SETCC node). - visitVSELECT (select C, 0, 1) -> (xor C, 1). Come to think of it, this one could test for the common case of 'C' being a SETCC too. Preserved existing behaviour for all other targets and updated the affected MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low' variable was counting in the wrong direction because it thought it could simply add the result of the comparison. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4389 llvm-svn: 212697	2014-07-10 10:18:12 +00:00
Hao Liu	71224b02fb	[AArch64]Fix an assertion failure in DAG Combiner about concating 2 build_vector. llvm-svn: 212677	2014-07-10 03:41:50 +00:00
Chandler Carruth	d3561f6fec	[SDAG] Make the new zext-vector-inreg node default to expand so targets don't need to set it manually. This is based on feedback from Tom who pointed out that if every target needs to handle this we need to reach out to those maintainers. In fact, it doesn't make sense to duplicate everything when anything other than expand seems unlikely at this stage. llvm-svn: 212661	2014-07-09 22:53:04 +00:00
David Blaikie	029bd3350e	Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information. Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped provide a reduced reproduction, though the failure wasn't too hard to guess, and even easier with the example to confirm. The assertion that the subprogram metadata associated with an llvm::Function matches the scope data referenced by the DbgLocs on the instructions in that function is not valid under LTO. In LTO, a C++ inline function might exist in multiple CUs and the subprogram metadata nodes will refer to the same llvm::Function. In this case, depending on the order of the CUs, the first intance of the subprogram metadata may not be the one referenced by the instructions in that function and the assertion will fail. A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the assertion removed and a comment added to explain this situation. Original commit message: If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 212649	2014-07-09 21:02:41 +00:00
Matt Arsenault	658c5576d1	Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine. Do this if the truncate is free and the select is legal. llvm-svn: 212640	2014-07-09 19:12:07 +00:00
Chandler Carruth	5865a73a82	[x86] Fix a bug in my new zext-vector-inreg DAG trickery where we were not widening the input type to the node sufficiently to let the ext take place in a register. This would in turn result in a mysterious bitcast assertion failure downstream. First change here is to add back the helpful assert I had in an earlier version of the code to catch this immediately. Next change is to add support to the type legalization to detect when we have widened the operand either too little or too much (for whatever reason) and find a size-matched legal vector type to convert it to first. This can also fail so we get a new fallback path, but that seems OK. With this, we no longer crash on vec_cast2.ll when using widening. I've also added the CHECK lines for the zero-extend cases here. We still need to support sign-extend and trunc (or something) to get plausible code for the other two thirds of this test which is one of the regression tests that showed the most scalarization when widening was force-enabled. Slowly closing in on widening being a viable legalization strategy without it resorting to scalarization at every turn. =] llvm-svn: 212614	2014-07-09 12:36:54 +00:00
Chandler Carruth	14cad41e14	Sink two variables only used in an assert into the assert itself. Should fix the release builds with Werror. llvm-svn: 212612	2014-07-09 11:13:16 +00:00
Chandler Carruth	afe4b2507e	[x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening vector types to be legal and a ZERO_EXTEND node is encountered. When we use widening to legalize vector types, extend nodes are a real challenge. Either the input or output is likely to be legal, but in many cases not both. As a consequence, we don't really have any way to represent this situation and the prior code in the widening legalization framework would just scalarize the extend operation completely. This patch introduces a new DAG node to represent doing a zero extend of a vector "in register". The core of the idea is to allow legal but different vector types in the input and output. The output vector must have fewer lanes but wider elements. The operation is defined to zero extend the low elements of the input to the size of the output elements, and drop all of the high elements which don't have a corresponding lane in the output vector. It also includes generic expansion of this node in terms of blending a zero vector into the high elements of the vector and bitcasting across. This in turn yields extremely nice code for x86 SSE2 when we use the new widening legalization logic in conjunction with the new shuffle lowering logic. There is still more to do here. We need to support sign extension, any extension, and potentially int-to-float conversions. My current plan is to continue using similar synthetic nodes to model each of these transitions with generic lowering code for each one. However, with this patch LLVM already reaches performance parity with GCC for the core C loops of the x264 code (assuming you disable the hand-written assembly versions) when compiling for SSE2 and SSE3 architectures and enabling the new widening and lowering logic for vectors. Differential Revision: http://reviews.llvm.org/D4405 llvm-svn: 212610	2014-07-09 10:58:18 +00:00
Chandler Carruth	f0a33b71e9	[SDAG] At the suggestion of Hal, switch to an output parameter that tracks which elements of the build vector are in fact undef. This should make actually inpsecting them (likely in my next patch) reasonably pretty. Also makes the output parameter optional as it is clear now that most users are happy with undefs in their splats. llvm-svn: 212581	2014-07-09 00:41:34 +00:00
Andrea Di Biagio	d261e98f3d	[DAG] Teach how to combine a pair of shuffles into a single shuffle if the resulting mask is legal. This patch teaches how to fold a shuffle according to rule: shuffle (shuffle (x, undef, M0), undef, M1) -> shuffle(x, undef, M2) We do this only if the resulting mask M2 is legal; this is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes. This patch has the advantage of being target independent, since it works on ISD nodes. Therefore, all targets (not only x86) can take advantage of this rule. The idea behind this patch is that most shuffle pairs can be safely combined before we run the legalizer on vector operations. This allows us to combine/simplify dag nodes earlier in the process and not only immediately before instruction selection stage. That said. This patch is not meant to replace any existing target specific combine rules; backends might still introduce new shuffles during legalization stage. Also, this rule is very simple and avoids to aggressively optimize shuffles. llvm-svn: 212539	2014-07-08 15:22:29 +00:00
Benjamin Kramer	cccdadca45	Fix some Twine locals. Two of those are use after frees. Found by clang-tidy, fixed by me. llvm-svn: 212537	2014-07-08 14:55:06 +00:00
Chandler Carruth	142e966261	[x86,SDAG] Sink the logic for folding shuffles of splats more aggressively from the x86 shuffle lowering to the generic SDAG vector shuffle formation code. This code already tried to fold away shuffles of splats! It just had lots of bugs and couldn't handle the case my new x86 shuffle lowering needed. First, it failed to correctly compute whether N2 was undef because it pre-computed this, then did transformations which could make N2 undef, then failed to ever re-consider the precomputed state. Second, it didn't look through bitcasts at all, even in the safe cases where they are just element-type bitcasts with no change to the number of elements. Third, it didn't handle all-zero bit casts nicely the way my code in the x86 side of things did, which is essential to getting good zext-shuffle lowerings. But all of these are generic. I just ported the code down to this layer and fixed the surrounding bugs. Tests exercising this in the x86 backend still pass and some silly code in widen_cast-6.ll gets better. I updated that test to be a bit more precise but it's still pretty unclear what the value of the test is in this day and age. llvm-svn: 212517	2014-07-08 08:45:38 +00:00
Chandler Carruth	efbce58775	[SDAG] Actually check for a non-constant splat and clarify comments around the handling of UNDEF lanes in boolean vector content analysis. The code before my changes here also failed to check for non-constant splats in a buildvector. I have no idea how to trigger this, I just spotted by inspection when trying to understand the code. It seems extremely unlikely to be worth the trouble to teach the only caller of this code (DAG combining setcc patterns) how to cleverly handle undef lanes, so I've just commented more thoroughly that we're giving up there. llvm-svn: 212515	2014-07-08 07:44:15 +00:00
Chandler Carruth	b844e72e85	[SDAG] Build up a more rich set of APIs for querying build-vector SDAG nodes about whether they are splats. This is factored out and improved from r212324 which got reverted as it was far too aggressive. The new API should help more conservatively handle buildvectors that are a mixture of splatted and undef values. No functionality change at this point. The hope is to slowly re-introduce the undef-tolerant optimization of splats, but each time being forced to make a concious decision about how to handle the undefs in a way that doesn't lead to contradicting assumptions about the collapsed value. Hal has pointed out in discussions that this may not end up being the desired API and instead it may be more convenient to get a mask of the undef elements or something similar. I'm starting simple and will expand the API as I adapt actual callers and see exactly what they need. llvm-svn: 212514	2014-07-08 07:19:55 +00:00
Chandler Carruth	beeacac0b3	[x86] Revert r212324 which was too aggressive w.r.t. allowing undef lanes in vector splats. The core problem here is that undef lanes can't unilaterally be considered to contribute to splats. Their handling needs to be more cautious. There is also a reported failure of the nightly testers (thanks Tobias!) that may well stem from the same core issue. I'm going to fix this theoretical issue, factor the APIs a bit better, and then verify that I don't see anything bad with Tobias's reduction from the test suite before recommitting. Original commit message for r212324: [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is dramatically improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212475	2014-07-07 19:03:32 +00:00
Benjamin Kramer	6cbe670db8	Make helper functions static. llvm-svn: 212460	2014-07-07 14:47:51 +00:00
Tim Northover	55beb64bd0	CodeGen: it turns out that NAND is not the same thing as BIC. At all. We've been performing the wrong operation on ARM for "atomicrmw nand" for years, since "a NAND b" is "~(a & b)" rather than ARM's very tempting "a & ~b". This bled over into the generic expansion pass. So I assume no-one has ever actually tried to do an atomic nand in the real world. Oh well. llvm-svn: 212443	2014-07-07 09:06:35 +00:00
Chandler Carruth	5d79bb5d32	[x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is dramatically improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212324	2014-07-04 08:11:49 +00:00
Eric Christopher	c1058df66f	Move function dependent resetting of a subtarget variable out of the subtarget. This involved having the movt predicate take the current function - since we care about size in instruction selection for whether or not to use movw/movt take the function so we can check the attributes. This required adding the current MachineFunction to FastISel and propagating through. llvm-svn: 212309	2014-07-04 01:55:26 +00:00
Eric Christopher	09f7131984	Temporarily revert "Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information." as it appears to be breaking some LTO constructs. This reverts commit r212203. llvm-svn: 212298	2014-07-03 22:24:54 +00:00
Sanjay Patel	dc574ab500	bug fix for PR20020: anti-dependency-breaker causes miscompilation This patch sets the 'KeepReg' bit for any tied and live registers during the PrescanInstruction() phase of the dependency breaking algorithm. It then checks those 'KeepReg' bits during the ScanInstruction() phase to avoid changing any tied registers. For more details, please see comments in: http://llvm.org/bugs/show_bug.cgi?id=20020 I added two FIXME comments for code that I think can be removed by using register iterators that include self. I don't want to include those code changes with this patch, however, to keep things as small as possible. The test case is larger than I'd like, but I don't know how to reduce it further and still produce the failing asm. Differential Revision: http://reviews.llvm.org/D4351 llvm-svn: 212275	2014-07-03 15:19:40 +00:00
Ulrich Weigand	f236bb1b5b	Fix ppcf128 component access on little-endian systems The PowerPC 128-bit long double data type (ppcf128 in LLVM) is in fact a pair of two doubles, where one is considered the "high" or more-significant part, and the other is considered the "low" or less-significant part. When a ppcf128 value is stored in memory or a register pair, the high part always comes first, i.e. at the lower memory address or in the lower-numbered register, and the low part always comes second. This is true both on big-endian and little-endian PowerPC systems. (Similar to how with a complex number, the real part always comes first and the imaginary part second, no matter the byte order of the system.) This was implemented incorrectly for little-endian systems in LLVM. This commit fixes three related issues: - When printing an immediate ppcf128 constant to assembler output in emitGlobalConstantFP, emit the high part first on both big- and little-endian systems. - When lowering a ppcf128 type to a pair of f64 types in SelectionDAG (which is used e.g. when generating code to load an argument into a register pair), use correct low/high part ordering on little-endian systems. - In a related issue, because lowering ppcf128 into a pair of f64 must operate differently from lowering an int128 into a pair of i64, bitcasts between ppcf128 and int128 must not be optimized away by the DAG combiner on little-endian systems, but must effect a word-swap. Reviewed by Hal Finkel. llvm-svn: 212274	2014-07-03 15:06:47 +00:00
Chandler Carruth	99b1104c46	[x86] Fix the completely broken vector widening legalization of bswap. This operation was classified as a binary operation in the widening logic for some reason (clearly, untested). It is in fact a unary operation. Add a RUN line to a test to exercise this for x86. Note that again the vector widening strategy doesn't regress anything and in one case removes a totally unecessary instruction that we couldn't avoid when promoting the element type. llvm-svn: 212257	2014-07-03 07:04:38 +00:00
Chandler Carruth	9d010fffe1	[codegen,aarch64] Add a target hook to the code generator to control vector type legalization strategies in a more fine grained manner, and change the legalization of several v1iN types and v1f32 to be widening rather than scalarization on AArch64. This fixes an assertion failure caused by scalarizing nodes like "v1i32 trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32. This also provides a foundation for other targets to have more granular control over how vector types are legalized. Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow some work to start taking place on top of this patch as it adds some really important hooks to the backend that I'd like to immediately start using. =] http://reviews.llvm.org/D4322 llvm-svn: 212242	2014-07-03 00:23:43 +00:00
David Blaikie	9a0f7948a2	Revert "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself." This reverts commit r212205. Reverting this again, still seeing crashes when building compiler-rt... Sorry for the continued noise, not sure why I'm failing to reproduce this locally. llvm-svn: 212226	2014-07-02 21:42:28 +00:00
David Blaikie	9408f5282e	DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself. Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), committed again in r212085 and reverted again in r212089 after fixing some other cases, such as debug info subprogram lists not keeping track of the function they represent (r212128) and then short-circuiting things like LiveDebugVariables that build LexicalScopes for functions that might not have full debug info. And again, I believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions. llvm-svn: 212205	2014-07-02 18:32:05 +00:00
Quentin Colombet	5caa6a2da1	[RegAllocGreedy] Provide a subtarget hook to disable the local reassignment heuristic. By default, no functionality change. This is a follow-up of r212099. This hook provides a finer grain to control the optimization. <rdar://problem/17444599> llvm-svn: 212204	2014-07-02 18:32:04 +00:00
David Blaikie	d47fb5b339	Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information. If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 212203	2014-07-02 18:31:35 +00:00
Chad Rosier	aba845e835	Revert "Revert "MachineScheduler: better book-keeping for asserts."" This reverts commit r212109, which reverted r212088. However, disable the assert as it's not necessary for correctness. There are several corner cases that the assert needed to handle better for in-order scheduling, but none of them are incorrect scheduler behavior. The assert is mainly there to collect good unit tests like this and ensure that the target-independent scheduler is working as expected with the various machine models. llvm-svn: 212187	2014-07-02 16:46:08 +00:00
Matt Arsenault	e9a5a50322	Fix missing const llvm-svn: 212168	2014-07-02 06:45:26 +00:00
Chandler Carruth	c1bedac3bd	[cleanup] Hoist an if-else chain on ISD opcodes (really designed for switches) into a switch, and sink them into a dispatch function that can return the result rather than awkward variable setting with breaks. llvm-svn: 212166	2014-07-02 06:23:34 +00:00
Chandler Carruth	722289f311	[cleanup] Remove dead 'break;' statements that I meant to nuke in r212158 but missed. Thanks to Craig for spotting the goof! llvm-svn: 212159	2014-07-02 04:39:34 +00:00
Chandler Carruth	2746c2861f	[cleanup] Hoist the promotion dispatch logic into the promote function so that we can use return to express it more cleanly and avoid so many nested switch statements. llvm-svn: 212158	2014-07-02 03:07:15 +00:00
Chandler Carruth	1cfa895c4a	[cleanup] Nuke the 'VectorOp' bit of the promote method names. This doesn't add any information for methods in the VectorLegalizer class that clearly take SDAG operations to legalize. llvm-svn: 212157	2014-07-02 03:07:11 +00:00
Chandler Carruth	68adf1568a	[x86] Clean up and modernize the doxygen and API comments for the vector operation legalization code. llvm-svn: 212155	2014-07-02 02:16:57 +00:00
Juergen Ributzka	190305b648	[FastISel] Factor out stackmap intrinsic selection code into a dedicated helper method. NFCI. llvm-svn: 212140	2014-07-01 22:25:49 +00:00
Juergen Ributzka	3bd03c7099	[DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI. The argument list vector is never used after it has been passed to the CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and pretty much as cheap as keeping a pointer to it. llvm-svn: 212135	2014-07-01 22:01:54 +00:00
Alp Toker	d8d510af92	Move remaining LLVM_ENABLE_DUMP conditionals out of the headers This macro is sometimes defined manually but isn't (and doesn't need to be) in llvm-config.h so shouldn't appear in the headers, likewise NDEBUG. Instead switch them over to LLVM_DUMP_METHOD on the definitions. llvm-svn: 212130	2014-07-01 21:19:13 +00:00
Chad Rosier	f575a73751	Revert "MachineScheduler: better book-keeping for asserts." This reverts commit r212088, which is causing a number of spec failures. Will provide reduced test cases shortly. PR20057 llvm-svn: 212109	2014-07-01 17:23:11 +00:00
Quentin Colombet	6d590d538f	[PeepholeOptimzer] Fix a typo in a comment. Spotted by Amara Emerson. llvm-svn: 212106	2014-07-01 16:23:44 +00:00
Quentin Colombet	1111e6fe84	[PeepholeOptimizer] Advanced rewriting of copies to avoid cross register banks copies. This patch extends the peephole optimization introduced in r190713 to produce register-coalescer friendly copies when possible. This extension taught the existing cross-bank copy optimization how to deal with the instructions that generate cross-bank copies, i.e., insert_subreg, extract_subreg, reg_sequence, and subreg_to_reg. E.g. b = insert_subreg e, A, sub0 <-- cross-bank copy ... C = copy b.sub0 <-- cross-bank copy Would produce the following code: b = insert_subreg e, A, sub0 <-- cross-bank copy ... C = copy A <-- same-bank copy This patch also introduces a new helper class for that: ValueTracker. This class implements the logic to look through the copy related instructions and get the related source. For now, the advanced rewriting is disabled by default as we are lacking the semantic on target specific instructions to catch the motivating examples. Related to <rdar://problem/12702965>. llvm-svn: 212100	2014-07-01 14:33:36 +00:00
Quentin Colombet	e1a36634b7	[RegAllocGreedy] Provide a flag to disable the local reassignment heuristic. By default, no functionality change. Before evicting a local variable, this heuristic tries to find another (set of) local(s) that can be reassigned to a free color. In some extreme cases (large basic blocks with tons of local variables), the compilation time is dominated by the local interference checks that this heuristic must perform, with no code gen gain. E.g., the motivating example takes 4 minutes to compile with this heuristic, 12 seconds without. Improving the situation will likely require to make drastic changes to the register allocator and/or the interference check framework. For now, provide this flag to better understand the impact of that heuristic. <rdar://problem/17444599> llvm-svn: 212099	2014-07-01 14:08:37 +00:00
David Blaikie	c8caa1702a	Revert "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself." This reverts commit r212085. This breaks the sanitizer bot... & I thought I'd tried pretty hard not to do that. Guess I need to try harder. llvm-svn: 212089	2014-07-01 04:11:45 +00:00
Andrew Trick	f1b307bcb0	MachineScheduler: better book-keeping for asserts. Fixes another test case under PR20057. llvm-svn: 212088	2014-07-01 03:23:13 +00:00
David Blaikie	b89e6d93d9	DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself. Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), and I now believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions. llvm-svn: 212085	2014-07-01 03:11:59 +00:00
Alp Toker	cf21875d41	Fix 'platform-specific' hyphenations llvm-svn: 212056	2014-06-30 18:57:16 +00:00
Saleem Abdulrasool	67b548154e	CodeGen: rename Win64 ExceptionHandling to WinEH This exception format is not specific to Windows x64. A similar approach is taken on nearly all architectures. Generalise the name to reflect reality. This will eventually be used for Windows on ARM data emission as well. Switch the enum and namespace into an enum class. llvm-svn: 212000	2014-06-29 21:43:47 +00:00
Saleem Abdulrasool	7206a52522	MC: rename EmitWin64EH routines Rename the routines to reflect the reality that they are more related to call frame information than to Win64 EH. Although EH is implemented in an intertwined manner by augmenting with an exception handler and an associated parameter, the majority of these routines emit information required to unwind the frames. This also helps identify that these routines are generic for most windows platforms (they apply equally to nearly all architectures except x86) although the encoding of the information is architecture dependent. Unwinding data is emitted via EmitWinCFI* and exception handling information via EmitWinEH*. llvm-svn: 211994	2014-06-29 01:52:01 +00:00
Craig Topper	66e588be09	Add ops() method to SDNode that returns an ArrayRef<SDUse>. Use it to simplify some code. llvm-svn: 211993	2014-06-29 00:40:57 +00:00
Chad Rosier	5235973ee0	[AArch64] Fix memset ICE when memset value is f128. llvm-svn: 211960	2014-06-27 21:05:09 +00:00
David Majnemer	dad0a645a7	IR: Add COMDATs to the IR This new IR facility allows us to represent the object-file semantic of a COMDAT group. COMDATs allow us to tie together sections and make the inclusion of one dependent on another. This is required to implement features like MS ABI VFTables and optimizing away certain kinds of initialization in C++. This functionality is only representable in COFF and ELF, Mach-O has no similar mechanism. Differential Revision: http://reviews.llvm.org/D4178 llvm-svn: 211920	2014-06-27 18:19:56 +00:00
David Blaikie	dada538bb4	Revert "Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location.""" Reverting this again, didn't mean to commit it - while r211872 fixes one of the issues here, there are still others to figure out and address. This reverts commit r211871. llvm-svn: 211873	2014-06-27 05:34:05 +00:00
David Blaikie	8832992df5	Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location."" This reverts commit r211724. llvm-svn: 211871	2014-06-27 05:31:49 +00:00
Andrew Trick	040c0da578	Left out the NDEBUG in the previous checkin. llvm-svn: 211867	2014-06-27 05:09:36 +00:00
Andrew Trick	5632722cab	MachineScheduler: add some book-keeping to fix an assert. Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"' Thanks to Chad for the test case. llvm-svn: 211865	2014-06-27 04:57:05 +00:00
Juergen Ributzka	009bff223b	[StackMaps] Enable patchpoint liveness analysis per default. llvm-svn: 211817	2014-06-26 23:39:52 +00:00
Juergen Ributzka	14871f73bb	[Stackmaps] Remove the liveness calculation for stackmap intrinsics. There is no need to calculate the liveness information for stackmaps. The liveness information is still available for the patchpoint intrinsic and that is also the intended usage model. Related to <rdar://problem/17473725> llvm-svn: 211816	2014-06-26 23:39:44 +00:00
Alp Toker	e69170a110	Revert "Introduce a string_ostream string builder facilty" Temporarily back out commits r211749, r211752 and r211754. llvm-svn: 211814	2014-06-26 22:52:05 +00:00
Alp Toker	614717388c	Introduce a string_ostream string builder facilty string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface. small_string_ostream<bytes> additionally permits an explicit stack storage size other than the default 128 bytes to be provided. Beyond that, storage is transferred to the heap. This convenient class can be used in most places an std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair would previously have been used, in order to guarantee consistent access without byte truncation. The patch also converts much of LLVM to use the new facility. These changes include several probable bug fixes for truncated output, a programming error that's no longer possible with the new interface. llvm-svn: 211749	2014-06-26 00:00:48 +00:00
Eric Christopher	dda00098bc	The includes were sorted. Revert r210578. llvm-svn: 211737	2014-06-25 22:36:37 +00:00
David Blaikie	2952956fd8	Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location." This reverts commit r211723. Breaks the ASan/compiler-rt build... guess I didn't test very far at all :/. llvm-svn: 211724	2014-06-25 18:20:54 +00:00
David Blaikie	442584588a	PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions. llvm-svn: 211723	2014-06-25 18:03:10 +00:00
NAKAMURA Takumi	1db5995d14	Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore SEH pseudo ops in X86 JIT emitter. -- This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 llvm-svn: 211691	2014-06-25 12:41:52 +00:00
NAKAMURA Takumi	c403be1991	Reformat. llvm-svn: 211689	2014-06-25 12:40:56 +00:00
Rafael Espindola	f491704e22	Print a=b as an assignment. In assembly the expression a=b is parsed as an assignment, so it should be printed as one. This remove a truly horrible hack for producing a label with "a=.". It would be used by codegen but would never be reached by the asm parser. Sorry I missed this when it was first committed. llvm-svn: 211639	2014-06-24 22:45:16 +00:00
Sanjay Patel	994751940c	fixed a few typos in comments llvm-svn: 211634	2014-06-24 21:11:51 +00:00
David Majnemer	102ff69693	CodeGen: Avoid multiple strlen calls Use a StringRef to hold our section prefix. This avoids multiple calls to strlen. llvm-svn: 211602	2014-06-24 16:01:53 +00:00
Kevin Qin	93d45ecdbf	[AArch64] Fix a build_vector pattern match fail caused by defect in isBuildVectorAllZeros(). llvm-svn: 211567	2014-06-24 05:37:27 +00:00
Rafael Espindola	73f364ef5f	Remove a temporary hack. Amusingly this survived a lot longer than the CFI transition. We don't even support non-cfi assemblers any more. llvm-svn: 211498	2014-06-23 14:22:55 +00:00
NAKAMURA Takumi	d77cefe633	Revert r211399, "Generate native unwind info on Win64" It broke Legacy JIT Tests on x86_64-{mingw32\|msvc}, aka Windows x64. llvm-svn: 211480	2014-06-22 22:00:56 +00:00
Benjamin Kramer	b7f5fb5751	Legalizer: Add support for splitting insert_subvectors. We handle this by spilling the whole thing to the stack and doing the insertion as a store. PR19492. This happens in real code because the vectorizer creates v2i128 when AVX is enabled. llvm-svn: 211435	2014-06-21 12:56:42 +00:00
Richard Trieu	c1485223a6	Add back functionality removed in r210497. Instead of asserting, output a message stating that a null pointer was found. llvm-svn: 211430	2014-06-21 02:43:02 +00:00
Reid Kleckner	4a01230db4	Generate native unwind info on Win64 This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 llvm-svn: 211399	2014-06-20 20:35:47 +00:00
Rafael Espindola	1fc003e6c5	Allow a target to create a null streamer. Targets can assume that a target streamer is present, so they have to be able to construct a null streamer in order to set the target streamer in it to. Fixes a crash when using the null streamer with arm. llvm-svn: 211358	2014-06-20 13:11:28 +00:00
Yaron Keren	6d3194f7d5	The count() function for STL datatypes returns unsigned, even where it's only 1/0 result like std::set. Some of the LLVM ADT already return unsigned count(), while others still return bool count(). In continuation to r197879, this patch modifies DenseMap, DenseSet, ScopedHashTable, ValueMap:: count() to return size_type instead of bool, 1 instead of true and 0 instead of false. size_type is typedef-ed locally within each class to size_t. http://reviews.llvm.org/D4018 Reviewed by dblaikie. llvm-svn: 211350	2014-06-20 10:26:56 +00:00
Karthik Bhat	e03a25da70	Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer. This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and vectorizes them as vector shuffles if they are profitable. These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86. Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015 llvm-svn: 211339	2014-06-20 04:32:48 +00:00
Eric Christopher	c40e5edbbc	Add a new subtarget hook for whether or not we'd like to enable the atomic load linked expander pass to run for a particular subtarget. This requires a check of the subtarget and so save the TargetMachine rather than only TargetLoweringInfo and update all callers. llvm-svn: 211314	2014-06-19 21:03:04 +00:00
David Blaikie	de8e12a49a	DebugInfo: Fission: Ensure the address pool entries for location lists are emitted. The address pool was being emitted before location lists. The latter could add more entries to the pool which would be lost/never emitted. llvm-svn: 211284	2014-06-19 17:59:14 +00:00
Jingyue Wu	37fcb5919d	[ValueTracking] Extend range metadata to call/invoke Summary: With this patch, range metadata can be added to call/invoke including IntrinsicInst. Previously, it could only be added to load. Rename computeKnownBitsLoad to computeKnownBitsFromRangeMetadata because range metadata is not only used by load. Update the language reference to reflect this change. Test Plan: Add several tests in range-2.ll to confirm the verifier is happy with having range metadata on call/invoke. Add two tests in AddOverFlow.ll to confirm annotating range metadata to call/invoke can benefit InstCombine. Reviewers: meheff, nlewycky, reames, hfinkel, eliben Reviewed By: eliben Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4187 llvm-svn: 211281	2014-06-19 16:50:16 +00:00
Oliver Stannard	f7693f4c1f	Emit DWARF3 call frame information when DWARF3+ debug info is requested Currently, llvm always emits a DWARF CIE with a version of 1, even when emitting DWARF 3 or 4, which both support CIE version 3. This patch makes it emit the newer CIE version when we are emitting DWARF 3 or 4. This will not reduce compatibility, as we already emit other DWARF3/4 features, and is worth doing as the DWARF3 spec removed some ambiguities in the interpretation of call frame information. It also fixes a minor bug where the "return address" field of the CIE was encoded as a ULEB128, which is only valid when the CIE version is 3. There are no test changes for this, because (as far as I can tell) none of the platforms that we test have a return address register with a DWARF register number >127. llvm-svn: 211272	2014-06-19 15:39:33 +00:00
Eric Christopher	4c5bff36ad	Move -dwarf-version to an MC level command line option so it's used by all of the MC level tools and codegen. Fix up all uses in the compiler to use this and set it on the context accordingly. llvm-svn: 211257	2014-06-19 06:22:08 +00:00
Eric Christopher	07634e2a5b	Remove unnecessary include. llvm-svn: 211256	2014-06-19 06:22:05 +00:00
Tim Northover	d82ed2e581	DAG: move sret demotion into most basic LowerCallTo implementation. It looks like there are two versions of LowerCallTo here: the SelectionDAGBuilder one is designed to operate on LLVM IR, and the TargetLowering one in the case where everything is at DAG level. Previously, only the SelectionDAGBuilder variant could handle demoting an impossible return to sret semantics (before delegating to the TargetLowering version), but this functionality is also useful for certain libcalls (e.g. 128-bit operations on 32-bit x86). So this commit moves the sret handling down a level. rdar://problem/17242889 llvm-svn: 211155	2014-06-18 11:52:44 +00:00
Tom Stellard	aad4659470	SelectionDAG: Expand i64 = FP_TO_SINT i32 llvm-svn: 211108	2014-06-17 16:53:07 +00:00
David Blaikie	b9597a8e57	PR20038: DebugInfo missing DIEs for some concrete variables. I haven't nailed this down entirely, but this is about as small of a test case as I can seem to construct and adequately demonstrates the crasher. I'll continue investigating the root cause/fix(es). llvm-svn: 210993	2014-06-15 19:34:26 +00:00
Tim Northover	65277a2bc0	LegalizeDAG: make sure cast is unsigned before using FP_TO_UINT. It's valid to use FP_TO_SINT when asking for a smaller type (e.g. all "unsigned int16" values fit into a "signed int32"), but the reverse isn't true. Unfortunately, I'm not actually aware of any architecture with asymmetric FP_TO_SINT and FP_TO_UINT handling and the logic happens to work in the symmetric case, so I can't actually write a test for this. llvm-svn: 210986	2014-06-15 09:27:20 +00:00
David Blaikie	6f9e867c45	DebugInfo: Remove some extra handling of abstract variables and instead rely solely on the delayed handling introduced in r210946 Now that we handle finding abstract variables at the end of the module, remove the upfront handling and just ensure the abstract variable is built when necessary. In theory we could have a split implementation, where inlined variables are immediately constructed referencing the abstract definition, and concrete variables are delayed - but let's go with one solution for now unless there's a reason not to. llvm-svn: 210961	2014-06-13 23:52:55 +00:00
Jiangning Liu	96e92c1d75	Move GlobalMerge from Transform to CodeGen. This patch is to move GlobalMerge pass from Transform/Scalar to CodeGen, because GlobalMerge depends on TargetMachine. In the mean time, the macro INITIALIZE_TM_PASS is also moved to CodeGen/Passes.h. With this fix we can avoid making libScalarOpts depend on libCodeGen. llvm-svn: 210951	2014-06-13 22:57:59 +00:00
Eric Christopher	f047bfd115	The hazard recognizer only needs a subtarget, not a target machine so make it take one. Fix up all users accordingly. llvm-svn: 210948	2014-06-13 22:38:52 +00:00
David Blaikie	e847f132f7	DebugInfo: Reference abstract definitions from variables in concrete definitions that preceed their first inline definition. Rather than relying on abstract variables looked up at the time the concrete variable is created, look them up at the end of the module to ensure they're referenced even if they're created after the concrete definition. This completes/matches the work done in r209677 to handle this for the subprograms themselves. llvm-svn: 210946	2014-06-13 22:35:44 +00:00
David Blaikie	be7c677008	DwarfDebug::getExistingAbstractVariable: constify an existing reference parameter that didn't need to be mutated. llvm-svn: 210944	2014-06-13 22:29:31 +00:00
David Blaikie	eb1a27239c	DebugInfo: Following up to r209677, refactor local variable emission to delay the choice between emitting the definition attributes or using DW_AT_abstract_definition This doesn't fix the abstract variable handling yet, but it introduces a similar delay mechanism as was added for subprograms, causing DW_AT_location to be reordered to the beginning of the attribute list for local variables, and fixes all the test fallout for that. A subsequent commit will remove the abstract variable handling in DbgVariable and just do the abstract variable lookup at module end to ensure that abstract variables introduced after their concrete counterparts are appropriately referenced by the concrete variable. llvm-svn: 210943	2014-06-13 22:18:23 +00:00
Tim Northover	20b9f739eb	Atomics: make use of the "cmpxchg weak" instruction. This also simplifies the IR we create slightly: instead of working out where success & failure should go manually, it turns out we can just always jump to a success/failure block created for the purpose. Later phases will sort out the mess without much difficulty. llvm-svn: 210917	2014-06-13 16:45:52 +00:00
Tim Northover	d039abdeeb	Atomics: switch direction of cmpxchg comparison This has two benefits: it makes the result more suitable for direct insertaion into the struct to emulate the new cmpxchg, and it means the name we give the instruction matches its actual effect better. llvm-svn: 210916	2014-06-13 16:45:36 +00:00
Tim Northover	420a216817	IR: add "cmpxchg weak" variant to support permitted failure. This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity all cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903	2014-06-13 14:24:07 +00:00
Juergen Ributzka	454d374e37	[FastISel][X86] - Add branch weights Add branch weights to branch instructions, so that the following passes can optimize based on it (i.e. basic block ordering). llvm-svn: 210863	2014-06-13 00:45:11 +00:00
Juergen Ributzka	349777d3ea	[FastISel][X86] Add MachineMemOperand to load/store instructions. This commit adds MachineMemOperands to load and store instructions. This allows the peephole optimizer to fold load instructions. Unfortunatelly the peephole optimizer currently doesn't run at -O0. llvm-svn: 210858	2014-06-12 23:27:57 +00:00
Andrew Trick	491e34a139	Fix the scheduler's MaxObservedStall computation. WenHan Gu pointed out this bug that results in an assert not being effective in some cases. llvm-svn: 210846	2014-06-12 22:36:28 +00:00
Tom Stellard	7783b0adf4	Revert "SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, y)) for vectors" This reverts commit r210540, adds a testcase for the regression it caused, and marks the R600 test it was supposed to fix as XFAIL. llvm-svn: 210792	2014-06-12 16:04:47 +00:00
Juergen Ributzka	04558dc77a	[FastISel] Add support for the stackmap intrinsic. This implements target-independent FastISel lowering for the stackmap intrinsic. llvm-svn: 210742	2014-06-12 03:29:26 +00:00
Eric Christopher	4fdc765b13	Revert r210613 to conform to coding standards. Thanks Duncan for noticing. llvm-svn: 210662	2014-06-11 16:59:33 +00:00
Jiangning Liu	d623c528c5	Create macro INITIALIZE_TM_PASS. Pass initialization requires to initialize TargetMachine for back-end specific passes. This commit creates a new macro INITIALIZE_TM_PASS to simplify this kind of initialization. llvm-svn: 210641	2014-06-11 07:04:37 +00:00
Saleem Abdulrasool	8076cab0ce	CodeGen: refactor DwarfException DwarfException served as a base class for exception handling directive emission. However, this is also used by other exception models (e.g. Win64EH). Rename this class to EHStreamer and split it out of DwarfException.h. NFC. Use the opportunity to fix up some of the documentation comments to match current LLVM style. Also rename some functions to conform better with current LLVM coding style. llvm-svn: 210622	2014-06-11 01:19:03 +00:00
Eric Christopher	946a6581ea	Sort includes. llvm-svn: 210613	2014-06-11 00:25:16 +00:00
Eric Christopher	576d36ae05	Have isInTailCallPosition take the DAG so that we can use the version of TargetLowering/Machine from there on the way to avoiding TargetMachine in TargetLowering. llvm-svn: 210579	2014-06-10 20:39:38 +00:00
Eric Christopher	09fc276d08	Reorder includes to be sorted. llvm-svn: 210578	2014-06-10 20:39:35 +00:00
Eric Christopher	db5028bd5b	Fix typos. llvm-svn: 210571	2014-06-10 20:07:29 +00:00
Juergen Ributzka	89fe23e888	[FastISel] Collect statistics about failing intrinsic calls. Add more instruction-specific statistics about failing intrinsic calls during FastISel. llvm-svn: 210556	2014-06-10 18:17:00 +00:00
Tom Stellard	3787b12255	SelectionDAG: Don't use MVT::Other to determine legality of ISD::SELECT_CC The SelectionDAG bad a special case for ISD::SELECT_CC, where it would allow targets to specify: setOperationAction(ISD::SELECT_CC, MVT::Other, Expand); to indicate that they wanted to expand ISD::SELECT_CC for all types. This wasn't applied correctly everywhere, and it makes writing new DAG patterns with ISD::SELECT_CC difficult. llvm-svn: 210541	2014-06-10 16:01:29 +00:00
Tom Stellard	b9a023383e	SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, y)) for vectors This prevents a future commit from regressing: test/CodeGen/R600/setcc-equivalent.ll llvm-svn: 210540	2014-06-10 16:01:25 +00:00
Tom Stellard	3ca1bfc728	SelectionDAG: Expand SELECT_CC to SELECT + SETCC This consolidates code from the Hexagon, R600, and XCore targets. No functionality change intended. llvm-svn: 210539	2014-06-10 16:01:22 +00:00
Richard Trieu	a23043cb9c	Removing an "if (!this)" check from two print methods. The condition will never be true in a well-defined context. The checking for null pointers has been moved into the caller logic so it does not rely on undefined behavior. llvm-svn: 210497	2014-06-09 22:53:16 +00:00
Alexey Samsonov	8000e2734e	Generate better location ranges for some register-described variables. Don't terminate location ranges for register-described variables at the end of machine basic block if this register is never modified in the function body, except for the prologue and epilogue. Prologue location is guessed by FrameSetup flags on MachineInstructions, while epilogue location is deduced from debug locations of instructions in the basic blocks ending with return instructions. This patch is mostly targeted to fix non-trivial debug locations for variables addressed via stack and frame pointers. It is not really a generic fix. We can still produce poor debug info for register-described variables if this register is modified somewhere in the function, but in unrelated places. This might be the case for the debug info in optimized binaries (e.g. for local variables in inlined functions). LiveDebugVariables pass in CodeGen attempts to fix this problem by adjusting DBG_VALUE instructions, but this pass is tied to greedy register allocator, which is used in optimized builds only. Proper fix would likely involve generalizing LiveDebugVariables to all register allocators. See more discussion in http://reviews.llvm.org/D3933 review thread. I'm proceeding with this patch to fix immediate severe problems and important cases, e.g. fix completely broken debug info with AddressSanitizer and fix PR19307 (missing debug info for by-value std::string arguments). llvm-svn: 210492	2014-06-09 21:53:47 +00:00
Andrea Di Biagio	f99dd64f0a	[X86] Add target combine rules for horizontal add/sub. This patch adds new target specific combine rules to identify horizontal add/sub idioms from BUILD_VECTOR dag nodes. This patch also teaches the DAGCombiner how to canonicalize sequences of insert_vector_elt dag nodes according to the following rule: (insert_vector_elt (insert_vector_elt A, I0), I1) -> (insert_vecto_elt (insert_vector_elt A, I1), I0) This new canonicalization rule only triggers if the inner insert_vector dag node has exactly one use; also, both indices must be known constants, and I1 < I0. This last rule made it possible to write a simpler algorithm to identify horizontal add/sub patterns because now we don't have to worry about the ordering of insert_vector_elt dag nodes. llvm-svn: 210477	2014-06-09 16:54:41 +00:00
Andrea Di Biagio	4db1abea15	[DAG] Expose NoSignedWrap, NoUnsignedWrap and Exact flags to SelectionDAG. This patch modifies SelectionDAGBuilder to construct SDNodes with associated NoSignedWrap, NoUnsignedWrap and Exact flags coming from IR BinaryOperator instructions. Added a new SDNode type called 'BinaryWithFlagsSDNode' to allow accessing nsw/nuw/exact flags during codegen. Patch by Marcello Maggioni. llvm-svn: 210467	2014-06-09 12:32:53 +00:00
Craig Topper	66f09ad041	[C++11] Use 'nullptr'. llvm-svn: 210442	2014-06-08 22:29:17 +00:00
Alp Toker	5c53639492	Fix typos llvm-svn: 210401	2014-06-07 21:23:09 +00:00
Andrew Trick	7f1ebbeb8f	Fix the MachineScheduler's logic for updating ready times for in-order. Now the scheduler updates a node's ready time as soon as it is scheduled, before releasing dependent nodes. There was a reason I didn't do this initially but it no longer applies. A53 is in-order and was running into an issue where nodes where added to the readyQ too early. That's now fixed. This also makes it easier for custom scheduling strategies to build heuristics based on the actual cycles that the node was scheduled at. The only impact on OOO (sandybridge/cyclone) is that ready times will be slightly more accurate. I didn't measure any significant regressions. llvm-svn: 210390	2014-06-07 01:48:43 +00:00
David Blaikie	3dca59902b	DebugInfo: Use the scope of the function declaration, if any, to name a function in DWARF pubnames This ensures that member functions, for example, are entered into pubnames with their fully qualified name, rather than inside the global namespace. llvm-svn: 210379	2014-06-06 22:29:05 +00:00
David Blaikie	553eb4a880	DebugInfo: pubnames: include file-local (static or anonymous namespace) variables and anonymous namespaces themselves. Still some issues with name qualification, FIXMEs added to test cases and fixes will come next. llvm-svn: 210378	2014-06-06 22:16:56 +00:00
Rafael Espindola	0766ae08e5	Fix a few issues with comdat handling on COFF. * Section association cannot use just the section name as many sections can have the same name. With this patch, the comdat symbol in an assoc section is interpreted to mean a symbol in the associated section and the mapping is discovered from it. * Comdat symbols were not being set correctly. Instead we were getting whatever was output first for that section. A consequence is that associative sections now must use .section to set the association. Using .linkonce would not work since it is not possible to change a sections comdat symbol (it is used to decide if we should create a new section or reuse an existing one). This includes r210298, which was reverted because it was asserting on an associated section having the same comdat as the associated section. llvm-svn: 210367	2014-06-06 19:26:12 +00:00
Eric Christopher	0dd8d486b3	Have TargetSelectionDAGInfo take a DataLayout initializer rather than a TargetMachine since the only thing it wants is DataLayout. llvm-svn: 210366	2014-06-06 19:04:48 +00:00
Alexey Samsonov	45d638a3fd	Fix null dereference with -debug-only=dwarfdebug llvm-svn: 210299	2014-06-05 23:10:19 +00:00
Tom Roeder	44cb65fff1	Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute. It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables. This also adds backend support for generating the jump-instruction tables on ARM and X86. Note that since the jumptable attribute creates a second function pointer for a function, any function marked with jumptable must also be marked with unnamed_addr. llvm-svn: 210280	2014-06-05 19:29:43 +00:00
Sasa Stankovic	56c12e679a	Prevent hoisting the instruction whose def might be clobbered by the terminator. llvm-svn: 210261	2014-06-05 13:42:48 +00:00
David Blaikie	72c3aa39b7	Revert r210221 again, due to a crash Richard Smith has provided involving self-hosting LLVM with libc++. Test case coming, once I reduce it. llvm-svn: 210236	2014-06-05 02:04:59 +00:00
David Blaikie	367fb01d70	DebugInfo: Reuse existing LexicalScope to retrieve the scope's MDNode, rather than looking it up through the DebugLoc. No functional change intended, just streamlines the abstract variable lookup/construction to use a common entry point. llvm-svn: 210234	2014-06-05 01:30:50 +00:00
David Blaikie	087e7203a9	DebugInfo: Roll argument insertion into variable insertion to ensure arguments are correctly handled in all cases. No functional change intended. llvm-svn: 210233	2014-06-05 01:04:20 +00:00
David Blaikie	bb6a4e2fea	PR19388: DebugInfo: Emit dead arguments in their originally declared order. Unused arguments were not being added to the argument list, but instead treated as arbitrary scope variables. This meant they weren't carefully added in the original argument order. In this particular example, though, it turns out the argument is only /mostly/ unused (well, actually it's entirely used, but in a specific way). It's a struct that, due to ABI reasons, is decomposed into chunks (exactly one chunk, since it has one member) and then passed. Since only one of those chunks is used (SROA, etc, kill the original reconstitution code) we don't have a location to describe the whole variable. In this particular case, since the struct consists of just the one int, once we have partial location information, this should have a location that describes the entire variable (since the piece is the entirety of the object). And at some point we'll need to describe the location of even /entirely/ unused arguments so that they can at least be printed on function entry. llvm-svn: 210231	2014-06-05 00:51:35 +00:00
David Blaikie	6cfa9e1a6d	DebugInfo: Add comments/assert description to r209674 based on Eric Christopher's post-commit review feedback. llvm-svn: 210228	2014-06-05 00:25:26 +00:00
David Blaikie	36408e7569	DebugInfo: Reapply r209984 (reverted in r210143), asserting that abstract DbgVariables have DIEs. Abstract variables within abstract scopes that are entirely optimized away in their first inlining are omitted because their scope is not present so the variable is never created. Instead, we should ensure the scope is created so the variable can be added, even if it's been optimized away in its first inlining. This fixes the incorrect debug info in missing-abstract-variable.ll (added in r210143) and passes an asserts self-hosting build, so hopefully there's not more of these issues left behind... fingers crossed. llvm-svn: 210221	2014-06-04 23:50:52 +00:00
Hans Wennborg	8e873329a1	Don't emit structors for available_externally globals (PR19933) We would previously assert here when trying to figure out the section for the global. This makes us handle the situation more gracefully since the IR isn't malformed. Differential Revision: http://reviews.llvm.org/D4022 llvm-svn: 210215	2014-06-04 21:04:54 +00:00
Andrew Trick	8d2ee37f31	Add a subtarget hook: enablePostMachineScheduler. As requested by AArch64 subtargets. Note that this will have no effect until the AArch64 target actually enables the pass like this: substitutePass(&PostRASchedulerID, &PostMachineSchedulerID); As soon as armv7 switches over, PostMachineScheduler will become the default postRA scheduler, so this won't be necessary any more. Targets using the old postRA schedule would then do: substitutePass(&PostMachineSchedulerID, &PostRASchedulerID); llvm-svn: 210167	2014-06-04 07:06:27 +00:00
Andrew Trick	3ccf71d4d6	Move GenericScheduler and PostGenericScheduler into a header. These were not exposed previously because I didn't want out-of-tree targets to be too dependent on their internals. They can be reused for a very wide variety of processors with casual scheduling needs without exposing the classes by instead using hooks defined in MachineSchedPolicy (we can add more if needed). When targets are more aggressively tuned or want to provide custom heuristics, they can define their own MachineSchedStrategy. I tend to think this is better once you start customizing heuristics because you can copy over only what you need. I don't think that layering heuristics generally works well. However, Arch64 targets now want to reuse the Generic scheduling logic but also provide extensions. I don't see much harm in exposing the Generic scheduling classes with a major caveat: these scheduling strategies may change in the future without validating performance on less mainstream processors. If you want to be immune from changes, just define your own MachineSchedStrategy. llvm-svn: 210166	2014-06-04 07:06:18 +00:00
David Blaikie	19a8b90763	DebugInfo: Partial revert r209984 due to more cases where abstract DbgVariables do not have associated DIEs. Along with a test case to demonstrate that due to inlining order there are cases where abstract variable DIEs are not constructed since the abstract subprogram was built due to a previous inlining that optimized away those variables. This produces incorrect debug info (the 'missing' abstract variable causes the inlined instance of that variable to be emitted with a full description (name, line, file) rather than referencing the abstract origin), but this commit at least ensures that it doesn't crash... llvm-svn: 210143	2014-06-04 01:30:59 +00:00
Pete Cooper	7223557752	Calculate dead instructions when a live interval is created. This gets us closer to being able to remove LiveVariables entirely which is where dead instructions are currently tagged as such. Reviewed by Jakob Olesen llvm-svn: 210132	2014-06-03 22:42:10 +00:00
Rafael Espindola	64c1e18033	Allow alias to point to an arbitrary ConstantExpr. This patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is up to MC (or the system assembler) to decide if that expression is valid or not. This reduces our ability to diagnose invalid uses and how early we can spot them, but it also lets us do things like @test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32), i32 ptrtoint (i32* @bar to i32)) to i32) An important implication of this patch is that the notion of aliased global doesn't exist any more. The alias has to encode the information needed to access it in its metadata (linkage, visibility, type, etc). Another consequence to notice is that getSection has to return a "const char ". It could return a NullTerminatedStringRef if there was such a thing, but when that was proposed the decision was to just uses "const char*" for that. llvm-svn: 210062	2014-06-03 02:41:57 +00:00
Eric Christopher	d91d605f7f	InitLibcallNames can take a Triple instead of a TargetMachine. llvm-svn: 210045	2014-06-02 20:51:49 +00:00
David Blaikie	23b4ecbff4	DebugInfo: Assert that DbgVariables have associated DIEs This was previously committed in r209680 and reverted in r209683 after it caused sanitizer builds to crash. The issue seems to be that the DebugLoc associated with dbg.value IR intrinsics isn't necessarily accurate. Instead, we duplicate the DIVariables and add an InlinedAt field to them to record their location. We were using this InlinedAt field to compute the LexicalScope for the variable, but not using it in the abstract DbgVariable construction and mapping. This resulted in a formal parameter to the current concrete function, correctly having no InlinedAt information, but incorrectly having a DebugLoc that described an inlined location within the function... thus an abstract DbgVariable was created for the variable, but its DIE was never constructed (since the LexicalScope had no such variable). This DbgVariable was silently ignored (by testing for a non-null DIE on the abstract DbgVariable). So, fix this by using the right scoping information when constructing abstract DbgVariables. In the long run, I suspect we want to undo the work that added this second kind of location tracking and fix the places where the DebugLoc propagation on the dbg.value intrinsic fails. This will shrink debug info (by not duplicating DIVariables), make it more efficient (by not having to construct new DIVariable metadata nodes to try to map back to a single variable), and benefit all instructions. But perhaps there are insurmountable issues with DebugLoc quality that I'm unaware of... I just don't know how we can't /just keep the DebugLoc from the dbg.declare to the dbg.values and never get this wrong/. Some history context: http://llvm.org/viewvc/llvm-project?view=revision&revision=135629 http://llvm.org/viewvc/llvm-project?view=revision&revision=137253 llvm-svn: 209984	2014-06-01 03:38:13 +00:00
Alp Toker	da0c7933cf	Fix typos llvm-svn: 209982	2014-05-31 21:26:28 +00:00
Adam Nemet	b4690e3fd1	[SelectionDAG] Force cycle detection in AssignTopologicalOrder before aborting DAG cycle detection is only enabled with ENABLE_EXPENSIVE_CHECKS. However we can run it just before we would crash in order to provide more informative diagnostics. Now in addition to the "Overran sorted position" message we also get the Node printed if a cycle was detected. Tested by building several configs: Debug+Assert, Debug+Assert+Check (this is ENABLE_EXPENSIVE_CHECKS), Release+Assert and Release. Also tried that the AssignTopologicalOrder assert produces the expected results. llvm-svn: 209977	2014-05-31 16:23:20 +00:00
Adam Nemet	7d39430a14	[SelectionDAG] Pass DAG to checkForCycles Pass the DAG down to checkForCycles from all callers where we have it. This allows target-specific nodes to be printed properly. Also print some missing newlines. llvm-svn: 209976	2014-05-31 16:23:17 +00:00
Andrea Di Biagio	446a527905	[X86] Add two combine rules to simplify dag nodes introduced during type legalization when promoting nodes with illegal vector type. This patch teaches the backend how to simplify/canonicalize dag node sequences normally introduced by the backend when promoting certain dag nodes with illegal vector type. This patch adds two new combine rules: 1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) -> (shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>) 2) fold (BINOP (shuffle (A, Undef, <Mask>)), (shuffle (B, Undef, <Mask>))) -> (shuffle (BINOP A, B), Undef, <Mask>). Both rules are only triggered on the type-legalized DAG. In particular, rule 1. is a target specific combine rule that attempts to sink a bitconvert into the operands of a binary operation. Rule 2. is a target independet rule that attempts to move a shuffle immediately after a binary operation. llvm-svn: 209930	2014-05-30 23:17:53 +00:00
Filipe Cabecinhas	82111f12fb	Convert a vselect into a concat_vector if possible Summary: If both vector args to vselect are concat_vectors and the condition is constant and picks half a vector from each argument, convert the vselect into a concat_vectors. Added a test. The ConvertSelectToConcatVector is assuming it doesn't get vselects with arguments of, for example, <undef, undef, true, true>. Those get taken care of in the checks above its call. Reviewers: nadav, delena, grosbach, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3916 llvm-svn: 209929	2014-05-30 23:03:11 +00:00
Adrian Prantl	c11975439c	Roll DbgVariable::setMInsn into the constructor. No functional changes. llvm-svn: 209920	2014-05-30 21:10:13 +00:00
Logan Chien	c002981084	Fix MIPS exception personality encoding. For MIPS, we have to encode the personality routine with an indirect pointer to absptr; otherwise, some link warning warning will be raised, and the program might crash in some early MIPS Android device. llvm-svn: 209907	2014-05-30 16:48:56 +00:00
Rafael Espindola	92945eee80	[pr19636] Fix known bit computation in urem instruction with power of two. Patch by Andrey Kuharev. llvm-svn: 209902	2014-05-30 15:00:45 +00:00
Tim Northover	d622e1282c	SelectionDAG: skip barriers for unordered atomic operations Unordered is strictly weaker than monotonic, so if the latter doesn't have any barriers then the former certainly shouldn't. rdar://problem/16548260 llvm-svn: 209901	2014-05-30 14:41:51 +00:00
Tim Northover	b4ddc0845a	ARM & AArch64: make use of common cmpxchg idioms after expansion The C and C++ semantics for compare_exchange require it to return a bool indicating success. This gets mapped to LLVM IR which follows each cmpxchg with an icmp of the value loaded against the desired value. When lowered to ldxr/stxr loops, this extra comparison is redundant: its results are implicit in the control-flow of the function. This commit makes two changes: it replaces that icmp with appropriate PHI nodes, and then makes sure earlyCSE is called after expansion to actually make use of the opportunities revealed. I've also added -{arm,aarch64}-enable-atomic-tidy options, so that existing fragile tests aren't perturbed too much by the change. Many of them either rely on undef/unreachable too pervasively to be restored to something well-defined (particularly while making sure they test the same obscure assert from many years ago), or depend on a particular CFG shape, which is disrupted by SimplifyCFG. rdar://problem/16227836 llvm-svn: 209883	2014-05-30 10:09:59 +00:00
Richard Trieu	c0f9121e71	Remove use of comma operator. llvm-svn: 209871	2014-05-30 03:15:17 +00:00
Adrian Prantl	fef140df96	Debug Info: Remove unused code. The MInsn of an _abstract_ variable is never used again and updating the abstract variable for each inlined instance of it was questionable in the first place. llvm-svn: 209829	2014-05-29 16:56:48 +00:00
Hao Liu	4091450181	Fix an assertion failure caused by v1i64 in DAGCombiner Shrink. llvm-svn: 209798	2014-05-29 09:19:07 +00:00
Michael J. Spencer	f375d80635	[x86] Fold extract_vector_elt of a load into the Load's address computation. An address only use of an extract element of a load can be simplified to a load. Without this the result of the extract element is spilled to the stack so that an address is available. llvm-svn: 209788	2014-05-29 01:42:45 +00:00
Matt Arsenault	3ee3746374	Fix wrong setcc result type when legalizing uaddo/usubo No test because no in-tree targets change the bitwidth of the setcc type depending on the bitwidth of the compared type. Patch by Ke Bai llvm-svn: 209771	2014-05-28 20:51:42 +00:00
Rafael Espindola	59f7eba2b5	[pr19844] Add thread local mode to aliases. This matches gcc's behavior. It also seems natural given that aliases contain other properties that govern how it is accessed (linkage, visibility, dll storage). Clang still has to be updated to expose this feature to C. llvm-svn: 209759	2014-05-28 18:15:43 +00:00
Hal Finkel	2c77fe59d9	Revert "[DAGCombiner] Split up an indexed load if only the base pointer value is live" This reverts r208640 (I've just XFAILed the test) because it broke ppc64/Linux self-hosting. Because nearly every regression test triggers a segfault, I hope this will be easy to fix. llvm-svn: 209747	2014-05-28 15:33:19 +00:00
Alexey Samsonov	bb2990df58	Change representation of instruction ranges where variable is accessible. Use more straightforward way to represent the set of instruction ranges where the location of a user variable is defined - vector of pairs of instructions (defining start/end of each range), instead of a flattened vector of instructions where some instructions are supposed to start the range, and the rest are supposed to "clobber" it. Simplify the code which generates actual .debug_loc entries. No functionality change. llvm-svn: 209698	2014-05-27 23:09:50 +00:00
Alexey Samsonov	8a86d6da26	Factor out looking for prologue end into a function llvm-svn: 209697	2014-05-27 22:47:41 +00:00
Alexey Samsonov	f0e0cca0c7	Don't pre-populate the set of keys in the map with variable locations history. Current implementation of calculateDbgValueHistory already creates the keys in the expected order (user variables are listed in order of appearance), and should do so later by contract. No functionality change. llvm-svn: 209690	2014-05-27 22:35:00 +00:00
David Blaikie	6900674aaf	DebugInfo: partially revert cleanup committed in r209680 I'm not sure exactly where/how we end up with an abstract DbgVariable with a null DIE, but we do... looking into it & will add a test and/or fix when I figure it out. Currently shows up in selfhost or compiler-rt builds. llvm-svn: 209683	2014-05-27 20:20:43 +00:00
David Blaikie	b85f0080e7	DebugInfo: Simplify solution to avoid DW_AT_artificial on inlined parameters. Originally committed in r207717, I clearly didn't look very closely at the code to understand how existing things were working... llvm-svn: 209680	2014-05-27 19:34:32 +00:00
David Blaikie	482097d098	DebugInfo: Create abstract function definitions even when concrete definitions preceed inline definitions. After much puppetry, here's the major piece of the work to ensure that even when a concrete definition preceeds all inline definitions, an abstract definition is still created and referenced from both concrete and inline definitions. Variables are still broken in this case (see comment in dbg-value-inlined-parameter.ll test case) and will be addressed in follow up work. llvm-svn: 209677	2014-05-27 18:37:55 +00:00
David Blaikie	2910f62084	DebugInfo: Avoid an extra map lookup when finding abstract subprogram DIEs. llvm-svn: 209676	2014-05-27 18:37:51 +00:00
David Blaikie	3c2fff3fe6	DebugInfo: Lazily construct subprogram definition DIEs. A further step to correctly emitting concrete out of line definitions preceeding inlined instances of the same program. To do this, emission of subprograms must be delayed until required since we don't know which (abstract only (if there's no out of line definition), concrete only (if there are no inlined instances), or both) DIEs are required at the start of the module. To reduce the test churn in the following commit that actually fixes the bug, this commit introduces the lazy DIE construction and cleans up test cases that are impacted by the changes in the resulting DIE ordering. llvm-svn: 209675	2014-05-27 18:37:48 +00:00
David Blaikie	f7221adb8e	DebugInfo: Lazily attach definition attributes to definitions. This is a precursor to fixing inlined debug info where the concrete, out-of-line definition may preceed any inlined usage. To cope with this, the attributes that may appear on the concrete definition or the abstract definition are delayed until the end of the module. Then, if an abstract definition was created, it is referenced (and no other attributes are added to the out-of-line definition), otherwise the attributes are added directly to the out-of-line definition. In a couple of cases this causes not just reordering of attributes, but reordering of types. When the creation of the attribute is delayed, if that creation would create a type (such as for a DW_AT_type attribute) then other top level DIEs may've been constructed during the delay, causing the referenced type to be created and added after those intervening DIEs. In the extreme case, in cross-cu-inlining.ll, this actually causes the DW_TAG_basic_type for "int" to move from one CU to another. llvm-svn: 209674	2014-05-27 18:37:43 +00:00
David Blaikie	7f91686f07	DebugInfo: Separate out the addition of subprogram attribute additions so that they can be added later depending on whether or not the function is inlined. llvm-svn: 209673	2014-05-27 18:37:38 +00:00
Tim Northover	4f1909f1da	ARM: teach AAPCS-VFP to deal with Cortex-M4. Cortex-M4 only has single-precision floating point support, so any LLVM "double" type will have been split into 2 i32s by now. Fortunately, the consecutive-register framework turns out to be precisely what's needed to reconstruct the double and follow AAPCS-VFP correctly! rdar://problem/17012966 llvm-svn: 209650	2014-05-27 10:43:38 +00:00
David Blaikie	ab53c91010	DwarfUnit: Remove some misleading no-op code introduced in r204162. Post commit review feedback from Manman called this out, but it looks like it slipped through the cracks. llvm-svn: 209611	2014-05-26 05:32:21 +00:00
David Blaikie	ea86226774	DebugInfo: Fix inlining with #file directives a little harder Seems my previous fix was insufficient - we were still not adding the inlined function to the abstract scope list. Which meant it wasn't flagged as inline, didn't have nested lexical scopes in the abstract definition, and didn't have abstract variables - so the inlined variable didn't reference an abstract variable, instead being described completely inline. llvm-svn: 209602	2014-05-25 18:11:35 +00:00
Benjamin Kramer	5256ce37ac	MachineVerifier: Clean up some syntactic weirdness left behind by find&replace. No functionality change. llvm-svn: 209581	2014-05-24 13:31:10 +00:00
Benjamin Kramer	389cec0d3e	CodeGen: Make MachineBasicBlock::back skip to the beginning of the last bundle. This makes front/back symmetric with begin/end, avoiding some confusion. Added instr_front/instr_back for the old behavior, corresponding to instr_begin/instr_end. Audited all three in-tree users of back(), all of them look like they don't want to look inside bundles. Fixes an assertion (PR19815) when generating debug info on mips, where a delay slot was bundled at the end of a branch. llvm-svn: 209580	2014-05-24 13:13:17 +00:00
David Blaikie	169ffe41af	DebugInfo: Put concrete definitions referencing abstract definitions in the same scope as the abstract definition. This seems like a simple cleanup/improved consistency, but also helps lay the foundation to fix the bug mentioned in the test case: concrete definitions preceeding any inlined usage aren't properly split into concrete + abstract (because they're not known to need it until it's too late). Once we start deferring this choice until later, we won't have the choice to put concrete definitions for inlined subroutines in a different scope from concrete definitions for non-inlined subroutines (since we won't know at time-of-construction which one it'll be). This change brings those two cases into alignment ahead of that future chaneg/fix. llvm-svn: 209547	2014-05-23 20:25:15 +00:00
David Blaikie	05b8584f16	Add FIXME comment based on code review feedback by Hal Finkel on r209338 llvm-svn: 209529	2014-05-23 16:53:14 +00:00
David Blaikie	4860225570	Rename a couple of variables to be more accurate. It's not really a "ScopeDIE", as such - it's the abstract function definition's DIE. And we usually use "SP" for subprograms, rather than "Sub". llvm-svn: 209499	2014-05-23 05:03:23 +00:00
David Blaikie	96fb9024f2	DebugInfo: Fix cross-CU references for scopes (and variables within those scopes) in abstract definitions of cross-CU inlined functions Found by Adrian Prantl during post-commit review of r209335. llvm-svn: 209498	2014-05-23 04:23:06 +00:00
Eric Christopher	9eff5178f1	Return false if we're not going to do anything. llvm-svn: 209455	2014-05-22 17:49:33 +00:00
Eric Christopher	65382d7316	Remove unused variable. llvm-svn: 209391	2014-05-22 05:33:03 +00:00
David Blaikie	8729bca333	DebugInfo: Simplify dead variable collection slightly. constructSubprogramDIE was already called for every subprogram in every CU when the module was started - there's no need to call it again at module finalization. llvm-svn: 209372	2014-05-22 00:48:36 +00:00
Eli Bendersky	f13a05607c	Similar to bitcast, treat addrspacecast as a foldable operand. Added a test sink-addrspacecast.ll to verify this change. Patch by Jingyue Wu. llvm-svn: 209343	2014-05-22 00:02:52 +00:00
Eric Christopher	3470bbbd54	Fix compilation issues. llvm-svn: 209342	2014-05-21 23:51:57 +00:00
Eric Christopher	6b0fcfee36	Make early if conversion dependent upon the subtarget and add a subtarget hook to enable. Unconditionally add to the pass pipeline for targets that might want to use it. No functional change. llvm-svn: 209340	2014-05-21 23:40:26 +00:00
David Blaikie	2da282b860	Revert "DebugInfo: Don't put fission type units in comdat sections." This reverts commit r208930, r208933, and r208975. It seems not all fission consumers are ready to handle this behavior. Reverting until tools are brought up to spec. llvm-svn: 209338	2014-05-21 23:27:41 +00:00
David Blaikie	1ea9db2dce	DebugInfo: Use the SPMap to find the parent CU of inlined functions as they may not be in the current CU Committed in r209178 then reverted in r209251 due to LTO breakage, here's a proper fix for the case of the missing subprogram DIE. The DIEs were there, just in other compile units. Using the SPMap we can find the right compile unit to search for and produce cross-unit references to describe this kind of inlining. One existing test case needed to be updated because it had a function that wasn't in the CU's subprogram list, so it didn't appear in the SPMap. llvm-svn: 209335	2014-05-21 23:14:12 +00:00
David Blaikie	825bdd2fc6	DebugInfo: Ensure concrete out of line variables from inlined functions reference their abstract origins. llvm-svn: 209327	2014-05-21 22:41:17 +00:00
David Blaikie	ce7a1bd038	DebugInfo: Simplify subprogram declaration creation/references and accidentally refix PR11300. Also simplifies the linkage name handling a little too. llvm-svn: 209311	2014-05-21 18:04:33 +00:00
Richard Smith	56f9c191e1	[modules] Add module maps for LLVM. These are not quite ready for prime-time yet, but only a few more Clang patches need to land. (I have 'ninja check' passing locally.) llvm-svn: 209269	2014-05-21 02:46:14 +00:00
Eric Christopher	eb71972887	Move the verbose asm option to be part of the options struct and set appropriately. llvm-svn: 209258	2014-05-20 23:59:50 +00:00
David Blaikie	374af662e9	Revert "DebugInfo: Assume all subprogram DIEs have been created before any abstract subprograms are constructed." This reverts commit r209178. This seems to be asserting in an LTO build on some internal Apple buildbots. No upstream reproduction (and I don't have an LLVM-aware gold built right now to reproduce it personally) but it's a small patch & the failure's semi-plausible so I'm going to revert first while I try to reproduce this. llvm-svn: 209251	2014-05-20 22:33:09 +00:00
David Blaikie	93ef46b02a	Unbreak the sanitizer buildbots after r209226 due to SROA issue described in http://reviews.llvm.org/D3714 Undecided whether this should include a test case - SROA produces bad dbg.value metadata describing a value for a reference that is actually the value of the thing the reference refers to. For now, loosening the assert lets this not assert, but it's still bogus/wrong output... If someone wants to tell me to add a test, I'm willing/able, just undecided. Hopefully we'll get SROA fixed soon & we can tighten up this assertion again. llvm-svn: 209240	2014-05-20 21:40:13 +00:00
David Blaikie	1d9aec67b0	Fix test breakage introduced in r209223. Oops, broke the broken enum constants again. llvm-svn: 209226	2014-05-20 18:36:35 +00:00
Alexey Samsonov	dfcaf9c8d8	Rewrite calculateDbgValueHistory to make it (hopefully) more transparent. This change preserves the original algorithm of generating history for user variables, but makes it more clear. High-level description of algorithm: Scan all the machine basic blocks and machine instructions in the order they are emitted to the object file. Do the following: 1) If we see a DBG_VALUE instruction, add it to the history of the corresponding user variable. Keep track of all user variables, whose locations are described by a register. 2) If we see a regular instruction, look at all the registers it clobbers, and terminate the location range for all variables described by these registers. 3) At the end of the basic block, terminate location ranges for all user variables described by some register. Although this change shouldn't be user-visible (the contents of .debug_loc section should be the same), it changes some internal assumptions about the set of instructions used to track the variable locations. Watching the bots. llvm-svn: 209225	2014-05-20 18:34:54 +00:00
David Blaikie	2af1c805b4	PR19767: DebugInfo emission of pointer constants. In refactoring DwarfUnit::isUnsignedDIType I restricted it to only work on values with signedness (unsigned or signed), asserting on anything else (which did uncover some bugs). But it turns out that we do need to emit constants of signless data, such as pointer constants - only null pointer constants are known to need this so far, but it's conceivable that there might be non-null pointer constants at some point (hardcoded address offsets for device drivers?). This patch just uses 'unsigned' for signless data such as pointer constants. Arguably we could use signless representations (DW_FORM_dataN) instead, allowing a trinary result from isUnsignedDIType (signed, unsigned, signless), but this seems reasonable for now. llvm-svn: 209223	2014-05-20 18:21:51 +00:00
Eric Christopher	650c8f2a06	Clean up language and grammar. Based on a patch by jfcaron3@gmail.com! PR19806 llvm-svn: 209216	2014-05-20 17:11:11 +00:00
Benjamin Kramer	7bd6bee385	Legalizer: Make bswap promotion safe for vectors. llvm-svn: 209202	2014-05-20 09:42:31 +00:00
David Blaikie	8e1d489351	DebugInfo: Emit function definitions within their namespace scope. This workaround (presumably for ancient GDB) doesn't appear to be required (GDB 7.5 seems to tolerate function definition DIEs in namespace scope just fine). llvm-svn: 209189	2014-05-20 03:23:24 +00:00
David Blaikie	424b59b1ce	DebugInfo: Assume all subprogram DIEs have been created before any abstract subprograms are constructed. Since we visit the whole list of subprograms for each CU at module start, this is clearly true - don't test for the case, just assert it. A few old test cases seemed to have incomplete subprogram lists, but any attempt to reproduce them shows full subprogram lists that even include entities that have been completely inlined and the out of line definition removed. llvm-svn: 209178	2014-05-19 23:16:19 +00:00
David Blaikie	973141a035	DebugInfo: Don't include DW_AT_inline on each abstract definition multiple times. When I refactored this in r208636 I accidentally caused this to be added multiple times to each abstract subprogram (not accounting for the deduplicating effect of the InlinedSubprogramDIEs set). This got better in r208798 when the abstract definitions got the attribute added to them at construction time, but still had the redundant copies introduced in r208636. This commit removes those excess DW_AT_inlines and relies solely on the insertion in r208798. llvm-svn: 209166	2014-05-19 22:07:16 +00:00
David Blaikie	48b056bab0	DebugInfo: Fix missing inlined_subroutines caused by r208748. The check in DwarfDebug::constructScopeDIE was meant to consider inlined subroutines as any non-top-level scope that was a subprogram. Instead of checking "not top level scope" it was checking if the /subprogram's/ scope was non-top-level. Fix this and beef up a test case to demonstrate some of the missing inlined_subroutines are no longer missing. In the course of fixing this I also found that r208748 (with this fix) found one /extra/ inlined_subroutine in concrete_out_of_line.ll due to two inlined_subroutines having the same inlinedAt location. The previous implementation was collapsing these into a single inlined subroutine. I'm not sure what the original code was that created this .ll file so I'm not sure if this actually happens in practice today. Since we deliberately include column information to disambiguate two calls on the same line, that may've addressed this bug in the frontend, but it's good to know that workaround isn't necessary for this particular case anymore. llvm-svn: 209165	2014-05-19 21:54:31 +00:00
Eric Christopher	710c0ae7de	Fix typos. llvm-svn: 209164	2014-05-19 21:18:47 +00:00
Benjamin Kramer	f3ad23551d	SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the bswap not. - On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though. - On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal. - On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled. llvm-svn: 209123	2014-05-19 13:12:38 +00:00
Saleem Abdulrasool	f3a5a5c546	Target: remove old constructors for CallLoweringInfo This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. llvm-svn: 209082	2014-05-17 21:50:17 +00:00
Saleem Abdulrasool	9f664c1083	Target: change member from reference to pointer This is a preliminary step to help ease the construction of CallLoweringInfo. Changing the construction to a chained function pattern requires that the parameter be nullable. However, rather than copying the vector, save a pointer rather than the reference to permit a late binding of the arguments. llvm-svn: 209080	2014-05-17 21:50:01 +00:00
Rafael Espindola	e0098928c9	Delete getAliasedGlobal. llvm-svn: 209040	2014-05-16 22:37:03 +00:00
David Blaikie	48369d1b8e	DebugInfo: Assert rather than conditionalizing when a CU's subprogram list contains declarations. llvm-svn: 209039	2014-05-16 22:21:45 +00:00
David Blaikie	c405c9cb0b	DebugInfo: Handle emitting constants of C++ unicode character type. Patch by Stephan Tolksdorf! (with some test case stuff by me) Differential Revision: http://reviews.llvm.org/D3810 llvm-svn: 209037	2014-05-16 21:53:09 +00:00
Reid Kleckner	fceb76f5f9	Add comdat key field to llvm.global_ctors and llvm.global_dtors This allows us to put dynamic initializers for weak data into the same comdat group as the data being initialized. This is necessary for MSVC ABI compatibility. Once we have comdats for guard variables, we can use the combination to help GlobalOpt fire more often for weak data with guarded initialization on other platforms. Reviewers: nlewycky Differential Revision: http://reviews.llvm.org/D3499 llvm-svn: 209015	2014-05-16 20:39:27 +00:00
David Blaikie	46d0ca5b40	DebugInfo: Add an assert regarding the subprogram in the subprogram map matching the abstract subprogram. I'm not sure this is how it'll be going forward (I'd rather prefer the definition to be in the main SP mapping, for various reasons) but this helps me understand how it is today. llvm-svn: 209009	2014-05-16 19:42:10 +00:00
David Blaikie	825f487b68	DebugInfo: Assume the CU's Subprogram list only contains definitions. DIBuilder maintains this invariant and the current DwarfDebug code could end up doing weird things if it contained declarations (such as putting the definition DIE inside a CU that contained the declaration - this doesn't seem like a good idea, so rather than adding logic to handle this case we'll just ban in for now & cross that bridge if we come to it later). llvm-svn: 209004	2014-05-16 18:26:53 +00:00
David Blaikie	4a3b84d2f5	DwarfDebug: Refactor AT_ranges/AT_high_pc+AT_low_pc emission into helper function. llvm-svn: 208997	2014-05-16 16:42:40 +00:00
Rafael Espindola	5a52b9f139	Revert "Implement global merge optimization for global variables." This reverts commit r208934. The patch depends on aliases to GEPs with non zero offsets. That is not supported and fairly broken. The good news is that GlobalAlias is being redesigned and will have support for offsets, so this patch should be a nice match for it. llvm-svn: 208978	2014-05-16 13:02:18 +00:00
Eric Christopher	c21d3d5f90	Remove the Options query functions and just access our Options directly. llvm-svn: 208937	2014-05-16 00:32:52 +00:00
Jiangning Liu	932e1c3924	Implement global merge optimization for global variables. This commit implements two command line switches -global-merge-on-external and -global-merge-aligned, and both of them are false by default, so this optimization is disabled by default for all targets. For ARM64, some back-end behaviors need to be tuned to get this optimization further enabled. llvm-svn: 208934	2014-05-15 23:45:42 +00:00
David Blaikie	962c9a2d54	DebugInfo: Follow up to r208930, comment usage of 'using' to bring in base class overload. Code review feedback from Eric Christopher. llvm-svn: 208933	2014-05-15 23:29:53 +00:00
Eric Christopher	5d376066df	Move more MC options into the MCTargetOptions structure. No functional change. llvm-svn: 208932	2014-05-15 23:27:49 +00:00
David Blaikie	bc094f387b	DebugInfo: Don't put fission type units in comdat sections. Since type units in the dwo file are handled by a debug aware tool, they don't need to leverage the ELF comdat grouping to implement deduplication. Avoid creating all the .group sections for these as a space optimization. llvm-svn: 208930	2014-05-15 23:18:15 +00:00
David Blaikie	4c6d987b06	DebugInfo: Simplify retrieving filename/directory name for line table entry building. llvm-svn: 208911	2014-05-15 20:18:50 +00:00
Jay Foad	5a29c367f7	Instead of littering asserts throughout the code after every call to computeKnownBits, consolidate them into one assert at the end of computeKnownBits itself. llvm-svn: 208876	2014-05-15 12:12:55 +00:00
Alp Toker	beaca19c7c	Fix typos llvm-svn: 208839	2014-05-15 01:52:21 +00:00
David Blaikie	91e8104622	DwarfDebug: Don't set frame index locations on abstract variables. Abstract variables should never have/use locations. In this case the data wasn't used, so no functional change intended here, just simplification. llvm-svn: 208820	2014-05-14 22:51:59 +00:00
David Blaikie	9ba7254688	DebugInfo: Sure up subprogram variable list handling with more assertions and fewer conditionals. Many old tests using prior schemas still had some brokenness here (both indirect arrays and arrays with single bogus elements). Fixed those up so they don't hit the new assertions. Also reduced nesting in some places, etc. llvm-svn: 208817	2014-05-14 21:52:46 +00:00
David Blaikie	7af6e6f267	DebugInfo: Assert that a CU's subprogram list contains only subprograms. llvm-svn: 208816	2014-05-14 21:52:37 +00:00
Jay Foad	a0653a3e6c	Rename ComputeMaskedBits to computeKnownBits. "Masked" has been inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811	2014-05-14 21:14:37 +00:00
David Blaikie	f662f0a65e	DebugInfo: Do not delay attaching DW_AT_inline attribute to abstract definitions. This is just unneccessary - we only create abstract definitions when we're inlining anyway, so there's no reason to delay this to see if we're going to inline anything. llvm-svn: 208798	2014-05-14 17:58:53 +00:00
Logan Chien	95188b9092	Fix ARM EHABI when function has landingpad and nounwind. If the function has the landingpad instruction, then the handlerdata should be emitted even if the function has nouwnind attribute. Otherwise, following code will not work: void test1() noexcept { try { throw_exception(); } catch (...) { log_unexpected_exception(); } } Since the cantunwind was incorrectly emitted and the LSDA is not available. llvm-svn: 208791	2014-05-14 16:38:30 +00:00
Jay Foad	e48d9e8efe	Update the comments for ComputeMaskedBits, which lost its Mask parameter in r154011. llvm-svn: 208757	2014-05-14 08:00:07 +00:00
David Blaikie	9b8c8cda0d	Recommit r208506: DebugInfo: Include lexical scopes in inlined subroutines. This was reverted in r208642 due to regressions surrounding file changes within lexical scopes causing inlining information to be lost. The issue was in LexicalScopes::getOrCreateInlinedScope, where I was previously testing "isLexicalBlock" which is false for "DILexicalBlockFile" (a scope used to represent changes in the current file name) and assuming it was then a function (breaking out of the inlined scope path and reaching for the parent non-inlined scopes). By inverting the condition and testing for "isSubprogram" the correct behavior is attained. (also found some weirdness in Clang, see r208742 when reducing this test case - the resulting test case doesn't apply with the Clang fix, but I've added a more realistic test case to inline-scopes.ll which does reproduce the issue and demonstrate the fix) llvm-svn: 208748	2014-05-14 01:08:28 +00:00
Louis Gerbarg	1b91aa2cf5	Add missing line breaks to debug output in CodeGenPrepare llvm-svn: 208731	2014-05-13 21:54:22 +00:00
Rafael Espindola	99e05cf163	Split GlobalValue into GlobalValue and GlobalObject. This allows code to statically accept a Function or a GlobalVariable, but not an alias. This is already a cleanup by itself IMHO, but the main reason for it is that it gives a lot more confidence that the refactoring to fix the design of GlobalAlias is correct. That will be a followup patch. llvm-svn: 208716	2014-05-13 18:45:48 +00:00
Joey Gouly	12a8bf09d0	[CGP] r205941 changed the logic, so that a cast happens before 'Result' is compared to 'AddrMode.BaseReg'. In the case that 'AddrMode.BaseReg' is nullptr, 'Result' will also be nullptr, so the cast causes an assertion. We should use dyn_cast_or_null here to check 'Result' is not null and it is an instruction. Bug found by Mats Petersson, and I reduced his IR to get a test case. llvm-svn: 208705	2014-05-13 15:42:45 +00:00
David Blaikie	290e22872d	Revert "DebugInfo: Include lexical scopes in inlined subroutines." This reverts commit r208506. Some inlined subroutine scopes appear to be missing with this change. Reverting while I investigate. llvm-svn: 208642	2014-05-12 23:53:03 +00:00
Pete Cooper	7fd1d725b9	Use a logical not when inverting SetCC. This unfortunately doesn't fire on any targets so I couldn't find a test case to trigger it. The problem occurs when a non-i1 setcc is inverted. For example 'i8 = setcc' will get 'xor 0xff' to invert this. This is clearly wrong when the boolean contents are ZeroOrOne. This patch introduces getLogicalNOT and updates SetCC legalisation to use it. Reviewed by Hal Finkel. llvm-svn: 208641	2014-05-12 23:26:58 +00:00
Adam Nemet	5d78558c2b	[DAGCombiner] Split up an indexed load if only the base pointer value is live Right now the load may not get DCE'd because of the side-effect of updating the base pointer. This can happen if we lower a read-modify-write of an illegal larger type (e.g. i48) such that the modification only affects one of the subparts (the lower i32 part but not the higher i16 part). See the testcase. In order to spot the dead load we need to revisit it when SimplifyDemandedBits decided that the value of the load is masked off. This is the CommitTargetLoweringOpt piece. I checked compile time with ARM64 by sending SPEC bitcode files through llc. No measurable change. Fixes <rdar://problem/16031651> llvm-svn: 208640	2014-05-12 23:00:03 +00:00
David Blaikie	525358db2c	DebugInfo: Attach DW_AT_inline to inlined subprograms at DIE-construction time rather than as a post-processing step. llvm-svn: 208636	2014-05-12 21:50:44 +00:00
David Blaikie	4abe19edad	DwarfDebug: Avoid an extra map lookup while constructing abstract scope DIEs and reduce nesting/conditionals. One test case had to be updated as it still had the extra indirection for the variable list - removing the extra indirection got it back to passing. llvm-svn: 208608	2014-05-12 18:23:35 +00:00
Matt Arsenault	2adca6090f	Make SimplifyDemandedBits understand BUILD_PAIR llvm-svn: 208598	2014-05-12 17:14:48 +00:00
Saleem Abdulrasool	fba09d47e9	CodeGen: add parenthesis around complex expression Add missing parenthesis suggested by GCC. NFC. llvm-svn: 208519	2014-05-12 06:08:18 +00:00
Hal Finkel	f0e086a0bc	Pass the value type to TLI::getRegisterByName We must validate the value type in TLI::getRegisterByName, because if we don't and the wrong type was used with the IR intrinsic, then we'll assert (because we won't be able to find a valid register class with which to construct the requested copy operation). For PPC64, additionally, the type information is necessary to decide between the 64-bit register and the 32-bit subregister. No functionality change. llvm-svn: 208508	2014-05-11 19:29:07 +00:00
David Blaikie	9576766be9	DebugInfo: Include lexical scopes in inlined subroutines. llvm-svn: 208506	2014-05-11 18:12:17 +00:00
David Blaikie	e0f14743c0	DwarfUnit: Make explicit a limitation/bug in enumeration constant emission. Filed as PR19712, LLVM fails to detect the right type of an enum constant when a frontend does not provide an underlying type for the enumeration type. llvm-svn: 208502	2014-05-11 17:04:05 +00:00
David Blaikie	60cae1ba49	DwarfUnit: Pick a winner between isTypeSigned and isUnsignedDIType. And the winner by a nose is isUnsignedDIType, for no particular reason. These two functions were just complements of each other and used in very related code, so refactor callers to just use one of them. llvm-svn: 208500	2014-05-11 16:08:41 +00:00
David Blaikie	c0a2841e2f	DwarfUnit: Factor out calling isUnsignedDIType into a utility function so each caller of emitConstantValue doesn't have to call it separately. llvm-svn: 208496	2014-05-11 15:56:59 +00:00
David Blaikie	c05c8f483b	DwarfUnit: Share common constant value emission between APInts of small (<= 64 bit) and MCOperand immediates. Doesn't seem a good reason to duplicate this code (it was more literally duplicated prior to r208494, and while the dataN code /does/ actually fire in this case, it doesn't seem necessary (and the DWARF standard recommends using udata/sdata pervasively instead of dataN, so as to indicate signedness of the values)) llvm-svn: 208495	2014-05-11 15:47:39 +00:00
David Blaikie	958647c36d	DebugInfo: Simplify constant value emission. This code looks to have become dead at some time in the past. I tried to reproduce cases where LLVM would emit constants with dataN, but could not. Upon inspection it seems the code doesn't do that anymore - the only time a size is provided by isTypeSigned is when the type is signed, and in those cases we use sdata. dataN is only used for unsigned types and isTypeSigned doesn't provide a value for sizeInBits in that case. Remove the dead cases/size plumbing. llvm-svn: 208494	2014-05-11 15:06:20 +00:00
Oliver Stannard	c24f2171ca	ARM: HFAs must be passed in consecutive registers When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must be passed in a block of consecutive floating-point registers, or on the stack. This means that unused floating-point registers cannot be back-filled with part of an HFA, however this can currently happen. This patch, along with the corresponding clang patch (http://reviews.llvm.org/D3083) prevents this. llvm-svn: 208413	2014-05-09 14:01:47 +00:00
Quentin Colombet	2eb151e29f	[TargetInstrInfo] Fix the implementation of commuteInstruction to match the comment of the API. Relaxes the behavior of TargetInstrInfo::commuteInstruction when TargetInstrInfo::findCommutedOpIndices returns false. Previously TargetInstrInfo triggered a fatal error in such situation whereas based on the comment in the API it should just return nullptr. Indeed the only precondition that should be ensured is that the instruction must be commutable. llvm-svn: 208371	2014-05-08 23:12:27 +00:00
David Blaikie	2f143e0c30	Reapply r207876 (Try simplifying LexicalScopes ownership again) including a workaround for an MSVC2012 bug regarding forward_as_tuple (r207876 was reverted in r208131 after seeing some consistent buildbot failure for MSVC 2012. The original commits were in r207724-r207726) Takumi was nice enough to dig into this and locate this Microsoft Connect issue: http://connect.microsoft.com/VisualStudio/feedback/details/814899/forward-as-tuple-debug-implementation-error describing a bug in MSVC2012's forward_as_tuple implementation. Since the parameters in this instance are trivial/small, pass them by value (using make_tuple) instead of perfectly-forwarded tuple of rvalue references (involving the broken forward_as_tuple). Hopefully this will satisfy MSVC2012. llvm-svn: 208364	2014-05-08 22:24:51 +00:00
Hal Finkel	e8172d85f9	Fix a spelling error llvm-svn: 208314	2014-05-08 13:42:57 +00:00
Hal Finkel	6532c20faa	Move late partial-unrolling thresholds into the processor definitions The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features), and also would not correctly identify the target CPU if certain target features were disabled. After some discussions on IRC with Chandler et al., it was decided that the processor scheduling models were the right containers for this information (because it is often tied to special uop dispatch-buffer sizes). This does represent a small functionality change: - For generic x86-64 (which uses the SB model and, thus, will get some unrolling). - For AMD cores (because they still currently use the SB scheduling model) - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump the default threshold to 50; we're working on a test case for this). Otherwise, nothing has changed for any other targets. The logic, however, has been moved into BasicTTI, so other targets may now also opt-in to this functionality simply by setting LoopMicroOpBufferSize in their processor model definitions. llvm-svn: 208289	2014-05-08 09:14:44 +00:00
Matt Arsenault	5f2fd4b22a	Fix using wrong result type for setcc. When reducing the bitwidth of a comparison against a constant, the original setcc's result type was used, which was incorrect. No test since I don't think any other in tree targets change the bitwidth of the setcc type depending on the bitwidth of the compared type. llvm-svn: 208236	2014-05-07 18:26:58 +00:00
Rafael Espindola	566fcfe69b	Remove the UseCFI option from createAsmStreamer. We were already always passing true, this just removes the option. llvm-svn: 208205	2014-05-07 13:00:43 +00:00
Zinovy Nis	da925c0d7c	[BUG][REFACTOR] 1) Fix for printing debug locations for absolute paths. 2) Location printing is moved into public method DebugLoc::print() to avoid re-inventing the wheel. Differential Revision: http://reviews.llvm.org/D3513 llvm-svn: 208177	2014-05-07 09:51:22 +00:00
David Blaikie	9dabbf6228	Revert "Try simplifying LexicalScopes ownership again." Speculatively reverting due to a suspicious failure on a Windows buildbot. This reverts commit 10c37a012ea11596d44cd9059fe09c959caf30c8. llvm-svn: 208131	2014-05-06 21:07:17 +00:00
Benjamin Kramer	1625bfccbe	TTI: Estimate @llvm.fmuladd cost as fmul + fadd when FMA's aren't legal on the target. llvm-svn: 208115	2014-05-06 18:36:23 +00:00
Renato Golin	c7aea40ec6	Implememting named register intrinsics This patch implements the infrastructure to use named register constructs in programs that need access to specific registers (bare metal, kernels, etc). So far, only the stack pointer is supported as a technology preview, but as it is, the intrinsic can already support all non-allocatable registers from any architecture. llvm-svn: 208104	2014-05-06 16:51:25 +00:00
David Blaikie	658a20b04d	Try simplifying LexicalScopes ownership again. Committed initially in r207724-r207726 and reverted due to compiler-rt crashes in r207732. Instead, fix this harder with unordered_map and store the LexicalScopes by value in the map. This did necessitate moving the definition of LexicalScope above the definition of LexicalScopes. Let's see how the buildbots/compilers tolerate unordered_map::emplace + std::piecewise_construct + std::forward_as_tuple... llvm-svn: 207876	2014-05-02 22:21:05 +00:00
Benjamin Kramer	6dd9f8feb3	Satisfy GCC's urgent need for parentheses around ‘&&’ within ‘\|\|’. llvm-svn: 207871	2014-05-02 21:28:49 +00:00
Tim Northover	820e041a3c	DAGCombine: prevent formation of illegal ConstantFP nodes. llvm-svn: 207850	2014-05-02 17:25:02 +00:00
Benjamin Kramer	42d262f410	Allow SelectionDAG::FoldConstantArithmetic to work when it's called with a vector VT but scalar values. llvm-svn: 207835	2014-05-02 12:35:22 +00:00
Juergen Ributzka	37fc0a8ae8	[Stackmaps] Pacify windows buildbot. llvm-svn: 207807	2014-05-01 22:39:26 +00:00
Juergen Ributzka	673a762b80	[Stackmaps] Add command line option to specify the stackmap version. llvm-svn: 207805	2014-05-01 22:21:30 +00:00
Juergen Ributzka	6340195abd	[Stackmaps] Refactor serialization code. No functional change intended. llvm-svn: 207804	2014-05-01 22:21:27 +00:00
Juergen Ributzka	f01e809383	[Stackmaps] Replace the custom ConstantPool class with a MapVector. llvm-svn: 207803	2014-05-01 22:21:24 +00:00
Richard Smith	d730500706	Speculatively roll back r207724-r207726, which are code cleanup changes and appear to be breaking a bootstrapped build of compiler-rt. llvm-svn: 207732	2014-05-01 00:46:58 +00:00
David Blaikie	6b71cc7bac	LexicalScopes: Use unique_ptr to manage ownership of abstract LexicalScopes. llvm-svn: 207726	2014-04-30 23:46:27 +00:00
David Blaikie	998dedac98	Forgotten reformatting. llvm-svn: 207725	2014-04-30 23:42:04 +00:00
David Blaikie	b36914421b	LexicalScopes: use unique_ptr to own LexicalScope objects. Ownership of abstract scopes coming soon. llvm-svn: 207724	2014-04-30 23:40:59 +00:00
Alexey Samsonov	0436caa936	Use a single data structure to store all user variables in DwarfDebug Summary: Get rid of UserVariables set, and turn DbgValues into MapVector to get a fixed ordering, as suggested in review for http://reviews.llvm.org/D3573. Test Plan: llvm regression tests Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3579 llvm-svn: 207720	2014-04-30 23:02:40 +00:00
David Blaikie	899ae61fee	Revert "Emit DW_AT_object_pointer once, on the declaration, for each function." Breaks GDB buildbot (http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/14517) GCC emits DW_AT_object_pointer /everywhere/ (declaration, abstract definition, inlined subroutine), but it looks like GCC relies on it being somewhere other than the declaration, at least. I'll experiment further & can hopefully still remove it from the inlined_subroutine. This reverts commit r207705. llvm-svn: 207719	2014-04-30 22:58:19 +00:00
Joerg Sonnenberger	3c10817b92	Prepare support of Itanium ABI on ARM as opposed to EHABI by conditionally emitting .fnstart and friends only for EHABI. llvm-svn: 207718	2014-04-30 22:43:13 +00:00
David Blaikie	44078b3260	DebugInfo: Omit DW_AT_artificial on DW_TAG_formal_parameters in DW_TAG_inlined_subroutines. They just don't need to be there - they're inherited from the abstract definition. In theory I would like them to be inherited from the declaration, but the DWARF standard doesn't quite say that... we can probably do it anyway but I'm less confident about that so I'll leave it for a separate commit. llvm-svn: 207717	2014-04-30 22:41:33 +00:00
Alexey Samsonov	f74bde6735	Convert more loops to range-based equivalents llvm-svn: 207714	2014-04-30 22:17:38 +00:00
Alexey Samsonov	c74503ea21	Slightly simplify code in DwarfDebug::beginFunction llvm-svn: 207710	2014-04-30 21:44:17 +00:00
Alexey Samsonov	414b6fb170	Move logic for calculating DBG_VALUE history map into separate file/class. Summary: No functionality change. Test Plan: llvm regression test suite. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: echristo, llvm-commits Differential Revision: http://reviews.llvm.org/D3573 llvm-svn: 207708	2014-04-30 21:34:11 +00:00
David Blaikie	3b2a53a437	Emit DW_AT_object_pointer once, on the declaration, for each function. This effectively reverts r164326, but adds some comments and justification and ensures we /don't/ emit the DW_AT_object_pointer on the (abstract and concrete) definitions. (while still preserving it on standalone definitions involving ObjC Blocks) This does increase the size of member function declarations from 7 to 11 bytes, unfortunately, but still seems like the Right Thing to do so that callers that see only the declaration still have the information about the object pointer. That said, I don't know what, if any, DWARF consumers don't have a heuristic to guess this in the case of normal C++ member functions - perhaps we can remove it entirely. llvm-svn: 207705	2014-04-30 21:29:41 +00:00
Weiming Zhao	7f6daf1799	[ARM64] Prevent bit extraction to be adjusted by following shift For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx more difficult. For example: Given %shr = lshr i64 %x, 4 %and = and i64 %shr, 15 %arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and %0 = load i64* %arrayidx With current shift folding, it takes 3 instrs to compute base address: lsr x8, x0, #1 and x8, x8, #0x78 add x8, x9, x8 If using ubfx, it only needs 2 instrs: ubfx x8, x0, #4, #4 add x8, x9, x8, lsl #3 This fixes bug 19589 llvm-svn: 207702	2014-04-30 21:07:24 +00:00
Reid Kleckner	dd2647edcf	Fix the clang-cl self-host build by defining ~DwarfDebug out of line DwarfDebug.h has a SmallVector member containing a unique_ptr of an incomplete type. MSVC doesn't have key functions, so the vtable and dtor are emitted in AsmPrinter.cpp, where DwarfDebug's ctor is called. AsmPrinter.cpp include DwarfUnit.h and doesn't get a complete definition of DwarfTypeUnit. We could fix the problem by including DwarfUnit.h in DwarfDebug.h, but that would increase header bloat. Instead, define ~DwarfDebug out of line. llvm-svn: 207701	2014-04-30 20:34:31 +00:00
Alexey Samsonov	41b977dffd	Convert several loops over MachineFunction basic blocks to range-based loops llvm-svn: 207683	2014-04-30 18:29:51 +00:00
Craig Topper	2d2aa0ca1f	Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I introduced most of these recently. llvm-svn: 207616	2014-04-30 07:17:30 +00:00
David Blaikie	4c1089d0f3	Fix some 80 cols violations committed in r207539 Caught by Eric Christopher in post-commit review. llvm-svn: 207595	2014-04-29 23:43:06 +00:00
Benjamin Kramer	d59664f4f7	raw_ostream: Forward declare OpenFlags and include FileSystem.h only where necessary. llvm-svn: 207593	2014-04-29 23:26:49 +00:00
Jim Grosbach	2eb60fdc85	Tidy up whitespace. llvm-svn: 207583	2014-04-29 22:41:50 +00:00
David Blaikie	e872a6eb91	DwarfDebug: Split the initialization of abstract and non-abstract subprogram DIEs. These were called from distinct places and had significant distinct behavior. No need to make that a dynamic check inside the function rather than just having two functions (refactoring some common code into a helper function to be called from the two separate functions). llvm-svn: 207539	2014-04-29 15:58:35 +00:00
Craig Topper	9d74a5a5f1	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. llvm-svn: 207511	2014-04-29 07:58:41 +00:00
David Blaikie	6ada8e332b	Remove DwarfUnit::LabelRange since it's unused. Seems at some point the intent was to emit fission ranges_base as unique per CU but the code today emits ranges_base as the start of the ranges section for all CUs being compiled and all the ranges_base relative addresses are relative to that. So removing this dead code and leaving the status quo until there's a reason to change it (perhaps something's faster if it has distinct ranges for each CU). llvm-svn: 207464	2014-04-28 23:36:52 +00:00
David Blaikie	b2133cb88d	AddressPool::HasBeenUsed: Add comment explaining the use-case for this flag. Based on code review by Eric Christopher on r207323 llvm-svn: 207460	2014-04-28 22:52:50 +00:00
David Blaikie	46f8201187	DIE: Document some learnings about why the world isn't perfect. llvm-svn: 207458	2014-04-28 22:41:39 +00:00
David Blaikie	d67ffe8b73	Satisfy sub-optimal GCC warning. (Clang doesn't warn here because it knows the string is benign - the assert still checks what it's intended to - though putting the correct parens does make clang-format format the code a little better) llvm-svn: 207456	2014-04-28 22:27:26 +00:00
Eric Christopher	83dd2fad2a	We already calculate WideVT above, just reuse it. Patch by Jan Vesely <jan.vesely@rutgers.edu>. llvm-svn: 207455	2014-04-28 22:24:57 +00:00
Eli Bendersky	6ae9883eeb	Add (...) around && clause to appeace gcc 4.8's warning llvm-svn: 207452	2014-04-28 22:19:12 +00:00
David Blaikie	bd57905321	DebugInfo: Just store the DIE by value in the DwarfUnit Since all 4 ctor calls in DwarfDebug just pass in a trivially constructed DIE with the right tag type, sink the tag selection down into the Dwarf*Unit ctors (removing the argument entirely from callers in DwarfDebug) and initialize the DIE member in DwarfUnit. llvm-svn: 207448	2014-04-28 21:14:27 +00:00
David Blaikie	92a2f8a836	Pass DIEs to DwarfUnit constructors by unique_ptr. llvm-svn: 207447	2014-04-28 21:04:29 +00:00
Eric Christopher	793c7479b5	Reformat, 80-col, tab characters, etc. llvm-svn: 207444	2014-04-28 20:42:22 +00:00
David Blaikie	f244922f43	Improve explicit memory ownership of DIEs Now that the subtle constructScopeDIE has been refactored into two functions - one returning memory to take ownership of, one returning a pointer to already owning memory - push unique_ptr through more APIs. I think this completes most of the unique_ptr ownership of DIEs. llvm-svn: 207442	2014-04-28 20:36:45 +00:00
David Blaikie	d8f0ac7b4a	DwarfDebug: Omit DW_AT_object_pointer on inlined_subroutines While refactoring out constructScopeDIE into two functions I realized we were emitting DW_AT_object_pointer in the inlined subroutine when we didn't need to (GCC doesn't, and the abstract subprogram definition has the information already). So here's the refactoring and the bug fix. This is one step of refactoring to remove some subtle memory ownership semantics. It turns out the original constructScopeDIE returned ownership in its return value in some cases and not in others. The split into two functions now separates those two semantics - further cleanup (unique_ptr, etc) will follow. llvm-svn: 207441	2014-04-28 20:27:02 +00:00
Craig Topper	8c0b4d0791	Convert more SelectionDAG functions to use ArrayRef. llvm-svn: 207397	2014-04-28 05:57:50 +00:00
Craig Topper	e73658ddbb	[C++] Use 'nullptr'. llvm-svn: 207394	2014-04-28 04:05:08 +00:00
Craig Topper	633d99b62d	Convert AddNodeIDNode and SelectionDAG::getNodeIfExiists to use ArrayRef<SDValue> llvm-svn: 207383	2014-04-27 23:22:43 +00:00
Craig Topper	b2ba83cd30	Convert SelectionDAGISel::MorphNode to use ArrayRef. llvm-svn: 207379	2014-04-27 19:21:20 +00:00
Craig Topper	131de82adb	Convert SelectionDAG::MorphNodeTo to use ArrayRef. llvm-svn: 207378	2014-04-27 19:21:16 +00:00
Craig Topper	481fb2879f	Convert SelectionDAG::SelectNodeTo to use ArrayRef. llvm-svn: 207377	2014-04-27 19:21:11 +00:00
Craig Topper	dd5e16dd34	Convert one last signature of getNode to take an ArrayRef of SDUse. llvm-svn: 207376	2014-04-27 19:21:06 +00:00
Craig Topper	bb5330725e	Convert SDNode constructor to use ArrayRef. llvm-svn: 207375	2014-04-27 19:21:02 +00:00
Craig Topper	64941d9786	Convert SelectionDAG::getMergeValues to use ArrayRef. llvm-svn: 207374	2014-04-27 19:20:57 +00:00
Craig Topper	2d7d6052c6	Const-correct SelectionDAG::getAtomic. llvm-svn: 207373	2014-04-27 19:20:47 +00:00
Adrian Prantl	42a0d8c6ef	Clarify the doxygen comment for AsmPrinter::EmitDwarfRegOpPiece and add default arguments to the function. No functional change. llvm-svn: 207372	2014-04-27 18:50:45 +00:00
Benjamin Kramer	ce4b3fee72	X86TTI: Adjust sdiv cost now that we can lower it on plain SSE2. Includes a fix for a horrible typo that caused all SDIV costs to be slightly off :) llvm-svn: 207371	2014-04-27 18:47:54 +00:00
Adrian Prantl	d34db65c84	Debug info: Refactor EmitDwarfRegOpPiece to be a member function of AsmPrinter. No functional change. http://reviews.llvm.org/D3373 rdar://problem/15928306 llvm-svn: 207369	2014-04-27 18:25:45 +00:00
Adrian Prantl	e19e5efe5a	Debug Info: Prepare DebugLocEntry to handle more than a single value per entry. This is in preparation for generic DW_OP_piece support. No functional change so far. http://reviews.llvm.org/D3373 rdar://problem/15928306 llvm-svn: 207368	2014-04-27 18:25:40 +00:00
Benjamin Kramer	322053caa7	Make helper functions static. llvm-svn: 207359	2014-04-27 14:54:59 +00:00
David Blaikie	6afb267fb5	Remove redundant explicit default initialization of non-trivially constructed member. llvm-svn: 207357	2014-04-27 14:47:23 +00:00
NAKAMURA Takumi	4beba42e1e	Add the default constructor DwarfAccelTable::DataArray() to initialize (MCSymbol*)StrSym explicitly. It will fix crash in codegen on msvc x64. llvm-svn: 207356	2014-04-27 11:59:44 +00:00
Benjamin Kramer	6bca8ef667	SelectionDAG: Aggressively fold shuffles of constant splats. llvm-svn: 207352	2014-04-27 11:41:06 +00:00
Benjamin Kramer	da4841b3a9	DAGCombiner: Simplify code a bit, make more transforms work with vectors. llvm-svn: 207338	2014-04-26 23:09:49 +00:00
David Blaikie	45aa56b8ea	DwarfDebug: Roll argument into call. llvm-svn: 207334	2014-04-26 22:37:45 +00:00
David Blaikie	2b4669de8a	DebugInfo: Fix and test a regression caused by r207263 causing the DW_AT_object_pointer to go missing on blocks Noticed by inspection. Test coverage added. llvm-svn: 207333	2014-04-26 22:12:18 +00:00
Craig Topper	206fcd450a	Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer and size. llvm-svn: 207329	2014-04-26 19:29:41 +00:00
Craig Topper	48d114bed1	Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>. llvm-svn: 207327	2014-04-26 18:35:24 +00:00
Craig Topper	963c5d5ef8	Remove an unused version of getMemIntrinsicNode and getNode. Additionally, these were calling makeVTList with the pointers passed in which would were unlikely to belong to SelectionDAG and likely would have just been stack pointers. llvm-svn: 207326	2014-04-26 18:35:13 +00:00
David Blaikie	e12b49a6e8	DWARF Type Units: Avoid emitting type units under fission if the type requires an address. Since there's no way to ensure the type unit in the .dwo and the type unit skeleton in the .o are correlated, this cannot work. This implementation is a bit inefficient for a few reasons, called out in comments. llvm-svn: 207323	2014-04-26 17:27:38 +00:00
David Blaikie	f3de2ab46c	DwarfDebug: Minor refactoring around type unit construction Sinking addition of the declaration attribute down to where the signature is added. So that if the signature is not added neither is the declaration attribute (this will come in handy when aborting type unit construction to instead emit the type into the CU directly in some cases) Pull out type unit identifier hashing just to simplify the function a little, it'll be getting longer. llvm-svn: 207321	2014-04-26 16:26:41 +00:00
Benjamin Kramer	ad0168702a	Rip out X86-specific vector SDIV lowering, make the corresponding DAGCombiner transform work on vectors. llvm-svn: 207316	2014-04-26 13:00:53 +00:00
Benjamin Kramer	4dae598bc8	DAGCombiner: Turn divs of vector splats into vectorized multiplications. Otherwise the legalizer would just scalarize everything. Support for mulhi in the targets isn't that great yet so on most targets we get exactly the same scalarized output. Add a test for x86 vector udiv. I had to disable the mulhi nodes on ARM because there aren't any patterns for it. As far as I know ARM has instructions for getting the high part of a multiply so this should be fixed. llvm-svn: 207315	2014-04-26 12:06:28 +00:00
Michael Zolotukhin	1a97a7bcbf	Revert r206749 till a final decision about the intrinsics is made. llvm-svn: 207313	2014-04-26 09:56:41 +00:00
Juergen Ributzka	a6bda8bae2	[DAG] During DAG legalization keep opaque constants even after expanding. The included test case would return the incorrect results, because the expansion of an shift with a constant shift amount of 0 would generate undefined behavior. This is because ExpandShiftByConstant assumes that all shifts by constants with a value of 0 have already been optimized away. This doesn't happen for opaque constants and usually this isn't a problem, because opaque constants won't take this code path - they are not supposed to. In the case that the opaque constant has to be expanded by the legalizer, the legalizer would drop the opaque flag. In this case we hit the limitations of ExpandShiftByConstant and create incorrect code. This commit fixes the legalizer by not dropping the opaque flag when expanding opaque constants and adding an assertion to ExpandShiftByConstant to catch this not supported case in the future. This fixes <rdar://problem/16718472> llvm-svn: 207304	2014-04-26 02:58:04 +00:00
Eric Christopher	ece0e90e33	Make sure that rangelists are also relative to the compile unit low_pc similar to location lists. Fixes PR19563 llvm-svn: 207283	2014-04-25 22:23:54 +00:00
David Blaikie	772ab8ae5a	DwarfAccelTable: Store the string symbol in the accelerator table to avoid duplicate lookup. This also avoids the need for subtly side-effecting calls to manifest strings in the string table at the point where items are added to the accelerator tables. llvm-svn: 207281	2014-04-25 22:21:35 +00:00
David Blaikie	daefdbf3ad	Encapsulate the DWARF string pool in a separate type. Pulls out some more code from some of the rather monolithic DWARF classes. Unlike the address table, the string table won't move up into DwarfDebug - each DWARF file has its own string table (but there can be only one address table). llvm-svn: 207277	2014-04-25 21:34:35 +00:00
Adrian Prantl	32da88923a	This reapplies r207235 with an additional bugfixes caught by the msan buildbot - do not insert debug intrinsics before phi nodes. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207269	2014-04-25 20:49:25 +00:00
David Blaikie	0eb13ce85a	DwarfUnit: Remove unused function llvm-svn: 207264	2014-04-25 20:02:24 +00:00
David Blaikie	914046e1e7	DIE: Pass ownership of children via std::unique_ptr rather than raw pointer. This should reduce the chance of memory leaks like those fixed in r207240. There's still some unclear ownership of DIEs happening in DwarfDebug. Pushing unique_ptr and references through more APIs should help expose the cases where ownership is a bit fuzzy. llvm-svn: 207263	2014-04-25 20:00:34 +00:00
David Blaikie	8dbcc3fe32	DIEEntry: Refer to the specified DIE via reference rather than pointer. Makes some more cases (the unit tests, specifically), lexically compatible with a change to unique_ptr. llvm-svn: 207261	2014-04-25 19:33:43 +00:00
David Blaikie	b0b3fcf6d3	DwarfUnit: return by reference from createAndAddDIE Since this doesn't return ownership (the DIE has been added to the specified parent already) nor return null, just return by reference. llvm-svn: 207259	2014-04-25 18:52:29 +00:00
David Blaikie	adcde36ceb	Return DIE by reference instead of pointer from DwarfUnit::getUnitDie llvm-svn: 207255	2014-04-25 18:35:57 +00:00
David Blaikie	65a7466675	DwarfUnit: Suddently, DIE references, everywhere. This'll make changing to unique_ptr ownership of DIEs easier since the usages will now have '*' on them making them textually compatible between unique_ptr and raw pointer. llvm-svn: 207253	2014-04-25 18:26:14 +00:00
Adrian Prantl	d2d9b76e48	Revert "This reapplies r207130 with an additional testcase+and a missing check for" This reverts commit 207235 to investigate msan buildbot breakage. llvm-svn: 207250	2014-04-25 18:18:09 +00:00
David Blaikie	e071fc8082	Refactor some common logic in DwarfUnit::constructVariableDIE and pass non-null DIE by reference to DbgVariable::setDIE llvm-svn: 207244	2014-04-25 17:32:19 +00:00
Adrian Prantl	f5834a4b49	This reapplies r207130 with an additional testcase+and a missing check for AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207235	2014-04-25 17:01:00 +00:00
David Blaikie	69d0cf06bc	Add missing cpp file header Code review feedback from Paul Robinson on r207022 llvm-svn: 207198	2014-04-25 06:22:32 +00:00
Adrian Prantl	6e5de2ea06	Revert "This reapplies r207130 with an additional testcase+and a missing check for" Typo in testcase. llvm-svn: 207166	2014-04-25 00:42:50 +00:00
Adrian Prantl	3512190ab3	This reapplies r207130 with an additional testcase+and a missing check for AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207165	2014-04-25 00:38:40 +00:00
Adrian Prantl	ff4282a204	Revert "Debug info for optimized code: Support variables that are on the stack and" This reverts commit 207130 for buildbot breakage. llvm-svn: 207162	2014-04-25 00:04:49 +00:00
Richard Smith	a4b7cfd64f	Remove C++11ism (specializing a template in a surrounding namespace) to appease the buildbots. llvm-svn: 207136	2014-04-24 18:49:15 +00:00
Richard Smith	0d9ec713e7	[modules] "Specialize" a function by actually specializing a function template rather than by adding an overload and hoping that it's declared before the code that calls it. (In a modules build, it isn't.) llvm-svn: 207133	2014-04-24 18:27:29 +00:00
Adrian Prantl	f4223918de	Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into indirect DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine-intrinsics testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207130	2014-04-24 17:41:45 +00:00
Craig Topper	353eda484c	[C++] Use 'nullptr'. llvm-svn: 207083	2014-04-24 06:44:33 +00:00
David Blaikie	31f2900ae6	Remove unused parameter llvm-svn: 207061	2014-04-24 01:25:10 +00:00
David Blaikie	18d337508c	Remove the intermediate AccelTypes maps in DWARF units. llvm-svn: 207060	2014-04-24 01:23:49 +00:00
David Blaikie	ecf0415245	Remove the intermediate AccelNamespace maps in DWARF units. llvm-svn: 207059	2014-04-24 01:02:42 +00:00
David Blaikie	0ee82b95cb	Remove the intermediate AccelObjC maps in DWARF units llvm-svn: 207057	2014-04-24 00:53:32 +00:00
David Blaikie	27931a41e4	And actually use the DwarfDebug::AccelNames to emit the names. Fix for r207049 which would've emitted no accelerated names at all... llvm-svn: 207051	2014-04-23 23:46:25 +00:00
David Blaikie	f2505d6995	More formatting... llvm-svn: 207050	2014-04-23 23:38:39 +00:00
David Blaikie	2406a0627c	Remove intermediate accelerator table for names. (similar changes coming for the other accelerator tables) llvm-svn: 207049	2014-04-23 23:37:35 +00:00
David Blaikie	2c0f4ef241	DwarfAccelTable: Remove trivial dtor and simplify construction with an array. llvm-svn: 207044	2014-04-23 23:03:45 +00:00
David Blaikie	d75fb28ae7	Move the AddressPool from DwarfFile to DwarfDebug. There's only ever one address pool, not one per DWARF output file, so let's just have one. (similar refactoring of the string pool to come soon) llvm-svn: 207026	2014-04-23 21:20:10 +00:00
David Blaikie	8fb87eee17	clang-format for my previous commit (I keep forgetting... ) llvm-svn: 207025	2014-04-23 21:20:07 +00:00
David Blaikie	e226b08ee9	Separate out the DWARF address pool into its own type/files. llvm-svn: 207022	2014-04-23 21:04:59 +00:00
David Blaikie	05e736fb8a	clang-format r207010 llvm-svn: 207016	2014-04-23 19:44:08 +00:00
David Blaikie	85f80d7122	Split out DwarfFile from DwarfDebug into its own .h/.cpp files. Some of these types (DwarfDebug in particular) are quite large to begin with (and I keep forgetting whether DwarfFile is in DwarfDebug or DwarfUnit... ) so having a few smaller files seems like goodness. llvm-svn: 207010	2014-04-23 18:54:00 +00:00
Evgeniy Stepanov	0a951b775e	Create MCTargetOptions. For now it contains a single flag, SanitizeAddress, which enables AddressSanitizer instrumentation of inline assembly. Patch by Yuri Gorshenin. llvm-svn: 206971	2014-04-23 11:16:03 +00:00
David Blaikie	637cac42ed	Requisite reformatting for previous commit. llvm-svn: 206927	2014-04-22 23:09:36 +00:00
David Blaikie	f9b6a558c8	Push memory ownership of DwarfUnits into clients of DwarfFile. This prompted me to push references through most of DwarfDebug. Sorry for the churn. Honestly it's a bit silly that we're passing around units all over the place like that anyway and I think it's mostly due to the DIE attribute adding utility functions being utilities in DwarfUnit. I should have another go at moving them out of DwarfUnit... llvm-svn: 206925	2014-04-22 22:39:41 +00:00
David Blaikie	c33b3cdb0c	Use std::unique_ptr to handle ownership of DwarfUnits in DwarfFile. So Chandler - how about those range algorithms? (would really love a dereferencing range adapter for this sort of stuff) llvm-svn: 206921	2014-04-22 21:27:37 +00:00
David Blaikie	5f1a001071	Simplify address pool index assignment. llvm-svn: 206905	2014-04-22 17:21:40 +00:00
Hao Liu	c636d15284	Fix an infinite loop bug in DAG Combine about keeping transfering between ANY_EXTEND and SIGN_EXTEND. llvm-svn: 206873	2014-04-22 09:57:06 +00:00
David Blaikie	afd2c6be0e	Revert "Use value semantics to manage DbgVariables rather than dynamic allocation/pointers." This reverts commit r206780. This commit was regressing gdb.opt/inline-locals.exp in the GDB 7.5 test suite. Reverting until I can fix the issue. llvm-svn: 206867	2014-04-22 05:41:06 +00:00
Chandler Carruth	1b9dde087e	[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837	2014-04-22 02:02:50 +00:00
Quentin Colombet	d4f44690ef	[CodeGenPrepare] Use APInt to check the value of the immediate in a and while checking candidate for bit field extract. Otherwise the value may not fit in uint64_t and this will trigger an assertion. This fixes PR19503. llvm-svn: 206834	2014-04-22 01:20:34 +00:00
Chandler Carruth	e96dd8975f	[Modules] Make Support/Debug.h modular. This requires it to not change behavior based on other files defining DEBUG_TYPE, which means it cannot define DEBUG_TYPE at all. This is actually better IMO as it forces folks to define relevant DEBUG_TYPEs for their files. However, it requires all files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't already. I've updated all such files in LLVM and will do the same for other upstream projects. This still leaves one important change in how LLVM uses the DEBUG_TYPE macro going forward: we need to only define the macro after header files have been #include-ed. Previously, this wasn't possible because Debug.h required the macro to be pre-defined. This commit removes that. By defining DEBUG_TYPE after the includes two things are fixed: - Header files that need to provide a DEBUG_TYPE for some inline code can do so by defining the macro before their inline code and undef-ing it afterward so the macro does not escape. - We no longer have rampant ODR violations due to including headers with different DEBUG_TYPE definitions. This may be mostly an academic violation today, but with modules these types of violations are easy to check for and potentially very relevant. Where necessary to suppor headers with DEBUG_TYPE, I have moved the definitions below the includes in this commit. I plan to move the rest of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big enough. The comments in Debug.h, which were hilariously out of date already, have been updated to reflect the recommended practice going forward. llvm-svn: 206822	2014-04-21 22:55:11 +00:00
Yi Jiang	b23edebdd2	Set default value of HasExtractBitsInsn to false llvm-svn: 206803	2014-04-21 22:22:44 +00:00
Hal Finkel	bae796f0dc	Remove seemingly-unneeded artificial dependency The rationale for this artificial dependency seems to have been lost to the ravages of time, it is covered by no regression tests, and has no impact on test-suite performance numbers on either x86 or PPC. For the test suite, on both x86 and PPC, I ran the test suite 10 times (both as a baseline and with this change), and found no statistically-significant changes. For PPC, I used a P7 box. For x86, I used an Intel Xeon E5430. Both with -O3 -mcpu=native. This was discussed on-list back in January, but I've not had a chance to run the performance tests until today. llvm-svn: 206795	2014-04-21 21:30:25 +00:00
David Blaikie	2b1dfa7244	Use unique_ptr to handle ownership of UserValues in LiveDebugVariablesImpl llvm-svn: 206785	2014-04-21 20:37:07 +00:00
David Blaikie	422b93dcf1	Use unique_ptr to manage objects owned by the ScheduleDAGMI. llvm-svn: 206784	2014-04-21 20:32:32 +00:00
David Blaikie	b0b7b18e8c	Use value semantics to manage DbgVariables rather than dynamic allocation/pointers. Requires switching some vectors to lists to maintain pointer validity. These could be changed to forward_lists (singly linked) with a bit more work - I've left comments to that effect. llvm-svn: 206780	2014-04-21 20:13:09 +00:00
Chandler Carruth	6d23a7b600	[Modules] Sink the DEBUG_TYPE macro out of LegalizeTypes.h and into the various .cpp files. This macro is inherently non-modular, and it wasn't even needed in this header file. llvm-svn: 206775	2014-04-21 19:43:07 +00:00
Yi Jiang	d069f6393a	ARM64: Combine shifts and uses from different basic block to bit-extract instruction llvm-svn: 206774	2014-04-21 19:34:27 +00:00
Matt Arsenault	443252c011	Fix unnecessary line break llvm-svn: 206772	2014-04-21 18:39:13 +00:00
Duncan P. N. Exon Smith	10be9a8868	Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commit r206707, reapplying r206704. The preceding commit to CalcSpillWeights should have sorted out the failing buildbots. <rdar://problem/14292693> llvm-svn: 206766	2014-04-21 17:57:07 +00:00
Duncan P. N. Exon Smith	7af3432e22	CalcSpillWeights: Hack to prevent x87 nonsense This gross hack forces `hweight` into memory, preventing hidden precision from making `1 > 1` occasionally equal `true`. <rdar://problem/14292693> llvm-svn: 206765	2014-04-21 17:57:01 +00:00
Michael Zolotukhin	f2ba994bf6	Reapply r206732. This time without optimization of branches. llvm-svn: 206749	2014-04-21 12:01:33 +00:00
Chandler Carruth	a2533a7bef	Revert r206732 which is causing llc to crash on most of the build bots. Original commit message: Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN, safe.srem.iN, safe.urem.iN (iN = i8, i61, i32, or i64). llvm-svn: 206735	2014-04-21 07:11:15 +00:00
Michael Zolotukhin	137a84616c	Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN, safe.srem.iN, safe.urem.iN (iN = i8, i16, i32, or i64). llvm-svn: 206732	2014-04-21 05:33:09 +00:00
Duncan P. N. Exon Smith	e63327e967	Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commit r206704, as expected. llvm-svn: 206707	2014-04-19 22:46:00 +00:00
Duncan P. N. Exon Smith	875ddfac75	Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commit r206677, reapplying my BlockFrequencyInfo rewrite. I've done a careful audit, added some asserts, and fixed a couple of bugs (unfortunately, they were in unlikely code paths). There's a small chance that this will appease the failing bots [1][2]. (If so, great!) If not, I have a follow-up commit ready that will temporarily add -debug-only=block-freq to the two failing tests, allowing me to compare the code path between what the failing bots and what my machines (and the rest of the bots) are doing. Once I've triggered those builds, I'll revert both commits so the bots go green again. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 [2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445 <rdar://problem/14292693> llvm-svn: 206704	2014-04-19 22:34:26 +00:00
Yaron Keren	d7ba46b287	Patch by Vadim Chugunov Win64 stack unwinder gets confused when execution flow "falls through" after a call to 'noreturn' function. This fixes the "missing epilogue" problem by emitting a trap instruction for IR 'unreachable' on x86_x64-pc-windows. A secondary use for it would be for anyone wanting to make double-sure that 'noreturn' functions, indeed, do not return. llvm-svn: 206684	2014-04-19 13:47:43 +00:00
Duncan P. N. Exon Smith	76b813619a	Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2 ) This reverts commit r206666, as planned. Still stumped on why the bots are failing. Sanitizer bots haven't turned anything up. If anyone can help me debug either of the failures (referenced in r206666) I'll owe them a beer. (In the meantime, I'll be auditing my patch for undefined behaviour.) llvm-svn: 206677	2014-04-19 00:42:46 +00:00
Duncan P. N. Exon Smith	b3caf3646f	Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2 ) This reverts commit r206628, reapplying r206622 (and r206626). Two tests are failing only on buildbots [1][2]: i.e., I can't reproduce on Darwin, and Chandler can't reproduce on Linux. Asan and valgrind don't tell us anything, but we're hoping the msan bot will catch it. So, I'm applying this again to get more feedback from the bots. I'll leave it in long enough to trigger builds in at least the sanitizer buildbots (it was failing for reasons unrelated to my commit last time it was in), and hopefully a few others.... and then I expect to revert a third time. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 [2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445 llvm-svn: 206666	2014-04-18 22:30:03 +00:00
Duncan P. N. Exon Smith	0842ff36a6	Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2 ) This reverts commit r206622 and the MSVC fixup in r206626. Apparently the remotely failing tests are still failing, despite my attempt to fix the nondeterminism in r206621. llvm-svn: 206628	2014-04-18 17:56:08 +00:00
Andrew Trick	1766f93b35	Better comments to explain buffered/unbuffered processor resources. llvm-svn: 206625	2014-04-18 17:35:08 +00:00
Duncan P. N. Exon Smith	f8361d127a	Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commit r206556, effectively reapplying commit r206548 and its fixups in r206549 and r206550. In an intervening commit I've added target triples to the tests that were failing remotely [1] (but passing locally). I'm hoping the mystery is solved? I'll revert this again if the tests are still failing remotely. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 llvm-svn: 206622	2014-04-18 17:22:25 +00:00
Duncan P. N. Exon Smith	e576167df8	Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commits r206548, r206549 and r206549. There are some unit tests failing that aren't failing locally [1], so reverting until I have time to investigate. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 llvm-svn: 206556	2014-04-18 02:17:43 +00:00
Duncan P. N. Exon Smith	12e68e1733	blockfreq: Rewrite BlockFrequencyInfoImpl Rewrite the shared implementation of BlockFrequencyInfo and MachineBlockFrequencyInfo entirely. The old implementation had a fundamental flaw: precision losses from nested loops (or very wide branches) compounded past loop exits (and convergence points). The @nested_loops testcase at the end of test/Analysis/BlockFrequencyAnalysis/basic.ll is motivating. This function has three nested loops, with branch weights in the loop headers of 1:4000 (exit:continue). The old analysis gives non-sensical results: Printing analysis 'Block Frequency Analysis' for function 'nested_loops': ---- Block Freqs ---- entry = 1.0 for.cond1.preheader = 1.00103 for.cond4.preheader = 5.5222 for.body6 = 18095.19995 for.inc8 = 4.52264 for.inc11 = 0.00109 for.end13 = 0.0 The new analysis gives correct results: Printing analysis 'Block Frequency Analysis' for function 'nested_loops': block-frequency-info: nested_loops - entry: float = 1.0, int = 8 - for.cond1.preheader: float = 4001.0, int = 32007 - for.cond4.preheader: float = 16008001.0, int = 128064007 - for.body6: float = 64048012001.0, int = 512384096007 - for.inc8: float = 16008001.0, int = 128064007 - for.inc11: float = 4001.0, int = 32007 - for.end13: float = 1.0, int = 8 Most importantly, the frequency leaving each loop matches the frequency entering it. The new algorithm leverages BlockMass and PositiveFloat to maintain precision, separates "probability mass distribution" from "loop scaling", and uses dithering to eliminate probability mass loss. I have unit tests for these types out of tree, but it was decided in the review to make the classes private to BlockFrequencyInfoImpl, and try to shrink them (or remove them entirely) in follow-up commits. The new algorithm should generally have a complexity advantage over the old. The previous algorithm was quadratic in the worst case. The new algorithm is still worst-case quadratic in the presence of irreducible control flow, but it's linear without it. The key difference between the old algorithm and the new is that control flow within a loop is evaluated separately from control flow outside, limiting propagation of precision problems and allowing loop scale to be calculated independently of mass distribution. Loops are visited bottom-up, their loop scales are calculated, and they are replaced by pseudo-nodes. Mass is then distributed through the function, which is now a DAG. Finally, loops are revisited top-down to multiply through the loop scales and the masses distributed to pseudo nodes. There are some remaining flaws. - Irreducible control flow isn't modelled correctly. LoopInfo and MachineLoopInfo ignore irreducible edges, so this algorithm will fail to scale accordingly. There's a note in the class documentation about how to get closer. See also the comments in test/Analysis/BlockFrequencyInfo/irreducible.ll. - Loop scale is limited to 4096 per loop (2^12) to avoid exhausting the 64-bit integer precision used downstream. - The "bias" calculation proposed on llvmdev is not incorporated here. This will be added in a follow-up commit, once comments from this review have been handled. llvm-svn: 206548	2014-04-18 01:57:45 +00:00
Diego Novillo	0915c047c2	Fix bug 19437 - Only add discriminators for DWARF 4 and above. Summary: This prevents the discriminator generation pass from triggering if the DWARF version being used in the module is prior to 4. Reviewers: echristo, dblaikie CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3413 llvm-svn: 206507	2014-04-17 22:33:50 +00:00
Josh Magee	adfde5fef6	[stack protector] Make the StackProtector pass respect ssp-buffer-size. Previously, SSPBufferSize was assigned the value of the "stack-protector-buffer-size" attribute after all uses of SSPBufferSize. The effect was that the default SSPBufferSize was always used during analysis. I moved the check for the attribute before the analysis; now --param ssp-buffer-size= works correctly again. Differential Revision: http://reviews.llvm.org/D3349 llvm-svn: 206486	2014-04-17 19:08:36 +00:00
Tim Northover	037f26f212	Atomics: promote ARM's IR-based atomics pass to CodeGen. Still only 32-bit ARM using it at this stage, but the promotion allows direct testing via opt and is a reasonably self-contained patch on the way to switching ARM64. At this point, other targets should be able to make use of it without too much difficulty if they want. (See ARM64 commit coming soon for an example). llvm-svn: 206485	2014-04-17 18:22:47 +00:00
Jim Grosbach	6623e7f94a	[c++11] Tidy up AsmPrinter.cpp. Range'ify loops and tidy up some by-reference handling. No functional change. llvm-svn: 206422	2014-04-16 22:38:02 +00:00
Tim Northover	863a789a99	DAGCombiner: don't optimise non-existant litpool load This particular DAG combine is designed to kick in when both ConstantFPs will end up being loaded via a litpool, however those nodes have a semi-legal status, dictated by isFPImmLegal so in some cases there wouldn't have been a litpool in the first place. Don't try to be clever in those circumstances. Picked up while merging some AArch64 tests. llvm-svn: 206365	2014-04-16 09:03:09 +00:00
Craig Topper	abb4ac7f87	Convert SelectionDAG::getVTList to use ArrayRef llvm-svn: 206357	2014-04-16 06:10:51 +00:00
Craig Topper	ada0857679	[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr. llvm-svn: 206356	2014-04-16 04:21:27 +00:00
Akira Hatanaka	3d90f99d1a	Make FastISel::SelectInstruction return before target specific fast-isel code handles Intrinsic::trap if TargetOptions::TrapFuncName is set. This fixes a bug in which the trap function was not taken into consideration when a program was compiled without optimization (at -O0). <rdar://problem/16291933> llvm-svn: 206323	2014-04-15 21:30:06 +00:00
Robert Lougher	a9bf2463b9	Revert r191049/r191059 as it can produce wrong code (see PR17975). It has already been reverted on the 3.4 branch in r196521. llvm-svn: 206311	2014-04-15 18:34:24 +00:00

... 10 11 12 13 14 ...

17618 Commits