llvm-project

Commit Graph

Author	SHA1	Message	Date
David Majnemer	2df38cd0c4	[InstSimplify] add nuw %x, C2 must be at least C2 Use the fact that add nuw always creates a larger bit pattern when trying to simplify comparisons. llvm-svn: 245638	2015-08-20 23:01:41 +00:00
Sanjoy Das	e472d8a57a	[InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245635	2015-08-20 22:31:55 +00:00
Michael Zolotukhin	51b00e6d82	[SLP] Propagate 'nontemporal' attribute into vectorized instructions. llvm-svn: 245633	2015-08-20 22:28:15 +00:00
Michael Zolotukhin	2a3d99fedf	[LoopVectorize] Propagate 'nontemporal' attribute into vectorized instructions. llvm-svn: 245632	2015-08-20 22:27:38 +00:00
Ahmed Bougacha	0cdc7719f0	[X86] Look for scalar through one bitcast when lowering to VBROADCAST. Fixes PR23464: one way to use the broadcast intrinsics is: _mm256_broadcastw_epi16(_mm_cvtsi32_si128((int)src)); We don't currently fold this, but now that we use native IR for the intrinsics (r245605), we can look through one bitcast to find the broadcast scalar. Differential Revision: http://reviews.llvm.org/D10557 llvm-svn: 245613	2015-08-20 21:02:39 +00:00
Ahmed Bougacha	69a17acb74	[X86] Add some broadcast-from-memory tests. llvm-svn: 245612	2015-08-20 20:59:41 +00:00
Jingyue Wu	ca3ef11a9b	[NVPTX] truncating 64-bit to 32-bit is free Summary: Add an LSR test that exercises isTruncateFree. Without this change, LSR creates another indvar representing the truncated value. Reviewers: jholewinski, eliben Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D12058 llvm-svn: 245611	2015-08-20 20:59:02 +00:00
Ahmed Bougacha	1a498705e4	[X86] Replace avx2 broadcast intrinsics with native IR. Since r245605, the clang headers don't use these anymore. r245165 updated some of the tests already; update the others, add an autoupgrade, remove the intrinsics, and cleanup the definitions. Differential Revision: http://reviews.llvm.org/D10555 llvm-svn: 245606	2015-08-20 20:36:19 +00:00
Adrian Prantl	baf90fc265	Fix a bug that caused SimplifyCFG to drop DebugLocs. Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all metadata in KnownSet, but the condition for DebugLocs was inverted. Most users of dropUnknownMetadata() actually worked around this by not adding LLVMContext::MD_dbg to their list of KnowIDs. This is now made explicit. llvm-svn: 245589	2015-08-20 18:24:02 +00:00
Adrian Prantl	a317cd2583	Fix a debug location handling bug in GVN. Caught by the famous "DebugLoc describes the currect SubProgram" assertion. When GVN is removing a nonlocal load it updates the debug location of the SSA value it replaced the load with with the one of the load. In the testcase this actually overwrites a valid debug location with an empty one. In reality GVN has to make an arbitrary choice between two equally valid debug locations. This patch changes to behavior to only update the location if the value doesn't already have a debug location. llvm-svn: 245588	2015-08-20 18:23:56 +00:00
Rafael Espindola	c30c7c493f	Fix symbol value computation when part of the expression is weak. This matches the behaviour of the gnu assembler and is part of fixing pr24486. llvm-svn: 245576	2015-08-20 16:18:30 +00:00
Douglas Katzman	58195a2d74	[Sparc]: correct the 'set' synthetic instruction Differential Revision: http://reviews.llvm.org/D12194 llvm-svn: 245575	2015-08-20 16:16:16 +00:00
Balaram Makam	ccf59731e3	Optimize bitwise even/odd test (-x&1 -> x&1) to not use negation. Summary: We know that -x & 1 is equivalent to x & 1, avoid using negation for testing if a negative integer is even or odd. Reviewers: majnemer Subscribers: junbuml, mssimpso, gberry, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D12156 llvm-svn: 245569	2015-08-20 15:35:00 +00:00
Zoran Jovanovic	56585d517b	[mips][microMIPS] Add microMIPS32r6 and microMIPS64r6 tests for existing 16-bit ADDIUR1SP, ADDIUR2, ADDIUS5 and ADDIUSP instructions Differential Revision: http://reviews.llvm.org/D10955 llvm-svn: 245554	2015-08-20 11:51:49 +00:00
Marina Yatsina	bce1ab67a5	[X86] Fix FBLD and FBSTP FBLD and FBSTP should receive TBYTE because it is defined as FBLD m80 FBSTP m80 Differential Revision: http://reviews.llvm.org/D11748 llvm-svn: 245553	2015-08-20 11:51:24 +00:00
Marina Yatsina	7a4e1ba737	[X86] Fix bug in COMISD and COMISS definition in td files COMISD should receive QWORD because it is defined as (V)COMISD xmm1, xmm2/m64 COMISS should receive DWORD because it is defined as (V)COMISS xmm1, xmm2/m32 Differential Revision: http://reviews.llvm.org/D11712 llvm-svn: 245551	2015-08-20 11:21:36 +00:00
David Majnemer	cfc1df553e	[X86] Fix the (shl (and (setcc_c), c1), c2) -> (and setcc_c, (c1 << c2)) fold We didn't check for the necessary preconditions before folding a mask/shift into a single mask. This fixes PR24516. llvm-svn: 245544	2015-08-20 09:00:56 +00:00
Bjorn Steinbrink	2e2f66557e	Revert "[DSE] Enable removal of lifetime intrinsics in terminating blocks" llvm-svn: 245543	2015-08-20 08:58:47 +00:00
Bjorn Steinbrink	cc7e8a9705	[DSE] Enable removal of lifetime intrinsics in terminating blocks Usually DSE is not supposed to remove lifetime intrinsics, but it's actually ok to remove them for dead objects in terminating blocks, because they convey no extra information there. Until we hit a lifetime start that cannot be removed, that is. Because from that point on the lifetime intrinsics become interesting again, e.g. for stack coloring. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11710 llvm-svn: 245542	2015-08-20 08:25:28 +00:00
Hal Finkel	9fdce9adee	[PowerPC] Fix value type on XVCMPEQDP for v2f64 comparisons XVCMPEQDP is used for VSX v2f64 equality comparisons, but the value type needs to be v2i64 (as that's the corresponding SETCC type). Fixes PR24225. llvm-svn: 245535	2015-08-20 03:02:02 +00:00
Hal Finkel	be78c25acb	[PowerPC] Fix the int2fp(fp2int(x)) DAGCombine to ignore ppc_fp128 This DAGCombine was creating custom SDAG nodes with an illegal ppc_fp128 operand type because it was triggering on f64/f32 int2fp(fp2int(ppc_fp128 x)), but shouldn't (it should only apply to f32/f64 types). The result was a crash. llvm-svn: 245530	2015-08-20 01:18:20 +00:00
Alex Lorenz	36efd3883d	MIR Serialization: Use the global value syntax for global value memory operands. This commit modifies the serialization syntax so that the global IR values in machine memory operands use the global value '@<name>' syntax instead of the current '%ir.<name>' syntax. The unnamed global IR values are handled by this commit as well, as the existing global value parsing method can parse the unnamed globals already. llvm-svn: 245527	2015-08-20 00:20:03 +00:00
Alex Lorenz	0d009645a1	MIR Serialization: Change syntax for the call entry pseudo source values. The global IR values in machine memory operands should use the global value '@<name>' syntax instead of the current '%ir.<name>' syntax. However, the global value call entry pseudo source values use the global value syntax already. Therefore, the syntax for the call entry pseudo source values has to be changed so that the global values and call entry global value PSVs can be parsed without ambiguities. llvm-svn: 245526	2015-08-20 00:12:57 +00:00
Alex Lorenz	dd13be0bcc	MIR Serialization: Serialize unnamed local IR values in memory operands. llvm-svn: 245521	2015-08-19 23:31:05 +00:00
Sanjay Patel	9e5927fdc3	[x86] enable machine combiner reassociations for scalar double-precision min/max llvm-svn: 245506	2015-08-19 21:27:27 +00:00
Sanjay Patel	4e3ee1e548	[x86] enable machine combiner reassociations for scalar single-precision maximums llvm-svn: 245504	2015-08-19 21:18:46 +00:00
Simon Pilgrim	35f528262f	[DAGCombiner] Added SMAX/SMIN/UMAX/UMIN constant folding We still need to add constant folding of vector comparisons to fold the tests for targets that don't support the respective min/max nodes I needed to update 2011-12-06-AVXVectorExtractCombine to load a vector instead of using a constant vector to prevent it folding Differential Revision: http://reviews.llvm.org/D12118 llvm-svn: 245503	2015-08-19 21:11:58 +00:00
Juergen Ributzka	b12248e9cd	[AArch64][FastISel] Don't fold shifts with UB. We are already falling back to SelectionDAG when encountering an shift with UB. This adds the same checks for shifts with UB that get folded into arithmetic or logical operations. This fixes rdar://problem/22345295. llvm-svn: 245499	2015-08-19 20:52:55 +00:00
David Majnemer	f25fe64716	[X86] Emit more efficient >= comparisons against 0 We don't do a great job with >= 0 comparisons against zero when the result is used as an i8. Given something like: void f(long long LL, bool B) { B = LL >= 0; } We used to generate: shrq $63, %rdi xorb $1, %dil movb %dil, (%rsi) Now we generate: testq %rdi, %rdi setns (%rsi) Differential Revision: http://reviews.llvm.org/D12136 llvm-svn: 245498	2015-08-19 20:51:40 +00:00
Dan Gohman	dde8dce6a9	[WebAssembly] Use the default alignment for SIMD types. Previously WebAssembly's datalayout string had -v128:8:128. This had been an attempt to declare a certain level of support for unaligned SIMD accesses. However, clang makes its own determinations for SIMD alignment that are independent of the datalayout string, so this wasn't actually meaningful. llvm-svn: 245494	2015-08-19 20:30:20 +00:00
Simon Pilgrim	989cbbd2f5	[DAGCombiner] Fold CONCAT_VECTORS of EXTRACT_SUBVECTOR (or undef) to VECTOR_SHUFFLE. Check to see if this is a CONCAT_VECTORS of a bunch of EXTRACT_SUBVECTOR operations. If so, and if the EXTRACT_SUBVECTOR vector inputs come from at most two distinct vectors the same size as the result, attempt to turn this into a legal shuffle. Differential Revision: http://reviews.llvm.org/D12125 llvm-svn: 245490	2015-08-19 20:09:50 +00:00
Paul Robinson	9c10414ce0	Minor tidying of regex in a test llvm-svn: 245486	2015-08-19 19:36:35 +00:00
Douglas Katzman	2362b69dd9	[Sparc]: asm-only support for the ldstub instruction. llvm-svn: 245485	2015-08-19 19:30:57 +00:00
Alex Lorenz	5ef93b0c4c	MIR Serialization: Serialize instruction's register ties. This commit serializes the machine instruction's register operand ties. The ties are printed out only when the instructon has register ties that are different from the ties that are specified in the instruction's description. llvm-svn: 245482	2015-08-19 19:05:34 +00:00
Nemanja Ivanovic	5f1cea4141	Temporary fix for the self-host failures introduced by rL244921. This revision has introduced an issue that only affects bootstrapped compiler when it is printing the ASM. I am working on resolving the issue, but in the meantime, I'm disabling the legalization of scalar_to_vector operation for v2i64 and the associated testing until I can get this fixed. llvm-svn: 245481	2015-08-19 19:04:47 +00:00
Alex Lorenz	e66a7ccf77	MIR Serialization: Serialize defined registers that require 'def' register flag. The defined registers are already serialized - they are represented by placing them before the '=' in a machine instruction. However, certain instructions like INLINEASM can have defined register operands after the '=', so this commit introduces the 'def' register flag for such operands. llvm-svn: 245480	2015-08-19 18:55:47 +00:00
Bruno Cardoso Lopes	27fd06922b	[PeepholeOptimizer] Look through PHIs to find additional register sources Reintroduce r245442. Remove an overly conservative assertion introduced in r245442. We could replace the assertion to use `shareSameRegisterFile` instead, but in that point in `insertPHI` we already lost the original Def subreg to check against. So drop the assertion completely. Original commit message: - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 245479	2015-08-19 18:53:36 +00:00
Douglas Katzman	e5485c651e	[SPARC] Enable writing to floating-point-state register. llvm-svn: 245475	2015-08-19 18:34:48 +00:00
Ahmed Bougacha	9e00ec6195	[AArch64] Improve short-form diags on long-form Match_InvalidOperand. Since r244955, we try to use the short-form ErrorInfo when both tries failed, and the long-form match failed on a suffix operand. However, this means we sometimes mix ErrorInfo and MatchResult (one manifestation of this being PR24498). Instead, restore both. llvm-svn: 245469	2015-08-19 17:40:19 +00:00
Derek Schuff	55817ee604	x32. Fixes a bug in x32 exception handling. This patch updates the X86 lowering so that the Exception Pointer and Selector are 64-bit wide only if Subtarget.isTarget64BitLP64. Patch by João Porto Reviewers: dschuff, rnk Differential Revision: http://reviews.llvm.org/D12111 llvm-svn: 245454	2015-08-19 16:28:21 +00:00
JF Bastien	5ab87edbb4	x32. Fixes jmp %reg in x32 x32 has 32-bit pointers; x86-64 can't jmp %r32. This patch addresses this issue by explicitly zero-extending brind's target to 64-bits. Author: jpp Reviewers: jfb, dschuff, pavel.v.chupin Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D12112 llvm-svn: 245452	2015-08-19 16:17:08 +00:00
Bruno Cardoso Lopes	61009142b8	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Revert r245442 while investigating a fix. An assertion hit in http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/11380 llvm-svn: 245446	2015-08-19 15:10:32 +00:00
James Y Knight	d966fb6fef	[SPARC] Fix BooleanContents, so that select of a trunc doesn't eliminate the trunc. Differential Revision: http://reviews.llvm.org/D10442 llvm-svn: 245444	2015-08-19 14:47:04 +00:00
Bruno Cardoso Lopes	0a1c126684	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply r243486. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 245442	2015-08-19 14:34:41 +00:00
Silviu Baranga	ad1b19fcb7	[ARM] Add instruction selection patterns for vmin/vmax Summary: The mid-end was generating vector smin/smax/umin/umax nodes, but we were using vbsl to generatate the code. This adds the vmin/vmax patterns and a test to check that we are now generating vmin/vmax instructions. Reviewers: rengolin, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12105 llvm-svn: 245439	2015-08-19 14:11:27 +00:00
Joerg Sonnenberger	7d180c59bb	Map %fprs to %asr6 in the Sparc assembler parser. llvm-svn: 245437	2015-08-19 13:55:14 +00:00
Tobias Grosser	85508e804b	Revert "[X86] Widen the 'AND' mask if doing so shrinks the encoding size" This reverts commit 245169 which miscompiles MultiSource/Applications/siod from LNT. llvm-svn: 245432	2015-08-19 11:35:10 +00:00
Michael Kuperstein	9fe42604aa	[X86] Do not lower scalar sdiv/udiv to a shifts + mul sequence when optimizing for minsize There are some cases where the mul sequence is smaller, but for the most part, using a div is preferable. This does not apply to vectors, since x86 doesn't have vector idiv, and a vector mul/shifts sequence ought to be smaller than a scalarized division. Differential Revision: http://reviews.llvm.org/D12082 llvm-svn: 245431	2015-08-19 11:21:43 +00:00
Hal Finkel	0ef2b10f16	Fix how DependenceAnalysis calls delinearization Fix how DependenceAnalysis calls delinearization, mirroring what is done in Delinearization.cpp (mostly by making sure to call getSCEVAtScope before delinearizing, and by removing the unnecessary 'Pairs == 1' check). Patch by Vaivaswatha Nagaraj! llvm-svn: 245408	2015-08-19 02:56:36 +00:00
Eric Christopher	0efe9f60bb	Revert "Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks." This is causing bootstrap problems, e.g.: http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/2960 This reverts r245195. llvm-svn: 245402	2015-08-19 02:15:13 +00:00
Hal Finkel	a8d205f145	Make ScalarEvolution::isKnownPredicate a little smarter Here we make ScalarEvolution::isKnownPredicate, indirectly, a little smarter. Given some relational comparison operator OP, and two AddRec SCEVs, {I,+,S} OP {J,+,T}, we can reduce this to the comparison I OP J when S == T, both AddRecs are for the same loop, and both are known not to wrap. As it turns out, because of the way that backedge-guard expressions can be leveraged when computing known predicates, this allows indvars to simplify the if-statement comparison in this loop: void foo (int a, int b, int n) { for (int i = 0; i < n; ++i) { if (i > n) a[i] = b[i] + 1; } } which, somewhat surprisingly, we were not previously optimizing away. llvm-svn: 245400	2015-08-19 01:51:51 +00:00
Chih-Hung Hsieh	fdcf541871	Split ARM and AArch64 emutls.ll test Differential Revision: http://reviews.llvm.org/D12127 llvm-svn: 245399	2015-08-19 01:44:51 +00:00
Alex Lorenz	df9e3c6fb0	MIR Serialization: Serialize MMI's variable debug information. llvm-svn: 245396	2015-08-19 00:13:25 +00:00
Quentin Colombet	861ad97e6f	[BasicAA] Add a test for PR24468 to be sure we won't regress when we finally get the GEP aliasing right. llvm-svn: 245395	2015-08-19 00:08:26 +00:00
Quentin Colombet	b700e357b5	[BasicAA] Revert r221876 because it can produce incorrect aliasing information: see PR24468. llvm-svn: 245394	2015-08-19 00:07:20 +00:00
Alex Lorenz	607efb6c7e	MIR Parser: Return true on error when parsing standalone registers. llvm-svn: 245384	2015-08-18 22:57:36 +00:00
Alex Lorenz	f3630113cd	MIR Serialization: Serialize the operand's bit mask target flags. This commit adds support for bit mask target flag serialization to the MIR printer and the MIR parser. It also adds support for the machine operand's target flag serialization to the AArch64 target. Reviewers: Duncan P. N. Exon Smith llvm-svn: 245383	2015-08-18 22:52:15 +00:00
Alex Lorenz	a314d81328	MIR Serialization: Serialize the frame information's stack protector index. llvm-svn: 245372	2015-08-18 22:26:26 +00:00
David Majnemer	c6bb0e2a51	[InstSimplify] Don't assume getAggregateElement will succeed It isn't always possible to get a value from getAggregateElement. This fixes PR24488. llvm-svn: 245365	2015-08-18 22:07:25 +00:00
Joerg Sonnenberger	b0ce8747c3	Load/store instructions for floating points with address space require SparcV9. To properly handle this, define the *a instructions as separate instruction classes by refactoring the LoadA and StoreA multiclasses. Move the instruction tests into the sparcv9 file to test the difference. llvm-svn: 245360	2015-08-18 21:31:46 +00:00
Simon Pilgrim	4bce6d6b73	[X86] Refreshed sign extension tests. llvm-svn: 245358	2015-08-18 21:21:35 +00:00
Simon Pilgrim	ce30ae62e2	[X86][AVX] Added shuffle concatenation tests llvm-svn: 245351	2015-08-18 20:51:15 +00:00
Matthias Braun	fa3b248a66	DAGCombiner: Improve DAGCombiner select normalization The current code normalizes select(C0, x, select(C1, x, y)) towards select(C0\|C1, x, y) if the targets prefers that form. This patch adds an additional rule that if the select(C1, x, y) part already exists in the function then we want to normalize into the other direction because the effects of reusing the existing value are bigger than transforming into the target preferred form. This addresses regressions following r238793, see also: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150727/290272.html Differential Revision: http://reviews.llvm.org/D11616 llvm-svn: 245350	2015-08-18 20:48:36 +00:00
Matthias Braun	2e920bd04f	DAGCombiner: Optimize SELECTs first before turning them into SELECT_CC This is part of http://reviews.llvm.org/D11616 - I just decided to split this up into a separate commit. llvm-svn: 245349	2015-08-18 20:48:29 +00:00
Simon Pilgrim	edaba3b7c3	Updated constants to give more useful min/max constant folding tests llvm-svn: 245348	2015-08-18 20:46:48 +00:00
David Majnemer	0ad363eebc	[WinEH] Calculate state numbers for the new EH representation State numbers are calculated by performing a walk from the innermost funclet to the outermost funclet. Rudimentary support for the new EH constructs has been added to the assembly printer, just enough to test the new machinery. Differential Revision: http://reviews.llvm.org/D12098 llvm-svn: 245331	2015-08-18 19:07:12 +00:00
Alex Lorenz	eb7c9be43c	MIR Parser: Implicit register verifier should accept unexpected implicit subregister operands. llvm-svn: 245315	2015-08-18 17:17:13 +00:00
Daniel Sanders	63f4a5dcad	[mips] Expand JAL instructions when PIC is enabled. Summary: This is the correct way to handle JAL instructions when PIC is enabled. Patch by Toma Tabacu Reviewers: seanbruno, tomatabacu Subscribers: brooks, seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D6231 llvm-svn: 245305	2015-08-18 16:18:09 +00:00
Davide Italiano	485cb66d0e	[MC] Convert another bunch of tests from macho-dump to llvm-readobj. This is (almost) everything under MC/MachO/ARM. There are still some cases missing, because llvm-readobj doesn't (yet) support some features, that macho-dump provides. I plan to reduce the gap between them shortly. llvm-svn: 245302	2015-08-18 16:05:13 +00:00
Zoran Jovanovic	2fe8466f6e	[mips][microMIPS] Implement DDIV, DMOD, DDIVU and DMODU instructions Differential Revision: http://reviews.llvm.org/D10953 llvm-svn: 245297	2015-08-18 14:40:43 +00:00
Zoran Jovanovic	a6593ff613	[mips][microMIPS] Implement SW and SWE instructions Differential Revision: http://reviews.llvm.org/D10869 llvm-svn: 245293	2015-08-18 12:53:08 +00:00
Simon Pilgrim	9f4374d361	Fixed max/min typo in test names llvm-svn: 245278	2015-08-18 09:02:51 +00:00
Simon Pilgrim	19ffd57f45	[X86][SSE} Added constant SMAX/SMIN/UMAX/UMIN tests Constant folding patch to follow soon llvm-svn: 245276	2015-08-18 08:52:43 +00:00
Simon Pilgrim	08d823afe4	[X86][SSE] Added extra vector truncation tests. Including cases for PR14866 llvm-svn: 245274	2015-08-18 08:37:09 +00:00
Justin Bogner	9f00ebaeda	Revert "Constant propagation after hiting llvm.assume" This was also failing bootstrap: http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build This reverts r245265. llvm-svn: 245269	2015-08-18 07:00:34 +00:00
Piotr Padlewski	94ca3783b8	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 245265	2015-08-18 03:55:30 +00:00
Guozhi Wei	f66d384443	Align SP adjustment in function getSPAdjust This commit adds a new function TargetFrameLowering::alignSPAdjust and calls it from TargetInstrInfo::getSPAdjust. It fixes PR24142. llvm-svn: 245253	2015-08-17 22:36:27 +00:00
Alex Lorenz	a56ba6a6dd	MIR Serialization: Serialize the local offsets for the stack objects. llvm-svn: 245249	2015-08-17 22:17:42 +00:00
Alex Lorenz	eb62568625	MIR Serialization: Serialize the memory operand's range metadata node. llvm-svn: 245247	2015-08-17 22:09:52 +00:00
Alex Lorenz	03e940d1f8	MIR Serialization: Serialize the memory operand's noalias metadata node. llvm-svn: 245246	2015-08-17 22:08:02 +00:00
Alex Lorenz	a16f624dc3	MIR Serialization: Serialize the memory operand's alias scope metadata node. llvm-svn: 245245	2015-08-17 22:06:40 +00:00
Alex Lorenz	a617c9162d	MIR Serialization: Serialize the memory operand's TBAA metadata node. llvm-svn: 245244	2015-08-17 22:05:15 +00:00
David Majnemer	83f4bb23c4	[WinEHPrepare] Replace unreasonable funclet terminators with unreachable It is possible to be in a situation where more than one funclet token is a valid SSA value. If we see a terminator which exits a funclet which doesn't use the funclet's token, replace it with unreachable. Differential Revision: http://reviews.llvm.org/D12074 llvm-svn: 245238	2015-08-17 20:56:39 +00:00
Douglas Katzman	685a7d1a70	[SPARC]: recognize '.' as the start of an assembler expression. llvm-svn: 245232	2015-08-17 19:55:01 +00:00
James Molloy	974838f294	[ARM] Fix crash when targetting CPU without NEON We emulate a scalar vmin/vmax with NEON instructions as they don't exist in the VFP ISA. So only mark these as legal when NEON is available. Found here: https://code.google.com/p/chromium/issues/detail?id=521671 llvm-svn: 245231	2015-08-17 19:37:12 +00:00
Silviu Baranga	b322aa6f53	[CostModel][AArch64] Increase cost of vector insert element and add missing cast costs Summary: Increase the estimated costs for insert/extract element operations on AArch64. This is motivated by results from benchmarking interleaved accesses. Add missing costs for zext/sext/trunc instructions and some integer to floating point conversions. These costs were previously calculated by scalarizing these operation and were affected by the cost increase of the insert/extract element operations. Reviewers: rengolin Subscribers: mcrosier, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D11939 llvm-svn: 245226	2015-08-17 16:05:09 +00:00
Silviu Baranga	d5ac26937c	[CostModel][ARM] Increase cost of insert/extract operations Summary: This change limits the minimum cost of an insert/extract element operation to 2 in cases where this would result in mixing of NEON and VFP code. Reviewers: rengolin Subscribers: mssimpso, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12030 llvm-svn: 245225	2015-08-17 15:57:05 +00:00
Artur Pilipenko	34d8ba84c8	Take alignment into account in isSafeToSpeculativelyExecute and isSafeToLoadUnconditionally. Reviewed By: hfinkel, sanjoy, MatzeB Differential Revision: http://reviews.llvm.org/D9791 llvm-svn: 245223	2015-08-17 15:54:26 +00:00
Joseph Tremoulet	7031c9fc2e	[WinEHPrepare] Fix catchret successor phi demotion Summary: When demoting an SSA value that has a use on a phi and one of the phi's predecessors terminates with catchret, the edge needs to be split and the load inserted in the new block, else we'll still have a cross-funclet SSA value. Add a test for this, and for the similar case where a def to be spilled is on and invoke and a critical edge, which was already implemented but missing a test. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12065 llvm-svn: 245218	2015-08-17 13:51:37 +00:00
Daniel Sanders	a39ef1c68f	[mips] [IAS] Add support for the DLA pseudo-instruction and fix problems with DLI Summary: It is the same as LA, except that it can also load 64-bit addresses and it only works on 64-bit MIPS architectures. Reviewers: tomatabacu, seanbruno, vkalintiris Subscribers: brooks, seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D9524 llvm-svn: 245208	2015-08-17 10:11:55 +00:00
Michael Kuperstein	adc4e9c414	[GMR] isNonEscapingGlobalNoAlias() should look through Bitcasts/GEPs when looking at loads. This fixes yet another case from PR24288. Differential Revision: http://reviews.llvm.org/D12064 llvm-svn: 245207	2015-08-17 10:06:08 +00:00
James Molloy	ef183397b1	Generate FMINNAN/FMINNUM/FMAXNAN/FMAXNUM from SDAGBuilder. These only get generated if the target supports them. If one of the variants is not legal and the other is, and it is safe to do so, the other variant will be emitted. For example on AArch32 (V8), we have scalar fminnm but not fmin. Fix up a couple of tests while we're here - one now produces better code, and the other was just plain wrong to start with. llvm-svn: 245196	2015-08-17 07:13:10 +00:00
Karthik Bhat	3af28945b9	Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks. PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was deleting the next basic block iterator. Fixed the same by resetting the basic block iterator post call to DeleteDeadInstruction. llvm-svn: 245195	2015-08-17 05:51:39 +00:00
David Majnemer	8ed559ad22	Revert "[InstCombinePHI] Partial simplification of identity operations." This reverts commit r244887, it caused PR24470. llvm-svn: 245194	2015-08-17 03:11:26 +00:00
Chandler Carruth	2f1fd1658f	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193	2015-08-17 02:08:17 +00:00
Sanjay Patel	57fd1dc5db	transform fmin/fmax calls when possible (PR24314) If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187	2015-08-16 20:18:19 +00:00
David Majnemer	e04443baff	Revert "Add support for cross block dse. This patch enables dead stroe elimination across basicblocks." This reverts commit r245025, it caused PR24469. llvm-svn: 245172	2015-08-16 07:11:59 +00:00
David Majnemer	dfa3b09541	[InstCombine] Replace an and+icmp with a trunc+icmp Bitwise arithmetic can obscure a simple sign-test. If replacing the mask with a truncate is preferable if the type is legal because it permits us to rephrase the comparison more explicitly. llvm-svn: 245171	2015-08-16 07:09:17 +00:00
David Majnemer	1a59e49f3c	[X86] Widen the 'AND' mask if doing so shrinks the encoding size We can set additional bits in a mask given that we know the other operand of an AND already has some bits set to zero. This can be more efficient if doing so allows us to use an instruction which implicitly sign extends the immediate. This fixes PR24085. Differential Revision: http://reviews.llvm.org/D11289 llvm-svn: 245169	2015-08-16 04:52:11 +00:00
Sanjay Patel	40d4eb40f6	[x86] enable machine combiner reassociations for scalar single-precision minimums llvm-svn: 245166	2015-08-15 17:01:54 +00:00
Simon Pilgrim	d65ace84c7	Updated broadcast stack folding test to avoid use of broadcast intrinsics. llvm-svn: 245165	2015-08-15 16:54:18 +00:00
Sanjay Patel	3b7e3677e3	fix typos; NFC llvm-svn: 245164	2015-08-15 16:53:08 +00:00
Sanjay Patel	9f6c7dddd2	add test case to show current codegen llvm-svn: 245163	2015-08-15 16:49:50 +00:00
Simon Pilgrim	0750c84623	[DAGCombiner] Attempt to mask vectors before zero extension instead of after. For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors. (zext (truncate x)) -> (zext (and(x, m)) Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code. This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles. Differential Revision: http://reviews.llvm.org/D11764 llvm-svn: 245160	2015-08-15 13:27:30 +00:00
David Majnemer	0bc0eef71c	[IR] Give catchret an optional 'return value' operand Some personality routines require funclet exit points to be clearly marked, this is done by producing a token at the funclet pad and consuming it at the corresponding ret instruction. CleanupReturnInst already had a spot for this operand but CatchReturnInst did not. Other personality routines don't need to use this which is why it has been made optional. llvm-svn: 245149	2015-08-15 02:46:08 +00:00
JF Bastien	5e4303dc14	Accelerate MergeFunctions with hashing This patch makes the Merge Functions pass faster by calculating and comparing a hash value which captures the essential structure of a function before performing a full function comparison. The hash is calculated by hashing the function signature, then walking the basic blocks of the function in the same order as the main comparison function. The opcode of each instruction is hashed in sequence, which means that different functions according to the existing total order cannot have the same hash, as the comparison requires the opcodes of the two functions to be the same order. The hash function is a static member of the FunctionComparator class because it is tightly coupled to the exact comparison function used. For example, functions which are equivalent modulo a single variant callsite might be merged by a more aggressive MergeFunctions, and the hash function would need to be insensitive to these differences in order to exploit this. The hashing function uses a utility class which accumulates the values into an internal state using a standard bit-mixing function. Note that this is a different interface than a regular hashing routine, because the values to be hashed are scattered amongst the properties of a llvm::Function, not linear in memory. This scheme is fast because only one word of state needs to be kept, and the mixing function is a few instructions. The main runOnModule function first computes the hash of each function, and only further processes functions which do not have a unique function hash. The hash is also used to order the sorted function set. If the hashes differ, their values are used to order the functions, otherwise the full comparison is done. Both of these are helpful in speeding up MergeFunctions. Together they result in speedups of 9% for mysqld (a mostly C application with little redundancy), 46% for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all three cases, the new speed of MergeFunctions is about half that of the module verifier, making it relatively inexpensive even for large LTO builds with hundreds of thousands of functions. The same functions are merged, so this change is free performance. Author: jrkoenig Reviewers: nlewycky, dschuff, jfb Subscribers: llvm-commits, aemerson Differential revision: http://reviews.llvm.org/D11923 llvm-svn: 245140	2015-08-15 01:18:18 +00:00
Matt Arsenault	427a0fd22e	LoopStrengthReduce: Try to pass address space to isLegalAddressingMode This seems to only work some of the time. In some situations, this seems to use a nonsensical type and isn't actually aware of the memory being accessed. e.g. if branch condition is an icmp of a pointer, it checks the addressing mode of i1. llvm-svn: 245137	2015-08-15 00:53:06 +00:00
Matt Arsenault	297ae311ce	AMDGPU/SI: Fix printing useless info with amdhsa The comments at the bottom would all report 0 if amdhsa was used. llvm-svn: 245135	2015-08-15 00:12:39 +00:00
Sanjay Patel	7332e0455f	make current codegen visible in the checks, so we can decide if it's right llvm-svn: 245120	2015-08-14 23:03:01 +00:00
Nick Lewycky	8075fd22b9	Fix a crash where a utility function wasn't aware of fcmp vectors and created a value with the wrong type. Fixes PR24458! llvm-svn: 245119	2015-08-14 22:46:49 +00:00
Bjarke Hammersholt Roune	9791ed4705	[SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl Summary: http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions. This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit: for (int i = 0; i < numIterations; ++i) sum += ptr[i - offset]; for (int i = 0; i < numIterations; ++i) sum += ptr[i * stride]; for (int i = 0; i < numIterations; ++i) sum += ptr[3 * (i << 7)]; Reviewers: atrick, sanjoy Subscribers: sanjoy, majnemer, hfinkel, llvm-commits, meheff, jingyue, eliben Differential Revision: http://reviews.llvm.org/D11860 llvm-svn: 245118	2015-08-14 22:45:26 +00:00
Pat Gavlin	b399095c3f	Add a target environment for CoreCLR. Although targeting CoreCLR is similar to targeting MSVC, there are certain important differences that the backend must be aware of (e.g. differences in stack probes, EH, and library calls). Differential Revision: http://reviews.llvm.org/D11012 llvm-svn: 245115	2015-08-14 22:41:43 +00:00
Sanjay Patel	dd175bc6c4	make current codegen visible in the checks, so we can decide if it's right llvm-svn: 245108	2015-08-14 22:10:59 +00:00
Ahmed Bougacha	cd35787217	[AArch64] Fix FMLS scalar-indexed-from-2s-after-neg patterns. We canonicalize V64 vectors to V128 through insert_subvector: the other FMLA/FMLS/FMUL/FMULX patterns match that already, but this one doesn't, so we'd fail to match fmls and generate fneg+fmla instead. The vector equivalents are already tested and functional. llvm-svn: 245107	2015-08-14 22:06:05 +00:00
Evgeniy Stepanov	24ac55d884	[msan] Fix handling of musttail calls. MSan instrumentation for return values of musttail calls is not allowed by the IR constraints, and not needed at the same time. llvm-svn: 245106	2015-08-14 22:03:50 +00:00
Alex Lorenz	577d271a75	MIR Serialization: Serialize the '.cfi_same_value' CFI directive. llvm-svn: 245103	2015-08-14 21:55:58 +00:00
Alex Lorenz	c3ba7508f6	MIR Serialization: Serialize the external symbol call entry pseudo source values. llvm-svn: 245098	2015-08-14 21:14:50 +00:00
Alex Lorenz	50b826fb75	MIR Serialization: Serialize the global value call entry pseudo source values. llvm-svn: 245097	2015-08-14 21:08:30 +00:00
Renato Golin	980b6cc42b	Revert "[ARM] Fix MachO CPU Subtype selection" This reverts commit r245081, as it breaks many builds. llvm-svn: 245086	2015-08-14 19:35:47 +00:00
Alex Lorenz	1039fd1ae5	MIR Serialization: Serialize the 'internal' register operand flag. llvm-svn: 245085	2015-08-14 19:07:07 +00:00
Alex Lorenz	f9a2b12361	MIR Serialization: Serialize the bundled machine instructions. llvm-svn: 245082	2015-08-14 18:57:24 +00:00
Vedant Kumar	2f079be789	[ARM] Fix MachO CPU Subtype selection This patch makes the Darwin ARM backend take advantage of TargetParser. It also teaches TargetParser about ARMV7K for the first time. This makes target triple parsing more consistent across llvm. Differential Revision: http://reviews.llvm.org/D11996 llvm-svn: 245081	2015-08-14 18:36:47 +00:00
Sanjay Patel	ed502905f7	[x86] fix allowsMisalignedMemoryAccess() implementation This patch fixes the x86 implementation of allowsMisalignedMemoryAccess() to correctly return the 'Fast' output parameter for 32-byte accesses. To test that, an existing load merging optimization is changed to use the TLI hook. This exposes a shortcoming in the current logic and results in the regression test update. Changing other direct users of the isUnalignedMem32Slow() x86 CPU attribute would be a follow-on patch. Without the fix in allowsMisalignedMemoryAccesses(), we will infinite loop when targeting SandyBridge because LowerINSERT_SUBVECTOR() creates 32-byte loads from two 16-byte loads while PerformLOADCombine() splits them back into 16-byte loads. Differential Revision: http://reviews.llvm.org/D10662 llvm-svn: 245075	2015-08-14 17:53:40 +00:00
Vedant Kumar	06f0678010	[test] Testing write access to llvm llvm-svn: 245074	2015-08-14 17:42:50 +00:00
Reid Kleckner	a57d015154	[sancov] Leave llvm.localescape in the entry block Summary: Similar to the change we applied to ASan. The same test case works. Reviewers: samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11961 llvm-svn: 245067	2015-08-14 16:45:42 +00:00
Chad Rosier	67dca908fe	Cleanup test whitespace or lack thereof. NFC. llvm-svn: 245065	2015-08-14 16:34:15 +00:00
Rafael Espindola	dbaf0498a9	Revert "Centralize the information about which object format we are using." This reverts commit r245047. It was failing on the darwin bots. The problem was that when running ./bin/llc -march=msp430 llc gets to if (TheTriple.getTriple().empty()) TheTriple.setTriple(sys::getDefaultTargetTriple()); Which means that we go with an arch of msp430 but a triple of x86_64-apple-darwin14.4.0 which fails badly. That code has to be updated to select a triple based on the value of march, but that is not a trivial fix. llvm-svn: 245062	2015-08-14 15:48:41 +00:00
Davide Italiano	c0ea1e6d63	Convert tests under MC/ELF from macho-dump to llvm-readobj. Yet another step towards deprecating macho-dump. llvm-svn: 245059	2015-08-14 15:16:37 +00:00
Rafael Espindola	90eb70c8a7	Centralize the information about which object format we are using. Other than some places that were handling unknown as ELF, this should have no change. The test updates are because we were detecting arm-coff or x86_64-win64-coff as ELF targets before. It is not clear if the enum should live on the Triple. At least now it lives in a single location and should be easier to move somewhere else. llvm-svn: 245047	2015-08-14 13:31:17 +00:00
Simon Pilgrim	1fdc177ccf	Renamed min tests (typo) llvm-svn: 245038	2015-08-14 11:03:31 +00:00
James Molloy	63be198712	[AArch64] FMINNAN/FMAXNAN on f16 is not legal. Spotted by Ahmed - in r244594 I inadvertently marked f16 min/max as legal. I've reverted it here, and marked min/max on scalar f16's as promote. I've also added a testcase. The test just checks that the compiler doesn't fall over - it doesn't create fmin nodes for f16 yet. llvm-svn: 245035	2015-08-14 09:08:50 +00:00
David Majnemer	b611e3f50e	[IR] Add token types This introduces the basic functionality to support "token types". The motivation stems from the need to perform operations on a Value whose provenance cannot be obscured. There are several applications for such a type but my immediate motivation stems from WinEH. Our personality routine enforces a single-entry - single-exit regime for cleanups. After several rounds of optimizations, we may be left with a terminator whose "cleanup-entry block" is not entirely clear because control flow has merged two cleanups together. We have experimented with using labels as operands inside of instructions which are not terminators to indicate where we came from but found that LLVM does not expect such exotic uses of BasicBlocks. Instead, we can use this new type to clearly associate the "entry point" and "exit point" of our cleanup. This is done by having the cleanuppad yield a Token and consuming it at the cleanupret. The token type makes it impossible to obscure or otherwise hide the Value, making it trivial to track the relationship between the two points. What is the burden to the optimizer? Well, it turns out we have already paid down this cost by accepting that there are certain calls that we are not permitted to duplicate, optimizations have to watch out for such instructions anyway. There are additional places in the optimizer that we will probably have to update but early examination has given me the impression that this will not be heroic. Differential Revision: http://reviews.llvm.org/D11861 llvm-svn: 245029	2015-08-14 05:09:07 +00:00
Karthik Bhat	ddc2a86a00	Add support for cross block dse. This patch enables dead stroe elimination across basicblocks. Example: define void @test_02(i32 %N) { %1 = alloca i32 store i32 %N, i32* %1 store i32 10, i32* @x %2 = load i32, i32* %1 %3 = icmp ne i32 %2, 0 br i1 %3, label %4, label %5 ; <label>:4 store i32 5, i32* @x br label %7 ; <label>:5 %6 = load i32, i32* @x store i32 %6, i32* @y br label %7 ; <label>:7 store i32 15, i32* @x ret void } In the above example dead store "store i32 5, i32* @x" is now eliminated. Differential Revision: http://reviews.llvm.org/D11143 llvm-svn: 245025	2015-08-14 04:17:23 +00:00
Jingyue Wu	1238f341ba	[SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't sign-overflow. Summary: This patch implements my promised optimization to reunites certain sexts from operands after we extract the constant offset. See the header comment of reuniteExts for its motivation. One key building block that enables this optimization is Bjarke's poison value analysis (D11212). That helps to prove "a +nsw b" can't overflow. Reviewers: broune Subscribers: jholewinski, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12016 llvm-svn: 245003	2015-08-14 02:02:05 +00:00
Alex Lorenz	5022f6bb81	MIR Serialization: Change MIR syntax - use custom syntax for MBBs. This commit modifies the way the machine basic blocks are serialized - now the machine basic blocks are serialized using a custom syntax instead of relying on YAML primitives. Instead of using YAML mappings to represent the individual machine basic blocks in a machine function's body, the new syntax uses a single YAML block scalar which contains all of the machine basic blocks and instructions for that function. This is an example of a function's body that uses the old syntax: body: - id: 0 name: entry instructions: - '%eax = MOV32r0 implicit-def %eflags' - 'RETQ %eax' ... The same body is now written like this: body: \| bb.0.entry: %eax = MOV32r0 implicit-def %eflags RETQ %eax ... This syntax change is motivated by the fact that the bundled machine instructions didn't map that well to the old syntax which was using a single YAML sequence to store all of the machine instructions in a block. The bundled machine instructions internally use flags like BundledPred and BundledSucc to determine the bundles, and serializing them as MI flags using the old syntax would have had a negative impact on the readability and the ease of editing for MIR files. The new syntax allows me to serialize the bundled machine instructions using a block construct without relying on the internal flags, for example: BUNDLE implicit-def dead %itstate, implicit-def %s1 ... { t2IT 1, 24, implicit-def %itstate %s1 = VMOVS killed %s0, 1, killed %cpsr, implicit killed %itstate } This commit also converts the MIR testcases to the new syntax. I developed a script that can convert from the old syntax to the new one. I will post the script on the llvm-commits mailing list in the thread for this commit. llvm-svn: 244982	2015-08-13 23:10:16 +00:00
Ahmed Bougacha	80e4ac802a	[AArch64] Provide "too few operands" diags on short-form NEON also. We used to just say "invalid type suffix for instruction", which is misleading. This is because we fallback to the long-form matcher if the short-form matcher failed, losing the error information on the way. Save it, so that we can provide a little better diagnostics when the long-form matcher thinks a suffix is the cause of the error. llvm-svn: 244955	2015-08-13 21:09:13 +00:00
Alex Lorenz	6866104073	MIR Parser: Don't allow negative alignments for memory operands. llvm-svn: 244953	2015-08-13 20:55:01 +00:00
Davide Italiano	a195386ca1	[SimplifyLibCalls] Correctly set the is_zero_undef flag for llvm.cttz If <src> is non-zero we can safely set the flag to true, and this results in less code generated for, e.g. ffs(x) + 1 on FreeBSD. Thanks to majnemer for suggesting the fix and reviewing. Code generated before the patch was applied: 0: 0f bc c7 bsf %edi,%eax 3: b9 20 00 00 00 mov $0x20,%ecx 8: 0f 45 c8 cmovne %eax,%ecx b: 83 c1 02 add $0x2,%ecx e: b8 01 00 00 00 mov $0x1,%eax 13: 85 ff test %edi,%edi 15: 0f 45 c1 cmovne %ecx,%eax 18: c3 retq Code generated after the patch was applied: 0: 0f bc cf bsf %edi,%ecx 3: 83 c1 02 add $0x2,%ecx 6: 85 ff test %edi,%edi 8: b8 01 00 00 00 mov $0x1,%eax d: 0f 45 c1 cmovne %ecx,%eax 10: c3 retq It seems we can still use cmove and save another 'test' instruction, but that can be tackled separately. Differential Revision: http://reviews.llvm.org/D11989 llvm-svn: 244947	2015-08-13 20:34:26 +00:00
Simon Pilgrim	829091e310	[X86][SSE] Tests for SMAX/SMIN/UMAX/UMIN vector instructions llvm-svn: 244944	2015-08-13 20:31:03 +00:00
Jingyue Wu	13a80eaceb	[SeparateConstOffsetFromGEP] strengthen the inbounds attribute We used to be over-conservative about preserving inbounds. Actually, the second GEP (which applies the constant offset) can inherit the inbounds attribute of the original GEP, because the resultant pointer is equivalent to that of the original GEP. For example, x = GEP inbounds a, i+5 => y = GEP a, i // inbounds removed x = GEP inbounds y, 5 // inbounds preserved llvm-svn: 244937	2015-08-13 18:48:49 +00:00
Nemanja Ivanovic	1c39ca6501	Scalar to vector conversions using direct moves This patch corresponds to review: http://reviews.llvm.org/D11471 It improves the code generated for converting a scalar to a vector value. With direct moves from GPRs to VSRs, we no longer require expensive stack operations for this. Subsequent patches will handle the reverse case and more general operations between vectors and their scalar elements. llvm-svn: 244921	2015-08-13 17:40:44 +00:00
Igor Laevsky	30143aee11	Emit argmemonly attribute for intrinsics. Differential Revision: http://reviews.llvm.org/D11352 llvm-svn: 244920	2015-08-13 17:40:04 +00:00
James Molloy	6f0a9d7e1b	[ARM] Rejig vmax tests a bit They rely on global fast-math options, but soon ISel will rely only on fast-math flags on the instructions themselves. Rip the fast checks out into their own file so we can mark their instructions as fast. llvm-svn: 244914	2015-08-13 17:28:16 +00:00
James Molloy	e7695bca0f	[AArch64] Small rejig of fmax tests, NFCI. These tests relied on -enable-no-nans-fp-math, whereas soon they'll take their no-nans hint from the FCMP instruction itself, so split the no-nans stuff out into its own test. Also do a slight rejig of instruction order. The old FMIN/MAX backend matching had to deal with looking through casts, which it never did particularly well. Now, instcombine will recognize such patterns and canonicalize the cast outside the select. So modify the test inputs to assume that instcombine has already run. llvm-svn: 244913	2015-08-13 17:28:10 +00:00
Erik Eckstein	11fc8175d9	[DeadStoreElimination] remove a redundant store even if the load is in a different block. DeadStoreElimination does eliminate a store if it stores a value which was loaded from the same memory location. So far this worked only if the store is in the same block as the load. Now we can also handle stores which are in a different block than the load. Example: define i32 @test(i1, i32) { entry: %l2 = load i32, i32 %1, align 4 br i1 %0, label %bb1, label %bb2 bb1: br label %bb3 bb2: ; This store is redundant store i32 %l2, i32* %1, align 4 br label %bb3 bb3: ret i32 0 } Differential Revision: http://reviews.llvm.org/D11854 llvm-svn: 244901	2015-08-13 15:36:11 +00:00
Petar Jovanovic	d22164dc3b	[mips][mcjit] Calculate correct addend for HI16 and PCHI16 reloc Previously, for O32 ABI we did not calculate correct addend for R_MIPS_HI16 and R_MIPS_PCHI16 relocations. This patch fixes that. Patch by Vladimir Radosavljevic. Differential Revision: http://reviews.llvm.org/D11186 llvm-svn: 244897	2015-08-13 15:12:49 +00:00
Joseph Tremoulet	c9ff914ced	[WinEHPrepare] Update demotion logic Summary: Update the demotion logic in WinEHPrepare to avoid creating new cleanups by walking predecessors as necessary to insert stores for EH-pad PHIs. Also avoid creating stores for EH-pad PHIs that have no uses. The store/load placement is still pretty naive. Likely future improvements (at least for optimized compiles) include: - Share loads for related uses as possible - Coalesce non-interfering use/def-related PHIs - Store at definition point rather than each PHI pred for non-interfering lifetimes. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11955 llvm-svn: 244894	2015-08-13 14:30:10 +00:00
Ulrich Weigand	a887f06214	[SystemZ] Support large LLVM IR struct return values Recent mesa/llvmpipe crashes on SystemZ due to a failed assertion when attempting to compile a routine with a return type of { <4 x float>, <4 x float>, <4 x float>, <4 x float> } on a system without vector instruction support. This is because after legalizing the vector type, we get a return value consisting of 16 floats, which cannot all be returned in registers. Usually, what should happen in this case is that the target's CanLowerReturn routine rejects the return type, in which case SelectionDAG falls back to implementing a structure return in memory via implicit reference. However, the SystemZ target never actually implemented any CanLowerReturn routine, and thus would accept any struct return type. This patch fixes the crash by implementing CanLowerReturn. As a side effect, this also handles fp128 return values, fixing a todo that was noted in SystemZCallingConv.td. llvm-svn: 244889	2015-08-13 13:37:06 +00:00
Charlie Turner	6153698f26	[InstCombinePHI] Partial simplification of identity operations. Consider this code: BB: %i = phi i32 [ 0, %if.then ], [ %c, %if.else ] %add = add nsw i32 %i, %b ... In this common case the add can be moved to the %if.else basic block, because adding zero is an identity operation. If we go though %if.then branch it's always a win, because add is not executed; if not, the number of instructions stays the same. This pattern applies also to other instructions like sub, shl, shr, ashr \| 0, mul, sdiv, div \| 1. Patch by Jakub Kuderski! llvm-svn: 244887	2015-08-13 12:38:58 +00:00
John Brawn	68acdcb435	[ARM] Reorganise and simplify thumb-1 load/store selection Other than PC-relative loads/store the patterns that match the various load/store addressing modes have the same complexity, so the order that they are matched is the order that they appear in the .td file. Rearrange the instruction definitions in ARMInstrThumb.td, and make use of AddedComplexity for PC-relative loads, so that the instruction matching order is the order that results in the simplest selection logic. This also makes register-offset load/store be selected when it should, as previously it was only selected for too-large immediate offsets. Differential Revision: http://reviews.llvm.org/D11800 llvm-svn: 244882	2015-08-13 10:48:22 +00:00
Simon Pilgrim	becd5e8abd	[InstCombine] SSE/AVX vector shifts demanded shift amount bits Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this. I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code, it means that SimplifyX86immshift now (re)decodes the type of shift. Differential Revision: http://reviews.llvm.org/D11938 llvm-svn: 244872	2015-08-13 07:39:03 +00:00
Ahmed Bougacha	a196661bb0	[CodeGen] Mark the promoted FCOPYSIGN result FP_ROUND as TRUNCating. Now that we can properly promote mismatched FCOPYSIGNs (r244858), we can mark the FP_ROUND on the result as truncating, to expose folding. FCOPYSIGN doesn't change anything but the sign bit, so (fp_round (fcopysign (fpext a), b)) is equivalent to (modulo the sign bit): (fp_round (fpext a)) which is a no-op. llvm-svn: 244862	2015-08-13 01:32:30 +00:00
Ahmed Bougacha	b2a9ed910e	[AArch64] Cleanup vector-fcopysign.ll test. NFC. llvm-svn: 244861	2015-08-13 01:20:38 +00:00
Ahmed Bougacha	2a97b1bcf8	[AArch64] Also custom-lowering mismatched vector/f16 FCOPYSIGN. We can lower them using our cool tricks if we fpext/fptrunc the second input, like we do for f32/f64. Follow-up to r243924, r243926, and r244858. llvm-svn: 244860	2015-08-13 01:13:56 +00:00
Ahmed Bougacha	40ded502ff	[CodeGen] When Promoting, don't extend the 2nd FCOPYSIGN operand. We don't care about its type, and there's even a combine that'll fold away the FP_EXTEND if we let it run. However, until it does, we'll have something broken like: (f32 (fp_extend (f64 v))) Scalar f16 follow-up to r243924. llvm-svn: 244858	2015-08-13 01:09:43 +00:00
Philip Reames	971dc3a82a	[RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints To be clear: this is an optimization not a correctness change. CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll. One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust. I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially. Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly. Differential Revision: http://reviews.llvm.org/D11819 llvm-svn: 244821	2015-08-12 22:11:45 +00:00
Alex Lorenz	2791dcca60	MIR Parser: Allow the MI IR references to reference global values. This commit fixes a bug where MI parser couldn't resolve the named IR references that referenced named global values. llvm-svn: 244817	2015-08-12 21:27:16 +00:00
Alex Lorenz	0cc671bf79	MIR Serialization: Serialize the fixed stack pseudo source values. llvm-svn: 244816	2015-08-12 21:23:17 +00:00
Alex Lorenz	4be56e9370	MIR Serialization: Serialize the jump table pseudo source values. llvm-svn: 244813	2015-08-12 21:11:08 +00:00
Alex Lorenz	d858f874fa	MIR Serialization: Serialize the GOT pseudo source values. llvm-svn: 244809	2015-08-12 21:00:22 +00:00
Philip Reames	9ac4e38a16	[RewriteStatepointsForGC] Handle extractelement fully in the base pointer algorithm When rewriting the IR such that base pointers are available for every live pointer, we potentially need to duplicate instructions to propagate the base. The original code had only handled PHI and Select under the belief those were the only instructions which would need duplicated. When I added support for vector instructions, I'd added a collection of hacks for ExtractElement which caught most of the common cases. Of course, I then found the one test case my hacks couldn't cover. :) This change removes all of the early hacks for extract element. By defining extractelement as a BDV (rather than trying to look through it), we can extend the rewriting algorithm to duplicate the extract as needed. Note that a couple of peephole optimizations were left in for the moment, because while we now handle extractelement as a first class citizen, we're not yet handling insertelement. That change will follow in the near future. llvm-svn: 244808	2015-08-12 21:00:20 +00:00
Alex Lorenz	46e9558ac6	MIR Serialization: Serialize the stack pseudo source values. llvm-svn: 244806	2015-08-12 20:44:16 +00:00
Alex Lorenz	91097a3ffa	MIR Serialization: Serialize the constant pool pseudo source values. llvm-svn: 244803	2015-08-12 20:33:26 +00:00
JF Bastien	71d29acecd	WebAssembly: floating-point comparisons Summary: D11924 implemented part of the floating-point comparisons, this patch implements the rest: * Tell ISelLowering that all booleans are either 0 or 1. * Expand the eq/ne/lt/le/gt/ge floating-point comparisons to the canonical ones (similar to what Mips32r6InstrInfo.td does). * Add tests for ord/uno. * Add tests for ueq/one/ult/ule/ugt/uge. * Fix existing comparison tests to remove the (res & 1) code, which setBooleanContents stops from generating. Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11970 llvm-svn: 244779	2015-08-12 17:53:29 +00:00
Simon Pilgrim	a5737a44da	Cleaned up test. NFCI. llvm-svn: 244765	2015-08-12 17:00:50 +00:00
John Brawn	75fc09ddba	Redo "Make global aliases have symbol size equal to their type" r242520 was reverted in r244313 as the expected behaviour of the alias attribute in C is that the alias has the same size as the aliasee. However we can re-introduce adding the size on the alias when the aliasee does not, from a source code or object perspective, exist as a discrete entity. This happens when the aliasee is not a symbol, or when that symbol is private. Differential Revision: http://reviews.llvm.org/D11943 llvm-svn: 244752	2015-08-12 15:05:39 +00:00
John Brawn	0bef27d836	[GlobalMerge] Only emit aliases for internal linkage variables for non-Mach-O On Mach-O emitting aliases for the variables that make up a MergedGlobals variable can cause problems when linking with dead stripping enabled so don't do that, except for external variables where we must emit an alias. llvm-svn: 244748	2015-08-12 13:36:48 +00:00
Zoran Jovanovic	366783e14c	[mips][microMIPS] Create microMIPS64r6 subtarget and implement DALIGN, DAUI, DAHI, DATI, DEXT, DEXTM and DEXTU instructions Differential Revision: http://reviews.llvm.org/D10923 llvm-svn: 244744	2015-08-12 12:45:16 +00:00
Michael Kuperstein	fe0d9bb6eb	[X86] Disable mul -> shl + lea combine when compiling for minsize Differential Revision: http://reviews.llvm.org/D11904 llvm-svn: 244740	2015-08-12 11:27:26 +00:00
Davide Italiano	96887f755b	[MC] Convert the last test using macho-dump under X86/ to llvm-readobj. llvm-svn: 244732	2015-08-12 10:36:16 +00:00
Michael Kuperstein	bc7f99a3ab	[X86] Allow x86 call frame optimization to fold more loads into pushes This abstracts away the test for "when can we fold across a MachineInstruction" into the the MI interface, and changes call-frame optimization use the same test the peephole optimizer users. Differential Revision: http://reviews.llvm.org/D11945 llvm-svn: 244729	2015-08-12 10:14:58 +00:00
Matt Arsenault	c574686529	AMDGPU: Fix assert on dbg_value instructions llvm-svn: 244728	2015-08-12 09:04:44 +00:00
Simon Pilgrim	8c049d5c03	[InstCombine] Move SSE/AVX vector blend folding to instcombiner As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely). InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask). I also moved all the relevant combine tests into InstCombine/blend_x86.ll Differential Revision: http://reviews.llvm.org/D11934 llvm-svn: 244723	2015-08-12 08:08:56 +00:00
Sanjay Patel	260b6d36f4	[x86] enable machine combiner reassociations for 256-bit vector FP mul/add llvm-svn: 244705	2015-08-12 00:29:10 +00:00
Adam Nemet	e2f6d34d21	[LoopDist] Add test for missing coverage Add a testcase to ensure that if we can't find bounds for a necessary memcheck we don't distribute. llvm-svn: 244703	2015-08-12 00:21:59 +00:00
Adam Nemet	abc794d3db	[LAA] Fix typo in test llvm-svn: 244690	2015-08-11 23:03:09 +00:00
Mark Heffernan	438ffe5eac	Use 32-bit divides instead of 64-bit divides where possible. For NVPTX, try to use 32-bit division instead of 64-bit division when the dividend and divisor fit in 32 bits. This speeds up some internal benchmarks significantly. The underlying reason is that many index computations are carried out in 64-bits but never actually exceed the capacity of a 32-bit word. llvm-svn: 244684	2015-08-11 22:16:34 +00:00
Paul Robinson	78046b49a9	Make DW_AT_[MIPS_]linkage_name optional, and off by default for SCE. Mangled "linkage" names can be huge, and if the debugger (or other tools) have no use for them, the size savings can be very impressive (on the order of 40%). Add one test for controlling behavior, and modify a number of tests to either stop using linkage names, or make llc emit them (so these tests will still run when the default triple is for PS4). Differential Revision: http://reviews.llvm.org/D11374 llvm-svn: 244678	2015-08-11 21:36:45 +00:00
Sanjoy Das	827529e7a0	Fix PR24354. `InstCombiner::OptimizeOverflowCheck` was asserting an invariant (operands to binary operations are ordered by decreasing complexity) that wasn't really an invariant. Fix this by instead having `InstCombiner::OptimizeOverflowCheck` establish the invariant if it does not hold. llvm-svn: 244676	2015-08-11 21:33:55 +00:00
JF Bastien	da06bce8b5	WebAssembly: implement comparison. Some of the FP comparisons (ueq, one, ult, ule, ugt, uge) are currently broken, I'll fix them in a follow-up. Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11924 llvm-svn: 244665	2015-08-11 21:02:46 +00:00
Sanjay Patel	2c6a01570d	[x86] enable machine combiner reassociations for 128-bit vector single/double multiplies llvm-svn: 244657	2015-08-11 20:19:23 +00:00
Sanjay Patel	82d91ddb4f	fix minsize detection: minsize attribute implies optimizing for size Also, add a test for optsize because this was not part of any existing regression test. llvm-svn: 244651	2015-08-11 19:39:36 +00:00
Jingyue Wu	99eb4685ef	SelectionDAG: Prefer to combine multiplication with less uses for fma Summary: For example: s6 = s0s5; s2 = s6s6 + s6; ... s4 = s6*s3; We notice that it is possible for s2 is folded to fma (s0, s5, fmul (s6 s6)). This only happens when Aggressive is true, otherwise hasOneUse() check already prevents from folding the multiplication with more uses. Test Plan: test/CodeGen/NVPTX/fma-assoc.ll Patch by Xuetian Weng Reviewers: hfinkel, apazos, jingyue, ohsallen, arsenm Subscribers: arsenm, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11855 llvm-svn: 244649	2015-08-11 19:21:46 +00:00
Chen Li	10f01bd4d3	[LowerSwitch] Fix a bug when LowerSwitch deletes the default block Summary: LowerSwitch crashed with the attached test case after deleting the default block. This happened because the current implementation of deleting dead blocks is wrong. After the default block being deleted, it contains no instruction or terminator, and it should no be traversed anymore. However, since the iterator is advanced before processSwitchInst() function is executed, the block advanced to could be deleted inside processSwitchInst(). The deleted block would then be visited next and crash dyn_cast<SwitchInst>(Cur->getTerminator()) because Cur->getTerminator() returns a nullptr. This patch fixes this problem by recording dead default blocks into a list, and delete them after all processSwitchInst() has been done. It still possible to visit dead default blocks and waste time process them. But it is a compile time issue, and I plan to have another patch to add support to skip dead blocks. Reviewers: kariddi, resistor, hans, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11852 llvm-svn: 244642	2015-08-11 18:12:26 +00:00
Sanjay Patel	070df89928	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244631	2015-08-11 17:04:31 +00:00
Sanjay Patel	caddda56aa	add missing tests for powi expansion with size optimizations The minsize test will be fixed in the next commit. llvm-svn: 244630	2015-08-11 16:58:49 +00:00
Sanjay Patel	c3e8349a3e	fixed to use FileCheck llvm-svn: 244627	2015-08-11 16:51:31 +00:00
Sanjay Patel	605b6adf31	fixed to test attribute, rather than CPU llvm-svn: 244625	2015-08-11 16:43:18 +00:00
Sanjay Patel	cdd5ec47ed	fix typos; NFC llvm-svn: 244619	2015-08-11 16:10:41 +00:00
Sanjay Patel	fec7965b36	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244617	2015-08-11 15:56:31 +00:00
John Brawn	863bfdbfb4	[GlobalMerge] Use private linkage for MergedGlobals variables Other objects can never reference the MergedGlobals symbol so external linkage is never needed. Using private instead of internal linkage means the object is more similar to what it looks like when global merging is not enabled, with the only difference being that the merged variables are addressed indirectly relative to the start of the section they are in. Also add aliases for merged variables with internal linkage, as this also makes the object be more like what it is when they are not merged. Differential Revision: http://reviews.llvm.org/D11942 llvm-svn: 244615	2015-08-11 15:48:04 +00:00
Mehdi Amini	b10555cc61	Fix InstCombine test: invalid CHECK line slipped in r231270 I incorrectly wrote CHECK-NEXT with followin with ':', the check was ignored by FileCheck. The non-inbound GEP is folded here because the DataLayout is no longer optional, the fold was originally guarded with a comment that said: We need TD information to know the pointer size unless this is inbounds. Now we always have "TD information" and perform the fold. Thanks Jonathan Roelofs for noticing. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 244613	2015-08-11 15:31:17 +00:00
Sanjay Patel	b5c0c58737	remove unnecessary settings/attributes from test case llvm-svn: 244612	2015-08-11 15:30:53 +00:00
Sanjay Patel	c454f07eb1	delete FIXME comment; it's fixed llvm-svn: 244605	2015-08-11 14:35:29 +00:00
Sanjay Patel	74ca312666	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244604	2015-08-11 14:31:14 +00:00
Sanjay Patel	52c2691829	add missing test for machine combiner when optimizing for size The minsize test will be fixed in the next commit. llvm-svn: 244603	2015-08-11 14:29:45 +00:00
Michael Kuperstein	243c073a2e	[X86] Allow merging of immediates within a basic block for code size savings First step in preventing immediates that occur more than once within a single basic block from being pulled into their users, in order to prevent unnecessary large instruction encoding .Currently enabled only when optimizing for size. Patch by: zia.ansari@intel.com Differential Revision: http://reviews.llvm.org/D11363 llvm-svn: 244601	2015-08-11 14:10:58 +00:00
James Molloy	b7b2a1e9b4	[AArch64] Match fminnum/fmaxnum for vector fminnm/fmaxnm instead of an intrinsic. Lower Intrinsic::aarch64_neon_fmin/fmax to fminnum/fmannum and match that instead. Minimal functional change: - Extra tests added because coverage of scalar fminnm/fmaxnm instructions was nonexistant. - f16 test updated because now we actually generate scalar fminnm/fmaxnm we no longer need to bail out to a libcall! llvm-svn: 244595	2015-08-11 12:06:37 +00:00
Marina Yatsina	8c997af103	[X86] Add SAL mnemonics for Intel syntax SAL and SHL instructions perform the same operation Differential Revision: http://reviews.llvm.org/D11882 llvm-svn: 244588	2015-08-11 12:05:06 +00:00
Marina Yatsina	d353c45eaf	[X86] Fix REPE, REPZ, REPNZ for intel syntax REPE, REPZ, REPNZ, REPNE should have mnemonics for Intel syntax as well. Currently using these instructions causes compilation errors for Intel syntax. Differential Revision: http://reviews.llvm.org/D11794 llvm-svn: 244584	2015-08-11 11:28:10 +00:00
Marina Yatsina	f6bc15d763	[X86] Fix imul alias for intel syntax The "imul reg, imm" alias is not defined for intel syntax. In intel syntax there is no w/l/q suffix for the imul instruction. Differential Revision: http://reviews.llvm.org/D11887 llvm-svn: 244582	2015-08-11 10:43:04 +00:00
James Molloy	134bec2722	Add support for floating-point minnum and maxnum The select pattern recognition in ValueTracking (as used by InstCombine and SelectionDAGBuilder) only knew about integer patterns. This teaches it about minimum and maximum operations. matchSelectPattern() has been extended to return a struct containing the existing Flavor and a new enum defining the pattern's behavior when given one NaN operand. C minnum() is defined to return the non-NaN operand in this case, but the idiomatic C "a < b ? a : b" would return the NaN operand. ARM and AArch64 at least have different instructions for these different cases. llvm-svn: 244580	2015-08-11 09:12:57 +00:00
Vasileios Kalintiris	1c78ca6a09	[mips] Remap move as or. Summary: This patch remaps the assembly idiom 'move' to 'or' instead of 'daddu' or 'addu'. The use of addu/daddu instead of or as move was highlighted as a performance issue during the analysis of a recent 64bit design. Originally move was encoded as 'or' by binutils but was changed for the r10k cpu family due to their pipeline which had 2 arithmetic units and a single logical unit, and so could issue multiple (d)addu based moves at the same time but only 1 logical move. This patch preserves the disassembly behaviour so that disassembling a old style (d)addu move still appears as move, but assembling move always gives an or Patch by Simon Dardis. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11796 llvm-svn: 244579	2015-08-11 08:56:25 +00:00
Michael Kuperstein	7337ee23d8	[X86] When optimizing for minsize, use POP for small post-call stack clean-up When optimizing for size, replace "addl $4, %esp" and "addl $8, %esp" following a call by one or two pops, respectively. We don't try to do it in general, but only when the stack adjustment immediately follows a call - which is the most common case. That allows taking a short-cut when trying to find a free register to pop into, instead of a full-blown liveness check. If the adjustment immediately follows a call, then every register the call clobbers but doesn't define should be dead at that point, and can be used. Differential Revision: http://reviews.llvm.org/D11749 llvm-svn: 244578	2015-08-11 08:48:48 +00:00
Michael Kuperstein	82814f63c0	Allow PeepholeOptimizer to fold a few more cases The condition for clearing the folding candidate list was clamped together with the "uninteresting instruction" condition. This is too conservative, e.g. we don't need to clear the list when encountering an IMPLICIT_DEF. Differential Revision: http://reviews.llvm.org/D11591 llvm-svn: 244577	2015-08-11 08:19:43 +00:00
Michael Kuperstein	07f31d92ca	[GMR] Be a bit smarter about which globals don't alias when doing recursive lookups Should hopefully fix the remainder of PR24288. Differential Revision: http://reviews.llvm.org/D11900 llvm-svn: 244575	2015-08-11 08:06:44 +00:00
Lang Hames	0fd3610e6d	[RuntimeDyld][AArch64] Add explicit addends before calling relocationValueRef. relocationValueRef uses the addend, so it has to be set before the call. llvm-svn: 244574	2015-08-11 06:27:53 +00:00
Yaron Keren	4988786b0f	Enable five passing dsymutil tests on Windows. These tests pass with Windows 7 x64 + MSYS2. I'll see if the bots like them as well and disable the failing ones. llvm-svn: 244572	2015-08-11 06:05:27 +00:00
David Majnemer	85a549dbc8	[IR] Verify EH pad predecessors Make sure that an EH pad's predecessors are using their unwind edge to transfer control to the EH pad. llvm-svn: 244563	2015-08-11 02:48:30 +00:00
JF Bastien	ef172fc9f0	WebAssembly: add basic floating-point tests Summary: I somehow forgot to add these when I added the basic floating-point opcodes. Also remove ceil/floor/trunc/nearestint for now, and add them only when properly tested. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11927 llvm-svn: 244562	2015-08-11 02:45:15 +00:00
Tyler Nowicki	c94d6ad241	Print vectorization analysis when loop hint is specified. This patch and a relatec clang patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints. llvm-svn: 244555	2015-08-11 01:09:15 +00:00
JF Bastien	e73ce68225	WebAssembly: simply assert on SNaN and NaNs with payloads Summary: convertToHexString doesn't represent them correctly at this point in time. This is a follow-up to sunfish's suggestion in D11914. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11925 llvm-svn: 244551	2015-08-11 00:49:20 +00:00
Alex Lorenz	c483808785	MIR Serialization: Serialize UsedPhysRegMask from the machine register info. This commit serializes the UsedPhysRegMask register mask from the machine register information class. The mask is serialized as an inverted 'calleeSavedRegisters' mask to keep the output minimal. This commit also allows the MIR parser to infer this mask from the register mask operands if the machine function doesn't specify it. Reviewers: Duncan P. N. Exon Smith llvm-svn: 244548	2015-08-11 00:32:49 +00:00
Kostya Serebryany	2569118621	[libFuzzer] don't crash if the condition in a switch has unusual type (e.g. i72) llvm-svn: 244544	2015-08-11 00:24:39 +00:00
Sanjoy Das	7742b8ba15	Address post-commit review from r243378. This checks that bork_directive occurs exactly twice in the test output. llvm-svn: 244543	2015-08-11 00:20:24 +00:00
Alex Lorenz	c5d35ba009	MIR Parser: Report an error when a stack object is redefined. llvm-svn: 244536	2015-08-10 23:50:41 +00:00
Joerg Sonnenberger	ebe7bf44ec	Add lduw and lwua aliases for SPARCv9. llvm-svn: 244535	2015-08-10 23:47:22 +00:00
Alex Lorenz	1d9a303142	MIR Parser: Report an error when a fixed stack object is redefined. llvm-svn: 244534	2015-08-10 23:45:02 +00:00
Joerg Sonnenberger	2ee3d76737	Load/store for float registers from/to alternate space. llvm-svn: 244532	2015-08-10 23:33:17 +00:00
Alex Lorenz	b97c9ef4d0	MIR Serialization: Serialize the liveout register mask machine operands. llvm-svn: 244529	2015-08-10 23:24:42 +00:00
Sanjay Patel	d967a878fa	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244528	2015-08-10 23:07:26 +00:00
Tyler Nowicki	652b0dabe6	Extend late diagnostics to include late test for runtime pointer checks. This patch moves checking the threshold of runtime pointer checks to the vectorization requirements (late diagnostics) and emits a diagnostic that infroms the user the loop would be vectorized if not for exceeding the pointer-check threshold. Clang will also append the options that can be used to allow vectorization. llvm-svn: 244523	2015-08-10 23:01:55 +00:00
JF Bastien	4a6422562d	WebAssembly: print immediates Summary: For now output using C99's hexadecimal floating-point representation. This patch also cleans up how machine operands are printed: instead of special-casing per type of machine instruction, the code now handles operands generically. Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11914 llvm-svn: 244520	2015-08-10 22:36:48 +00:00
Joerg Sonnenberger	6dce129051	Add support for the signx instrution alias of SPARCv9. llvm-svn: 244519	2015-08-10 22:32:25 +00:00
Alex Lorenz	e5101e2016	MachineVerifier: Handle the optional def operand in a PATCHPOINT instruction. The PATCHPOINT instructions have a single optional defined register operand, but the machine verifier can't verify the optional defined register operands. This commit makes sure that the machine verifier won't report an error when a PATCHPOINT instruction doesn't have its optional defined register operand. This change will allow us to enable the machine verifier for the code generation tests for the patchpoint intrinsics. Reviewers: Juergen Ributzka llvm-svn: 244513	2015-08-10 21:47:36 +00:00
Reid Kleckner	c25c7944f0	[llvm-symbolizer] Remove underscores and other C mangling on Windows Summary: This makes it so that reports symbolized after the fact with llvm-symbolizer are more similar to the ones we generate at runtime with in-process dbghelp. Reviewers: samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11785 llvm-svn: 244512	2015-08-10 21:47:11 +00:00
Alex Lorenz	2f43dd5a12	StackMap: FastISel: Add an appropriate number of immediate operands to the frame setup instruction. This commit ensures that the stack map lowering code in FastISel adds an appropriate number of immediate operands to the frame setup instruction. The previous code added just one immediate operand, which was fine for a target like AArch64, but on X86 the ADJCALLSTACKDOWN64 instruction needs two explicit operands. This caused the machine verifier to report an error when the old code added just one. Reviewers: Juergen Ributzka Differential Revision: http://reviews.llvm.org/D11853 llvm-svn: 244508	2015-08-10 21:27:03 +00:00
Tyler Nowicki	655e573dc5	Make fp vectorization test X86 specified to avoid cost-model related problems on arm-thumb and hexagon. llvm-svn: 244505	2015-08-10 21:14:38 +00:00
Rafael Espindola	3db2273861	Add a test showing that objdump (and so ObjectFIle) can handle shndx. It was already passing, we were just not testing the code. llvm-svn: 244504	2015-08-10 21:00:15 +00:00
JF Bastien	fa9746dc8d	x86: Emit LAHF/SAHF instead of PUSHF/POPF NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF. As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire. I did [[ https://github.com/jfbastien/benchmark-x86-flags \| a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are: \| Time per call (ms) \| Runtime (ms) \| Benchmark \| \| 0.000012514 \| 6257 \| sete.i386 \| \| 0.000012810 \| 6405 \| sete.i386-fast \| \| 0.000010456 \| 5228 \| sete.x86-64 \| \| 0.000010496 \| 5248 \| sete.x86-64-fast \| \| 0.000012906 \| 6453 \| lahf-sahf.i386 \| \| 0.000013236 \| 6618 \| lahf-sahf.i386-fast \| \| 0.000010580 \| 5290 \| lahf-sahf.x86-64 \| \| 0.000010304 \| 5152 \| lahf-sahf.x86-64-fast \| \| 0.000028056 \| 14028 \| pushf-popf.i386 \| \| 0.000027160 \| 13580 \| pushf-popf.i386-fast \| \| 0.000023810 \| 11905 \| pushf-popf.x86-64 \| \| 0.000026468 \| 13234 \| pushf-popf.x86-64-fast \| Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose. Reviewers: rnk, jvoung, t.p.northover Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D6629 llvm-svn: 244503	2015-08-10 20:59:36 +00:00
Sanjay Patel	d09391c8cd	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244499	2015-08-10 20:45:44 +00:00
Sanjay Patel	178f8cba51	[x86, SSE]]add missing tests for load folding with partial register update The minsize case is wrong; that will be fixed in the next commit. llvm-svn: 244498	2015-08-10 20:34:34 +00:00
Simon Pilgrim	a3a72b41de	[InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombiner As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations. Differential Revision: http://reviews.llvm.org/D11886 llvm-svn: 244495	2015-08-10 20:21:15 +00:00
Jonathan Roelofs	f45295c366	Fix a few more cases of 'CHECK[^:]*$'. NFCI llvm-svn: 244491	2015-08-10 19:56:39 +00:00
Tyler Nowicki	c1a86f5866	Late evaluation of the fast-math vectorization requirement. This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint. llvm-svn: 244489	2015-08-10 19:51:46 +00:00
Jonathan Roelofs	5dcf157443	Fix another case of 'CHECK[^:]*$'. NFCI llvm-svn: 244486	2015-08-10 19:22:55 +00:00
Tyler Nowicki	4d62f2e039	Modify diagnostic messages to clearly indicate the why interleaving wasn't done. Sometimes interleaving is not beneficial, as determined by the cost-model and sometimes it is disabled by a loop hint (by the user). This patch modifies the diagnostic messages to make it clear why interleaving wasn't done. llvm-svn: 244485	2015-08-10 19:14:16 +00:00
James Y Knight	3994be87de	[Sparc] Implement i64 load/store support for 32-bit sparc. The LDD/STD instructions can load/store a 64bit quantity from/to memory to/from a consecutive even/odd pair of (32-bit) registers. They are part of SparcV8, and also present in SparcV9. (Although deprecated there, as you can store 64bits in one register). As recommended on llvmdev in the thread "How to enable use of 64bit load/store for 32bit architecture" from Apr 2015, I've modeled the 64-bit load/store operations as working on a v2i32 type, rather than making i64 a legal type, but with few legal operations. The latter does not (currently) work, as there is much code in llvm which assumes that if i64 is legal, operations like "add" will actually work on it. The same assumption does not hold for v2i32 -- for vector types, it is workable to support only load/store, and expand everything else. This patch: - Adds a new register class, IntPair, for even/odd pairs of registers. - Modifies the list of reserved registers, the stack spilling code, and register copying code to support the IntPair register class. - Adds support in AsmParser. (note that in asm text, you write the name of the first register of the pair only. So the parser has to morph the single register into the equivalent paired register). - Adds the new instructions themselves (LDD/STD/LDDA/STDA). - Hooks up the instructions and registers as a vector type v2i32. Adds custom legalizer to transform i64 load/stores into v2i32 load/stores and bitcasts, so that the new instructions can actually be generated, and marks all operations other than load/store on v2i32 as needing to be expanded. - Copies the unfortunate SelectInlineAsm hack from ARMISelDAGToDAG. This hack undoes the transformation of i64 operands into two arbitrarily-allocated separate i32 registers in SelectionDAGBuilder. and instead passes them in a single IntPair. (Arbitrarily allocated registers are not useful, asm code expects to be receiving a pair, which can be passed to ldd/std.) Also adds a bunch of test cases covering all the bugs I've added along the way. Differential Revision: http://reviews.llvm.org/D8713 llvm-svn: 244484	2015-08-10 19:11:39 +00:00
Jonathan Roelofs	49e46ce8e2	Fix a bunch of trivial cases of 'CHECK[^:]*$' in the tests. NFCI I looked into adding a warning / error for this to FileCheck, but there doesn't seem to be a good way to avoid it triggering on the instances of it in RUN lines. llvm-svn: 244481	2015-08-10 19:01:27 +00:00
Mark Heffernan	8939154a22	Add new llvm.loop.unroll.enable metadata. This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time, and unroll partially if the trip count is not known at compile time. This differs from "llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not known at compile time. The "llvm.loop.unroll.enable" is intended to be added for loops annotated with "#pragma unroll". llvm-svn: 244466	2015-08-10 17:28:08 +00:00
Sanjay Patel	10294b59de	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244464	2015-08-10 17:15:17 +00:00
Sanjay Patel	0f12d71b49	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244463	2015-08-10 17:00:44 +00:00
Sanjay Patel	68b0325a9e	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244460	2015-08-10 16:47:47 +00:00
Sanjay Patel	9a9003d94c	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244458	2015-08-10 16:43:20 +00:00
Fraser Cormack	e29ab2bfab	Prevent the scalarizer from caching incorrect entries The scalarizer can cache incorrect entries when walking up a chain of insertelement instructions. This occurs when it encounters more than one instruction that it is not actively searching for, as it unconditionally caches every element it finds. The fix is to only cache the first element that it isn't searching for so we don't overwrite correct entries. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D11559 llvm-svn: 244448	2015-08-10 14:48:47 +00:00
Robert Lougher	11a44b78a3	Trace copies when checking for rematerializability in spill weight calculation PR24139 contains an analysis of poor register allocation. One of the findings was that when calculating the spill weight, a rematerializable interval once split is no longer rematerializable. This is because the isRematerializable check in CalcSpillWeights.cpp does not follow the copies introduced by live range splitting (after splitting, the live interval register definition is a copy which is not rematerializable). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D11686 llvm-svn: 244439	2015-08-10 11:59:44 +00:00
David Majnemer	4232fb3f8d	[PHITransAddr] Don't assume that instruction operands are translatable We can only PHI translate instructions. In our attempt to PHI translate a bitcast, we attempt to translate its operand; however, the operand might be an argument or a global instead of an instruction. Benignly bail out when this happens. This fixes PR24397. Differential Revision: http://reviews.llvm.org/D11879 llvm-svn: 244418	2015-08-09 15:43:02 +00:00
Sanjay Patel	e0178262d4	[x86] enable machine combiner reassociations for 128-bit vector single/double adds llvm-svn: 244403	2015-08-08 19:08:20 +00:00
Sanjay Patel	0f51d14957	add a missing regression test for a DAGCombiner FDIV optimization There's no test for this transform in any backend. Discovered while debugging fast-math-flag propagation in the DAG (r244053). llvm-svn: 244373	2015-08-07 23:19:41 +00:00
Tom Stellard	fd25395c72	AMDGPU: Add pass to lower OpenCL image and sampler arguments. The pass adds new kernel arguments for image attributes, and resolves calls to dummy attribute and resource id getter functions. Patch by: Zoltan Gilian llvm-svn: 244372	2015-08-07 23:19:30 +00:00
James Y Knight	0cab80c9b3	[SPARC] Disable unsupported ExecutionEngine tests, and XFAIL a couple of DebugInfo tests. llvm-svn: 244371	2015-08-07 23:01:16 +00:00
Tom Stellard	8ebad11ee9	AMDGPU/SI: Use InstAlias instead of MnemonicAlias for VOPC instructions Summary: With InstAlias, we don't need to print the _e32 portion of the mnemonic when we print the $dst operand. This change makes it possible to include vcc in the asm string when we switch VOPC over to having implicit vcc defs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11813 llvm-svn: 244362	2015-08-07 22:00:56 +00:00
Sanjay Patel	8d768b5b3a	redo r244360 (tighten checks...) after specifying triple llvm-svn: 244361	2015-08-07 21:42:24 +00:00
Sanjay Patel	0e1d731631	tighten checks using update_llc_test_checks.py llvm-svn: 244360	2015-08-07 21:38:53 +00:00
Alex Lorenz	61420f790d	MIR Serialization: Serialize the base alignment for the machine memory operands. llvm-svn: 244357	2015-08-07 20:48:30 +00:00
Alex Lorenz	83127739ff	MIR Serialization: Serialize the offsets for the machine memory operands. llvm-svn: 244356	2015-08-07 20:26:52 +00:00
Matt Arsenault	711b390a7c	AMDGPU: Assume SMRD access for constant address space Since r243294 these are selected to SMRD and moved later if required. llvm-svn: 244354	2015-08-07 20:18:34 +00:00
Chen Li	eafbc9dc47	[ConstantFoldTerminator] Preserve make.implicit metadata when converting SwitchInst to BranchInst Summary: llvm::ConstantFoldTerminator function can convert SwitchInst with single case (and default) to a conditional BranchInst. This patch adds support to preserve make.implicit metadata on this conversion. Reviewers: sanjoy, weimingz, chenli Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11841 llvm-svn: 244348	2015-08-07 19:30:12 +00:00
Simon Pilgrim	3815c16bf8	[InstCombine] Fix SSE2/AVX2 vector logical shift by constant This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes. e.g. %1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>) In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) \| 15) - giving a zero result. In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821). Differential Revision: http://reviews.llvm.org/D11760 llvm-svn: 244341	2015-08-07 18:22:50 +00:00
Tom Stellard	c8733e805e	AMDGPU/SI: Use correct encoding of vopc for VI in the assembler Summary: We were using the SI encoding for VI. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11812 llvm-svn: 244332	2015-08-07 16:45:33 +00:00
Tom Stellard	d37631a8ac	AMDGPU/SI: Add VI checks to vop3 assembler tests Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11811 llvm-svn: 244331	2015-08-07 16:45:30 +00:00
Rafael Espindola	f7eb882176	add missing tests files llvm-svn: 244323	2015-08-07 15:35:49 +00:00
Rafael Espindola	e01f43bcc1	Add dynamic_table iterators back to ELF.h. In tree they are only used by llvm-readobj, but it is also used by https://github.com/mono/CppSharp. While at it, add some missing error checking. llvm-svn: 244320	2015-08-07 15:25:20 +00:00
Frederic Riss	a5e1453ac3	[dsymutil] Use the new MCDwarfLineTableParams customization to emit linetables llvm-dsymutil has to be able to process debug info produced by other compilers which use different line table settings. The testcase wasn't generated by another compiler, but by a modified clang. llvm-svn: 244319	2015-08-07 15:14:13 +00:00
Silviu Baranga	3e8e51c1a9	[ARM] Update ReconstructShuffle to handle mismatched types Summary: Port the ReconstructShuffle function from AArch64 to ARM to handle mismatched incoming types in the BUILD_VECTOR node. This fixes an outstanding FIXME in the ReconstructShuffle code. Reviewers: t.p.northover, rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11720 llvm-svn: 244314	2015-08-07 11:40:46 +00:00
John Brawn	64e5a66794	Revert "Make global aliases have symbol size equal to their type" This reverts r242520, as it caused pr24379. Also removes part of the test added by r243874 that checks the size of alias symbols. llvm-svn: 244313	2015-08-07 10:56:21 +00:00
NAKAMURA Takumi	b918a404c3	Tweak llvm/test/tools/dsymutil/arch-option.test to avoid globbing on mingw-w64. llvm-svn: 244311	2015-08-07 08:38:22 +00:00
JF Bastien	315cc06840	WebAssembly: textual emission uses expected opcode names Summary: WebAssembly's tablegen instructions have the names WebAssembly expects, but by LLVM convention they're uppercase and suffixed with their type after an underscore. Leave the C++ code that way, but print outt he names WebAssembly expects (lowercase, no type). We could teach tablegen to do this later, maybe by using `!cast<string>(node)` in the .td files. Reviewers: sunfish Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D11776 llvm-svn: 244305	2015-08-07 01:57:03 +00:00
Tom Stellard	f594fcad73	ELF: Add AMDGPU specific defintions Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11458 llvm-svn: 244303	2015-08-07 01:35:24 +00:00
Alex Lorenz	cba8c5fe31	MIR Serialization: Fix serialization of unnamed IR block references. The block address machine operands can reference IR blocks in other functions. This commit fixes a bug where the references to unnamed IR blocks in other functions weren't serialized correctly. llvm-svn: 244299	2015-08-06 23:57:04 +00:00
Juergen Ributzka	f09c7a3d0f	[AArch64][FastISel] Always use AND before checking the branch flag. When we are not emitting the condition for the branch, because the condition is in another BB or SDAG did the selection for us, then we have to mask the flag in the register with AND. This is required when the condition comes from a truncate, because SDAG only truncates down to a legal size of i32. This fixes rdar://problem/22161062. llvm-svn: 244291	2015-08-06 22:44:15 +00:00
Juergen Ributzka	9f54dbe7a1	Revert "[AArch64][FastISel] Add more truncation tests." and "[AArch64][FastISel] Always use an AND instruction when truncating to non-legal types." This reverts commit r243198 and 243304. Turns out this wasn't the correct fix for this problem. It works only within FastISel, but fails when the truncate is selected by SDAG. llvm-svn: 244287	2015-08-06 22:13:48 +00:00
Sean Silva	c2b70bf999	[compatibility.ll] Cover explicitly named comdats. Patch by Vedant Kumar! <vsk@apple.com> llvm-svn: 244284	2015-08-06 22:04:21 +00:00
Rafael Espindola	8b3b09fdcf	Move to llvm-readobj code that is only used there. lld might end up using a small part of this, but it will be in a much refactored form. For now this unblocks avoiding the full section scan in the ELFFile constructor. This also has a (very small) error handling improvement. llvm-svn: 244282	2015-08-06 21:54:37 +00:00
Frederic Riss	dc5370b9cc	[dsymutil] Implement dSYM bundle creation A dSYM bundle is a file hierarchy that looks slike this: <bundle name>.dSYM/ Contents/ Info.plist Resources/ DWARF/ <DWARF file(s)> This is the default output mode of dsymutil. llvm-svn: 244270	2015-08-06 21:05:06 +00:00
Frederic Riss	0948db6064	[dsymutil] Add (unimplemented) --flat option dsymutil should by default generate dSYM bundles which are filesystem hierarchies containing the debug info and an additional Info.plist. Currently llvm-dsymutil emits raw binaries containing the debug info. This is what we call the 'flat mode'. Add a -f/-flat option that is supposed to enable that flat mode, but don't wire it for now, only pass it to the tests that will need it to stay functional once we do bundle generation by default. This basically makes this commit NFC and removes the noise from the actual commit that adds support for bundle generation. llvm-svn: 244269	2015-08-06 21:05:01 +00:00
Sanjoy Das	366acc175e	[IndVars] Fix PR24356. Unsigned predicates increase or decrease agnostic of the signs of their increments. llvm-svn: 244265	2015-08-06 20:43:41 +00:00
Rui Ueyama	b9583d22eb	Update comments. llvm-svn: 244259	2015-08-06 20:05:27 +00:00
Tom Stellard	217361c33f	AMDGPU/SI: Add support for 32-bit immediate SMRD offsets on CI Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11604 llvm-svn: 244254	2015-08-06 19:28:38 +00:00
Tom Stellard	dee26a2876	AMDGPU/SI: Use ComplexPatterns for SMRD addressing modes Summary: This allows us to consolidate several of the TableGen patterns. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11602 llvm-svn: 244253	2015-08-06 19:28:30 +00:00
Kit Barton	a7bf96ab5c	Fix possible infinite loop in shrink wrapping when searching for save/restore points. There is an infinite loop that can occur in Shrink Wrapping while searching for the Save/Restore points. Part of this search checks whether the save/restore points are located in different loop nests and if so, uses the (post) dominator trees to find the immediate (post) dominator blocks. However, if the current block does not have any immediate (post) dominators then this search will result in an infinite loop. This can occur in code containing an infinite loop. The modification checks whether the immediate (post) dominator is different from the current save/restore block. If it is not, then the search terminates and the current location is not considered as a valid save/restore point for shrink wrapping. Phabricator: http://reviews.llvm.org/D11607 llvm-svn: 244247	2015-08-06 19:01:57 +00:00
Quentin Colombet	6443cce233	[Reassociation] Fix miscompile for va_arg arguments. iisUnmovableInstruction() had a list of instructions hardcoded which are considered unmovable. The list lacked (at least) an entry for the va_arg and cmpxchg instructions. Fix this by introducing a new Instruction::mayBeMemoryDependent() instead of maintaining another instruction list. Patch by Matthias Braun <matze@braunis.de>. Differential Revision: http://reviews.llvm.org/D11577 rdar://problem/22118647 llvm-svn: 244244	2015-08-06 18:44:34 +00:00
Alex Lorenz	e86d51533d	MIR Parser: Report an error when parsing duplicate memory operand flags. llvm-svn: 244240	2015-08-06 18:26:36 +00:00
Alex Lorenz	dc8de2a6b7	MIR Serialization: Serialize the 'invariant' machine memory operand flag. llvm-svn: 244230	2015-08-06 16:55:53 +00:00
Richard Diamond	bd753c9315	Fix an alignment error in `llvm::expandAtomicRMWToCmpXchg` without breaking the build where X86 isn't enabled. Summary: Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test. Reviewers: rengolin, jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11804 llvm-svn: 244229	2015-08-06 16:55:03 +00:00
Alex Lorenz	10fd03857f	MIR Serialization: Serialize the 'non-temporal' machine memory operand flag. llvm-svn: 244228	2015-08-06 16:49:30 +00:00
Douglas Katzman	63d64da0ce	[SPARC] Don't compare arch name as a string, use the enum instead. Fixes PR22695 llvm-svn: 244221	2015-08-06 15:44:12 +00:00
Renato Golin	a02ac60469	Revert "Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test." This reverts commit r244155, as it was breaking the buildbots for too long. Should be reapplied with proper fix. llvm-svn: 244205	2015-08-06 10:37:59 +00:00
Michael Kuperstein	868dc65444	[X86] Improve EmitLoweredSelect for contiguous CMOV pseudo instructions. This change improves EmitLoweredSelect() so that multiple contiguous CMOV pseudo instructions with the same (or exactly opposite) conditions get lowered using a single new basic-block. This eliminates unnecessary extra basic-blocks (and CFG merge points) when contiguous CMOVs are being lowered. Patch by: kevin.b.smith@intel.com Differential Revision: http://reviews.llvm.org/D11428 llvm-svn: 244202	2015-08-06 08:45:34 +00:00
Peter Collingbourne	e834f42073	COFF: Assign the correct symbol type to internal functions. The COFFSymbolRef::isFunctionDefinition() function tests for several conditions that are not related to whether a symbol is a function, but rather whether the symbol meets the requirements for a function definition auxiliary record, which excludes certain symbols such as internal functions and undefined references. The test we need to determine the symbol type is much simpler: we only need to compare the complex type against IMAGE_SYM_DTYPE_FUNCTION. llvm-svn: 244195	2015-08-06 05:26:35 +00:00
Alex Lorenz	49873a8382	MIR Serialization: Initial serialization of the machine operand target flags. This commit implements the initial serialization of the machine operand target flags. It extends the 'TargetInstrInfo' class to add two new methods that help to provide text based serialization for the target flags. This commit can serialize only the X86 target flags, and the target flags for the other targets will be serialized in the follow-up commits. Reviewers: Duncan P. N. Exon Smith llvm-svn: 244185	2015-08-06 00:44:07 +00:00
Frederic Riss	b5dce473e3	Revert "Make sure all temporary files get created under %T." This reverts commit r244163. The workaround shouldn't be necessary after r244172, and moreover the commit was slightly buggy as it dis a simple mkdir without removing the directory first, which could cause 'File exists' errors. llvm-svn: 244182	2015-08-05 23:53:38 +00:00
Frederic Riss	2a6e44518f	[dsymutil] Update source used to generate test binary. Forgot to include that in the last commit. llvm-svn: 244171	2015-08-05 23:33:45 +00:00
Bjarke Hammersholt Roune	5cbc7d2999	[NVPTX] Use LDG for pointer induction variables. More specifically, make NVPTXISelDAGToDAG able to emit cached loads (LDG) for pointer induction variables. Also fix latent bug where LDG was not restricted to kernel functions. I believe that this could not be triggered so far since we do not currently infer that a pointer is global outside a kernel function, and only loads of global pointers are considered for cached loads. llvm-svn: 244166	2015-08-05 23:11:57 +00:00
Artem Belevich	d41611cfdf	Make sure all temporary files get created under %T. llvm-svn: 244163	2015-08-05 22:54:36 +00:00
Frederic Riss	ae0d436545	[dsymutil] Add support for the -arch option. This option allows to select a subset of the architectures when performing a universal binary link. The filter is done completely in the mach-o specific part of the code. llvm-svn: 244160	2015-08-05 22:33:28 +00:00
Reid Kleckner	10cac7a190	Fix Windows test failure with triple instead of using the native OS llvm-svn: 244159	2015-08-05 22:27:08 +00:00
Reid Kleckner	12d2c12023	If the "CodeView" module flag is set, emit codeview instead of DWARF Summary: Emit both DWARF and CodeView if "CodeView" and "Dwarf Version" module flags are set. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11756 llvm-svn: 244158	2015-08-05 22:26:20 +00:00
Alex Lorenz	5672a893e5	MIR Serialization: Serialize the machine operand's offset. This commit serializes the offset for the following operands: target index, global address, external symbol, constant pool index, and block address. llvm-svn: 244157	2015-08-05 22:26:15 +00:00
Richard Diamond	559c1d72a9	Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test. llvm-svn: 244155	2015-08-05 22:10:57 +00:00
Chen Li	50efd9220a	[LoopUnswitch] Preserve make.implicit metadata for unswitched conditions Summary: This patch adds support to preserve make.implicit metadata for unswitched conditions in loop pre-header. Reviewers: sanjoy, weimingz Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11769 llvm-svn: 244132	2015-08-05 21:13:26 +00:00
JF Bastien	8662083770	x86 atomic: optimize a.store(reg op a.load(acquire), release) Summary: PR24191 finds that the expected memory-register operations aren't generated when relaxed { load ; modify ; store } is used. This is similar to PR17281 which was addressed in D4796, but only for memory-immediate operations (and for memory orderings up to acquire and release). This patch also handles some floating-point operations. Reviewers: reames, kcc, dvyukov, nadav, morisset, chandlerc, t.p.northover, pete Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11382 llvm-svn: 244128	2015-08-05 21:04:59 +00:00
JF Bastien	7c4218f49c	Revert "Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk." I mistakenly committed the patch for D6629, and was trying to commit another. Reverting until it gets proper signoff. llvm-svn: 244121	2015-08-05 20:53:56 +00:00
JF Bastien	ce5256f5c5	Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk. llvm-svn: 244120	2015-08-05 20:49:46 +00:00
Alex Lorenz	3f2058da16	MIR Parser: Report an error when parsing large immediate operands. llvm-svn: 244100	2015-08-05 19:03:42 +00:00
Alex Lorenz	05e3882e81	MIR Serialization: Serialize the typed immediate integer machine operands. llvm-svn: 244098	2015-08-05 18:52:21 +00:00
Frederic Riss	6e3278633d	[dsymutil] Fix test patterns. Depending on the filesystem paths, the YAML dump might quote paths. Account for that in the regex patterns. llvm-svn: 244094	2015-08-05 18:45:13 +00:00
Frederic Riss	4dd3e0c41e	[dsymutil] Implement support for handling mach-o universal binaries as main input/output. The DWARF linker isn't touched by this, the implementation links individual files and merges them together into a fat binary by calling out to the 'lipo' utility. The main change is that the MachODebugMapParser can now return multiple debug maps for a single binary. The test just verifies that lipo would be invoked correctly, but doesn't actually generate a binary. This mimics the way clang tests its external iplatform tools integration. llvm-svn: 244087	2015-08-05 18:27:44 +00:00
Alex Lorenz	2b3cf19332	MIR Parser: Report an error when parsing duplicate register flags. llvm-svn: 244081	2015-08-05 18:09:03 +00:00
Chandler Carruth	405e4f9051	[GMR] Teach the conservative path of GMR to catch even more easy cases. In PR24288 it was pointed out that the easy case of a non-escaping global and something that obviously required an escape sometimes is hidden behind PHIs (or selects in theory). Because we have this binary test, we can easily just check that all possible input values satisfy the requirement. This is done with a (very small) recursion through PHIs and selects. With this, the specific example from the PR is correctly folded by GVN. Differential Revision: http://reviews.llvm.org/D11707 llvm-svn: 244078	2015-08-05 17:58:30 +00:00
Alex Lorenz	01c1a5ee58	MIR Serialization: Serialize the 'early-clobber' register operand flag. llvm-svn: 244075	2015-08-05 17:49:03 +00:00
Alex Lorenz	9075258b6a	MIR Serialization: Serialize the 'debug-use' register operand flag. llvm-svn: 244071	2015-08-05 17:41:17 +00:00
James Y Knight	bce20afe0f	[Sparc] Fix disassembly of popc instruction. And add tests. Patch by David Wiberg! llvm-svn: 244064	2015-08-05 17:00:30 +00:00
Steven Wu	9927206f8c	Force the MachO generated for Darwin to have VERSION_MIN load command On Darwin, it is required to stamp the object file with VERSION_MIN load command. This commit will provide a VERSRION_MIN load command to the MachO file that doesn't specify the version itself by inferring from Target Triple. llvm-svn: 244059	2015-08-05 15:36:38 +00:00
Artyom Skrobov	6fbef2a780	ARMISelDAGToDAG.cpp had this self-contradictory code: return StringSwitch<int>(Flags) .Case("g", 0x1) .Case("nzcvq", 0x2) .Case("nzcvqg", 0x3) .Default(-1); ... // The _g and _nzcvqg versions are only valid if the DSP extension is // available. if (!Subtarget->hasThumb2DSP() && (Mask & 0x2)) return -1; ARMARM confirms that the comment is right, and the code was wrong. llvm-svn: 244029	2015-08-05 11:02:14 +00:00
Simon Pilgrim	42c611b9ae	[InstCombine] Added more specific SSE2/AVX2 vector shift tests. llvm-svn: 244022	2015-08-05 08:21:38 +00:00
Hal Finkel	17caf326e5	[MachineCombiner] Don't use the opcode-only form of computeInstrLatency In r242277, I updated the MachineCombiner to work with itineraries, but I missed a call that is scheduling-model-only (the opcode-only form of computeInstrLatency). Using the form that takes an MI* allows this to work with itineraries (and should be NFC for subtargets with scheduling models). llvm-svn: 244020	2015-08-05 07:45:28 +00:00
Hal Finkel	23cdeeea0f	[RuntimeDyld] Adapt PPC64 relocations to PPC32 Begin adapting some of the implemented PPC64 relocations for PPC32 (with a test case). Patch by Pierre-Andre Saulais! llvm-svn: 243991	2015-08-04 15:29:00 +00:00
Sanjay Patel	75ced2782b	[x86] machine combiner reassociation: mark EFLAGS operand as 'dead' In the commentary for D11660, I wasn't sure if it was alright to create new integer machine instructions without also creating the implicit EFLAGS operand. From what I can see, the implicit operand is always created by the MachineInstrBuilder based on the instruction type, so we don't have to do that explicitly. However, in reviewing the debug output, I noticed that the operand was not marked as 'dead'. The machine combiner should do that to preserve future optimization opportunities that may be checking for that dead EFLAGS operand themselves. Differential Revision: http://reviews.llvm.org/D11696 llvm-svn: 243990	2015-08-04 15:21:56 +00:00
Vasileios Kalintiris	044e172228	Revert r229675 - [mips] Avoid redundant sign extension of the result of binary bitwise instructions. It introduced two regressions on 64-bit big-endian targets running under N32 (MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4, and MultiSource/Applications/kimwitu++/kc) The issue is that on 64-bit targets comparisons such as BEQ compare the whole GPR64 but incorrectly tell the instruction selector that they operate on GPR32's. This leads to the elimination of i32->i64 extensions that are actually required by comparisons to work correctly. There's currently a patch under review that fixes this problem. llvm-svn: 243984	2015-08-04 14:26:35 +00:00
Simon Pilgrim	d19b9d8229	[InstCombine] Split off SSE2/AVX2 vector shift tests. These aren't vector demanded bits tests. More tests to follow. llvm-svn: 243963	2015-08-04 08:05:27 +00:00
Duncan P. N. Exon Smith	706f37e8df	Linker: Fix references to uniqued nodes after r243883 r243883 started moving 'distinct' nodes instead of duplicated them in lib/Linker. This had the side-effect of sometimes not cloning uniqued nodes that reference them. I missed a corner case: !named = !{!0} !0 = !{!1} !1 = distinct !{!0} !0 is the entry point for "remapping", and a temporary clone (say, !0-temp) is created and mapped in case we need to model a uniquing cycle. Recursive descent into !1. !1 is distinct, so we leave it alone, but update its operand to !0-temp. Pop back out to !0. Its only operand, !1, hasn't changed, so we don't need to use !0-temp. !0-temp goes out of scope, and we're finished remapping, but we're left with: !named = !{!0} !0 = !{!1} !1 = distinct !{null} ; uh oh... Previously, if !0 and !0-temp ended up with identical operands, then !0-temp couldn't have been referenced at all. Now that distinct nodes don't get duplicated, that assumption is invalid. We need to !0-temp->replaceAllUsesWith(!0) before freeing !0-temp. I found this while running an internal `-flto -g` bootstrap. Strangely, there was no case of this in the open source bootstrap I'd done before commit... llvm-svn: 243961	2015-08-04 06:42:31 +00:00
Mehdi Amini	c8d5783114	Update test suite to make "ninja check" succeed without native backend builtin Requires "native" feature in most places that were failing. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243960	2015-08-04 06:32:54 +00:00
Mehdi Amini	63c3989f6a	Move generic MIR tests in their own subdir, requires "native" as well These tests rely on the native backend to be built-in. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243959	2015-08-04 06:32:45 +00:00
Mehdi Amini	fc6b6983ac	Improve lit "native" feature to check if the native backend is builtin The goal is to have 'ninja check' passing even if the X86 backend is not built. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243958	2015-08-04 06:32:31 +00:00
Saleem Abdulrasool	0a2672bb43	ARM: support windows division routines This adds the software division routines for the Windows RTABI. These are not expected to be used often though as most modern Windows ARM capable targets support hardware division. In the case that the target CPU doesnt support hardware division, this will be the fallback. llvm-svn: 243952	2015-08-04 03:57:56 +00:00
Sanjoy Das	215df9ed98	Revert "[LSR] Generate and use zero extends" This reverts commit r243348 and r243357. They caused PR24347. llvm-svn: 243939	2015-08-04 01:52:05 +00:00
Ahmed Bougacha	e0e12db8c8	[AArch64] Add isel support for f16 indexed LD/ST. llvm-svn: 243935	2015-08-04 01:29:38 +00:00
Ahmed Bougacha	b0ae36f0d1	[AArch64] Vector FCOPYSIGN supports Custom-lowering: mark it as such. There's a bunch of code in LowerFCOPYSIGN that does smart lowering, and is actually already vector-aware; let's use it instead of scalarizing! The only interesting change is that for v2f32, we previously always used use v4i32 as the integer vector type. Use v2i32 instead, and mark FCOPYSIGN as Custom. llvm-svn: 243926	2015-08-04 00:42:34 +00:00
Ahmed Bougacha	f65371a235	[CodeGen] Fix FCOPYSIGN legalization to account for mismatched types. We used to legalize it like it's any other binary operations. It's not, because it accepts mismatched operand types. Because of that, we used to hit various asserts and miscompiles. Specialize vector legalizations to, in the worst case, unroll, or, when possible, to just legalize the operand that needs legalization. Scalarization isn't covered, because I can't think of a target where some but not all of the 1-element vector types are to be scalarized. llvm-svn: 243924	2015-08-04 00:32:55 +00:00
Alex Lorenz	a518b79601	MIR Serialization: Serialize the 'volatile' machine memory operand flag. llvm-svn: 243923	2015-08-04 00:24:45 +00:00
Alex Lorenz	4af7e610c3	MIR Serialization: Initial serialization of the machine memory operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243915	2015-08-03 23:08:19 +00:00
Justin Bogner	45291391b2	lto: Avoid relying on the environment for this test It's better to pass libLTO to ld64 via the command line flag than rely on setting DYLD_LIBRARY_PATH. llvm-svn: 243911	2015-08-03 22:43:14 +00:00
Chandler Carruth	87adb7a2e2	[Unroll] Improve the brute force loop unroll estimate by propagating through PHI nodes across iterations. This patch teaches the new advanced loop unrolling heuristics to propagate constants into the loop from the preheader and around the backedge after simulating each iteration. This lets us brute force solve simple recurrances that aren't modeled effectively by SCEV. It also makes it more clear why we need to process the loop in-order rather than bottom-up which might otherwise make much more sense (for example, for DCE). This came out of an attempt I'm making to develop a principled way to account for dead code in the unroll estimation. When I implemented a forward-propagating version of that it produced incorrect results due to failing to propagate cost between loop iterations through the PHI nodes, and it occured to me we really should at least propagate simplifications across those edges, and it is quite easy thanks to the loop being in canonical and LCSSA form. Differential Revision: http://reviews.llvm.org/D11706 llvm-svn: 243900	2015-08-03 20:32:27 +00:00
Derek Schuff	b4c1c28c6e	Fix testing for end of stream in bitstream reader. This fixes a bug found while working on the bitcode reader. In particular, the method BitstreamReader::AtEndOfStream doesn't always behave correctly when processing a data streamer. The method fillCurWord doesn't properly set CurWord/BitsInCurWord if the data streamer was already at eof, but GetBytes had not yet set the ObjectSize field of the streaming memory object. This patch fixes this problem, and provides a test to show that this problem has been fixed. Patch by Karl Schimpf. Differential Revision: http://reviews.llvm.org/D11391 llvm-svn: 243890	2015-08-03 18:01:50 +00:00
Duncan P. N. Exon Smith	55ca964e94	DI: Disallow uniquable DICompileUnits Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s. The backend is liable to start relying on that (if it hasn't already), so make uniquable `DICompileUnit`s illegal and automatically upgrade old bitcode. This is a nice cleanup, since we can remove an unnecessary `DenseSet` (and the associated uniquing info) from `LLVMContextImpl`. Almost all the testcases were updated with this script: git grep -e '= !DICompileUnit' -l -- test \| grep -v test/Bitcode \| xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,' I imagine something similar should work for out-of-tree testcases. llvm-svn: 243885	2015-08-03 17:26:41 +00:00
Tim Northover	910dde7ab2	ARM: prefer allocating VFP regs at stride 4 on Darwin. This is necessary for WatchOS support, where the compact unwind format assumes this kind of layout. For now we only want this on Swift-like CPUs though, where it's been the Xcode behaviour for ages. Also, since it can expand the prologue we don't want it at -Oz. llvm-svn: 243884	2015-08-03 17:20:10 +00:00
Artur Pilipenko	17376c4e02	Currently string attributes on function arguments/return values can be generated using LLVM API. However they are not supported in parser. So, the following scenario will fail: * generate function with string attribute using API, * dump it in LL format, * try to parse. Add parser support for string attributes to fix the issue. Reviewed By: reames, hfinkel Differential Revision: http://reviews.llvm.org/D11058 llvm-svn: 243877	2015-08-03 14:31:49 +00:00
John Brawn	f3324cf1a5	[ARM] Make GlobalMerge merge extern globals by default Enabling merging of extern globals appears to be generally either beneficial or harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57) it gives improvements in the 1-5% range, but in the rest the overall effect is zero. Differential Revision: http://reviews.llvm.org/D10966 llvm-svn: 243874	2015-08-03 12:13:33 +00:00
Alexander Kornienko	cd86c9791c	Don't use test inputs from other directories. The test/DebugInfo/dwarfdump-macho-universal.test test added in r243862 uses an input from another test's directory (test/tools/dsymutil/Inputs/fat-test.o) which breaks our test setup. Copying the required test input to the test's Input directory to fix the issue. llvm-svn: 243872	2015-08-03 11:59:45 +00:00
James Molloy	6967e5e4a3	Be less conservative about forming IT blocks. In http://reviews.llvm.org/rL215382, IT forming was made more conservative under the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M. But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block as long as the flags register is dead afterwards. This gives significant performance improvements in a variety of MPEG based workloads. Differential revision: http://reviews.llvm.org/D11680 llvm-svn: 243869	2015-08-03 09:24:48 +00:00
Alexander Kornienko	2d4bef8d8d	Fix the test added at r243777. When RUN: lines are split into multiple lines, each one must be prefixed with RUN:. llvm-svn: 243868	2015-08-03 09:13:19 +00:00
Frederic Riss	4c5d357b57	[dwarfdump] Add support for dumping mach-o universal objectfiles llvm-svn: 243862	2015-08-03 00:10:31 +00:00
JF Bastien	fda53373f2	WebAssembly: implement getScalarShiftAmountTy so we can shift by amount, with type Summary: This currently sets the shift amount RHS to the same type as the LHS, and assumes that the LHS is a simple type. This isn't currently the case e.g. with weird integers sizes, but will eventually be true and will assert if not. That's what you get for having an experimental backend: break it and you get to keep both pieces. Most backends either set the RHS to MVT::i32 or MVT::i64, but WebAssembly is a virtual ISA and tries to have regular-looking binary operations where both operands are the same type (even if a 64-bit RHS shifter is slightly silly, hey it's free!). Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11715 llvm-svn: 243860	2015-08-03 00:00:11 +00:00
Simon Pilgrim	87368fdd30	[X86][SSE] Refreshed sse2 vector shift tests llvm-svn: 243850	2015-08-02 15:23:53 +00:00
Jingyue Wu	ffa09be222	[NVPTX] allow register copy between float and int Summary: Fixes PR24303. With Bruno's WIP (D11197) on PeepholeOptimizer, across-class register copying (e.g. i32 to f32) becomes possible. Enhance NVPTXInstrInfo::copyPhysReg to handle these cases. Reviewers: jholewinski Subscribers: eliben, jholewinski, llvm-commits, bruno Differential Revision: http://reviews.llvm.org/D11622 llvm-svn: 243839	2015-08-01 18:02:12 +00:00
Simon Atanasyan	3644ee4aa8	[Mips] Support DT_MIPS_RLD_MAP_REL dynamic section tag in the llvm-readobj llvm-svn: 243833	2015-08-01 12:02:02 +00:00
Simon Pilgrim	503a2594c3	[DAGCombiner] Convert constant AND masks to shuffle clear masks down to the byte level The XformToShuffleWithZero method currently checks AND masks at the per-lane level for all-one and all-zero constants and attempts to convert them to legal shuffle clear masks. This patch generalises XformToShuffleWithZero, splitting and checking the sub-lanes of the constants down to the byte level to see if any legal shuffle clear masks are possible. This allows a lot of masks (often from legalization or truncation) to be folded into existing shuffle patterns and removes a lot of constant mask loading. There are a few examples of poor shuffle lowering that are exposed by this patch that will be cleaned up in future patches (e.g. merging shuffles that are separated by bitcasts, x86 legalized v8i8 zero extension uses PMOVZX+AND+AND instead of AND+PMOVZX, etc.) Differential Revision: http://reviews.llvm.org/D11518 llvm-svn: 243831	2015-08-01 10:01:46 +00:00
JF Bastien	8f9aea08d4	WebAssembly: handle more than int32 argument/return Summary: Also test 64-bit integers, except shifts for now which are broken because isel dislikes the 32-bit truncate that precedes them. Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11699 llvm-svn: 243822	2015-08-01 04:48:44 +00:00
Alex Lorenz	b4d0d6a345	AMDGPU/SI: Add implicit register operands in the correct order. This commit fixes a bug in the class 'SIInstrInfo' where the implicit register machine operands were added to a machine instruction in an incorrect order - the implicit uses were added before the implicit defs. I found this bug while working on moving the implicit register operand verification code from the MIR parser to the machine verifier. This commit also makes the method 'addImplicitDefUseOperands' in the machine instruction class public so that it can be reused in the 'SIInstrInfo' class. Reviewers: Matt Arsenault Differential Revision: http://reviews.llvm.org/D11689 llvm-svn: 243799	2015-07-31 23:30:09 +00:00
Alex Lorenz	59ed5919cd	MIR Parser: Report an error when a jump table entry is redefined. llvm-svn: 243798	2015-07-31 23:13:23 +00:00
Jingyue Wu	cf70053b20	[NVPTX] convert pointers in byval kernel arguments to global Summary: For example, in struct S { int x; int y; }; __global__ void foo(S s) { int *b = s.y; // use b } "b" is guaranteed to point to global. NVPTX should emit ld.global/st.global for accessing "b". Reviewers: jholewinski Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11505 llvm-svn: 243790	2015-07-31 21:44:14 +00:00
JF Bastien	4a2d56044f	WebAssembly: handle `ret void`. Summary: Use -1 as numoperands for the return SDTypeProfile, denoting that return is variadic. Note that the patterns in InstrControl.td still need to match the inputs, so this ins't an "anything goes" variadic on ret! The next step will be to handle other local types (not just int32). Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11692 llvm-svn: 243783	2015-07-31 21:04:18 +00:00
Alex Lorenz	ad156fb6af	MIR Serialization: Serialize the floating point immediate machine operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243780	2015-07-31 20:49:21 +00:00
Duncan P. N. Exon Smith	375fa1381d	IR: Add a broad bitcode compatibility test Successive versions of LLVM should retain the ability to parse bitcode generated by old releases of the compiler. This adds a bitcode format compatibility test, which is intended to provide good (albeit not entirely exhaustive) coverage of the current LangRef. This also includes compatibility tests for LLVM 3.6. After every 3.X.0 release, the compatibility.ll file from the 3.X branch should be copied to compatibility-3.X.ll on trunk, and the 3.X.0 release used to generate a corresponding bitcode file. Patch by Vedant Kumar! llvm-svn: 243779	2015-07-31 20:44:32 +00:00
Frederic Riss	f6402bfe78	[dwarfdump] Ignore scattered relocations for mach-o. When encountering a scattered relocation, the code would assert trying to access an unexisting section. I couldn't find a way to expose the result of the processing of a scattered reloc, and I'm really unsure what the right thing to do is. This patch just skips them during the processing in DwarfContext and adds a mach-o file to the tests that exposed the asserting behavior. (This is a new failure that is being exposed by Rafael's recent work on the libObject interfaces. I think the wrong behavior has always happened, but now it's asserting) llvm-svn: 243778	2015-07-31 20:22:50 +00:00
Frederic Riss	8caf29932c	[dsymutil] Support multiple input files on the command line llvm-svn: 243777	2015-07-31 20:22:20 +00:00
Duncan P. N. Exon Smith	ed013cd221	DI: Remove DW_TAG_arg_variable and DW_TAG_auto_variable Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags, using `DW_TAG_variable` in their place Stop exposing the `tag:` field at all in the assembly format for `DILocalVariable`. Most of the testcase updates were generated by the following sed script: find test/ -name ".ll" -o -name ".mir" \| xargs grep -l 'DILocalVariable' \| xargs sed -i '' \ -e 's/tag: DW_TAG_arg_variable, //' \ -e 's/tag: DW_TAG_auto_variable, //' There were only a handful of tests in `test/Assembly` that I needed to update by hand. (Note: a follow-up could change `DILocalVariable::DILocalVariable()` to set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable` (as appropriate), instead of having that logic magically in the backend in `DbgVariable`. I've added a FIXME to that effect.) llvm-svn: 243774	2015-07-31 18:58:39 +00:00
JF Bastien	d7fcc6f9c7	WebAssembly: handle unused function arguments. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11684 llvm-svn: 243770	2015-07-31 18:13:27 +00:00
David Majnemer	654e130b6e	New EH representation for MSVC compatibility This introduces new instructions neccessary to implement MSVC-compatible exception handling support. Most of the middle-end and none of the back-end haven't been audited or updated to take them into account. Differential Revision: http://reviews.llvm.org/D11097 llvm-svn: 243766	2015-07-31 17:58:14 +00:00
JF Bastien	600aee9805	WebAssembly: print basic integer assembly. Summary: This prints assembly for int32 integer operations defined in WebAssemblyInstrInteger.td only, with major caveats: - The operation names are currently incorrect. - Other integer and floating-point types will be added later. - The printer isn't factored out to handle recursive AST code yet, since it can't even handle control flow anyways. - The assembly format isn't full s-expressions yet either, this will be added later. - This currently disables PrologEpilogCodeInserter as well as MachineCopyPropagation becasue they don't like virtual registers, which WebAssembly likes quite a bit. This will be fixed by factoring out NVPTX's change (currently a fork of PrologEpilogCodeInserter). Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11671 llvm-svn: 243763	2015-07-31 17:53:38 +00:00
Sanjay Patel	9ff4626028	[x86] reassociate integer multiplies using machine combiner pass Add i16, i32, i64 imul machine instructions to the list of reassociation candidates. A new bit of logic is needed to handle integer instructions: they have an implicit EFLAGS operand, so we have to make sure it's dead in order to do any reassociation with integer ops. Differential Revision: http://reviews.llvm.org/D11660 llvm-svn: 243756	2015-07-31 16:21:55 +00:00
Reid Kleckner	47ea9ece1a	[COFF] Return symbol VAs instead of RVAs for PE files This makes llvm-nm consistent with binutils nm on executables and DLLs. For a vanilla hello world executable, the address of main should include the default image base of 0x400000. llvm-svn: 243755	2015-07-31 16:14:22 +00:00
Geoff Berry	8a7ef3b2ee	[AArch64] Favor extended reg patterns for sub Summary: Favor the extended reg patterns over the shifted reg patterns that match only the operand shift and not the full sign/zero extend and shift. Reviewers: jmolloy, t.p.northover Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11569 llvm-svn: 243753	2015-07-31 15:55:54 +00:00
Matt Arsenault	e1ce344b5a	AMDGPU: Fix v16i32 to v16i8 truncstore llvm-svn: 243731	2015-07-31 04:12:04 +00:00
Kostya Serebryany	fb7d8d9d06	[libFuzzer] trace switch statements and apply mutations based on the expected case values llvm-svn: 243726	2015-07-31 01:33:06 +00:00
Tom Stellard	e182e74c53	ELFYAML: Enable parsing of EM_AMDGPU Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11263 llvm-svn: 243724	2015-07-31 01:15:15 +00:00
Matt Arsenault	a7bc7db53c	TableGen: Support folding casts from bits to int This is to fix an incorrect error when trying to initialize DwarfNumbers with a !cast<int> of a bits initializer. getValuesAsListOfInts("DwarfNumbers") would not see an IntInit and instead the cast, so would give up. It seems likely that this could be generalized to attempt the convertInitializerTo for any type. I'm not really sure why the existing code seems to special case the string cast cases when convertInitializerTo seems like it should generally handle this sort of thing. llvm-svn: 243722	2015-07-31 01:12:06 +00:00
Sumanth Gundapaneni	532a13691c	[ARM] Lower modulo operation to generate __aeabi_divmod on Android For a modulo (reminder) operation, clang -target armv7-none-linux-gnueabi generates "__modsi3" clang -target armv7-none-eabi generates "__aeabi_idivmod" clang -target armv7-linux-androideabi generates "__modsi3" Android bionic libc doesn't provide a __modsi3, instead it provides a "__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate the correct call when ever there is a modulo operation. Differential Revision: http://reviews.llvm.org/D11661 llvm-svn: 243717	2015-07-31 00:45:12 +00:00
Alex Lorenz	60bf599607	MIR Parser: Report an error when a constant pool item is redefined. llvm-svn: 243696	2015-07-30 22:00:17 +00:00
Alex Lorenz	a06c0c6401	MIR Parser: Report an error when a virtual register is redefined. llvm-svn: 243695	2015-07-30 21:54:10 +00:00
Sanjay Patel	1166f2ff9f	fix memcpy/memset/memmove lowering when optimizing for size Fixing MinSize attribute handling was discussed in D11363. This is a prerequisite patch to doing that. The handling of OptSize when lowering mem* functions was broken on Darwin because it wants to ignore -Os for these cases, but the existing logic also made it ignore -Oz (MinSize). The Linux change demonstrates a widespread problem. The backend doesn't usually recognize the MinSize attribute by itself; it assumes that if the MinSize attribute exists, then the OptSize attribute must also exist. Fixing this more generally will be a follow-on patch or two. Differential Revision: http://reviews.llvm.org/D11568 llvm-svn: 243693	2015-07-30 21:41:50 +00:00
Wei Mi	d6f7252e2e	[SLP vectorizer]: Choose the best consecutive candidate to pair with a store instruction. The patch changes the SLPVectorizer::vectorizeStores to choose the immediate succeeding or preceding candidate for a store instruction when it has multiple consecutive candidates. In this way it has better chance to find more slp vectorization opportunities. Differential Revision: http://reviews.llvm.org/D10445 llvm-svn: 243666	2015-07-30 17:40:39 +00:00
Alex Lorenz	618b283cd9	MIR Serialization: Serialize the machine basic block's successor weights. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243659	2015-07-30 16:54:38 +00:00
Vasileios Kalintiris	3d9d81fbb1	[mips] Fix out-of-date debug information in test file. Update the debug info in the check-lines because the change in r243638 introduced a constant initialization before the prologue's end as part of a register spill. llvm-svn: 243640	2015-07-30 13:13:09 +00:00
Vasileios Kalintiris	2041b1dd0b	[mips][FastISel] Remove hidden mips-fast-isel option. Summary: This hidden option would disable code generation through FastISel by default. It was removed from the available options and from the Fast-ISel tests that required it in order to run the tests. Reviewers: dsanders Subscribers: qcolombet, llvm-commits Differential Revision: http://reviews.llvm.org/D11610 llvm-svn: 243638	2015-07-30 12:39:33 +00:00
Vasileios Kalintiris	77fb0a3dcf	[mips][FastISel] Apply only zero-extension to constants prior to their materialization. Summary: Previously, we would sign-extend non-boolean negative constants and zero-extend otherwise. This was problematic for PHI instructions with negative values that had a type with bitwidth less than that of the register used for materialization. More specifically, ComputePHILiveOutRegInfo() assumes the constants present in a PHI node are zero extended in their container and afterwards deduces the known bits. For example, previously we would materialize an i16 -4 with the following instruction: addiu $r, $zero, -4 The register would end-up with the 32-bit 2's complement representation of -4. However, ComputePHILiveOutRegInfo() would generate a constant with the upper 16-bits set to zero. The SelectionDAG builder would use that information to generate an AssertZero node that would remove any subsequent trunc & zero_extend nodes. In theory, we should modify ComputePHILiveOutRegInfo() to consult target-specific hooks about the way they prefer to materialize the given constants. However, git-blame reports that this specific code has not been touched since 2011 and it seems to be working well for every target so far. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11592 llvm-svn: 243636	2015-07-30 11:51:44 +00:00
Nick Lewycky	c3890d2969	Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.) Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h. llvm-svn: 243585	2015-07-29 22:32:47 +00:00
Frederic Riss	11ab2b8858	[dsymutil] Rename -v option to -verbose The dsymutil-classic -v option dumps the tool version rather than putting it in verbose mode. Rename -v to -verbose and update the tests that use it (in the process removing it from a few tests that didn't require it anymore since the -dump-debug-map option was introduced). A followup commit will reintroduce the -v option that dumps the version. llvm-svn: 243582	2015-07-29 22:29:34 +00:00
Simon Pilgrim	ba10f76705	[X86][SSE] Keep 32-bit target i64 vector shifts on SSE unit. This patch improves the 32-bit target i64 constant matching to detect the shuffle vector splats that are introduced by i64 vector shift vectorization (D8416). Differential Revision: http://reviews.llvm.org/D11327 llvm-svn: 243577	2015-07-29 21:44:27 +00:00
Tim Northover	2a9d801fd5	AArch64: use 32-bit MOV rather than UBFX to truncate registers. It's potentially more efficient on Cyclone, and from the optimization guides & schedulers looks like it has no effect on Cortex-A53 or A57. In general you'd expect a MOV to be about the most efficient instruction with its semantics, even though the official "UXTW" alias is really a UBFX. llvm-svn: 243576	2015-07-29 21:34:32 +00:00
Alex Lorenz	a6f9a37d92	MIR Serialization: Serialize the frame info's save and restore points. This commit serializes the save and restore machine basic block references from the machine frame information class. llvm-svn: 243575	2015-07-29 21:09:09 +00:00
Simon Pilgrim	86478c6909	[X86][SSE] Vectorize i64 ASHR operations This patch vectorizes the v2i64/v4i64 ASHR shift operations - the last remaining integer vector shifts that are still being transferred to/from the scalar unit to be completed. Differential Revision: http://reviews.llvm.org/D11439 llvm-svn: 243569	2015-07-29 20:31:45 +00:00
Alexey Samsonov	869a5ff37f	[ASan] Disable dynamic alloca and UAR detection in presence of returns_twice calls. Summary: returns_twice (most importantly, setjmp) functions are optimization-hostile: if local variable is promoted to register, and is changed between setjmp() and longjmp() calls, this update will be undone. This is the reason why "man setjmp" advises to mark all these locals as "volatile". This can not be enough for ASan, though: when it replaces static alloca with dynamic one, optionally called if UAR mode is enabled, it adds a whole lot of SSA values, and computations of local variable addresses, that can involve virtual registers, and cause unexpected behavior, when these registers are restored from buffer saved in setjmp. To fix this, just disable dynamic alloca and UAR tricks whenever we see a returns_twice call in the function. Reviewers: rnk Subscribers: llvm-commits, kcc Differential Revision: http://reviews.llvm.org/D11495 llvm-svn: 243561	2015-07-29 19:36:08 +00:00
Colin LeMahieu	fcc32766bf	[llvm-objdump] Merging MachO DumpSections in to FilterSections. Simplifying some predicate logic. llvm-svn: 243556	2015-07-29 19:08:10 +00:00
Jingyue Wu	3a04dc6e78	Roll forward r242871 r242871 missed one place that should be guarded with isPhysicalReg. This patch fixes that. llvm-svn: 243555	2015-07-29 18:59:09 +00:00
Alex Lorenz	b139323f21	MIR Serialization: Serialize the '.cfi_def_cfa' CFI instruction. llvm-svn: 243554	2015-07-29 18:57:23 +00:00
Alex Lorenz	fbe9c04c5f	MIR Parser: Parse multiple LHS register machine operands. llvm-svn: 243553	2015-07-29 18:51:21 +00:00
Michael Zolotukhin	9f06ef76d3	[Unroll] Handle SwitchInst properly. Previously successor selection was simply wrong. llvm-svn: 243545	2015-07-29 18:10:33 +00:00
Michael Zolotukhin	3a7d55b623	[Unroll] Don't crash when simplified branch condition is undef. llvm-svn: 243544	2015-07-29 18:10:29 +00:00
Michael Zolotukhin	a2069d36ce	Rename test full-unroll-bad-geps.ll to full-unroll-crashers.ll. No reason to limit it only to GEP-related crashes. More tests are to come here. llvm-svn: 243543	2015-07-29 18:10:23 +00:00
Bruno Cardoso Lopes	38c0250679	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Reported to Broke some internal tests: PR24303 This reverts commit r243486. llvm-svn: 243540	2015-07-29 17:46:47 +00:00
Colin LeMahieu	77804bed85	[llvm-objdump] Added -j flag to filter sections that are operated on. llvm-svn: 243526	2015-07-29 15:45:39 +00:00
Jingyue Wu	7ec38530a5	Temporarily revert r242871 PR24299 llvm-svn: 243522	2015-07-29 15:26:11 +00:00
Bill Schmidt	42ddd71120	[PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask Given certain shuffle-vector masks, LLVM emits splat instructions which splat the wrong bytes from the source register. The issue is that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp does not ensure that the splat pattern found is requesting bytes that are aligned on an EltSize boundary. This patch detects this situation as not a valid splat mask, resulting in a permute being generated instead of a splat. Patch and test case by Tyler Kenney, cleaned up a bit by me. This is a simple bug fix that would be good to incorporate into 3.7. llvm-svn: 243519	2015-07-29 14:31:57 +00:00
Akira Hatanaka	f53b0403f8	[AArch64] Define subtarget feature strict-align. This commit defines subtarget feature strict-align and uses it instead of cl::opt -aarch64-strict-align to decide whether strict alignment should be forced. rdar://problem/21529937 llvm-svn: 243516	2015-07-29 14:17:26 +00:00
Sanjoy Das	cfe41f050c	[Statepoints] Let patchable statepoints have a symbolic call target. Summary: As added initially, statepoints required their call targets to be a constant pointer null if ``numPatchBytes`` was non-zero. This turns out to be a problem ergonomically, since there is no way to mark patchable statepoints as calling a (readable) symbolic value. This change remove the restriction of requiring ``null`` call targets for patchable statepoints, and changes PlaceSafepoints to maintain the symbolic call target through its transformation. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11550 llvm-svn: 243502	2015-07-28 23:50:30 +00:00
Sanjay Patel	133e68b45c	ignore duplicate divisor uses when transforming into reciprocal multiplies (PR24141) PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141 contains a test case where we have duplicate entries in a node's uses() list. After r241826, we use CombineTo() to delete dead nodes when combining the uses into reciprocal multiplies, but this fails if we encounter the just-deleted node again in the list. The solution in this patch is to not add duplicate entries to the list of users that we will subsequently iterate over. For the test case, this avoids triggering the combine divisors logic entirely because there really is only one user of the divisor. Differential Revision: http://reviews.llvm.org/D11345 llvm-svn: 243500	2015-07-28 23:28:22 +00:00
Alex Lorenz	ef5c196fb0	MIR Serialization: Serialize the target index machine operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243497	2015-07-28 23:02:45 +00:00
Akira Hatanaka	2670f4a550	[ARM] Define subtarget feature strict-align. This commit defines subtarget feature strict-align and uses it instead of cl::opt -arm-strict-align to decide whether strict alignment should be forced. Also, remove the logic that was checking the OS and architecture as clang is now responsible for setting strict-align based on the command line options specified and the target architecute and OS. rdar://problem/21529937 http://reviews.llvm.org/D11470 llvm-svn: 243493	2015-07-28 22:44:28 +00:00
Tim Northover	17ae83a25f	AArch64: be careful of large immediates when optimising cmps. llvm-svn: 243492	2015-07-28 22:42:32 +00:00
Davide Italiano	f75bf454e4	[tests] Use llvm-readobj instead of macho-dump. llvm-svn: 243487	2015-07-28 21:58:08 +00:00
Bruno Cardoso Lopes	3c235763e5	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply 243271 with more fixes; although we are not handling multiple sources with coalescable copies, we were not properly skipping this case. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243486	2015-07-28 21:45:50 +00:00
Vasileios Kalintiris	9876946aee	[mips][FastISel] Fix call lowering by bailing out on "fastcc" calls. Summary: Currently, we support only the MIPS O32 ABI calling convention for call lowering. With this change we avoid using the O32 calling convetion for lowering calls marked as using the fast calling convention. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11515 llvm-svn: 243485	2015-07-28 21:43:31 +00:00
Chih-Hung Hsieh	41169c5487	Fix typo. llvm-svn: 243475	2015-07-28 20:38:29 +00:00
Chih-Hung Hsieh	c5e53ca1b7	Limit this test only on linux. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243474	2015-07-28 20:31:10 +00:00
Vasileios Kalintiris	9ec6114860	[mips][FastISel] Fix generated code for IR's select instruction. Summary: Generate correct code for the select instruction by zero-extending it's boolean/condition operand to GPR-width. This is necessary because the conditional-move instructions operate on the whole register. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11506 llvm-svn: 243469	2015-07-28 19:57:25 +00:00
Matt Arsenault	7227cc1a48	AMDGPU: Don't try to use LDS/vector for private if pointer value stored If the pointer is the store's value operand, this would produce a broken module. Make sure the use is actually for the pointer operand. llvm-svn: 243462	2015-07-28 18:47:00 +00:00
Matt Arsenault	fdcd39a8ad	AMDGPU: Fix crash if called function is a bitcast getCalledFunction() is null, so this would crash. Replace crash with an error on unsupported call. llvm-svn: 243461	2015-07-28 18:29:14 +00:00
Jingyue Wu	42f1d67a45	[SCEV] Apply NSW and NUW flags via poison value analysis Summary: Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX. There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial. This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this: for (int i = 0; i < limit; ++i) { sum += ptr[i + offset]; } Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX. There are more details in this discussion on llvmdev. June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234 July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392 Patch by Bjarke Roune Reviewers: eliben, atrick, sanjoy Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits Differential Revision: http://reviews.llvm.org/D11212 llvm-svn: 243460	2015-07-28 18:22:40 +00:00
Alex Lorenz	305e8f6312	Add a test case for r242191 ([MMX] Use the appropriate instructions for GR64 <-> VR64 copies). This commit adds a MIR test case for the commit r242191, which was committed without one. This test case verifies that the ExpandPostRA pass expands the GR64 <-> VR64 copies into the appropriate MMX_MOV instructions. llvm-svn: 243457	2015-07-28 17:52:59 +00:00
Chih-Hung Hsieh	9843f406ec	Move unit tests to target specific directories. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243454	2015-07-28 17:32:49 +00:00
Alex Lorenz	deb534907e	MIR Serialization: Serialize the block address machine operands. llvm-svn: 243453	2015-07-28 17:28:03 +00:00
Sanjay Patel	94a7433cde	add tests to show broken current behavior of minsize attribute llvm-svn: 243451	2015-07-28 17:18:25 +00:00
Chih-Hung Hsieh	1e859582d6	Implement target independent TLS compatible with glibc's emutls.c. The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target//ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438	2015-07-28 16:24:05 +00:00
Geoff Berry	c573bf7a5f	[AArch64] Match float round and convert to int instructions. Summary: Add patterns for doing floating point round with various rounding modes followed by conversion to int as a single FCVT* instruction. Reviewers: t.p.northover, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11424 llvm-svn: 243422	2015-07-28 15:24:10 +00:00
Silviu Baranga	4825060059	[LAA] Add clarifying comments for the checking pointer grouping algorithm. NFC llvm-svn: 243416	2015-07-28 13:44:08 +00:00
Adhemerval Zanella	7bc3319d84	Implement __builtin_thread_pointer This path add the aarch64 lowering of __builtin_thread_pointer. It uses the already implemented AArch64ISD::THREAD_POINTER used in TLS generation. llvm-svn: 243412	2015-07-28 13:03:31 +00:00
Chandler Carruth	99ad7bb49c	[GMR] Teach GlobalsModRef to distinguish an important and safe case of no-alias with non-addr-taken globals: they cannot alias a captured pointer. If the non-global underlying object would have been a capture were it to alias the global, we can firmly conclude no-alias. It isn't reasonable for a transformation to introduce a capture in a way observable by an alias analysis. Consider, even if it were to temporarily capture one globals address into another global and then restore the other global afterward, there would be no way for the load in the alias query to observe that capture event correctly. If it observes it then the temporary capturing would have changed the meaning of the program, making it an invalid transformation. Even instrumentation passes or a pass which is synthesizing stores to global variables to expose race conditions in programs could not trigger this unless it queried the alias analysis infrastructure mid-transform, in which case it seems reasonable to return results from before the transform started. See the comments in the change for a more detailed outlining of the theory here. This should address the primary performance regression found when the non-conservatively-correct path of the alias query was disabled. Differential Revision: http://reviews.llvm.org/D11410 llvm-svn: 243405	2015-07-28 11:11:11 +00:00
Simon Pilgrim	df984f58ad	[X86][SSE] Use bitmasks instead of shuffles where possible. VPAND is a lot faster than VPSHUFB and VPBLENDVB - this patch ensures we attempt to lower to a basic bitmask before lowering to the slower byte shuffle/blend instructions. Split off from D11518. Differential Revision: http://reviews.llvm.org/D11541 llvm-svn: 243395	2015-07-28 08:54:41 +00:00
Igor Breger	47a7b95b1d	AVX512: Add encoding tests to vptestnm instructions Differential Revision: http://reviews.llvm.org/D11521 llvm-svn: 243391	2015-07-28 07:00:00 +00:00
Igor Breger	8352a0ddf2	AVX512: Implemented encoding and intrinsics for VGETEXPSS/D instructions Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11528 llvm-svn: 243390	2015-07-28 06:53:28 +00:00
Sanjoy Das	6c7a186599	FileCheck'ify some wc/grep based tests; NFCI. llvm-svn: 243378	2015-07-28 03:50:09 +00:00
Sanjay Patel	8c13e3680d	fix invalid load folding with SSE/AVX FP logical instructions (PR22371) This is a follow-up to the FIXME that was added with D7474 ( http://reviews.llvm.org/rL229531 ). I thought this load folding bug had been made hard-to-hit, but it turns out to be very easy when targeting 32-bit x86 and causes a miscompile/crash in Wine: https://bugs.winehq.org/show_bug.cgi?id=38826 https://llvm.org/bugs/show_bug.cgi?id=22371#c25 The quick fix is to simply remove the scalar FP logical instructions from the load folding table in X86InstrInfo, but that causes us to miss load folds that should be possible when lowering fabs, fneg, fcopysign. So the majority of this patch is altering those lowerings to use vector FP logical instructions (because that's all x86 gives us anyway). That lets us do the load folding legally. Differential Revision: http://reviews.llvm.org/D11477 llvm-svn: 243361	2015-07-28 00:48:32 +00:00
Sanjoy Das	3895a57b32	[LSR] Move X86 specific test case to X86/ rL243348 added the test case in the wrong directory. llvm-svn: 243357	2015-07-28 00:13:42 +00:00
Adam Nemet	54f0b83ee2	[LAA] Split out a helper to print a collection of memchecks This is effectively an NFC but we can no longer print the index of the pointer group so instead I print its address. This still lets us cross-check the section that list the checks against the section that list the groups (see how I modified the test). E.g. before we printed this: Run-time memory checks: Check 0: Comparing group 0: %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc Against group 1: %arrayidxA = getelementptr i16, i16* %a, i64 %ind %arrayidxA1 = getelementptr i16, i16* %a, i64 %add ... Grouped accesses: Group 0: (Low: %c High: (78 + %c)) Member: {%c,+,4}<%for.body> Member: {(2 + %c),+,4}<%for.body> Now we print this (changes are underlined): Run-time memory checks: Check 0: Comparing group (0x7f9c6040c320): ~~~~~~~~~~~~~~ %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind Against group (0x7f9c6040c358): ~~~~~~~~~~~~~~ %arrayidxA1 = getelementptr i16, i16* %a, i64 %add %arrayidxA = getelementptr i16, i16* %a, i64 %ind ... Grouped accesses: Group 0x7f9c6040c320: ~~~~~~~~~~~~~~ (Low: %c High: (78 + %c)) Member: {(2 + %c),+,4}<%for.body> Member: {%c,+,4}<%for.body> llvm-svn: 243354	2015-07-27 23:54:41 +00:00
Sanjoy Das	93b3504aa8	[LSR] Generate and use zero extends Summary: If a scale or a base register can be rewritten as "Zext({A,+,1})" then LSR will now consider a formula of that form in its normal cost computation. Depends on D9180 Reviewers: qcolombet, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9181 llvm-svn: 243348	2015-07-27 23:27:51 +00:00
JF Bastien	088c47ee5b	WebAssembly: add a generic CPU Summary: WebAssemblySubtarget.cpp expects a default 'generic' CPU to exist, and this seems to be prevalent with other targets. It makes sense to have something between MVP and bleeding-edge, even though for now it's the same as MVP. This removes a warning that's currently generated. Subscribers: jfb, llvm-commits, sunfish Differential Revision: http://reviews.llvm.org/D11546 llvm-svn: 243345	2015-07-27 23:25:54 +00:00
NAKAMURA Takumi	69ed7170dc	Tweak llvm/test/CodeGen/X86/virtual-registers-cleared-in-machine-functions-liveins.ll not to fail for targeting win32. llvm-svn: 243341	2015-07-27 23:01:41 +00:00
Alex Lorenz	8a1915b04e	MIR Serialization: Serialize the unnamed basic block references. This commit serializes the references from the machine basic blocks to the unnamed basic blocks. This commit adds a new attribute to the machine basic block's YAML mapping called 'ir-block'. This attribute contains the actual reference to the basic block. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243340	2015-07-27 22:42:41 +00:00
Colin LeMahieu	fe36f83b11	[llvm-mc] Add --no-warn flag with -W alias to disable outputting warnings while assembling. llvm-svn: 243338	2015-07-27 22:39:14 +00:00
Colin LeMahieu	fe2c8b8015	[llvm-mc] Pushing plumbing through for --fatal-warnings flag. llvm-svn: 243334	2015-07-27 21:56:53 +00:00
Sanjoy Das	5dab205ced	[IndVars] Make loop varying predicates loop invariant. Summary: Was D9784: "Remove loop variant range check when induction variable is strictly increasing" This change re-implements D9784 with the two differences: 1. It does not use SCEVExpander and does not generate new instructions. Instead, it does a quick local search for existing `llvm::Value`s that it needs when modifying the `icmp` instruction. 2. It is more general -- it deals with both increasing and decreasing induction variables. I've added all of the tests included with D9784, and two more. As an example on what this change does (copied from D9784): Given C code: ``` for (int i = M; i < N; i++) // i is known not to overflow if (i < 0) break; a[i] = 0; } ``` This transformation produces: ``` for (int i = M; i < N; i++) if (M < 0) break; a[i] = 0; } ``` Which can be unswitched into: ``` if (!(M < 0)) for (int i = M; i < N; i++) a[i] = 0; } ``` I went back and forth on whether the top level logic should live in `SimplifyIndvar::eliminateIVComparison` or be put into its own routine. Right now I've put it under `eliminateIVComparison` because even though the `icmp` is not eliminated, it no longer is an IV comparison. I'm open to putting it in its own helper routine if you think that is better. Reviewers: reames, nicholas, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11278 llvm-svn: 243331	2015-07-27 21:42:49 +00:00
Simon Pilgrim	c363e7d8e0	[X86][SSE] Added shuffle tests to demonstrate missed bitmask. llvm-svn: 243324	2015-07-27 20:41:57 +00:00
Alex Lorenz	5b0d5f6f26	MIR Serialization: Serialize the '.cfi_def_cfa_register' CFI instruction. llvm-svn: 243322	2015-07-27 20:39:03 +00:00
Bruno Cardoso Lopes	b20841df44	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Still breaks some ARM buildbots. This reverts r243271. llvm-svn: 243318	2015-07-27 20:26:04 +00:00
Akira Hatanaka	2541e0241c	[AArch64] Remove check for Darwin that was needed to decide if x18 should be reserved. The decision to reserve x18 is going to be made solely by the front-end, so it isn't necessary to check if the OS is Darwin in the backend. llvm-svn: 243308	2015-07-27 19:18:47 +00:00
Juergen Ributzka	93d67463a3	[AArch64][FastISel] Add more truncation tests. This is a follow-up to r243198 and adds more truncation tests. llvm-svn: 243304	2015-07-27 19:00:23 +00:00
Simon Pilgrim	15c0a59463	[InstCombine][X86][SSE] Replace sign/zero extension intrinsics with native IR Now that we are generating sane codegen for vector sext/zext nodes on SSE targets, this patch uses instcombine to replace the SSE41/AVX2 pmovsx and pmovzx intrinsics with the equivalent native IR code. Differential Revision: http://reviews.llvm.org/D11503 llvm-svn: 243303	2015-07-27 18:52:15 +00:00
Matt Arsenault	95365ca482	Fix assert when inlining a constantexpr addrspacecast The pointer size of the addrspacecasted pointer might not have matched, so this would have hit an assert in accumulateConstantOffset. I think this was here to allow constant folding of a load of an addrspacecasted constant. Accumulating the offset through the addrspacecast doesn't make much sense, so something else is necessary to allow folding the load through this cast. llvm-svn: 243300	2015-07-27 18:31:03 +00:00
Marek Olsak	93df060871	AMDGPU: don't match vgpr loads for constant loads Author: Dave Airlie <airlied@redhat.com> In order to implement indirect sampler loads, we don't want to match on a VGPR load but an SGPR one for constants, as we cannot feed VGPRs to the sampler only SGPRs. this should be applicable for llvm 3.7 as well. llvm-svn: 243294	2015-07-27 18:16:08 +00:00
Alex Lorenz	10b23525cc	Reset the virtual registers in liveins when clearing the virtual registers. This commit zeroes out the virtual register references in the machine function's liveins in the class 'MachineRegisterInfo' when the virtual register definitions are cleared. Reviewers: Matthias Braun llvm-svn: 243290	2015-07-27 17:51:59 +00:00
Alex Lorenz	12045a4b59	MIR Serialization: Serialize the machine function's liveins. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243288	2015-07-27 17:42:45 +00:00
Silviu Baranga	de38070587	The tests added in r243270 require asserts to be enabled llvm-svn: 243274	2015-07-27 15:22:49 +00:00
Silviu Baranga	65bdb6788b	Fix the tests added in r243270. Use 2>&1 instead of \|& llvm-svn: 243273	2015-07-27 15:08:55 +00:00
Bruno Cardoso Lopes	669c921bfd	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply r242295 with fixes in the implementation. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243271	2015-07-27 14:39:46 +00:00
Silviu Baranga	7581d22512	[ARM/AArch64] Fix cost model for interleaved accesses Summary: Fix the cost of interleaved accesses for ARM/AArch64. We were calling getTypeAllocSize and using it to check the number of bits, when we should have called getTypeAllocSizeInBits instead. This would pottentially cause the vectorizer to generate loads/stores and shuffles which cannot be matched with an interleaved access instruction. No performance changes are expected for now since matching/generating interleaved accesses is still disabled by default. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11524 llvm-svn: 243270	2015-07-27 14:39:34 +00:00
Marek Olsak	1354b87695	AMDGPU/SI: Fix the V_FRACT_F64 SI bug workaround This is a candidate for 3.7. llvm-svn: 243263	2015-07-27 11:37:42 +00:00
Jingyue Wu	bfefff555e	Roll forward r243250 r243250 appeared to break clang/test/Analysis/dead-store.c on one of the build slaves, but I couldn't reproduce this failure locally. Probably a false positive as I saw this test was broken by r243246 or r243247 too but passed later without people fixing anything. llvm-svn: 243253	2015-07-26 19:10:03 +00:00
Jingyue Wu	84879b71a9	Revert r243250 breaks tests llvm-svn: 243251	2015-07-26 18:30:13 +00:00
Jingyue Wu	bf485f059c	[TTI/CostModel] improve TTI::getGEPCost and use it in CostModel::getInstructionCost Summary: This patch updates TargetTransformInfoImplCRTPBase::getGEPCost to consider addressing modes. It now returns TCC_Free when the GEP can be completely folded to an addresing mode. I started this patch as I refactored SLSR. Function isGEPFoldable looks common and is indeed used by some WIP of mine. So I extracted that logic to getGEPCost. Furthermore, I noticed getGEPCost wasn't directly tested anywhere. The best testing bed seems CostModel, but its getInstructionCost method invokes getAddressComputationCost for GEPs which provides very coarse estimation. So this patch also makes getInstructionCost call the updated getGEPCost for GEPs. This change inevitably breaks some tests because the cost model changes, but nothing looks seriously wrong -- if we believe the new cost model is the right way to go, these tests should be updated. This patch is not perfect yet -- the comments in some tests need to be updated. I want to know whether this is a right approach before fixing those details. Reviewers: chandlerc, hfinkel Subscribers: aschwaighofer, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D9819 llvm-svn: 243250	2015-07-26 17:28:13 +00:00
Simon Pilgrim	65d35a14b7	[X86][SSE] Refreshed vector bit count tests. llvm-svn: 243249	2015-07-26 17:02:25 +00:00
Simon Pilgrim	8a9c1d7d88	[X86][AVX2] Refreshed avx2 conversion tests llvm-svn: 243248	2015-07-26 17:01:16 +00:00
Igor Breger	f2460112ad	Implemented encoding and intrinsics of the following instructions vunpckhps/pd, vunpcklps/pd, vpunpcklbw, vpunpckhbw, vpunpcklwd, vpunpckhwd, vpunpckldq, vpunpckhdq, vpunpcklqdq, vpunpckhqdq Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11509 llvm-svn: 243246	2015-07-26 14:41:44 +00:00
Tobias Grosser	e692669759	Fix typo in comment llvm-svn: 243244	2015-07-26 11:37:05 +00:00
Simon Pilgrim	357b85c926	[InstCombine] Split off SSE4a tests. These aren't vector demanded bits tests. More tests to follow. llvm-svn: 243223	2015-07-25 17:14:01 +00:00
Simon Pilgrim	944a5777bb	[X86][SSE] Added additional vector sign/zero load extension tests. llvm-svn: 243216	2015-07-25 14:07:20 +00:00
Simon Pilgrim	20dc35aff6	[X86][SSE] Added additional vector sign/zero extension tests. llvm-svn: 243212	2015-07-25 11:17:35 +00:00
Chen Li	145c2f57ae	[LoopUnswitch] Improve loop unswitch pass to find trivial unswitch conditions more effectively Summary: This patch improves trivial loop unswitch. The current trivial loop unswitch only checks if loop header's terminator contains a trivial unswitch condition. But if the loop header only has one reachable successor (due to intentionally or unintentionally missed code simplification), we should consider the successor as part of the loop header. Therefore, instead of stopping at loop header's terminator, we should keep traversing its successors within loop until reach a real conditional branch or switch (whose condition can not be constant folded). This change will enable a single -loop-unswitch pass to unswitch multiple trivial conditions (unswitch one trivial condition could open opportunity to unswitch another one in the same loop), while the old implementation can unswitch only one per pass. Reviewers: reames, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11481 llvm-svn: 243203	2015-07-25 03:21:06 +00:00
Juergen Ributzka	6364985b58	[AArch64][FastISel] Always use an AND instruction when truncating to non-legal types. When truncating to non-legal types (such as i16, i8 and i1) always use an AND instruction to mask out the upper bits. This was only done when the source type was an i64, but not when the source type was an i32. This commit fixes this and adds the missing i32 truncate tests. This fixes rdar://problem/21990703. llvm-svn: 243198	2015-07-25 02:16:53 +00:00
Eric Christopher	f0024d14f1	Fix PPCMaterializeInt to check the size of the integer based on the extension property we're requesting - zero or sign extended. This fixes cases where we want to return a zero extended 32-bit -1 and not be sign extended for the entire register. Also updated the already out of date comment with the current behavior. llvm-svn: 243192	2015-07-25 00:48:08 +00:00
Akira Hatanaka	0d4c9ea6e0	[AArch64] Define subtarget feature "reserve-x18", which is used to decide whether register x18 should be reserved. This change is needed because we cannot use a backend option to set cl::opt "aarch64-reserve-x18" when doing LTO. Out-of-tree projects currently using cl::opt option "-aarch64-reserve-x18" to reserve x18 should make changes to add subtarget feature "reserve-x18" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11463 llvm-svn: 243186	2015-07-25 00:18:31 +00:00
Duncan P. N. Exon Smith	56b893b364	DI/Verifier: Fix argument bitrot in DILocalVariable Add a verifier check that `DILocalVariable`s of tag `DW_TAG_arg_variable` always have a non-zero 'arg:' field, and those of tag `DW_TAG_auto_variable` always have a zero 'arg:' field. These are the only configurations that are properly understood by the backend. (Also, fix the bad examples in LangRef and test/Assembler, and fix the bug in Kaleidoscope Ch8.) A large number of testcases seem to have bitrotted their way forward from some ancient version of the debug info hierarchy that didn't have `arg:` parameters. If you have out-of-tree testcases that start failing in the verifier and you don't care enough to get the `arg:` right, you may have some luck just calling: sed -e 's/, arg: 0/, arg: 1/' or some such, but I hand-updated the ones in tree. llvm-svn: 243183	2015-07-24 23:59:25 +00:00
Alex Lorenz	1bb48de1f9	MIR Serialization: Serialize MachineFrameInfo's callee saved information. This commit serializes the callee saved information from the class 'MachineFrameInfo'. This commit extends the YAML mappings for the fixed and the ordinary stack objects and adds an optional 'callee-saved-register' attribute. This attribute is used to serialize the callee save information. llvm-svn: 243173	2015-07-24 22:22:50 +00:00
Lawrence Hu	dc8a83b53b	Handle loop with negtive induction variable increment This patch extend LoopReroll pass to hand the loops which is similar to the following: while (len > 1) { sum4 += buf[len]; sum4 += buf[len-1]; len -= 2; } llvm-svn: 243171	2015-07-24 22:01:49 +00:00
Alex Lorenz	ab4cbcfda7	MIR Serialization: Serialize the simple virtual register allocation hints. This commit serializes the virtual register allocations hints of type 0. These hints specify the preferred physical registers for allocations. llvm-svn: 243156	2015-07-24 20:35:40 +00:00
Philip Reames	fa2c630f79	[RewriteStatepointsForGC] Adjust naming scheme to be more stable The names for instructions inserted were previous dependent on iteration order. By deriving the names from the original instructions, we can avoid instability in tests without resorting to ordered traversals. It also makes the IR mildly easier to read at large scale. llvm-svn: 243140	2015-07-24 19:01:39 +00:00
Alex Lorenz	c7bf20403b	MIR Parser: Run the machine verifier after initializing machine functions. llvm-svn: 243128	2015-07-24 17:44:49 +00:00
Lang Hames	a8183e5c40	[RuntimeDyld] MachO: Add support for ARM scattered vanilla relocations. llvm-svn: 243126	2015-07-24 17:40:04 +00:00
Alex Lorenz	3905d9db97	MIR Tests: Add liveins and successors to make tests pass with machine verifier. This commit adds the liveins and successors properties to machine basic blocks in some of the MIR tests to ensure that the tests will pass when the MIR parser will run the machine verifier after initializing a machine function. llvm-svn: 243124	2015-07-24 17:36:55 +00:00
Alex Lorenz	55f95127bf	MIR Tests: Make the basic block successor test an X86 specific test. This commit moves and transforms the generic test 'CodeGen/MIR/successor-basic-blocks.mir' into an X86 specific test 'CodeGen/MIR/X86/successor-basic-blocks.mir'. This change is required in order to enable the machine verifier for the MIR parser, as the machine verifier verifies that the machine basic blocks contain instructions that actually determine the machine basic block successors. llvm-svn: 243123	2015-07-24 17:31:55 +00:00
Igor Breger	074a64e72c	AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation Added tests for DAG lowering ,encoding and intrinsic Differential Revision: http://reviews.llvm.org/D11218 llvm-svn: 243122	2015-07-24 17:24:15 +00:00
NAKAMURA Takumi	c9bc0b1e14	llvm/test/tools/dsymutil/ARM/lit.local.cfg: Fix possibly typo, s/X86/ARM/. llvm-svn: 243106	2015-07-24 11:55:11 +00:00
Luke Cheeseman	4d45ff2b87	[ARM] - Fix lowering of shufflevectors in AArch32 Some shufflevectors are currently being incorrectly lowered in the AArch32 backend as the existing checks for detecting the NEON operations from the shufflevector instruction expects the shuffle mask and the vector operands to be of the same length. This is not always the case as the mask may be twice as long as the operand; here only the lower half of the shufflemask gets checked, so provided the lower half of the shufflemask looks like a vector transpose (or even is just all -1 for undef) then the intrinsics may get incorrectly lowered into a vector transpose (VTRN) instruction. This patch fixes this by accommodating for both cases and adds regression tests. Differential Revision: http://reviews.llvm.org/D11407 llvm-svn: 243103	2015-07-24 09:57:05 +00:00
Luke Cheeseman	b5c627aba8	When lowering vector shifts a check is performed to see if the value to shift by is an immediate, in this check the value is negated and stored in and int64_t. The value can be -2^63 yet the result cannot be stored in an int64_t and this gives some undefined behaviour causing failures. The negation is only necessary when the values is within a certain range and so it should not need to negate -2^63, this patch introduces this and also a regression test. Differential Revision: http://reviews.llvm.org/D11408 llvm-svn: 243100	2015-07-24 09:31:48 +00:00
Frederic Riss	eb85c8fb09	[dsymutil] Implement support for universal mach-o object files. This patch allows llvm-dsymutil to read universal (aka fat) macho object files and archives. The patch touches nearly everything in the BinaryHolder, but it is fairly mechinical: the methods that returned MemoryBufferRefs or ObjectFiles now return a vector of those, and the high-level access function takes a triple argument to select the architecture. There is no support yet for handling fat executables and thus no support for writing fat object files. llvm-svn: 243096	2015-07-24 06:41:11 +00:00
Frederic Riss	65f0abf275	[dsymutil] Make the triple detection more strict. MachOObjectFile offers a method for detecting the correct triple, use it instead of the previous approximation. This doesn't matter right now, but it will become important for mach-o universal (fat) binaries. llvm-svn: 243095	2015-07-24 06:41:04 +00:00
Michael Zolotukhin	57776b8159	Handle resolvable branches in complete loop unroll heuristic. Summary: Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate. This patch is intended to be applied after D10205 (though it could be applied independently). Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10206 llvm-svn: 243084	2015-07-24 01:53:04 +00:00
Eric Christopher	1fb23395c3	Clean up function attributes on PPC fast-isel tests. llvm-svn: 243079	2015-07-24 01:07:50 +00:00
Alex Lorenz	8cfc68677c	MIR Serialization: Serialize the '.cfi_offset' CFI instruction. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243062	2015-07-23 23:09:07 +00:00
JF Bastien	8969666450	WebAssembly: test that valid -mcpu flags are accepted. Summary: AArch64 has a similar test. Subscribers: sunfish, aemerson, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11479 llvm-svn: 243058	2015-07-23 23:00:04 +00:00
Sanjay Patel	f2fa58e744	fix crash in machine trace metrics due to processing dbg_value instructions (PR24199) The test in PR24199 ( https://llvm.org/bugs/show_bug.cgi?id=24199 ) crashes because machine trace metrics was not ignoring dbg_value instructions when calculating data dependencies. The machine-combiner pass asks machine trace metrics to calculate an instruction trace, does some reassociations, and calls MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval() along with MachineTraceMetrics::invalidate(). The dbg_value instructions have their operands invalidated, but the instructions are not expected to be deleted. On a subsequent loop iteration of the machine-combiner pass, machine trace metrics would be called again and die while accessing the invalid debug instructions. Differential Revision: http://reviews.llvm.org/D11423 llvm-svn: 243057	2015-07-23 22:56:53 +00:00
Colin LeMahieu	333a19f6c3	Moving tests in to X86 directory. llvm-svn: 243049	2015-07-23 21:55:26 +00:00
Colin LeMahieu	0e4386c43b	Using an input object file instead of trying to generate an object file. llvm-svn: 243044	2015-07-23 21:40:19 +00:00
Colin LeMahieu	88783e97f3	Specifying a test triple. llvm-svn: 243042	2015-07-23 21:24:52 +00:00
Colin LeMahieu	f34933e425	[llvm-objdump] Add -D and --disassemble-all flags that attempt disassembly on all sections instead of just text sections. llvm-svn: 243041	2015-07-23 20:58:49 +00:00
Matt Wala	878c144f8a	[Scalarizer] Fix potential for stale data in Scattered across invocations Summary: Scalarizer has two data structures that hold information about changes to the function, Gathered and Scattered. These are cleared in finish() at the end of runOnFunction() if finish() detects any changes to the function. However, finish() was checking for changes by only checking if Gathered was non-empty. The function visitStore() only modifies Scattered without touching Gathered. As a result, Scattered could have ended up having stale data if Scalarizer only scalarized store instructions. Since the data in Scattered is used during the execution of the pass, this introduced dangling pointer errors. The fix is to check whether both Scattered and Gathered are empty before deciding what to do in finish(). This also fixes a problem where the Function can be modified although the pass returns false. Reviewers: rnk Subscribers: rnk, srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D10459 llvm-svn: 243040	2015-07-23 20:53:46 +00:00
Weiming Zhao	b33a5557f4	This patch eanble register coalescing to coalesce the following: %vreg2<def> = MOVi32imm 1; GPR32:%vreg2 %W1<def> = COPY %vreg2; GPR32:%vreg2 into: %W1<def> = MOVi32imm 1 Patched by Lawrence Hu (lawrence@codeaurora.org) llvm-svn: 243033	2015-07-23 19:24:53 +00:00
Igor Laevsky	2ba8374612	NFC. Explicitly specify attributes in BasicAA/cs-cs.ll test. This will simplify verifying correctness for a changes which modify attributes. llvm-svn: 243016	2015-07-23 14:31:18 +00:00
Michael Kuperstein	454d145395	[X86] Allow load folding into PUSH instructions Adds pushes to the folding tables. This also required a fix to the TD definition, since the memory forms of the push instructions did not have the right mayLoad/mayStore flags. Differential Revision: http://reviews.llvm.org/D11340 llvm-svn: 243010	2015-07-23 12:23:45 +00:00
Kuba Brecka	45dbffdc3d	[asan] Rename the ABI versioning symbol to '__asan_version_mismatch_check' instead of abusing '__asan_init' We currently version `__asan_init` and when the ABI version doesn't match, the linker gives a `undefined reference to '__asan_init_v5'` message. From this, it might not be obvious that it's actually a version mismatch error. This patch makes the error message much clearer by changing the name of the undefined symbol to be `__asan_version_mismatch_check_xxx` (followed by the version string). We obviously don't want the initializer to be named like that, so it's a separate symbol that is used only for the purpose of version checking. Reviewed at http://reviews.llvm.org/D11004 llvm-svn: 243003	2015-07-23 10:54:06 +00:00
Michael Kuperstein	ffcc7663a2	[X86] Fix order of operands for ins and outs instructions when parsing intel syntax Patch by: marina.yatsina@intel.com Differential Revision: http://reviews.llvm.org/D11337 llvm-svn: 243001	2015-07-23 10:23:48 +00:00
Rafael Espindola	b82657d3f1	Support printing relocations in files with no section table. llvm-svn: 242998	2015-07-23 09:11:05 +00:00
Elena Demikhovsky	482b303254	X86: Fixed assertion failure in 32-bit mode The DAG Node "SCALAR_TO_VECTOR" may be created if the type of the scalar element is legal. Added a check for the scalar type before creating this node. Added a test that fails with assertion on the current version. Differential Revision: http://reviews.llvm.org/D11413 llvm-svn: 242994	2015-07-23 08:25:23 +00:00
Chandler Carruth	fe414353db	Revert r242990: "AVX-512: Implemented encoding , DAG lowering and ..." This commit broke the build. Numerous build bots broken, and it was blocking my progress so reverting. It should be trivial to reproduce -- enable the BPF backend and it should fail when running llvm-tblgen. llvm-svn: 242992	2015-07-23 08:03:44 +00:00
Igor Breger	da1b2ea955	AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation Added tests for DAG lowering ,encoding and intrinsic Differential Revision: http://reviews.llvm.org/D11218 llvm-svn: 242990	2015-07-23 07:39:21 +00:00
Igor Breger	87e6397fb1	AVX : Fix ISA disabling in case AVX512VL , some instructions should be disabled only if AVX512BW and AVX512VL present. Tests added. Differential Revision: http://reviews.llvm.org/D11414 llvm-svn: 242987	2015-07-23 07:11:14 +00:00
Rafael Espindola	6565ea7299	Refactor duplicated code and check for invalid symbol table size. llvm-svn: 242981	2015-07-23 03:24:22 +00:00
Frederic Riss	9ccfddc39d	[dsymutil] Check archive members timestamps. The debug map contains the timestamp of the object files in references. We do not check these in the general case, but it's really useful if you have archives where different versions of an object file have been appended. This allows llvm-dsymutil to find the right one. llvm-svn: 242965	2015-07-22 23:24:00 +00:00
David Majnemer	ed9abe119b	[ConstantFolding] Support folding loads from a GlobalAlias The MSVC ABI requires that we generate an alias for the vtable which means looking through a GlobalAlias which cannot be overridden improves our ability to devirtualize. Found while investigating PR20801. Patch by Andrew Zhogin! Differential Revision: http://reviews.llvm.org/D11306 llvm-svn: 242955	2015-07-22 22:29:30 +00:00
Rafael Espindola	41f0f108a5	Force the gnu archive format to fix the test on darwin. llvm-svn: 242949	2015-07-22 22:09:44 +00:00
JF Bastien	b9073fb20a	WebAssembly: basic bitcode → assembly CodeGen test Summary: Add a basic CodeGen bitcode test which (for now) only prints out the function name and nothing else. The current code merely implements the basic needed for the test run to not crash / assert. Getting to that point required: - Basic InstPrinter. - Basic AsmPrinter. - DiagnosticInfoUnsupported (not strictly required, but nice to have, duplicated from AMDGPU/BPF's ISelLowering). - Some SP and register setup in WebAssemblyTargetLowering. - Basic LowerFormalArguments. - GenInstrInfo. - Placeholder LowerFormalArguments. - Placeholder CanLowerReturn and LowerReturn. - Basic DAGToDAGISel::Select, which requiresGenDAGISel.inc as well as GET_INSTRINFO_ENUM with GenInstrInfo.inc. - Remove WebAssemblyFrameLowering::determineCalleeSaves and rely on default. - Implement WebAssemblyFrameLowering::hasFP, same as AArch64's implementation. Follow-up patches will implement a real AsmPrinter, which will require adding MI opcodes specific to WebAssembly. Reviewers: sunfish Subscribers: aemerson, jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D11369 llvm-svn: 242939	2015-07-22 21:28:15 +00:00
Alex Lorenz	46d760d161	MIR Serialization: Serialize the machine instruction's debug location. llvm-svn: 242938	2015-07-22 21:15:11 +00:00
Rafael Espindola	be9ab2682e	Fix fetching the symbol table of a thin archive. We were trying to read it as an external file. llvm-svn: 242926	2015-07-22 19:34:26 +00:00
Rafael Espindola	69ef2afaeb	Identify thin archives as archives. llvm-svn: 242921	2015-07-22 18:29:39 +00:00
Alex Lorenz	35e4446903	MIR Serialization: Serialize the metadata machine operands. llvm-svn: 242916	2015-07-22 17:58:46 +00:00
Quentin Colombet	48b772007f	[ARM] Make the frame lowering code ready for shrink-wrapping. Shrink-wrapping can now be tested on ARM with -enable-shrink-wrap. Related to <rdar://problem/20821730> llvm-svn: 242908	2015-07-22 16:34:37 +00:00
Asaf Badouh	a5b2e5e2a7	[X86][AVX512] add reduce/range/scalef/rndScale include encoding and intrinsics Differential Revision: http://reviews.llvm.org/D11222 llvm-svn: 242896	2015-07-22 12:00:43 +00:00
Michael Kuperstein	75216a8d14	Fix test from r242886 to use the right triple. llvm-svn: 242889	2015-07-22 11:19:22 +00:00
Michael Kuperstein	23d952b611	[X86] Add .intel_syntax noprefix directive to intel-syntax x86 asm output Patch by: michael.zuckerman@intel.com Differential Revision: http://reviews.llvm.org/D11223 llvm-svn: 242886	2015-07-22 10:49:44 +00:00
Michael Kuperstein	d72403636c	Fix mem2reg to correctly handle allocas only used in a single block Currently, a load from an alloca that is used in as single block and is not preceded by a store is replaced by undef. This is not always correct if the single block is inside a loop. Fix the logic so that: 1) If there are no stores in the block, replace the load with an undef, as before. 2) If there is a store (regardless of where it is in the block w.r.t the load), bail out, and let the rest of mem2reg handle this alloca. Patch by: gil.rapaport@intel.com Differential Revision: http://reviews.llvm.org/D11355 llvm-svn: 242884	2015-07-22 10:29:29 +00:00
Kuba Brecka	8ec94ead7d	[asan] Improve moving of non-instrumented allocas In r242510, non-instrumented allocas are now moved into the first basic block. This patch limits that to only move allocas that are present after the first instrumented one (i.e. only move allocas up). A testcase was updated to show behavior in these two cases. Without the patch, an alloca could be moved down, and could cause an invalid IR. Differential Revision: http://reviews.llvm.org/D11339 llvm-svn: 242883	2015-07-22 10:25:38 +00:00
Elena Demikhovsky	a26f10ce18	AVX-512: Added intrinsics for VCVT* instructions. All SKX forms. All VCVT instructions for float/double/int/long types. Differential Revision: http://reviews.llvm.org/D11343 llvm-svn: 242877	2015-07-22 08:56:00 +00:00
Chen Li	c0f3a158f0	[LoopUnswitch] Code refactoring to separate trivial loop unswitch and non-trivial loop unswitch in processCurrentLoop() Summary: The current code in LoopUnswtich::processCurrentLoop() mixes trivial loop unswitch and non-trivial loop unswitch together. It goes over all basic blocks in the loop and checks if a condition is trivial or non-trivial unswitch condition. However, trivial unswitch condition can only occur in the loop header basic block (where it controls whether or not the loop does something at all). This refactoring separate trivial loop unswitch and non-trivial loop unswitch. Before going over all basic blocks in the loop, it checks if the loop header contains a trivial unswitch condition. If so, unswitch it. Otherwise, go over all blocks like before but don't check trivial condition any more since they are not possible to be in the other blocks. This code has no functionality change. Reviewers: meheff, reames, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11276 llvm-svn: 242873	2015-07-22 05:26:29 +00:00
Jingyue Wu	20d73c6cc0	[BranchFolding] do not iterate the aliases of virtual registers Summary: MCRegAliasIterator only works for physical registers. So, do not run it on virtual registers. With this issue fixed, we can resurrect the BranchFolding pass in NVPTX backend. Reviewers: jholewinski, bkramer Subscribers: henryhu, meheff, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11174 llvm-svn: 242871	2015-07-22 04:16:52 +00:00
Chandler Carruth	ccffdaf7ed	[SROA] Fix a nasty pile of bugs to do with big-endian, different alloca types and loads, loads or stores widened past the size of an alloca, etc. This started off with a bug report about big-endian behavior with bitfields and loads and stores to a { i32, i24 } struct. An initial attempt to fix this was sent for review in D10357, but that didn't really get to the root of the problem. The core issue was that canConvertValue and convertValue in SROA were handling different bitwidth integers by doing a zext of the integer. It wouldn't do a trunc though, only a zext! This would in turn lead SROA to form an i24 load from an i24 alloca, zext it to i32, and then use it. This would at least produce the wrong value for big-endian systems. One of my many false starts here was to correct the computation for big-endian systems by shifting. But this doesn't actually work because the original code has a 64-bit store to the entire 8 bytes, and a 32-bit load of the last 4 bytes, and because the alloc size is 8 bytes, we can't lose that last (least significant if bigendian) byte! The real problem here is that we're forming an i24 load in SROA which is actually not sufficiently wide to load all of the necessary bits here. The source has an i32 load, and SROA needs to form that as well. The straightforward way to do this is to disable the zext logic in canConvertValue and convertValue, forcing us to actually load all 32-bits. This seems like a really good change, but it in turn breaks several other parts of SROA. First in the chain of knock-on failures, we had places where we were doing integer-widening promotion even though some of the integer loads or stores extended past the end of the alloca's memory! There was even a comment about preventing this, but it only prevented the case where the type had a different bit size from its store size. So I added checks to handle the cases where we actually have a widened load or store and to avoid trying to special integer widening promotion in those cases. Second, we actually rely on the ability to promote in the face of loads past the end of an alloca! This is important so that we can (for example) speculate loads around PHI nodes to do more promotion. The bits loaded are garbage, but as long as they aren't used and the alignment is suitable high (which it wasn't in the test case!) this is "fine". And we can't stop promoting here, lots of things stop working well if we do. So we need to add specific logic to handle the extension (and truncation) case, but only where that extension or truncation are over bytes that are outside the alloca's allocated storage and thus totally bogus to load or store. And of course, once we add back this correct handling of extension or truncation, we need to correctly handle bigendian systems to avoid re-introducing the exact bug that started us off on this chain of misery in the first place, but this time even more subtle as it only happens along speculated loads atop a PHI node. I've ported an existing test for PHI speculation to the big-endian test file and checked that we get that part correct, and I've added several more interesting big-endian test cases that should help check that we're getting this correct. Fun times. llvm-svn: 242869	2015-07-22 03:32:42 +00:00
Frederic Riss	1c65094d5b	[dsymutil] Implement ODR uniquing for C++ code. This optimization allows the DWARF linker to reuse definition of types it has emitted in previous CUs rather than reemitting them in each CU that references them. The size and link time gains are huge. For example when linking the DWARF for a debug build of clang, this generates a ~150M dwarf file instead of a ~700M one (the numbers date back a bit and must not be totally accurate these days). As with all the other parts of the llvm-dsymutil codebase, the goal is to keep bit-for-bit compatibility with dsymutil-classic. The code is littered with a lot of FIXMEs that should be addressed once we can get rid of the compatibilty goal. llvm-svn: 242847	2015-07-21 22:41:43 +00:00
Alex Lorenz	f4baeb51b2	MIR Serialization: Start serializing the CFI operands with .cfi_def_cfa_offset. This commit begins serialization of the CFI index machine operands by serializing one kind of CFI instruction - the .cfi_def_cfa_offset instruction. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242845	2015-07-21 22:28:27 +00:00
Jingyue Wu	d058ea927f	[MDA] change BlockScanLimit into a command line option. Summary: In the benchmark (https://github.com/vetter/shoc) we are researching, the duplicated load is not eliminated because MemoryDependenceAnalysis hit the BlockScanLimit. This patch change it into a command line option instead of a hardcoded value. Patched by Xuetian Weng. Test Plan: test/Analysis/MemoryDependenceAnalysis/memdep-block-scan-limit.ll Reviewers: jingyue, reames Subscribers: reames, llvm-commits Differential Revision: http://reviews.llvm.org/D11366 llvm-svn: 242842	2015-07-21 21:50:39 +00:00
Bruno Cardoso Lopes	e8640518a9	[AsmPrinter] Check for valid constants in handleIndirectSymViaGOTPCRel Check whether BaseCst is valid before extracting a GlobalValue. This fixes PR24163. Patch by David Majnemer. llvm-svn: 242840	2015-07-21 21:45:42 +00:00
Michael J. Spencer	402a4f1088	[Object][ELF] Handle files with no section header string table. llvm-svn: 242839	2015-07-21 21:40:33 +00:00
Bill Schmidt	2be8054b49	[PPC64LE] More vector swap optimization TLC This makes one substantive change and a few stylistic changes to the VSX swap optimization pass. The substantive change is to permit LXSDX and LXSSPX instructions to participate in swap optimization computations. The previous change to insert a swap following a SUBREG_TO_REG widening operation makes this almost trivial. I experimented with also permitting STXSDX and STXSSPX instructions. This can be done using similar techniques: we could insert a swap prior to a narrowing COPY operation, and then permit these stores to participate. I prototyped this, but discovered that the pattern of a narrowing COPY followed by an STXSDX does not occur in any of our test-suite code. So instead, I added commentary indicating that this could be done. Other TLC: - I changed SH_COPYSCALAR to SH_COPYWIDEN to more clearly indicate the direction of the copy. - I factored the insertion of swap instructions into a separate function. Finally, I added a new test case to check that the scalar-to-vector loads are working properly with swap optimization. llvm-svn: 242838	2015-07-21 21:40:17 +00:00
Reid Kleckner	2f907557c3	Re-land 242726 to use RAII to do cleanup The LooksLikeCodeInBug11395() codepath was returning without clearing the ProcessedAllocas cache. llvm-svn: 242809	2015-07-21 17:40:14 +00:00
Arnold Schwaighofer	3651233004	MergeFunc: Transfer the callee's attributes when replacing a direct caller We insert a bitcast which obfuscates the getCalledFunction for the utility function which looks up attributes from the called function. Loosing ABI changing parameter attributes is a bad thing. rdar://21516488 llvm-svn: 242807	2015-07-21 17:07:07 +00:00
Alex Lorenz	6ede37442d	MIR Serialization: Serialize the external symbol machine operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242806	2015-07-21 16:59:53 +00:00
Nico Weber	f00afcc79b	Revert 242726, it broke ASan on OS X. llvm-svn: 242792	2015-07-21 15:48:53 +00:00
Karthik Bhat	d818e38ff9	Constfold trunc,rint,nearbyint,ceil and floor using APFloat A patch by Chakshu Grover! This patch allows constfolding of trunc,rint,nearbyint,ceil and floor intrinsics using APFloat class. Differential Revision: http://reviews.llvm.org/D11144 llvm-svn: 242763	2015-07-21 08:52:23 +00:00
Igor Breger	f7fd547e27	AVX512 : Implemented VPMADDUBSW and VPMADDWD instruction , Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11351 llvm-svn: 242761	2015-07-21 07:11:28 +00:00
Akira Hatanaka	285815258c	[ARM] Define subtarget feature "reserve-r9", which is used to decide whether register r9 should be reserved. This recommits r242737, which broke bots because the number of subtarget features went over the limit of 64. This change is needed because we cannot use a backend option to set cl::opt "arm-reserve-r9" when doing LTO. Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to reserve r9 should make changes to add subtarget feature "reserve-r9" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11320 llvm-svn: 242756	2015-07-21 01:42:02 +00:00
Matthias Braun	a50d2203fa	ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code Re-apply of r241928 which had to be reverted because of the r241926 revert. This commit factors out common code from MergeBaseUpdateLoadStore() and MergeBaseUpdateLSMultiple() and introduces a new function MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a strd/ldrd instruction into an strd/ldrd instruction with writeback where possible. Differential Revision: http://reviews.llvm.org/D10676 llvm-svn: 242743	2015-07-21 00:19:01 +00:00
Matthias Braun	e40d89ef9b	ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2 Re-apply r241926 with an additional check that r13 and r15 are not used for LDRD/STRD. See http://llvm.org/PR24190. This also already includes the fix from r241951. Differential Revision: http://reviews.llvm.org/D10623 llvm-svn: 242742	2015-07-21 00:18:59 +00:00
Akira Hatanaka	42427d2c38	Revert r242737. This caused builds to fail with the following error message: error:Too many subtarget features! Bump MAX_SUBTARGET_FEATURES. llvm-svn: 242740	2015-07-20 23:51:12 +00:00
Akira Hatanaka	7482d40cd5	[ARM] Define subtarget feature "reserve-r9", which is used to decide whether register r9 should be reserved. This change is needed because we cannot use a backend option to set cl::opt "arm-reserve-r9" when doing LTO. Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to reserve r9 should make changes to add subtarget feature "reserve-r9" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11320 llvm-svn: 242737	2015-07-20 23:21:30 +00:00
Matthias Braun	731e359e70	Revert "ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2" This reverts commit r241926. This caused http://llvm.org/PR24190 llvm-svn: 242735	2015-07-20 23:17:20 +00:00
Matthias Braun	84e289702a	Revert "ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code" This reverts commit r241928. This caused http://llvm.org/PR24190 llvm-svn: 242734	2015-07-20 23:17:16 +00:00
JF Bastien	e4d22d59d1	Targets: commonize some stack realignment code This patch does the following: * Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`. * Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute. Multiple targets duplicated the same `needsStackRealignment` code: - Aarch64. - ARM. - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has. - PowerPC. - WebAssembly. - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has. The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects: - AMDGPU - BPF - CppBackend - MSP430 - NVPTX - Sparc - SystemZ - XCore - Out-of-tree targets This is a breaking change! `make check` passes. The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation. `needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone. Reviewers: sunfish Subscribers: aemerson, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11160 llvm-svn: 242727	2015-07-20 22:51:32 +00:00
Reid Kleckner	87d03450a5	Don't try to instrument allocas used by outlined SEH funclets Summary: Arguments to llvm.localescape must be static allocas. They must be at some statically known offset from the frame or stack pointer so that other functions can access them with localrecover. If we ever want to instrument these, we can use more indirection to recover the addresses of these local variables. We can do it during clang irgen or with the asan module pass. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11307 llvm-svn: 242726	2015-07-20 22:49:44 +00:00
Matthias Braun	e536f4f681	AArch64: Add aditional Cyclone macroop fusion opportunities Related to rdar://19205407 Differential Revision: http://reviews.llvm.org/D10746 llvm-svn: 242724	2015-07-20 22:34:47 +00:00
Matthias Braun	2bd6dd8d54	MachineScheduler: Restrict macroop fusion to data-dependent instructions. Before creating a schedule edge to encourage MacroOpFusion check that: - The predecessor actually writes a register that the branch reads. - The predecessor has no successors in the ScheduleDAG so we can schedule it in front of the branch. This avoids skewing the scheduling heuristic in cases where macroop fusion cannot happen. Differential Revision: http://reviews.llvm.org/D10745 llvm-svn: 242723	2015-07-20 22:34:44 +00:00
Quentin Colombet	71a71485f4	[ARM] Refactor the prologue/epilogue emission to be more robust. This is the first step toward supporting shrink-wrapping for this target. The changes could be summarized by these items: - Expand the tail-call return as part of the expand pseudo pass. - Get rid of the assumptions that the epilogue is the exit block: * Do not assume which registers are free in the epilogue. (This indirectly improve the lowering of the code for the segmented stacks, see the test cases.) * Take into account that the basic block can be empty. Related to <rdar://problem/20821730> llvm-svn: 242714	2015-07-20 21:42:14 +00:00
Jingyue Wu	48a9bdc6aa	[NVPTX] make load on global readonly memory to use ldg Summary: [NVPTX] make load on global readonly memory to use ldg Summary: As describe in [1], ld.global.nc may be used to load memory by nvcc when __restrict__ is used and compiler can detect whether read-only data cache is safe to use. This patch will try to check whether ldg is safe to use and use them to replace ld.global when possible. This change can improve the performance by 18~29% on affected kernels (ratt_kernel and rwdot_kernel) in S3D benchmark of shoc [2]. Patched by Xuetian Weng. [1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache [2] https://github.com/vetter/shoc Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll Reviewers: jholewinski, jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11314 llvm-svn: 242713	2015-07-20 21:28:54 +00:00
Rafael Espindola	b68a16c47c	Simplify iterating over the dynamic section and report broken ones. llvm-svn: 242712	2015-07-20 21:23:29 +00:00
Krzysztof Parzyszek	921722049d	[Hexagon] Generate MUX from conditional transfers when dot-new not possible llvm-svn: 242711	2015-07-20 21:23:25 +00:00
Alex Lorenz	ab98049947	MIR Serialization: Initial serialization of machine constant pools. This commit implements the initial serialization of machine constant pools and the constant pool index machine operands. The constant pool is serialized using a YAML sequence of YAML mappings that represent the constant values. The target-specific constant pool items aren't serialized by this commit. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242707	2015-07-20 20:51:18 +00:00
Sanjoy Das	93d608c3c3	[ImplicitNullChecks] Work with implicit defs. Summary: This change generalizes the implicit null checks pass to work with instructions that don't have any explicit register defs. This lets us use X86's `cmp` against memory as faulting load instructions. Reviewers: reames, JosephTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11286 llvm-svn: 242703	2015-07-20 20:31:39 +00:00
Alex Lorenz	b29554dab9	MIR Parser: Add support for quoted named global value operands. This commit extends the machine instruction lexer and implements support for the quoted global value tokens. With this change the syntax for the global value identifier tokens becomes identical to the syntax for the global identifier tokens from the LLVM's assembly language. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242702	2015-07-20 20:31:01 +00:00
Rafael Espindola	33f250931c	Remove Elf_Rela_Iter and Elf_Rel_Iter. Use just the pointers and check for invalid relocation sections. llvm-svn: 242700	2015-07-20 20:07:50 +00:00
Chad Rosier	3da0ea7f5d	[AArch64] Change EON pattern to match more often. Phabricator: http://reviews.llvm.org/D11359 Patch by Geoff Berry <gberry@codeaurora.org> llvm-svn: 242694	2015-07-20 18:42:27 +00:00
Bill Schmidt	19dbd6c6c2	Add missing test for r242296 (vec_sld) llvm-svn: 242680	2015-07-20 15:43:21 +00:00
Rafael Espindola	836f2e86e5	Report errors an invalid virtual addresses. llvm-svn: 242676	2015-07-20 14:45:03 +00:00
Tom Stellard	70580f83cc	AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops Summary: The MUBUF addr64 bit has been removed on VI, so we must use FLAT instructions when the pointer is stored in VGPRs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11067 llvm-svn: 242673	2015-07-20 14:28:41 +00:00
Rafael Espindola	073624bb56	Simplify iterating over program headers and detect corrupt ones. We now use a simple pointer and have range loops. llvm-svn: 242669	2015-07-20 13:35:33 +00:00
Vasileios Kalintiris	974d409259	[mips] Added support for the ERETNC instruction. Summary: This required adding the instruction predicate HasMips32r5. Patch by Scott Egerton. Reviewers: dsanders, vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11136 llvm-svn: 242666	2015-07-20 12:28:56 +00:00
Rafael Espindola	00ddb1416d	llvm-readobj: Handle invalid references to the string table. llvm-svn: 242658	2015-07-20 03:38:17 +00:00
Rafael Espindola	c46ffb7a49	Move CHECKs closer to the RUN line. llvm-svn: 242657	2015-07-20 03:31:25 +00:00
Rafael Espindola	fb3acd6216	llvm-readobj: call exit(1) on error. llvm-readobj exists for testing llvm. We can safely stop the program the first time we know the input in corrupted. This is in preparation for making it handle a few more broken files. llvm-svn: 242656	2015-07-20 03:23:55 +00:00
Arnold Schwaighofer	764d6de823	Revert "MergeFuncs: Transfer the function parameter attributes to the call site" It is okay to not transfer parameter attributes. This reverts commit r242558. llvm-svn: 242646	2015-07-19 19:30:43 +00:00
Simon Pilgrim	23a29dafda	[X86][SSE] Tidied up vector CTLZ/CTTZ. NFCI. llvm-svn: 242645	2015-07-19 17:09:43 +00:00
Michael Kuperstein	69e40a4c85	[X86] Add support for tbyte memory operand size for Intel-syntax x86 assembly Differential Revision: http://reviews.llvm.org/D11257 Patch by: marina.yatsina@intel.com llvm-svn: 242639	2015-07-19 11:03:08 +00:00
Elena Demikhovsky	17b906058e	AVX-512: Floating point conversions for SKX - DAG Lowering. SKX supports conversion for all FP types. Integer types include doublewords and quardwords. I added "Legal" status for these nodes and a bunch of tests. I added "NoVLX" for AVX DAG selection to force VLX instructions selection when VLX is supported. Differential Revision: http://reviews.llvm.org/D11255 llvm-svn: 242637	2015-07-19 10:17:33 +00:00
Simon Pilgrim	59764dccfb	[X86][SSE] Updated SHL/LSHR i64 vectorization costs. This was missed in D8416. llvm-svn: 242621	2015-07-18 20:06:30 +00:00
Simon Pilgrim	8c81678dfa	[X86][SSE] Added additional fp/int tests. Demonstrates some shortfalls in subvector(cvt(x)) compared to cvt(subvector(x)) patterns - especially on AVX/AVX2 targets. llvm-svn: 242614	2015-07-18 17:05:39 +00:00
Simon Pilgrim	c77f72c4e4	Refreshed tests. llvm-svn: 242613	2015-07-18 16:53:51 +00:00
Simon Pilgrim	036db52673	Refreshed tests and reordered in descending integer size. llvm-svn: 242610	2015-07-18 16:14:56 +00:00
Simon Pilgrim	27e130f11c	Tidyup shufflevector calls - don't repeat inputs if you can avoid it. llvm-svn: 242609	2015-07-18 15:56:33 +00:00
Matthias Braun	9e85980658	ARM: Enable MachineScheduler and disable PostRAScheduler for swift. Reapply r242500 now that the swift schedmodel includes LDRLIT. This is mostly done to disable the PostRAScheduler which optimizes for instruction latencies which isn't a good fit for out-of-order architectures. This also allows to leave out the itinerary table in swift in favor of the SchedModel ones. This change leads to performance improvements/regressions by as much as 10% in some benchmarks, in fact we loose 0.4% performance over the llvm-testsuite for reasons that appear to be unknown or out of the compilers control. rdar://20803802 documents the investigation of these effects. While it is probably a good idea to perform the same switch for the other ARM out-of-order CPUs, I limited this change to swift as I cannot perform the benchmark verification on the other CPUs. Differential Revision: http://reviews.llvm.org/D10513 llvm-svn: 242588	2015-07-17 23:18:30 +00:00
Matthias Braun	141d1c9d8f	ARM: Add scheduling information for LDRLIT instructions to swift scheduling model These pseudo instructions are only lowered after register allocation and are therefore still present when the machine scheduler runs. Add a run: line to a testcase that uses the uncommon flags necessary to actually produce a LDRLIT instruction on swift. llvm-svn: 242587	2015-07-17 23:18:26 +00:00
Quentin Colombet	11922946fe	[RAGreedy] Add an experimental deferred spilling feature. The idea of deferred spilling is to delay the insertion of spill code until the very end of the allocation. A "candidate" to spill variable might not required to be spilled because of other evictions that happened after this decision was taken. The spirit is similar to the optimistic coloring strategy implemented in Preston and Briggs graph coloring algorithm. For now, this feature is highly experimental. Although correct, it would require much more modification to properly model the effect of spilling. Anyway, this early patch helps prototyping this feature. Note: The test case cannot unfortunately be reduced and is probably fragile. llvm-svn: 242585	2015-07-17 23:04:06 +00:00
Alex Lorenz	484903ecd2	MIR Parser: Allow the dollar characters in all of the identifier tokens. This commit modifies the machine instruction lexer so that it now accepts the '$' characters in identifier tokens. This change makes the syntax for unquoted global value tokens consistent with the syntax for the global idenfitier tokens in the LLVM's assembly language. llvm-svn: 242584	2015-07-17 22:48:04 +00:00
Arnold Schwaighofer	690cd87dcd	MergeFuncs: Transfer the function parameter attributes to the call site rdar://21516488 llvm-svn: 242558	2015-07-17 18:59:08 +00:00
Adam Nemet	5a6d5bc17b	Revert "ARM: Enable MachineScheduler and disable PostRAScheduler for swift." This reverts commit r242500. It broke some internal tests and Matthias asked me to revert it while he is investigating. llvm-svn: 242553	2015-07-17 18:14:19 +00:00
Peter Zotov	f1192cd7e4	[OCaml] Do not use -warn-error in tests. This -warn-error flag invariably gets into release tarballs and breaks builds on distributions that run tests as a part of release process. The OCaml binding tests are especially critical, since they often expose lingering toolchain bugs, and so it is replaced with -w +A (equivalent to -Wall). llvm-svn: 242550	2015-07-17 17:33:23 +00:00
Eli Bendersky	b09cfb51ca	Use inbounds GEPs for memcpy and memset lowering Follow-up on discussion in http://reviews.llvm.org/D11220 llvm-svn: 242542	2015-07-17 16:42:33 +00:00
Rafael Espindola	2bec8500ef	Add support for producing thin archives in llvm-lib. I will send an entry in docs/CommandGuide for review today. llvm-svn: 242533	2015-07-17 16:01:11 +00:00
John Brawn	9ca9ca2805	Make global aliases have symbol size equal to their type This is mainly for the benefit of GlobalMerge, so that an alias into a MergedGlobals variable has the same size as the original non-merged variable. Differential Revision: http://reviews.llvm.org/D10837 llvm-svn: 242520	2015-07-17 12:12:03 +00:00
Chandler Carruth	f55803f761	[PM/AA] Disable the core unsafe aspect of GlobalsModRef in the face of basic changes to the IR such as folding pointers through PHIs, Selects, integer casts, store/load pairs, or outlining. This leaves the feature available behind a flag. This flag's default could be flipped if necessary, but the real-world performance impact of this particular feature of GMR may not be sufficiently significant for many folks to want to run the risk. Currently, the risk here is somewhat mitigated by half-hearted attempts to update GlobalsModRef when the rest of the optimizer changes something. However, I am currently trying to remove that update mechanism as it makes migrating the AA infrastructure to a form that can be readily shared between new and old pass managers very challenging. Without this update mechanism, it is possible that this still unlikely failure mode will start to trip people, and so I wanted to try to proactively avoid that. There is a lengthy discussion on the mailing list about why the core approach here is flawed, and likely would need to look totally different to be both reasonably effective and resilient to basic IR changes occuring. This patch is essentially the first of two which will enact the result of that discussion. The next patch will remove the current update mechanism. Thanks to lots of folks that helped look at this from different angles. Especial thanks to Michael Zolotukhin for doing some very prelimanary benchmarking of LTO without GlobalsModRef to get a rough idea of the impact we could be facing here. So far, it looks very small, but there are some concerns lingering from other benchmarking. The default here may get flipped if performance results end up pointing at this as a more significant issue. Also thanks to Pete and Gerolf for reviewing! Differential Revision: http://reviews.llvm.org/D11213 llvm-svn: 242512	2015-07-17 06:58:24 +00:00
Kuba Brecka	37a5ffaca0	[asan] Fix invalid debug info for promotable allocas Since r230724 ("Skip promotable allocas to improve performance at -O0"), there is a regression in the generated debug info for those non-instrumented variables. When inspecting such a variable's value in LLDB, you often get garbage instead of the actual value. ASan instrumentation is inserted before the creation of the non-instrumented alloca. The only allocas that are considered standard stack variables are the ones declared in the first basic-block, but the initial instrumentation setup in the function breaks that invariant. This patch makes sure uninstrumented allocas stay in the first BB. Differential Revision: http://reviews.llvm.org/D11179 llvm-svn: 242510	2015-07-17 06:29:57 +00:00
Matthias Braun	2d8315f806	ARM: Enable MachineScheduler and disable PostRAScheduler for swift. This is mostly done to disable the PostRAScheduler which optimizes for instruction latencies which isn't a good fit for out-of-order architectures. This also allows to leave out the itinerary table in swift in favor of the SchedModel ones. This change leads to performance improvements/regressions by as much as 10% in some benchmarks, in fact we loose 0.4% performance over the llvm-testsuite for reasons that appear to be unknown or out of the compilers control. rdar://20803802 documents the investigation of these effects. While it is probably a good idea to perform the same switch for the other ARM out-of-order CPUs, I limited this change to swift as I cannot perform the benchmark verification on the other CPUs. Differential Revision: http://reviews.llvm.org/D10513 llvm-svn: 242500	2015-07-17 01:44:31 +00:00
Matt Arsenault	cabe02e141	Only do fmul (fadd x, x), c combine if the fadd only has one use This was increasing the instruction count if the fadd has multiple uses. llvm-svn: 242498	2015-07-17 01:14:35 +00:00
Rafael Espindola	494a381ede	Use small encodings for constants when possible. llvm-svn: 242493	2015-07-17 00:57:52 +00:00
Alex Lorenz	e5a44660dd	MIR Serialization: Serialize the frame setup machine instruction flag. llvm-svn: 242491	2015-07-17 00:24:15 +00:00
Alex Lorenz	7feaf7c60b	MIR Serialization: Serialize the frame index machine operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242487	2015-07-16 23:37:45 +00:00
Matthias Braun	da3d0d7342	Arm: Don't define a label twice with two setjmps in a function. Constructing a name based on the function name didn't give us a unique symbol if we had more than one setjmp in a function. Using MCContext::createTempSymbol() always gives us a unique name. Differential Revision: http://reviews.llvm.org/D9314 llvm-svn: 242482	2015-07-16 22:34:20 +00:00
Matthias Braun	3cd00c1739	Fix __builtin_setjmp in combination with sjlj exception handling. llvm.eh.sjlj.setjmp was used as part of the SjLj exception handling style but is also used in clang to implement __builtin_setjmp. The ARM backend needs to output additional dispatch tables for the SjLj exception handling style, these tables however can't be emitted if llvm.eh.sjlj.setjmp is simply used for __builtin_setjmp and no actual landing pad blocks exist. To solve this issue a new llvm.eh.sjlj.setup_dispatch intrinsic is introduced which is used instead of llvm.eh.sjlj.setjmp in the SjLj exception handling lowering, so we can differentiate between the case where we actually need to setup a dispatch table and the case where we just need the __builtin_setjmp semantic. Differential Revision: http://reviews.llvm.org/D9313 llvm-svn: 242481	2015-07-16 22:34:16 +00:00
Tim Northover	ca0ffc3561	AArch64: make inexact signalling on round Darwin-specific C11 leaves the choice on whether round-to-integer operations set the inexact flag implementation-defined. Darwin does expect it to be set, but this seems to be against the intent of the IEEE document and slower to implement anyway. So it should be opt-in. llvm-svn: 242446	2015-07-16 21:30:21 +00:00
Simon Pilgrim	769d58f9df	[X86][SSE] Added nounwind attribute to vector shift tests. Stop i686 codegen from generating cfi directives. llvm-svn: 242443	2015-07-16 21:14:26 +00:00
Bill Schmidt	54cced54a6	[PowerPC] v4i32 is a VSRCRegClass I was looking at some vector code generation and kept seeing unnecessary vector copies into the Altivec half of the VSX registers. I discovered that we overlooked v4i32 when adding the register classes for VSX; we only added v4f32 and v2f64. This means that anything that canonicalizes into v4i32 (which is a LOT of stuff) ends up being forced into VRRC on its way to VSRC. The fix is one line. The rest of the patch is fixing up some test cases whose code generation has changed as a result. This seems like it would be a good candidate for backport to 3.7. llvm-svn: 242442	2015-07-16 21:14:07 +00:00
Simon Pilgrim	88cfb3a748	[X86][SSE] Updated vector conversion test names. I'll be adding further tests shortly so need a more thorough naming convention. llvm-svn: 242440	2015-07-16 21:00:57 +00:00
Jingyue Wu	e7981cee24	[NVPTX] enable SpeculativeExecution in NVPTX Summary: SpeculativeExecution enables a series straight line optimizations (such as SLSR and NaryReassociate) on conditional code. For example, if (...) ... b * s ... if (...) ... (b + 1) * s ... speculative execution can hoist b * s and (b + 1) * s from then-blocks, so that we have ... b * s ... if (...) ... ... (b + 1) * s ... if (...) ... Then, SLSR can rewrite (b + 1) * s to (b * s + s) because after speculative execution b * s dominates (b + 1) * s. The performance impact of this change is significant. It speeds up the benchmarks running EigenFloatContractionKernelInternal16x16 (`ba68f42fa6/unsupported/Eigen/CXX11/src/Tensor/TensorContractionCuda.h`?at=default#cl-526) by roughly 2%. Some internal benchmarks that have the above code pattern are improved by up to 40%. No significant slowdowns are observed on Eigen CUDA microbenchmarks. Reviewers: jholewinski, broune, eliben Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11201 llvm-svn: 242437	2015-07-16 20:13:48 +00:00
Matthias Braun	af7d7709d6	AArch64: Implement conditional compare sequence matching. This is a new iteration of the reverted r238793 / http://reviews.llvm.org/D8232 which wrongly assumed that any and/or trees can be represented by conditional compare sequences, however there are some restrictions to that. This version fixes this and adds comments that explain exactly what types of and/or trees can actually be implemented as conditional compare sequences. Related to http://llvm.org/PR20927, rdar://18326194 Differential Revision: http://reviews.llvm.org/D10579 llvm-svn: 242436	2015-07-16 20:02:37 +00:00
Tom Stellard	78655fcfdc	AMDPGU/SI: Negative offsets aren't allowed in MUBUF's vaddr operand Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11226 llvm-svn: 242434	2015-07-16 19:40:09 +00:00
Pete Cooper	f4ce569deb	Revert "Add missing load/store flags to thumb2 instructions." This reverts commit r242300. This is causing buildbot failures which we are investigating. I'll reapply once we know whats going on, but for now want to get the bots green. llvm-svn: 242428	2015-07-16 18:38:13 +00:00
Peter Collingbourne	9b0fe61047	Internalize: internalize comdat members as a group, and drop comdat on such members. Internalizing an individual comdat group member without also internalizing the other members of the comdat can break comdat semantics. For example, if a module contains a reference to an internalized comdat member, and the linker chooses a comdat group from a different object file, this will break the reference to the internalized member. This change causes the internalizer to only internalize comdat members if all other members of the comdat are not externally visible. Once a comdat group has been fully internalized, there is no need to apply comdat rules to its members; later optimization passes (e.g. globaldce) can legally drop individual members of the comdat. So we drop the comdat attribute from all comdat members. Differential Revision: http://reviews.llvm.org/D10679 llvm-svn: 242423	2015-07-16 17:42:21 +00:00
Eli Bendersky	f14af16219	Correct lowering of memmove in NVPTX This fixes https://llvm.org/bugs/show_bug.cgi?id=24056 Also a bit of refactoring along the way. Differential Revision: http://reviews.llvm.org/D11220 llvm-svn: 242413	2015-07-16 16:27:19 +00:00
James Molloy	7395a8182c	[Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation This adds new intrinsics "*absdiff" for absolute difference ops to facilitate efficient code generation for "sum of absolute differences" operation. The patch also contains the introduction of corresponding SDNodes and basic legalization support.Sanity of the generated code is tested on X86. This is 1st of the three patches. Patch by Shahid Asghar-ahmad! llvm-svn: 242409	2015-07-16 15:22:46 +00:00
Silviu Baranga	0e5804a6af	Fix memcheck interval ends for pointers with negative strides Summary: The checking pointer grouping algorithm assumes that the starts/ends of the pointers are well formed (start <= end). The runtime memory checking algorithm also assumes this by doing: start0 < end1 && start1 < end0 to detect conflicts. This check only works if start0 <= end0 and start1 <= end1. This change correctly orders the interval ends by either checking the stride (if it is constant) or by using min/max SCEV expressions. Reviewers: anemet, rengolin Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D11149 llvm-svn: 242400	2015-07-16 14:02:58 +00:00
Michael Kuperstein	9d17a28164	[X86] Test for r242395 (Fix emitPrologue() to make less assumptions about pushes) llvm-svn: 242399	2015-07-16 13:55:39 +00:00
Michael Kuperstein	dadb847412	[X86] Reapply r240257 : "Allow more call sequences to use push instructions for argument passing" This allows more call sequences to use pushes instead of movs when optimizing for size. In particular, calling conventions that pass some parameters in registers (e.g. thiscall) are now supported. This should no longer cause miscompiles, now that a bug in emitPrologue was fixed in r242395. llvm-svn: 242398	2015-07-16 13:54:14 +00:00
Reid Kleckner	938bd6fc96	Revert "[X86] Allow more call sequences to use push instructions for argument passing" It miscompiles some code and a reduced test case has been sent to the author. This reverts commit r240257. llvm-svn: 242373	2015-07-16 01:30:00 +00:00
Alex Lorenz	e2041b68b1	Fix broken testcase from r242358. The testcase failed on non X86 targets, because I forgot to pass the '-march=x86-64' option into llc for one of the X86 specific tests. llvm-svn: 242370	2015-07-16 00:58:33 +00:00
Akira Hatanaka	024d91a00b	[ARM] Define a subtarget feature that is used to avoid using movt/movw pairs for 32-bit immediates. This change is needed to avoid emitting movt/movw pairs when doing LTO and do so on a per-function basis. Out-of-tree projects currently using cl::opt option -arm-use-movt=0 or false to avoid emitting movt/movw pairs should make changes to add subtarget feature "+no-movt" (see the changes made to clang in r242368). rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11026 llvm-svn: 242369	2015-07-16 00:58:23 +00:00
Rafael Espindola	e79b62d923	Trying to fix the windows bots. llvm-svn: 242367	2015-07-16 00:38:34 +00:00
Rafael Espindola	06d6d1905e	Fix handling of relative paths in thin archives. The member has to end up with a path relative to the archive. llvm-svn: 242362	2015-07-16 00:14:49 +00:00
Alex Lorenz	31d706836c	MIR Serialization: Serialize the jump table index operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242358	2015-07-15 23:38:35 +00:00
Alex Lorenz	6799e9b3e0	MIR Serialization: Serialize the jump table info. The jump table info is serialized using a YAML mapping that contains its kind and a YAML sequence of jump table entries. A jump table entry is a YAML mapping that has an ID and an inline YAML sequence of machine basic block references. The testcase 'CodeGen/MIR/X86/jump-table-info.mir' doesn't have any instructions because one of them contains a jump table index operand. The jump table index operands will be serialized in a follow up patch, and the appropriate instructions will be added to this testcase. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242357	2015-07-15 23:31:07 +00:00
Sean Silva	6dd10bade7	Add a test for r242281 from an old patch of mine. This isn't thorough, but should serve as a sanity check. llvm-svn: 242356	2015-07-15 23:23:02 +00:00
Rafael Espindola	57c0525d2c	llvm-ar: Don't write the directory in the string table. We were already doing the right thing for short file names, but not long ones. llvm-svn: 242354	2015-07-15 23:15:33 +00:00
Alex Lorenz	37643a04a4	MIR Serialization: Serialize references from the stack objects to named allocas. This commit serializes the references to the named LLVM alloca instructions from the stack objects in the machine frame info. This commit adds a field 'Name' to the struct 'yaml::MachineStackObject'. This new field is used to store the name of the alloca instruction when the alloca is present and when it has a name. llvm-svn: 242339	2015-07-15 22:14:49 +00:00
Paul Robinson	b9de106d04	Add a "debugger tuning" concept that allows us to fine-tune how we emit debug info, according to the preferences of the different debuggers used on various targets. Darwin and FreeBSD default to tuning for LLDB; PS4 defaults to tuning for the SCE (Sony Computer Entertainment) debugger. All others default to GDB. Differential Revision: http://reviews.llvm.org/D8506 llvm-svn: 242338	2015-07-15 22:04:54 +00:00
JF Bastien	7289f73b8d	Fix mergefunc infinite loop Self-referential constants containing references to a merged function no longer cause the MergeFunctions pass to infinite loop. Also adds a reproduction IR which would otherwise fail, which was isolated from a similar issue in Chromium. Author: jrkoenig Reviewers: nlewycky, jfb Subscribers: llvm-commits, nlewycky, jfb Differential Revision: http://reviews.llvm.org/D11208 llvm-svn: 242337	2015-07-15 21:51:33 +00:00
Rafael Espindola	449208d95b	Handle the error of trying to convert a regular archive to a thin one. While at it, test that we can add to a thin archive. llvm-svn: 242330	2015-07-15 20:45:56 +00:00
Tobias Edler von Koch	d8ce16b1e6	Analyze recursive PHI nodes in BasicAA Summary: This patch allows phi nodes like %x = phi [ %incptr, ... ] [ %var, ... ] %incptr = getelementptr %x, 1 to be analyzed by BasicAliasAnalysis. In aliasPHI, we can detect incoming values that are recursive GEPs with a constant offset. Instead of trying to analyze a recursive GEP (and failing), we now ignore it and instead set the size of the memory referenced by the PHINode to UnknownSize. This represents all the possible memory locations the pointer represented by the PHINode could be advanced to by the GEP. For now, this new behavior is turned off by default to allow debugging of performance degradations seen with SPEC/x86 and Hexagon benchmarks. The flag -basicaa-recphi turns it on. Reviewers: hfinkel, sanjoy Subscribers: tobiasvk_caf, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D10368 llvm-svn: 242320	2015-07-15 19:32:22 +00:00
Bruno Cardoso Lopes	ad61f34293	Revert "Look through PHIs to find additional register sources" Likely broke compilation on ARM: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/13054 This reverts commit 131ce4a838c081516cbfed039fc986b33e3979d6. llvm-svn: 242310	2015-07-15 18:10:35 +00:00
Adrian Prantl	ee5feafc0f	Debug Info: Add basic support for external types references. This is a necessary prerequisite for bootstrapping the emission of debug info inside modules. - Adds a FlagExternalTypeRef to DICompositeType. External types must have a unique identifier. - External type references are emitted using a forward declaration with a DW_AT_signature([DW_FORM_ref_sig8]) based on the UID. http://reviews.llvm.org/D9612 llvm-svn: 242302	2015-07-15 17:01:41 +00:00
Pete Cooper	21ca199cea	Add missing load/store flags to thumb2 instructions. These were the cause of a verifier error when building 7zip with -verify-machineinstrs. Running 'make check' with the verifier triggered the same error on the test here so i've updated the test to run the verifier on one of its runs instead of adding a new one. While looking at this code, there was a stale comment that these instructions were only used for disassembly. This probably used to be the case, but they are now used in the 'ARM load / store optimization pass' too. llvm-svn: 242300	2015-07-15 16:36:38 +00:00
Bruno Cardoso Lopes	fadd4fef2a	Look through PHIs to find additional register sources - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 242295	2015-07-15 15:35:23 +00:00
Benjamin Kramer	c11fd3e775	[PPC] Disassemble little endian ppc instructions in the right byte order PR24122. The test is simply a byte swapped version of ppc64-encoding.txt. llvm-svn: 242288	2015-07-15 12:56:19 +00:00
Alexey Bataev	b9288601a3	[SDAG] Optimize unordered comparison in soft-float mode (patch by Anton Nadolskiy) Current implementation handles unordered comparison poorly in soft-float mode. Consider (a ULE b) which is a <= b. It is lowered to (ledf2(a, b) <= 0 \|\| unorddf2(a, b) != 0) (in general). We can do better job by lowering it to (__gtdf2(a, b) <= 0). Such replacement is true for other CMP's (ult, ugt, uge). In general, we just call same function as for ordered case but negate comparison against zero. Differential Revision: http://reviews.llvm.org/D10804 llvm-svn: 242280	2015-07-15 08:39:35 +00:00
Hal Finkel	5d36b230b5	[PowerPC] Use the MachineCombiner to reassociate fadd/fmul This is a direct port of the code from the X86 backend (r239486/r240361), which uses the MachineCombiner to reassociate (floating-point) adds/muls to increase ILP, to the PowerPC backend. The rationale is the same. There is a lot of copy-and-paste here between the X86 code and the PowerPC code, and we should extract at least some of this into CodeGen somewhere. However, I don't want to do that until this code is enhanced to handle FMAs as well. After that, we'll be in a better position to extract the common parts. llvm-svn: 242279	2015-07-15 08:23:05 +00:00
Petr Pavlu	097adfb98c	[AArch64] Fix problems in decoding generic MSR instructions Bitpatterns rejected by the decoder method of `MSR (immediate)` should be decoded as the `extended MSR (register)` instruction. Differential Revision: http://reviews.llvm.org/D7174 llvm-svn: 242276	2015-07-15 08:10:30 +00:00
Petr Pavlu	182b05784a	[TableGen] Improve decoding options for non-orthogonal instructions When FixedLenDecoder matches an input bitpattern of form [01]+ with an instruction bitpattern of form [01?]+ (where 0/1 are static bits and ? are mixed/variable bits) it passes the input bitpattern to a specific instruction decoder method which then makes a final decision whether the bitpattern is a valid instruction or not. This means the decoder must handle all possible values of the variable bits which sometimes leads to opcode rewrites in the decoder method when the instructions are not fully orthogonal. The patch provides a way for the decoder method to say that when it returns Fail it does not necessarily mean the bitpattern is invalid, but rather that the bitpattern is definitely not an instruction that is recognized by the decoder method. The decoder can then try to match the input bitpattern with other possible instruction bitpatterns. For example, this allows to solve a situation on AArch64 where the `MSR (immediate)` instruction has form: 1101 0101 0000 0??? 0100 ???? ???1 1111 but not all values of the ? bits are allowed. The rejected values should be handled by the `extended MSR (register)` instruction: 1101 0101 000? ???? ???? ???? ???? ???? The decoder will first try to decode an input bitpattern that matches both bitpatterns as `MSR (immediate)` but currently this puts the decoder method of `MSR (immediate)` into a situation when it must be able to decode all possible values of the ? bits, i.e. it would need to rewrite the instruction to `MSR (register)` when it is not `MSR (immediate)`. The patch allows to specify that the decoder method cannot determine if the instruction is valid for all variable values. The decoder method can simply return Fail when it knows it is definitely not `MSR (immediate)`. The decoder will then backtrack the decoding and find that it can match the input bitpattern with the more generic `MSR (register)` bitpattern too. Differential Revision: http://reviews.llvm.org/D7174 llvm-svn: 242274	2015-07-15 08:04:27 +00:00
Simon Pilgrim	4582ca5ecb	[X86][SSE] Added i686/SSE2 vector shift tests. We were only testing on x86-64, but we should be ensuring decent code gen of i64 shifts on 32-bit targets. llvm-svn: 242273	2015-07-15 08:04:07 +00:00
Igor Breger	096e8b0995	AVX : Fix ISA disabling in case AVX512VL , some instructions should be disabled only if AVX512BW present. Tests added. Differential Revision: http://reviews.llvm.org/D11122 llvm-svn: 242270	2015-07-15 07:08:10 +00:00
Rafael Espindola	e649258272	Initial support for writing thin archives. llvm-svn: 242269	2015-07-15 05:47:46 +00:00
Michael Zolotukhin	1e8e7a8a43	Tidy-up test case from r242257. llvm-svn: 242268	2015-07-15 01:51:51 +00:00
Michael Zolotukhin	31b3eaaf28	[LoopUnrolling] Handle cast instructions. During estimation of unrolling effect we should be able to propagate constants through casts. Differential Revision: http://reviews.llvm.org/D10207 llvm-svn: 242257	2015-07-15 00:19:51 +00:00
JF Bastien	c8f48c19d3	WebAssembly: fix build breakage. Summary: processFunctionBeforeCalleeSavedScan was renamed to determineCalleeSaves and now takes a BitVector parameter as of rL242165, reviewed in http://reviews.llvm.org/D10909 WebAssembly is still marked as experimental and therefore doesn't build by default. It does, however, grep by default! I notice that processFunctionBeforeCalleeSavedScan is still mentioned in a few comments and error messages, which I also fixed. Reviewers: qcolombet, sunfish Subscribers: jfb, dsanders, hfinkel, MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D11199 llvm-svn: 242242	2015-07-14 23:06:07 +00:00
Hal Finkel	4012024fea	[PowerPC] Support symbolic targets in patchpoints Follow-up r235483, with the corresponding support in PPC. We use a regular call for symbolic targets (because they're much cheaper than indirect calls). llvm-svn: 242239	2015-07-14 22:53:11 +00:00
Rafael Espindola	142fc2d1c2	Accept lower case to handle windows error messages. llvm-svn: 242236	2015-07-14 22:42:21 +00:00
David Majnemer	33b6f82e72	[InstCombine] Generalize sub of selects optimization to all BinaryOperators This exposes further optimization opportunities if the selects are correlated. llvm-svn: 242235	2015-07-14 22:39:23 +00:00
Hal Finkel	9bbad03b98	[PowerPC] Use the ABI indirect-call protocol for patchpoints We used to take the address specified as the direct target of the patchpoint and did no TOC-pointer handling. This, however, as not all that useful, because MCJIT tends to create a lot of modules, and they have their own TOC sections. Thus, to call from the generated code to other generated code, you really need to switch TOC pointers. Make this work as expected, and under ELFv1, tread the address as the function descriptor address so that the correct TOC pointer can be loaded. llvm-svn: 242217	2015-07-14 22:26:06 +00:00
Rafael Espindola	4b83cb5390	Add support for reading members out of thin archives. For now the Archive owns the buffers of the thin archive members. This makes for a simple API, but all the buffers are destructed only when the archive is destructed. This should be fine since we close the files after mmap so we should not hit an open file limit. llvm-svn: 242215	2015-07-14 22:18:43 +00:00
Alex Lorenz	9fab370d79	MIR Serialization: Serialize the machine basic block live in registers. llvm-svn: 242204	2015-07-14 21:24:41 +00:00
Tim Northover	d5fdef016d	GVN: tolerate an instruction being replaced without existing in the leaderboard Sometimes an incidentally created instruction can duplicate a Value used elsewhere. It then often doesn't end up in the leader table. If it's later removed, we attempt to remove it from the leader table and segfault. Instead we should just ignore the removal request, which won't cause any problems. The reverse situation, where the original instruction is replaced by the new one (which you might think could leave the leader table empty) cannot occur, because the incidental instruction will never be found in the first place. llvm-svn: 242199	2015-07-14 21:03:18 +00:00
Hal Finkel	8acae5276e	[PowerPC] Fix the PPCInstrInfo::getInstrLatency implementation PowerPC uses itineraries to describe processor pipelines (and dispatch-group restrictions for P7/P8 cores). Unfortunately, the target-independent implementation of TII.getInstrLatency calls ItinData->getStageLatency, and that looks for the largest cycle count in the pipeline for any given instruction. This, however, yields the wrong answer for the PPC itineraries, because we don't encode the full pipeline. Because the functional units are fully pipelined, we only model the initial stages (there are no relevant hazards in the later stages to model), and so the technique employed by getStageLatency does not really work. Instead, we should take the maximum output operand latency, and that's what PPCInstrInfo::getInstrLatency now does. This caused some test-case churn, including two unfortunate side effects. First, the new arrangement of copies we get from function parameters now sometimes blocks VSX FMA mutation (a FIXME has been added to the code and the test cases), and we have one significant test-suite regression: SingleSource/Benchmarks/BenchmarkGame/spectral-norm 56.4185% +/- 18.9398% In this benchmark we have a loop with a vectorized FP divide, and it with the new scheduling both divides end up in the same dispatch group (which in this case seems to cause a problem, although why is not exactly clear). The grouping structure is hard to predict from the bottom of the loop, and there may not be much we can do to fix this. Very few other test-suite performance effects were really significant, but almost all weakly favor this change. However, in light of the issues highlighted above, I've left the old behavior available via a command-line flag. llvm-svn: 242188	2015-07-14 20:02:02 +00:00
Krzysztof Parzyszek	758744706a	[Hexagon] Generate instructions for operations on predicate registers Convert logical operations on general-purpose registers to the correspon- ding operations on predicate registers. llvm-svn: 242186	2015-07-14 19:30:21 +00:00
Keno Fischer	aff703a2ca	[CodeGen] Force emission of personality directive if explicitly specified Summary: Before this change, personality directives were not emitted if there was no invoke left in the function (of course until recently this also meant that we couldn't know what the personality actually was). This patch forces personality directives to still be emitted, unless it is known to be a noop in the absence of invokes, or the user explicitly specified `nounwind` (and not `uwtable`) on the function. Reviewers: majnemer, rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D10884 llvm-svn: 242185	2015-07-14 19:22:51 +00:00
Matt Arsenault	24692118ba	AMDGPU: Avoid using 64-bit shift for i64 (shl x, 32) This can be done only with moves which theoretically will optimize better later. Although this transform increases the instruction count, it should be code size / cycle count neutral in the worst VALU case. It also seems to slightly improve a couple of testcases due to other DAG combines this exposes. This is probably slightly worse for the SALU case, so it might be better to handle this during moveToVALU, although then you lose some simplifications like the load width reducing in the simple testcase. llvm-svn: 242177	2015-07-14 18:20:33 +00:00
Matt Arsenault	84db5d97b0	AMDGPU/SI: Fix read2 merging into a super register. If the read2 produced was supposed to be writing into a super register, it would use the wrong subregister indices. Fix this by inserting copies, so we only ever write to a vreg_64. Run the register coalescer again to clean this up, although this isn't ideal and often does result in an extra move. Also remove the assert that offset1 > offset0. There isn't a real reason to not allow this other than a minor convenience in the compiler, and it doesn't seem worth the effort of avoiding it. llvm-svn: 242174	2015-07-14 17:57:36 +00:00
Nemanja Ivanovic	984a3613b3	Add missing builtins to the PPC back end for ABI compliance (vol. 4) This patch corresponds to review: http://reviews.llvm.org/D11183 Back end portion of the fourth round of additions to altivec.h. llvm-svn: 242167	2015-07-14 17:25:20 +00:00
Tim Northover	feabe2e21e	ARM: add at least one real test for r242123. The ones committed were orthogonal to the change and would have passed before that revision. What it did do was prevent an assertion failure when generating object files. llvm-svn: 242166	2015-07-14 17:23:55 +00:00
Matthias Braun	0256486532	PrologEpilogInserter: Rewrite API to determine callee save regsiters. This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan(): - Rename the function to determineCalleeSaves() - Pass a bitset of callee saved registers by reference, thus avoiding the function-global PhysRegUsed bitset in MachineRegisterInfo. - Without PhysRegUsed the implementation is fine tuned to not save physcial registers which are only read but never modified. Related to rdar://21539507 Differential Revision: http://reviews.llvm.org/D10909 llvm-svn: 242165	2015-07-14 17:17:13 +00:00
Tim Northover	c962d4f28b	AArch64: add rev64 alias for 64-bit rev instruction. It could be useful to assembly programmers and makes the permitted variants a little more uniform. llvm-svn: 242164	2015-07-14 17:07:29 +00:00
Krzysztof Parzyszek	a0ecf07c0b	[Hexagon] Generate "extract" instructions more aggressively Generate extract instructions (via intrinsics) before the DAG combiner folds shifts into unrecognizable forms. llvm-svn: 242163	2015-07-14 17:07:24 +00:00
Rafael Espindola	e549b8c259	llvm-ar: Don't try to extract from thin archives. This matches the gnu ar behavior. llvm-svn: 242162	2015-07-14 16:55:13 +00:00
Rafael Espindola	4ae784396c	Sleep for 2.1 seconds to see if that makes the test stable on windows. Might fix pr24106. llvm-svn: 242158	2015-07-14 16:34:23 +00:00
Rafael Espindola	c3eec458ab	llvm-ar: print an error when the requested member is not found. llvm-svn: 242156	2015-07-14 16:02:40 +00:00
Rafael Espindola	bcb440fb1f	Rename a test. NFC. llvm-svn: 242151	2015-07-14 15:06:18 +00:00
Tom Stellard	e48fe2a27a	AMDGPU/SI: Add support for shrinking v_cndmask_b32_e32 instructions Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11061 llvm-svn: 242146	2015-07-14 14:15:03 +00:00
Daniel Sanders	03f9c019d2	[mips] Fix li/la differences between IAS and GAS. Summary: - Signed 16-bit should have priority over unsigned. - For la, unsigned 16-bit must use ori+addu rather than directly use ori. - Correct tests on 32-bit immediates with 64-bit predicates by sign-extending the immediate beforehand. For example, isInt<16>(0xffff8000) should be true and use addiu. Also split li/la testing into separate files due to their size. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10967 llvm-svn: 242139	2015-07-14 12:24:22 +00:00
David Majnemer	62690b1952	[SROA] Don't de-atomic volatile loads and stores Volatile loads and stores are made visible in global state regardless of what memory is involved. It is not correct to disregard the ordering and synchronization scope because it is possible to synchronize with memory operations performed by hardware. This partially addresses PR23737. llvm-svn: 242126	2015-07-14 06:19:58 +00:00
Yaron Keren	d1ba2d9d8b	Generate correct asm info for mingw and cygwin ARM targets. http://reviews.llvm.org/D11075 Patch by Martell Malone Reviewed by Reid Kleckner llvm-svn: 242123	2015-07-14 05:51:05 +00:00
NAKAMURA Takumi	326c2c4cff	Give an explicit triple to llvm/test/CodeGen/X86/pr13577.ll. llvm-svn: 242111	2015-07-14 03:07:06 +00:00
Matthias Braun	75e668ea6e	Revert "LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization" Accidental commit, needs review first. This reverts commit r242107. llvm-svn: 242108	2015-07-14 02:09:57 +00:00
Matthias Braun	4ac4ecdadf	LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization - Factor out code to query and modify the sign bit of a floatingpoint value as an integer. This also works if none of the targets integer types is big enough to hold all bits of the floatingpoint value. - Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available, otherwise perform bit manipulation on the sign bit. The previous code used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also takes 34 instructions on ARM Cortex-M4. With this patch we only require 5: vldr d0, LCPI0_0 vmov r2, r3, d0 lsrs r2, r3, #31 bfi r1, r2, #31, #1 bx lr (This could be further improved if the compiler would recognize that r2, r3 is zero). - Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is available otherwise perform bit manipulation on the sign bit. - Perform the sign(x) test by masking out the sign bit and comparing with 0 rather than shifting the sign bit to the highest position and testing for "<s 0". For x86 copysignl (on 80bit values) this gets us: testl $32768, %eax rather than: shlq $48, %rax sets %al testb %al, %al llvm-svn: 242107	2015-07-14 02:08:26 +00:00
Matthias Braun	b457ed3312	X86: Check output of x86 copysignl testcase. This makes the changes in an upcoming patch visible. llvm-svn: 242106	2015-07-14 02:08:23 +00:00
Alex Lorenz	418f3ec17d	MIR Serialization: Serialize the variable sized stack objects. llvm-svn: 242095	2015-07-14 00:26:26 +00:00
Reid Kleckner	486fa3977a	Update enforceKnownAlignment after the isWeakForLinker semantic change Previously we would refrain from attempting to increase the linkage of available_externally globals because they were considered weak for the linker. Now they are treated more like a declaration instead of a weak definition. This was causing SSE alignment faults in Chromuim, when some code assumed it could increase the alignment of a dllimported global that it didn't control. http://crbug.com/509256 llvm-svn: 242091	2015-07-14 00:11:08 +00:00
Alex Lorenz	2eacca86ef	MIR Serialization: Serialize the sub register indices. This commit serializes the sub register indices from the register machine operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242084	2015-07-13 23:24:34 +00:00
Rafael Espindola	8c1a9102e6	Add missing file. Sorry about that. llvm-svn: 242083	2015-07-13 23:14:26 +00:00
Rafael Espindola	c60d0d2a15	Fix reading archive members with / in the name. This is important for thin archives. llvm-svn: 242082	2015-07-13 23:07:05 +00:00
Bill Schmidt	15deb803b4	[PPC64LE] More improvements to VSX swap optimization This patch allows VSX swap optimization to succeed more frequently. Specifically, it is concerned with common code sequences that occur when copying a scalar floating-point value to a vector register. This patch currently handles cases where the floating-point value is already in a register, but does not yet handle loads (such as via an LXSDX scalar floating-point VSX load). That will be dealt with later. A typical case is when a scalar value comes in as a floating-point parameter. The value is copied into a virtual VSFRC register, and then a sequence of SUBREG_TO_REG and/or COPY operations will convert it to a full vector register of the class required by the context. If this vector register is then used as part of a lane-permuted computation, the original scalar value will be in the wrong lane. We can fix this by adding a swap operation following any widening SUBREG_TO_REG operation. Additional COPY operations may be needed around the swap operation in order to keep register assignment happy, but these are pro forma operations that will be removed by coalescing. If a scalar value is otherwise directly referenced in a computation (such as by one of the many XS* vector-scalar operations), we currently disable swap optimization. These operations are lane-sensitive by definition. A MentionsPartialVR flag is added for use in each swap table entry that mentions a scalar floating-point register without having special handling defined. A common idiom for PPC64LE is to convert a double-precision scalar to a vector by performing a splat operation. This ensures that the value can be referenced as V[0], as it would be for big endian, whereas just converting the scalar to a vector with a SUBREG_TO_REG operation leaves this value only in V[1]. A doubleword splat operation is one form of an XXPERMDI instruction, which takes one doubleword from a first operand and another doubleword from a second operand, with a two-bit selector operand indicating which doublewords are chosen. In the general case, an XXPERMDI can be permitted in a lane-swapped region provided that it is properly transformed to select the corresponding swapped values. This transformation is to reverse the order of the two input operands, and to reverse and complement the bits of the selector operand (derivation left as an exercise to the reader ;). A new test case that exercises the scalar-to-vector and generalized XXPERMDI transformations is added as CodeGen/PowerPC/swaps-le-5.ll. The patch also requires a change to CodeGen/PowerPC/swaps-le-3.ll to use CHECK-DAG instead of CHECK for two independent instructions that now appear in reverse order. There are two small unrelated changes that are added with this patch. First, the XXSLDWI instruction was incorrectly omitted from the list of lane-sensitive instructions; this is now fixed. Second, I observed that the same webs were being rejected over and over again for different reasons. Since it's sufficient to reject a web only once, I added a check for this to speed up the compilation time slightly. llvm-svn: 242081	2015-07-13 22:58:19 +00:00
Pete Cooper	1a550c0a87	Remove unnecessary lines from the test in r242068. This test case was breaking the hexagon elf bot. The failing lines were actually unnecessary as checking that the store still reads the correct value demonstrates that everything is working fine now. llvm-svn: 242073	2015-07-13 21:50:35 +00:00
Pete Cooper	90d95edbb4	Loop idiom recognizer was replacing too many uses of popcount. When spotting that a loop can use ctpop, we were incorrectly replacing all uses of a value with a value derived from ctpop. The bug here was exposed because we were replacing a use prior to the ctpop with the ctpop value and so we have a use before def, i.e., we changed %tobool.5 = icmp ne i32 %num, 0 store i1 %tobool.5, i1* %ptr br i1 %tobool.5, label %for.body.lr.ph, label %for.end to store i1 %1, i1* %ptr %0 = call i32 @llvm.ctpop.i32(i32 %num) %1 = icmp ne i32 %0, 0 br i1 %1, label %for.body.lr.ph, label %for.end Even if we inserted the ctpop so that it dominates the store here, that would still be incorrect. The store doesn’t want the result of ctpop. The fix is very simple, and involves replacing only the branch condition with the ctpop instead of all uses. Reviewed by Hal Finkel. llvm-svn: 242068	2015-07-13 21:25:33 +00:00
Reid Kleckner	9a1a919465	[WinEH] Emit the LSDA even if no lpads remain but outlining occurred The outlined funclets call intrinsics which reference labels from the LSDA. This situation can easily arise in small functions with a single cleanup at -O0, where Clang marks a definition as nounwind, and then WinEHPrepare "discovers" that the landingpad is dead by accident and deletes it. We now need to ask the LLVM IR Function for it's personality directly, rather than going through MachineModuleInfo. Fixes PR23892. llvm-svn: 242063	2015-07-13 20:41:46 +00:00
Rafael Espindola	6a8e86f26e	Add support deterministic output in llvm-ar and make it the default. llvm-svn: 242061	2015-07-13 20:38:09 +00:00
David Majnemer	1305e2c0f5	[MC] Correctly escape .safeseh's symbol This fixes PR24107. llvm-svn: 242050	2015-07-13 18:51:15 +00:00
Mark Heffernan	d7ebc24112	Enable runtime unrolling with unroll pragma metadata Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N") and a runtime trip count. Also, do not unroll loops with unroll full metadata if the loop has a runtime loop count. Previously, such loops would be unrolled with a very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be enabled resulting in a very large (and likely unwise) unroll factor. llvm-svn: 242047	2015-07-13 18:26:27 +00:00
Alex Lorenz	de491f0515	MIR Serialization: Serialize the fixed stack objects. This commit serializes the fixed stack objects, including fixed spill slots. The fixed stack objects are serialized using a YAML sequence of YAML inline mappings. Each mapping has the object's ID, type, size, offset, and alignment. The objects that aren't spill slots also serialize the isImmutable and isAliased flags. The fixed stack objects are a part of the machine function's YAML mapping. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242045	2015-07-13 18:07:26 +00:00
Reid Kleckner	5f4dd92209	[WinEH] Strip the \01 character from the __CxxFrameHandler3 thunk name Add another C++ 32-bit EH table test. llvm-svn: 242044	2015-07-13 17:55:14 +00:00
James Y Knight	46f91c8457	Fix handling of the 'n' asm constraint with invalid operands. It had accidently accepted a symbol+offset value (and emitted incorrect code for it, keeping only the offset part) instead of properly reporting the constraint as invalid. Differential Revision: http://reviews.llvm.org/D11039 llvm-svn: 242040	2015-07-13 16:36:22 +00:00
Tom Stellard	db5a11f698	AMDGPU/SI: Select mad patterns to v_mac_f32 The two-address instruction pass will convert these back to v_mad_f32 if necessary. Differential Revision: http://reviews.llvm.org/D11060 llvm-svn: 242038	2015-07-13 15:47:57 +00:00
Logan Chien	0a43abc9f8	ARM: Fix cttz expansion on vector types. The 64/128-bit vector types are legal if NEON instructions are available. However, there was no matching patterns for @llvm.cttz.*() intrinsics and result in fatal error. This commit fixes the problem by lowering cttz to: a. ctpop((x & -x) - 1) b. width - ctlz(x & -x) - 1 llvm-svn: 242037	2015-07-13 15:37:30 +00:00
Scott Douglass	69bf1ce03a	[ARM] Handle commutativity when converting to tADDhirr in Thumb2 Also, run thumb_rewrite.s tests in Thumb2 now that they pass. Differential Revision: http://reviews.llvm.org/D11132 llvm-svn: 242036	2015-07-13 15:31:48 +00:00
Scott Douglass	d9d8d26458	[ARM] Add Thumb2 ADD with SP narrowing from 3 operand to 2 Differential Revision: http://reviews.llvm.org/D11131 llvm-svn: 242035	2015-07-13 15:31:40 +00:00
Scott Douglass	039f768c42	[ARM] Small refactor of tryConvertingToTwoOperandForm (nfc) Also, add more Thumb2 ADD tests requested during review of http://reviews.llvm.org/D11053. Differential Revision: http://reviews.llvm.org/D11130 llvm-svn: 242034	2015-07-13 15:31:33 +00:00
Silviu Baranga	a647c30f88	Cleanup after r241809 - remove uncessary call to std::sort Summary: The iteration order within a member of DepCands is deterministic and therefore we don't have to sort the accesses within a member. We also don't have to copy the indices of the pointers into a vector, since we can iterate over the members of the class. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11145 llvm-svn: 242033	2015-07-13 14:48:24 +00:00
Rafael Espindola	237c3a6def	Don't change the visibility when converting a definition to a declaration. llvm-svn: 242030	2015-07-13 14:18:22 +00:00
Rafael Espindola	7068cbbc1a	Print the visibility of available_externally functions. We were already printing it for declarations, but not available_externally. llvm-svn: 242027	2015-07-13 13:55:18 +00:00
Manuel Klimek	779cf85a4f	Revert r241981 "Revert "Revert r236894 "[BasicAA] Fix zext & sext handling""" The repros from PR23626 still fail. llvm-svn: 242025	2015-07-13 13:50:55 +00:00
Elena Demikhovsky	0f370936a0	AVX-512: Added all AVX-512 forms of Vector Convert for Float/Double/Int/Long types. In this patch I have only encoding. Intrinsics and DAG lowering will be in the next patch. I temporary removed the old intrinsics test (just to split this patch). Half types are not covered here. Differential Revision: http://reviews.llvm.org/D11134 llvm-svn: 242023	2015-07-13 13:26:20 +00:00
Jingyue Wu	9a92d4fb04	[LSR] don't attempt to promote ephemeral values to indvars Summary: This at least saves compile time. I also encountered a case where ephemeral values affect whether other variables are promoted, causing performance issues. It may be a bug in LSR, but I didn't manage to reduce it yet. Anyhow, I believe it's in general not worth considering ephemeral values in LSR. Reviewers: atrick, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11115 llvm-svn: 242011	2015-07-13 03:28:53 +00:00
David Majnemer	599ca4426c	[InstSimplify] Teach InstSimplify how to simplify extractelement llvm-svn: 242008	2015-07-13 01:15:53 +00:00
David Majnemer	25a796e148	[InstSimplify] Teach InstSimplify how to simplify extractvalue llvm-svn: 242007	2015-07-13 01:15:46 +00:00
Renato Golin	1ef7a0f7c0	[ARM] Add support for nest attribute using r12 Register r12 ('ip') is used by GCC for this purpose and hence is used here. As discussed on the GCC mailing list, the register choice is an ABI issue and so choosing the same register as GCC means __builtin_call_with_static_chain is compatible. A similar patch has just gone in the AArch64 backend, so this is just the ARM counterpart, following the same discussion. Patch by Stephen Cross. llvm-svn: 241996	2015-07-12 18:16:40 +00:00
Simon Pilgrim	268ef6af0b	[X86][SSE] Tidied up vector extend/truncation tests. NFCI. llvm-svn: 241995	2015-07-12 17:40:49 +00:00
Simon Pilgrim	64cc4ad0a2	[X86][SSE] Vectorized v4i32 non-uniform shifts. While the v4i32 shl operation is already vectorized using a cvttps2dq/pmulld pattern, the lshr/ashr opeations are still scalarized. This patch adds vectorization support for non-uniform v4i32 shift operations - it splats constant shift amounts to allow them to use the immediate sse shift instructions, or extracts/zero-extends non-constant shift amounts. The individual results are then blended together. Differential Revision: http://reviews.llvm.org/D11063 llvm-svn: 241989	2015-07-12 11:15:19 +00:00
David Majnemer	6bc83e0f43	[LICM] Don't try to sink values out of loops without any exits There is no suitable basic block to sink instructions in loops without exits. The only way an instruction in a loop without exits can be used is as an incoming value to a PHI. In such cases, the incoming block for the corresponding value is unreachable. This fixes PR24013. Differential Revision: http://reviews.llvm.org/D10903 llvm-svn: 241987	2015-07-12 03:53:05 +00:00
Hal Finkel	cbf08925ef	[PowerPC] Make use of the TargetRecip system r238842 added the TargetRecip system for controlling use of reciprocal estimates for sqrt and division using a set of parameters that can be set by the frontend. Clang now supports a sophisticated -mrecip option, and this will allow that option to effectively control the relevant code-generation functionality of the PPC backend. llvm-svn: 241985	2015-07-12 02:33:57 +00:00
Hal Finkel	965cea5670	[PowerPC] Support the nest parameter attribute This adds support for the 'nest' attribute, which allows the static chain register to be set for functions calls under non-Darwin PPC/PPC64 targets. r11 is the chain register (which the PPC64 ELF ABI calls the "environment pointer"). For indirect calls under PPC64 ELFv1, this would normally be loaded from the function descriptor, but providing an explicit 'nest' parameter will override that process and use the value provided. This allows __builtin_call_with_static_chain to work as expected on PowerPC. llvm-svn: 241984	2015-07-12 00:37:44 +00:00
Hal Finkel	ef28aad9f4	Revert "Revert r236894 "[BasicAA] Fix zext & sext handling"" r236894 caused PR23626 (Clang miscompiles webkit's base64 decoder), and was reverted in r237984. This reapplies the patch with an additional test case for PR23626 and the associated fix (both scales and offsets in the BasicAliasAnalysis::constantOffsetHeuristic should initially be zero). Patch by Nick White, thanks! llvm-svn: 241981	2015-07-11 11:04:54 +00:00
Igor Laevsky	39d662f7ba	Add argmemonly attribute. This change adds new attribute called "argmemonly". Function marked with this attribute can only access memory through it's argument pointers. This attribute directly corresponds to the "OnlyAccessesArgumentPointees" ModRef behaviour in alias analysis. Differential Revision: http://reviews.llvm.org/D10398 llvm-svn: 241979	2015-07-11 10:30:36 +00:00
Tyler Nowicki	3960d85262	Renamed some uses of unroll to interleave in the vectorizer. llvm-svn: 241971	2015-07-11 00:31:11 +00:00
Alex Lorenz	53464510cc	MIR Serialization: Serialize the virtual register operands. Reviewers: Duncan P. N. Exon Smith Differential Revision: http://reviews.llvm.org/D11005 llvm-svn: 241959	2015-07-10 22:51:20 +00:00
Bjorn Steinbrink	a6b929dfe2	[InstCombine] Actually combine AA metadata when replacing one load with another Fixes PR24083 llvm-svn: 241955	2015-07-10 22:30:17 +00:00
Reid Kleckner	7ea7708d92	[SEH] Push reloads of the SEH code past phi nodes This in turn would sometimes introduce new cleanupblocks that didn't previously exist. The uses were being introduced by SSA value demotion. We actually want to promote uses of EH pointers and selectors, so I added some spcecial casing to avoid demoting such instructions. This is getting overly complicated, but hopefully we'll come along and delete it in the new representation. llvm-svn: 241950	2015-07-10 22:21:54 +00:00
Matt Arsenault	f54dc2384d	DAGCombiner: Assume invariant load cannot alias a store The motivation is to allow GatherAllAliases / FindBetterChain to not give up on dependent loads of a pointer from constant memory. This is important for AMDGPU, because most loads are pointers derived from a load of a kernel argument from constant memory. llvm-svn: 241948	2015-07-10 22:17:40 +00:00
Quentin Colombet	8b984d19f2	[ShrinkWrap][PEI] Do not insert epilogue for unreachable blocks. Although this is not incorrect to insert such code, it is useless and it hurts the binary size. llvm-svn: 241946	2015-07-10 22:09:55 +00:00
Evgeniy Stepanov	00b3020453	Fix AArch64 prologue for empty frame with dynamic allocas. Fixes PR23804: assertion failure in emitPrologue in the case of a function with an empty frame and a dynamic alloca that needs stack realignment. This is a typical case for AddressSanitizer. llvm-svn: 241943	2015-07-10 21:24:07 +00:00
Michael J. Spencer	3569a84598	[Object][ELF] Handle the dynamic string table in files without a section table. llvm-svn: 241937	2015-07-10 20:11:57 +00:00

... 12 13 14 15 16 ...

32219 Commits