llvm-project

Commit Graph

Author	SHA1	Message	Date
Peter Collingbourne	a0f371a106	Bitcode: Add a string table to the bitcode format. Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464	2017-04-17 17:51:36 +00:00
Tim Northover	879a0b2e1b	AArch64: support nonlazybind It's almost certainly not a good idea to actually use it in most cases (there's a pretty large code size overhead on AArch64), but we can't do those experiments until it's supported. llvm-svn: 300462	2017-04-17 17:27:56 +00:00
Matt Arsenault	7205f3c2e4	AMDGPU: SimplifyDemandedElts for image intrinsics Causes some VGPR usage improvements in shaderdb, but introduces some SGPR spilling regressions due to random scheduling changes later. llvm-svn: 300453	2017-04-17 15:12:44 +00:00
Max Kazantsev	751579cac0	[LoopPeeling] Get rid of Phis that become invariant after N steps This patch is a generalization of the improvement introduced in rL296898. Previously, we were able to peel one iteration of a loop to get rid of a Phi that becomes an invariant on the 2nd iteration. In more general case, if a Phi becomes invariant after N iterations, we can peel N times and turn it into invariant. In order to do this, we for every Phi in loop's header we define the Invariant Depth value which is calculated as follows: Given %x = phi <Inputs from above the loop>, ..., [%y, %back.edge]. If %y is a loop invariant, then Depth(%x) = 1. If %y is a Phi from the loop header, Depth(%x) = Depth(%y) + 1. Otherwise, Depth(%x) is infinite. Notice that if we peel a loop, all Phis with Depth = 1 become invariants, and all other Phis with finite depth decrease the depth by 1. Thus, peeling N first iterations allows us to turn all Phis with Depth <= N into invariants. Reviewers: reames, apilipenko, mkuper, skatkov, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31613 llvm-svn: 300446	2017-04-17 09:52:02 +00:00
Max Kazantsev	8ed6b66d85	[LoopPeeling] Fix condition for phi-eliminating peeling When peeling loops basing on phis becoming invariants, we make a wrong loop size check. UP.Threshold should be compared against the total numbers of instructions after the transformation, which is equal to 2 * LoopSize in case of peeling one iteration. We should also check that the maximum allowed number of peeled iterations is not zero. Reviewers: sanjoy, anna, reames, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31753 llvm-svn: 300441	2017-04-17 05:38:28 +00:00
Serguei Katkov	2616bbb16d	[BPI] Use metadata info before any other heuristics Metadata potentially is more precise than any heuristics we use, so it makes sense to use first metadata info if it is available. However it makes sense to examine it against other strong heuristics like unreachable one. If edge coming to unreachable block has higher probability then it is expected by unreachable heuristic then we use heuristic and remaining probability is distributed among other reachable blocks equally. An example where metadata might be more strong then unreachable heuristic is as follows: it is possible that there are two branches and for the branch A metadata says that its probability is (0, 2^25). For the branch B the probability is (1, 2^25). So the expectation is that first edge of B is hotter than first edge of A because first edge of A did not executed at least once. If first edge of A points to the unreachable block then using the unreachable heuristics we'll set the probability for A to (1, 2^20) and now edge of A becomes hotter than edge of B. This is unexpected behavior. This fixed the biggest part of https://bugs.llvm.org/show_bug.cgi?id=32214 Reviewers: sanjoy, junbuml, vsk, chandlerc Reviewed By: chandlerc Subscribers: llvm-commits, reames, davidxl Differential Revision: https://reviews.llvm.org/D30631 llvm-svn: 300440	2017-04-17 04:33:04 +00:00
Craig Topper	218a359fbd	[InstCombine] Simplify 1/X for vectors. llvm-svn: 300439	2017-04-17 03:41:47 +00:00
Craig Topper	eee53c030a	[InstCombine] Add test cases for missing support for simplifying 1/X for vectors. NFC llvm-svn: 300438	2017-04-17 03:41:44 +00:00
Craig Topper	1a18a7c51e	[InstCombine] Add support for vector srem->urem. llvm-svn: 300437	2017-04-17 01:51:24 +00:00
Craig Topper	b60f300afb	[InstCombine] Add missing testcases for srem->urem conversion. The vector version isn't currently supported. NFC llvm-svn: 300436	2017-04-17 01:51:21 +00:00
Craig Topper	f248468359	[InstCombine] Add support for turning vector sdiv into udiv. llvm-svn: 300435	2017-04-17 01:51:19 +00:00
Craig Topper	43b012b1b3	[InstCombine] Add test cases for missing support for turning vector sdiv into udiv. NFC llvm-svn: 300434	2017-04-17 01:51:16 +00:00
Benjamin Kramer	f5f593b674	[X86] Remove special handling for 16 bit for A asm constraints. Our 16 bit support is assembler-only + the terrible hack that is .code16gcc. Simply using 32 bit registers does the right thing for the latter. Fixes PR32681. llvm-svn: 300429	2017-04-16 20:13:08 +00:00
Michael Zuckerman	16b20d2fc5	[X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization This patch adds new optimization (Folding cmp(sub(a,b),0) into cmp(a,b)) to instCombineCall pass and was written specific for X86 CMP intrinsics. Differential Revision: https://reviews.llvm.org/D31398 llvm-svn: 300422	2017-04-16 13:26:08 +00:00
Dimitry Andric	909b3376ba	Use correct registers for "A" inline asm constraint Summary: In PR32594, inline assembly using the 'A' constraint on x86_64 causes llvm to crash with a "Cannot select" stack trace. This is because `X86TargetLowering::getRegForInlineAsmConstraint` hardcodes that 'A' means the EAX and EDX registers. However, on x86_64 it means the RAX and RDX registers, and on 16-bit x86 (ia16?) it means the old AX and DX registers. Add new register classes in `X86RegisterInfo.td` to support these cases, and amend the logic in `getRegForInlineAsmConstraint` to cope with different subtargets. Also add a test case, derived from PR32594. Reviewers: craig.topper, qcolombet, RKSimon, ab Reviewed By: ab Subscribers: ab, emaste, royger, llvm-commits Differential Revision: https://reviews.llvm.org/D31902 llvm-svn: 300404	2017-04-15 22:15:01 +00:00
Sanjay Patel	ef9f586bb2	[InstCombine] allow (X != C1 && X != C2) and similar patterns to match splat vector constants llvm-svn: 300402	2017-04-15 17:55:06 +00:00
Sanjay Patel	c8405b82a1	[InstCombine] add tests to show missing transforms for vectors; NFC llvm-svn: 300401	2017-04-15 17:50:45 +00:00
Sam Clegg	135a4b8ea1	[WebAssembly] Improve readobj and nm support for wasm Now that the libObect support for wasm is better we can have readobj and nm produce more useful output too. Differential Revision: https://reviews.llvm.org/D31514 llvm-svn: 300365	2017-04-14 19:50:44 +00:00
Sanjay Patel	7cfe41659c	[InstCombine] (X != C1 && X != C2) --> (X \| (C1 ^ C2)) != C2 ...when C1 differs from C2 by one bit and C1 <u C2: http://rise4fun.com/Alive/Vuo And move related folds to a helper function. This reduces code duplication and will make it easier to remove the scalar-only restriction as a follow-up step. llvm-svn: 300364	2017-04-14 19:23:50 +00:00
Craig Topper	fb71b7d3e0	[InstCombine] Support folding a subtract with a constant LHS into a phi node We currently only support folding a subtract into a select but not a PHI. This fixes that. I had to fix an assumption in FoldOpIntoPhi that assumed the PHI node was always in operand 0. Now we pass it in like we do for FoldOpIntoSelect. But we still require some dancing to find the Constant when we create the BinOp or ConstantExpr. This is based code is similar to what we do for selects. Since I touched all call sites, this also renames FoldOpIntoPhi to foldOpIntoPhi to match coding standards. Differential Revision: https://reviews.llvm.org/D31686 llvm-svn: 300363	2017-04-14 19:20:12 +00:00
Stanislav Mekhanoshin	eff0bc7839	[AMDGPU] set read_only access qualifier for pointers If a kernel's pointer argument is known to be readonly set access qualifier accordingly. This allows RT not to flush caches before dispatches. Differential Revision: https://reviews.llvm.org/D32091 llvm-svn: 300362	2017-04-14 19:11:40 +00:00
Sam Clegg	dd2d7bf100	[Test commit] Cleanup some whitespace in a test file llvm-svn: 300361	2017-04-14 18:43:57 +00:00
Craig Topper	d61ccd735e	[InstCombine] Regenerate test checks using script. NFC llvm-svn: 300360	2017-04-14 18:42:55 +00:00
Sanjay Patel	9d39a9d860	[InstCombine] add/move tests for and/or-of-icmps equality folds; NFC llvm-svn: 300357	2017-04-14 18:19:27 +00:00
Alexey Bataev	9c27d79520	Update tests for the patch. llvm-svn: 300351	2017-04-14 17:47:07 +00:00
Simon Pilgrim	5a22eaa2bf	[X86][SSE] Update MOVNTDQA non-temporal loads to generic implementation (LLVM) MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics. Clang companion patch: D31766. Differential Revision: https://reviews.llvm.org/D31767 llvm-svn: 300325	2017-04-14 15:05:35 +00:00
Dmitry Preobrazhensky	e6ef099dcd	[AMDGPU][MC] Corrected ds_write_src2_* to require one offset instead of two. Fixed bug 32551: https://bugs.llvm.org//show_bug.cgi?id=32551 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31809 llvm-svn: 300319	2017-04-14 12:28:07 +00:00
Dmitry Preobrazhensky	5714860ee4	[AMDGPU][MC] Enabled constants for src operands of s_cbranch_g_fork Fixed bug 32619: https://bugs.llvm.org//show_bug.cgi?id=32619 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D31973 llvm-svn: 300318	2017-04-14 11:52:26 +00:00
Andrew V. Tischenko	4e7bcd5216	Fix for PR#30562: Selection DAG error: Detected cycle in SelectionDAG. Patch by Dinar Temirbulatov llvm-svn: 300314	2017-04-14 09:17:09 +00:00
Andrew V. Tischenko	75745d0c3e	This patch closes PR#32216: Better testing of schedule model instruction latencies/throughputs. The details are here: https://reviews.llvm.org/D30941 llvm-svn: 300311	2017-04-14 07:44:23 +00:00
Peter Collingbourne	8446f1fe6a	Object, LTO: Add target triple to irsymtab and LTO API. Start using it in LLD to avoid needing to read bitcode again just to get the target triple, and in llvm-lto2 to avoid printing symbol table information that is inappropriate for the target. Differential Revision: https://reviews.llvm.org/D32038 llvm-svn: 300300	2017-04-14 02:55:06 +00:00
Daniel Berlin	2f72b19b05	NewGVN: Don't propagate over phi backedges where undef causes us to have >1 value, unless we can prove the phi node is cycle free. Fixes PR 32607. llvm-svn: 300299	2017-04-14 02:53:37 +00:00
Stanislav Mekhanoshin	86b0a5465b	[AMDGPU] added SIInstrInfo::getAddNoCarry() helper Addressed rest of post submit comments from D31993. Differential Revision: https://reviews.llvm.org/D32057 llvm-svn: 300288	2017-04-14 00:33:44 +00:00
Xinliang David Li	57dea2d359	[Profile] PE binary coverage bug fix PR/32584 Differential Revision: https://reviews.llvm.org/D32023 llvm-svn: 300277	2017-04-13 23:37:12 +00:00
Adam Nemet	c5779460f4	[AArch64] Avoid partial register writes on lane 0 of BUILD_VECTOR for i8/i16/f16 This further improves Ahmed's change in rL299482. See the new comment for the rationale. The patch recovers most of the regression for bzip2 after D31965. We're down to +2.68% from +6.97%. Differential Revision: https://reviews.llvm.org/D32028 llvm-svn: 300276	2017-04-13 23:32:47 +00:00
Konstantin Zhuravlyov	d24aeb20fc	AMDGPU/GFX9: Do not use v_pack_b32_f16 when packing Differential Revision: https://reviews.llvm.org/D31819 llvm-svn: 300275	2017-04-13 23:17:00 +00:00
Alexei Starovoitov	56db145164	[bpf] Fix memory offset check for loads and stores If the offset cannot fit into the instruction, an addition to the pointer is emitted before the actual access. However, BPF offsets are 16-bit but LLVM considers them to be, for the matter of this check, to be 32-bit long. This causes the following program: int bpf_prog1(void ign) { volatile unsigned long t = 0x8983984739ull; return (unsigned long )((0xffffffff8fff0002ull) + t); } To generate the following (wrong) code: 0: 18 01 00 00 39 47 98 83 00 00 00 00 89 00 00 00 r1 = 590618314553ll 2: 7b 1a f8 ff 00 00 00 00 (u64 )(r10 - 8) = r1 3: 79 a1 f8 ff 00 00 00 00 r1 = (u64 )(r10 - 8) 4: 79 10 02 00 00 00 00 00 r0 = (u64 *)(r1 + 2) 5: 95 00 00 00 00 00 00 00 exit Fix it by changing the offset check to 16-bit. Patch by Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D32055 llvm-svn: 300269	2017-04-13 22:24:13 +00:00
Zachary Turner	4dc4f01a86	[llvm-pdbdump] Recursively dump class layout. llvm-svn: 300258	2017-04-13 21:11:00 +00:00
Richard Smith	6c2615177b	Revert accidentally-committed files in r300252. llvm-svn: 300253	2017-04-13 20:31:21 +00:00
Richard Smith	55bd375b69	Remove all allocation and divisions from GreatestCommonDivisor Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our `lshr` implementation so it can operate in-place. Differential Revision: https://reviews.llvm.org/D31968 llvm-svn: 300252	2017-04-13 20:29:59 +00:00
Reid Kleckner	257cb4e099	[InstCombine] Fix !prof metadata preservation for invokes Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251	2017-04-13 20:26:38 +00:00
Dehao Chen	2c7ca9b5df	SamplePGO: convert callsite samples map key from callsite_location to callsite_location+callee_name Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly. Reviewers: davidxl, dnovillo Reviewed By: davidxl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D31950 llvm-svn: 300240	2017-04-13 19:52:10 +00:00
Anna Thomas	dcdb325fee	[LV] Fix the vector code generation for first order recurrence Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396. Reviewers: mssimpso, mkuper, anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31979 llvm-svn: 300238	2017-04-13 18:59:25 +00:00
Sanjay Patel	445d03bf00	[InstCombine] fold X == 0 \|\| X == -1 to one compare (PR32524) This is effectively a retry of: https://reviews.llvm.org/rL299851 but now we have tests and an assert to make sure the bug that was exposed with that attempt will not happen again. I'll fix the code duplication and missing sibling fold next, but I want to make this change as small as possible to reduce risk since I messed it up last time. This should fix: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 300236	2017-04-13 18:47:06 +00:00
Reid Kleckner	3a1150352d	[ArgPromotion] Don't drop !prof metadata on promoted calls Noticed by inspection while doing attribute work. DAE, InstCombineCalls, and ArgPromotion have a fair amount of duplicated code for hacking on call sites, and you can find bugs by comparing them. Add a test case for this. llvm-svn: 300229	2017-04-13 18:10:30 +00:00
Stanislav Mekhanoshin	d026f79bd3	[AMDGPU] Combine DS operations with offsets bigger than byte In many cases ds operations can be combined even if offsets do not fit into 8 bit encoding. What it takes is to adjust base address. Differential Revision: https://reviews.llvm.org/D31993 llvm-svn: 300227	2017-04-13 17:53:07 +00:00
Brian Gesiak	0a7894d99c	[Analysis] Support bitreverse in -demanded-bits pass Summary: * Add a bitreverse case in the demanded bits analysis pass. * Add tests for the bitreverse (and bswap) intrinsic in the demanded bits pass. * Add a test case to the BDCE tests: that manipulations to high-order bits are eliminated once the bits are reversed and then right-shifted. Reviewers: mkuper, jmolloy, hfinkel, trentxintong Reviewed By: jmolloy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31857 llvm-svn: 300215	2017-04-13 16:44:25 +00:00
Tobias Edler von Koch	90df1f48d5	LTO: Pass SF_Executable flag through to InputFile::Symbol Summary: The linker needs to be able to determine whether a symbol is text or data to handle the case of a common being overridden by a strong definition in an archive. If the archive contains a text member of the same name as the common, that function is discarded. However, if the archive contains a data member of the same name, that strong definition overrides the common. This is a behavior of ld.bfd, which the Qualcomm linker also supports in LTO. Here's a test case to illustrate: #### cat > 1.c << \! int blah; ! cat > 2.c << \! int blah() { return 0; } ! cat > 3.c << \! int blah = 20; ! clang -c 1.c clang -c 2.c clang -c 3.c ar cr lib.a 2.o 3.o ld 1.o lib.a -t #### The correct output is: 1.o (lib.a)3.o Thanks to Shankar Easwaran and Hemant Kulkarni for the test case! Reviewers: mehdi_amini, rafael, pcc, davide Reviewed By: pcc Subscribers: davide, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D31901 llvm-svn: 300205	2017-04-13 16:24:14 +00:00
Krzysztof Parzyszek	c87c48019c	[Hexagon] Unxfail passing tests r300198 fixed a problem that caused two tests to be xfailed. Unxfail these tests now, since they are passing. llvm-svn: 300203	2017-04-13 16:05:35 +00:00
Sanjay Patel	104e36a0e9	[InstCombine] add/move tests for or-of-icmps; NFC If we had these tests, the bug caused by https://reviews.llvm.org/rL299851 would have been caught sooner. There's also an assert in the code that should have caught that bug, but the assert line itself has a bug. llvm-svn: 300201	2017-04-13 15:46:39 +00:00
Geoff Berry	85a530fb59	Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline." This reverts commit r296872 now that PR32153 has been fixed. llvm-svn: 300200	2017-04-13 15:36:25 +00:00
NAKAMURA Takumi	9763a3249f	llvm/test/BugPoint/compile-custom.ll: Use %/s for its path not to be mis-escaped. llvm-svn: 300193	2017-04-13 11:40:32 +00:00
Craig Topper	e70dffeb54	[InstCombine] Add vector version of a test to show missing optimization. llvm-svn: 300161	2017-04-13 01:31:40 +00:00
Wei Ding	74da350b85	AMDGPU : Fix common dominator of two incoming blocks terminates with uniform branch issue. Differential Revision: http://reviews.llvm.org/D31350 llvm-svn: 300142	2017-04-12 23:51:47 +00:00
Zachary Turner	9e7dda3c6d	[llvm-pdbdump] Minor prepatory refactor of Class Def Dumper. In a followup patch I intend to introduce an additional dumping mode which dumps a graphical representation of a class's layout. In preparation for this, the text-based layout printer needs to be split out from the graphical layout printer, and both need to be able to use the same code for printing the intro and outro of a class's definition (e.g. base class list, etc). This patch does so, and in the process introduces a skeleton definition for the graphical printer, while currently making the graphical printer just print nothing. NFC llvm-svn: 300134	2017-04-12 23:18:51 +00:00
Zachary Turner	c883a8c6dc	[llvm-pdbdump] More advanced class definition dumping. Previously the dumping of class definitions was very primitive, and it made it hard to do more than the most trivial of output formats when dumping. As such, we would only dump one line for each field, and then dump non-layout items like nested types and enums. With this patch, we do a complete analysis of the object hierarchy including aggregate types, bases, virtual bases, vftable analysis, etc. The only immediately visible effects of this are that a) we can now dump a line for the vfptr where before we would treat that as padding, and b) we now don't treat virtual bases that come at the end of a class as padding since we have a more detailed analysis of the class's storage usage. In subsequent patches, we should be able to use this analysis to display a complete graphical view of a class's layout including recursing arbitrarily deep into an object's base class / aggregate member hierarchy. llvm-svn: 300133	2017-04-12 23:18:21 +00:00
Matt Arsenault	0d0d6c2f25	AMDGPU: Fix invalid copies when copying i1 to phys reg Insert a VReg_1 virtual register so the i1 workaround pass can handle it. llvm-svn: 300113	2017-04-12 21:58:23 +00:00
Stanislav Mekhanoshin	c90347d760	[AMDGPU] Generate range metadata for workitem id If workgroup size is known inform llvm about range returned by local id and local size queries. Differential Revision: https://reviews.llvm.org/D31804 llvm-svn: 300102	2017-04-12 20:48:56 +00:00
Piotr Padlewski	04aee46779	Remove readnone from invariant.group.barrier Summary: Readnone attribute would cause CSE of two barriers with the same argument, which is invalid by example: struct Base { virtual int foo() { return 42; } }; struct Derived1 : Base { int foo() override { return 50; } }; struct Derived2 : Base { int foo() override { return 100; } }; void foo() { Base *x = new Base{}; new (x) Derived1{}; int a = std::launder(x)->foo(); new (x) Derived2{}; int b = std::launder(x)->foo(); } Here 2 calls of std::launder will produce @llvm.invariant.group.barrier, which would be merged into one call, causing devirtualization to devirtualize second call into Derived1::foo() instead of Derived2::foo() Reviewers: chandlerc, dberlin, hfinkel Subscribers: llvm-commits, rsmith, amharc Differential Revision: https://reviews.llvm.org/D31531 llvm-svn: 300101	2017-04-12 20:45:12 +00:00
Sanjay Patel	6e41018942	[InstCombine] fix wrong undef handling when converting select to shuffle As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32486 ...the canonicalization of vector select to shufflevector does not hold up when undef elements are present in the condition vector. Try to make the undef handling clear in the code and the LangRef. Differential Revision: https://reviews.llvm.org/D31980 llvm-svn: 300092	2017-04-12 18:39:53 +00:00
Peter Collingbourne	94baec6ee8	llvm-lto2: Add a dump-symtab subcommand. This allows us to test the symbol table APIs for LTO input files. Differential Revision: https://reviews.llvm.org/D31920 llvm-svn: 300086	2017-04-12 18:27:00 +00:00
Craig Topper	9a51c7f343	[InstCombine] Teach SimplifyDemandedInstructionBits that even if we reach an instruction that has multiple uses, if we know all the bits for the demanded bits for this context we can go ahead and create a constant. Currently if we reach an instruction with multiples uses we know we can't do any optimizations to that instruction itself since we only have the demanded bits for one of the users. But if we know all of the bits are zero/one for that one user we can still go ahead and create a constant to give to that user. This might then reduce the instruction to having a single use and allow additional optimizations on the other path. This picks up an additional case that r300075 didn't catch. Differential Revision: https://reviews.llvm.org/D31552 llvm-svn: 300084	2017-04-12 18:17:46 +00:00
Renato Golin	af3bc2089e	[SystemZ] Fix more target specific tests llvm-svn: 300081	2017-04-12 18:03:09 +00:00
Renato Golin	ab85113a93	[SystemZ] Fix target specific tests llvm-svn: 300078	2017-04-12 17:14:46 +00:00
Dmitry Preobrazhensky	14104e0d0f	[AMDGPU][MC] Added support for several VI-specific opcodes (s_wakeup, etc) Added support for VI: - s_endpgm_saved - s_wakeup - s_rfe_restore_b64 - v_perm_b32 Enabled for VI: - v_mov_fed_b32 - v_mov_fed_b32_e64 See bug 32593: https://bugs.llvm.org//show_bug.cgi?id=32593 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D31931 llvm-svn: 300076	2017-04-12 17:10:07 +00:00
Craig Topper	845033a6c9	Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit below the highest demanded bit can be simplified If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation. My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran. With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything. Differential Revision: https://reviews.llvm.org/D31120 llvm-svn: 300075	2017-04-12 16:49:59 +00:00
Dmitry Preobrazhensky	5ac9fd64a3	[AMDGPU][MC] Corrected parsing of v_cmp_class* and v_cmpx_class* Fixed bug 32565: https://bugs.llvm.org//show_bug.cgi?id=32565 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31820 llvm-svn: 300073	2017-04-12 16:31:18 +00:00
Dmitry Preobrazhensky	3bff0c8c59	[AMDGPU][MC] Corrected encoding of V_MQSAD_U32_U8 for CI Corrected encoding of V_MQSAD_U32_U8 for CI See bug 32552: https://bugs.llvm.org//show_bug.cgi?id=32552 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31810 llvm-svn: 300070	2017-04-12 15:36:09 +00:00
Sanjay Patel	33439f982b	[InstCombine] morph an existing instruction instead of creating a new one One potential way to make InstCombine (very slightly?) faster is to recycle instructions when possible instead of creating new ones. It's not explicitly stated AFAIK, but we don't consider this an "InstSimplify". We could, however, make a new layer to house transforms like this if that makes InstCombine more manageable (just throwing out an idea; not sure how much opportunity is actually here). Differential Revision: https://reviews.llvm.org/D31863 llvm-svn: 300067	2017-04-12 15:11:33 +00:00
Dmitry Preobrazhensky	7184c44d66	[AMDGPU][MC] Corrected ds_wrxchg2* to support two offsets Fixed bug 28227: https://bugs.llvm.org//show_bug.cgi?id=28227 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31808 llvm-svn: 300066	2017-04-12 14:29:45 +00:00
Jonas Paulsson	4707015d46	Fix a RUN line in new test. Use '2>&1 \|' and not '\|&' to pipe debug output to FileCheck Hopefully handles a "shell parser error" on llvm-clang-x86_64-expensive-checks-win test/Transforms/SLPVectorizer/SystemZ/SLP-cmp-cost-query.ll llvm-svn: 300064	2017-04-12 14:25:08 +00:00
Jonas Paulsson	22776892c9	[SLPVectorizer] Pass the right type argument to getCmpSelInstrCost() In getEntryCost(), make the scalar type for a compare instruction that of the operands, not i1. This is needed in order to call getCmpSelInstrCost() for a compare in a sensible way, the same way as the LoopVectorizer does. New test: test/Transforms/SLPVectorizer/SystemZ/SLP-cmp-cost-query.ll Review: Matthew Simpson https://reviews.llvm.org/D31601 llvm-svn: 300061	2017-04-12 13:29:25 +00:00
Jonas Paulsson	592dbea779	[LoopVectorizer] Improve handling of branches during cost estimation. The cost for a branch after vectorization is very different depending on if the vectorizer will if-convert the block (branch is eliminated), or if scalarized and predicated blocks will be produced (branch duplicated before each block). There is also the case of remaining scalar branches, such as the back-edge branch. This patch handles these cases differently with TTI based cost estimates. Review: Matthew Simpson https://reviews.llvm.org/D31175 llvm-svn: 300058	2017-04-12 13:13:15 +00:00
Igor Breger	3b97ea39e7	[GlobalIsel][X86] support G_CONSTANT selection. Summary: [GlobalISel][X86] support G_CONSTANT selection. Add regbank select tests. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: llvm-commits, dberris, rovka, kristof.beyls Differential Revision: https://reviews.llvm.org/D31974 llvm-svn: 300057	2017-04-12 12:54:54 +00:00
Jonas Paulsson	da74ed42da	[LoopVectorizer, TTI] New method supportsEfficientVectorElementLoadStore() Since SystemZ supports vector element load/store instructions, there is no need for extracts/inserts if a vector load/store gets scalarized. This patch lets Target specify that it supports such instructions by means of a new TTI hook that defaults to false. The use for this is in the LoopVectorizer getScalarizationOverhead() method, which will with this patch produce a smaller sum for a vector load/store on SystemZ. New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll Review: Adam Nemet https://reviews.llvm.org/D30680 llvm-svn: 300056	2017-04-12 12:41:37 +00:00
Dmitry Preobrazhensky	12194e9bec	[AMDGPU][MC] Corrected src0 size for s_cbranch_join Fix for bug 28159: https://bugs.llvm.org//show_bug.cgi?id=28159 Reviewers: vpykhtin, arsenm Differential Revision: https://reviews.llvm.org/D31595 llvm-svn: 300055	2017-04-12 12:40:19 +00:00
Jonas Paulsson	9b4875434e	[SystemZ] Updated test fp-cast.ll This did not get included in the previous commit for SystemZ cost functions. llvm-svn: 300053	2017-04-12 12:11:41 +00:00
Jonas Paulsson	fccc7d66c3	[SystemZ] TargetTransformInfo cost functions implemented. getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052	2017-04-12 11:49:08 +00:00
Sam Kolton	aff8341da2	[AMDGPU] SDWA: make pass global Summary: Remove checks for basic blocks. Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31935 llvm-svn: 300040	2017-04-12 09:36:05 +00:00
Daniel Sanders	0ed2882fd4	[globalisel][tablegen] Add experimental support for OperandWithDefaultOps, PredicateOperand, and OptionalDefOperand Summary: As far as instruction selection is concerned, all three appear to be same thing. Support for these operands is experimental since AArch64 doesn't make use of them and the in-tree targets that do use them (AMDGPU for OperandWithDefaultOps, AMDGPU/ARM/Hexagon/Lanai for PredicateOperand, and ARM for OperandWithDefaultOps) are not using tablegen-erated GlobalISel yet. Reviewers: rovka, aditya_nandakumar, t.p.northover, qcolombet, ab Reviewed By: rovka Subscribers: inglorion, aemerson, rengolin, mehdi_amini, dberris, kristof.beyls, igorb, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D31135 llvm-svn: 300037	2017-04-12 08:23:08 +00:00
Bjorn Pettersson	4af0593ecc	[LoadCombine] Avoid analysing dead basic blocks Summary: Dead basic blocks may be forming a loop, for which SSA form is fulfilled, but with a circular def-use chain. LoadCombine could enter an infinite loop when analysing such dead code. This patch solves the problem by simply avoiding to analyse all basic blocks that aren't forward reachable, from function entry, in LoadCombine. Fixes https://bugs.llvm.org/show_bug.cgi?id=27065 Reviewers: mehdi_amini, chandlerc, grosser, Bigcheese, davide Reviewed By: davide Subscribers: dberlin, zzheng, bjope, grandinj, Ka-Ka, materi, jholewinski, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31032 llvm-svn: 300034	2017-04-12 08:07:55 +00:00
Serguei Katkov	ecebc3db72	[BPI] Refactor post domination calculation and simple fix for ColdCall Collection of PostDominatedByUnreachable and PostDominatedByColdCall have been split out of heuristics itself. Update of the data happens now for each basic block (before update for PostDominatedByColdCall might be skipped if unreachable or matadata heuristic handled this basic block). This separation allows re-ordering of heuristics without loosing the post-domination information. Reviewers: sanjoy, junbuml, vsk, chandlerc, reames Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31701 llvm-svn: 300029	2017-04-12 05:42:14 +00:00
Kannan Narayanan	acb089e12a	[AMDGPU] Add a new pass to insert waitcnts. Leave under an option for testing. Based on comments in https://reviews.llvm.org/D31161. llvm-svn: 300023	2017-04-12 03:25:12 +00:00
Bob Haarman	4075ccc717	ThinLTOBitcodeWriter: keep comdats together, rename if leader is renamed Summary: COFF requires that every comdat contain a symbol with the same name as the comdat. ThinLTOBitcodeWriter renames symbols, which may cause this requirement to be violated. This change avoids such violations by renaming comdats if their leaders are renamed. It also keeps comdats together when splitting modules. Reviewers: pcc, mehdi_amini, tejohnson Reviewed By: pcc Subscribers: rnk, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31963 llvm-svn: 300019	2017-04-12 01:43:07 +00:00
Matt Arsenault	9ac40026dd	AMDGPU: Insert wait at start of callee functions llvm-svn: 300000	2017-04-11 22:29:31 +00:00
Matt Arsenault	efa9f4b210	AMDGPU: Refactor SIMachineFunctionInfo slightly Prepare for handling non-entry functions. llvm-svn: 299999	2017-04-11 22:29:28 +00:00
Matt Arsenault	e622dc3803	AMDGPU: Refactor argument lowering Split into smaller functions and prepare for handling non-entry functions. llvm-svn: 299998	2017-04-11 22:29:24 +00:00
Matt Arsenault	fe78ffba92	AMDGPU: Fix folding reg_sequence into copy to phys reg This was producing an illegal reg_sequence defining a physical register with virtual register inputs. llvm-svn: 299997	2017-04-11 22:29:19 +00:00
Evgeniy Stepanov	90fd87303c	[asan] Give global metadata private linkage. Internal linkage preserves names like "__asan_global_foo" which may account to 2% of unstripped binary size. llvm-svn: 299995	2017-04-11 22:28:13 +00:00
Zvi Rackover	30efd24d78	InstSimplify: A shuffle of a splat is always the splat itself Summary: Fold: shuffle (splat-shuffle), undef, M --> splat-shuffle Reviewers: spatel, RKSimon, craig.topper Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31527 llvm-svn: 299990	2017-04-11 21:37:02 +00:00
Zvi Rackover	f720c036f4	[DAGCombine] Add more test cases for shuffle of splat. NFC. Tests added contain splat-masks with undef elements. llvm-svn: 299988	2017-04-11 21:16:59 +00:00
Easwaran Raman	ddb9ae192a	[x86] Relax the check in areLoadsFromSameBasePtr Check if the scale operand is identical (doesn't have to be 1) and do not check the chaain operand. Differential revision: https://reviews.llvm.org/D31833 llvm-svn: 299986	2017-04-11 21:05:02 +00:00
Anna Thomas	00dc1b74b7	[LV] Avoid vectorizing first order recurrence when phi uses are outside loop In the vectorization of first order recurrence, we vectorize such that the last element in the vector will be the one extracted to pass into the scalar remainder loop. However, this is not true when there is a phi (other than the primary induction variable) is used outside the loop. In such a case, we need the value from the second last iteration (i.e. the phi value), not the last iteration (which would be the phi update). I've added a test case for this. Also see PR32396. A follow up patch would generate the correct code gen for such cases, and turn this vectorization on. Differential Revision: https://reviews.llvm.org/D31910 Reviewers: mssimpso llvm-svn: 299985	2017-04-11 21:02:00 +00:00
Sanjay Patel	f0cb5a80ad	[InstSimplify] add tests for chains of shuffles; NFC llvm-svn: 299984	2017-04-11 20:54:57 +00:00
Daniel Berlin	554dcd8c89	MemorySSA: Move to Analysis, from Transforms/Utils. It's used as Analysis, it has Analysis passes, and once NewGVN is made an Analysis, this removes the cross dependency from Analysis to Transform/Utils. NFC. llvm-svn: 299980	2017-04-11 20:06:36 +00:00
Justin Bogner	20dd36a48a	MIR: Allow parsing of empty machine functions If you run llc -stop-after=codegenprepare and feed the resulting MIR to llc -start-after=codegenprepare, you'll have an empty machine function since we haven't run any isel yet. Of course, this only works if the MIRParser believes you that this is okay. This is essentially a revert of r241862 with a fix for the problem it was papering over. llvm-svn: 299975	2017-04-11 19:32:41 +00:00
Davide Italiano	8455f7d623	[X86] Create the correct ADC/SBB SDNode when lowering add. Differential Revision: https://reviews.llvm.org/D31911 llvm-svn: 299973	2017-04-11 19:11:20 +00:00
Andrea Di Biagio	8e26936bfd	[AddDiscriminators] Assign discriminators to MemIntrinsic calls. Before this patch, pass AddDiscriminators always avoided to assign discriminators to intrinsic calls. This was done mainly for two reasons: 1) We wanted to minimize the number of based discriminators used. 2) We wanted to avoid non-deterministic discriminator assignment for different debug levels. Unfortunately, that approach was problematic for MemIntrinsic calls. MemIntrinsic calls can be split by SROA into loads and stores, and each new load/store instruction would obtain the debug location from the original intrinsic call. If we don't assign a discriminator to MemIntrinsic calls, then we cannot correctly set the discriminator for the newly created loads and stores. This may have a negative impact on the basic block weight computation performed by the SampleLoader. This patch fixes the issue by letting MemIntrinsic calls have a discriminator. Differential Revision: https://reviews.llvm.org/D31900 llvm-svn: 299972	2017-04-11 19:07:30 +00:00
Craig Topper	9eac2717c6	[InstCombine] Add testcases for (B&A)^A -> ~B & A and (B\|A)^A -> B & ~A llvm-svn: 299971	2017-04-11 18:50:48 +00:00
Anna Thomas	98cbb067ce	[LV] Move first order recurrence test to common folder. NFC llvm-svn: 299969	2017-04-11 18:31:42 +00:00
Peter Collingbourne	7faa60c406	llvm-lto2: Move the LTO::run() action behind a subcommand. Move LTO::run() to a "run" subcommand so that we can introduce new subcommands for testing different parts of the LTO implementation. This doesn't use llvm::cl subcommands because it doesn't appear to be currently possible to pass an argument not associated with a subcommand to a subcommand (e.g. -lto-use-new-pm, -mcpu=yonah). Differential Revision: https://reviews.llvm.org/D31410 llvm-svn: 299967	2017-04-11 18:12:00 +00:00
Yaxun Liu	e95df719e1	[AMDGPU] Add A5 to data layout for amdgiz environment Differential Revision: https://reviews.llvm.org/D31589 llvm-svn: 299964	2017-04-11 17:18:13 +00:00
Reid Kleckner	6e545ffc4e	[PDB] Emit index/offset pairs for TPI and IPI streams Summary: This lets PDB readers lookup type record data by type index in O(log n) time. It also enables makes `cvdump -t` work on PDBs produced by LLD. cvdump will not dump a PDB that doesn't have an index-to-offset table. The table is sorted by type index, and has an entry every 8KB. Looking up a type record by index is a binary search of this table, followed by a scan of at most 8KB. Reviewers: ruiu, zturner, inglorion Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31636 llvm-svn: 299958	2017-04-11 16:26:15 +00:00
Sanjay Patel	28611acef9	revert r299851 - [InstCombine] fix matching of or-of-icmps constants (PR32524) This is a candidate culprit for multiple bot fails, so reverting pending investigation. llvm-svn: 299955	2017-04-11 15:57:32 +00:00
Geoff Berry	9d597adde4	[GVNHoist] Re-enable GVNHoist by default Turn GVNHoist back on by default now that PR32153 has been fixed. llvm-svn: 299944	2017-04-11 14:36:30 +00:00
Keno Fischer	30779772cf	[StripDeadDebug/DIFinder] Track inlined SPs Summary: In rL299692 I improved strip-dead-debug-info's ability to drop CUs that are not referenced from the current module. However, in doing so I neglected to realize that some SPs could be referenced entirely from inlined functions. It appears I was not the only one to make this mistake, because DebugInfoFinder, doesn't find those SPs either. Fix this in DebugInfoFinder and then use that to make sure not to drop those CUs in strip-dead-debug-info. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31904 llvm-svn: 299936	2017-04-11 13:32:11 +00:00
Diana Picus	1314a2889c	GlobalISel: Allow legalizing G_FADD to a libcall Use the same handling in the generic legalizer code as for the other libcalls (G_FREM, G_FPOW). Enable it on ARM for float and double so we can test it. llvm-svn: 299931	2017-04-11 10:52:34 +00:00
Volkan Keles	64ad85f8ba	[GlobalISel] LegalizerInfo: Enable legalization of non-power-of-2 types Summary: Legalize only if the type is marked as Legal or Custom. If not, return Unsupported as LegalizerHelper is not able to handle non-power-of-2 types right now. Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, kristof.beyls, javed.absar, ab Reviewed By: kristof.beyls, ab Subscribers: dberris, rovka, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D31711 llvm-svn: 299929	2017-04-11 10:10:14 +00:00
Sam Parker	4fc5f3c02e	[SelectionDAG] Check CALLSEQ_BEGIN nodes in DelayForLiveRegs A fix for the bug reported in PR30911. The issue arises when multiple CALLSEQ_BEGIN nodes are unscheduled as the last node to be unscheduled will gain access to the CallResource register. But when a node is being picked, only CALLSEQ_END nodes are checked against the CallResource and have their chains evaluated. This then means that other CALLSEQ_BEGIN nodes can be scheduled before the existing call sequence has been finalised. This patch adds a check against the FrameSetup nodes in DelayForLiveRegs to prevent this from happening. Differential Revision: https://reviews.llvm.org/D31536 llvm-svn: 299926	2017-04-11 08:43:32 +00:00
Craig Topper	18f9e424e7	[InstCombine] Support weird size element types in dyn_castNegVal. llvm-svn: 299915	2017-04-11 05:42:47 +00:00
Sanjoy Das	92ce1e76c5	[LoopUnswitch] Fix a test case (h/t to Chandler for pointing this out) The test in question was not at all testing what it was supposed to test. We do not //care// about placing `!make.implicit` in inner constant branch (since it will be folded away anyway). We care about placing `!make.implicit` in the outer branch that switches between either version of the loop. Having said that, it is _correct_ to leave behind the `!make.implicit` in the inner branch, but there is no need to do so. llvm-svn: 299912	2017-04-11 04:11:47 +00:00
Hal Finkel	b63ed91549	[LICM] Hoist fp division from the loops and replace by a reciprocal When allowed, we can hoist a division out of a loop in favor of a multiplication by the reciprocal. Fixes PR32157. Patch by vit9696! Differential Revision: https://reviews.llvm.org/D30819 llvm-svn: 299911	2017-04-11 02:22:54 +00:00
Hal Finkel	cef9e52736	[PowerPC] multiply-with-overflow might use the CTR register Check the legality of ISD::[US]MULO to see whether Intrinsic::[us]mul_with_overflow will legalize into a function call (and, thus, will use the CTR register). Fixes PR32485. Patch by Tim Neumann! Differential Revision: https://reviews.llvm.org/D31790 llvm-svn: 299910	2017-04-11 02:03:17 +00:00
Sanjay Patel	8f2001164a	[ARM, x86] add tests to show possible improvement for bool math; NFC llvm-svn: 299897	2017-04-10 23:26:31 +00:00
Kyle Butt	7e8be28661	CodeGen: BlockPlacement: Don't always tail-duplicate with no other successor. The math works out where it can actually be counter-productive. The probability calculations correctly handle the case where the alternative is 0 probability, rely on those calculations. Includes a test case that demonstrates the problem. llvm-svn: 299892	2017-04-10 22:28:22 +00:00
Kyle Butt	ee51a20164	CodeGen: BlockPlacement: Minor probability changes. Qin may be large, and Succ may be more frequent than BB. Take these both into account when deciding if tail-duplication is profitable. llvm-svn: 299891	2017-04-10 22:28:18 +00:00
Kyle Butt	a12bd756e4	CodeGen: BranchFolding: Merge identical blocks, even if they are short. Merging identical blocks when it doesn't reduce fallthrough. It is common for the blocks created from critical edge splitting to be identical. We would like to merge these blocks whenever doing so would not reduce fallthrough. llvm-svn: 299890	2017-04-10 22:28:12 +00:00
Matt Arsenault	3c1fc768ed	Allow DataLayout to specify addrspace for allocas. LLVM makes several assumptions about address space 0. However, alloca is presently constrained to always return this address space. There's no real way to avoid using alloca, so without this there is no way to opt out of these assumptions. The problematic assumptions include: - That the pointer size used for the stack is the same size as the code size pointer, which is also the maximum sized pointer. - That 0 is an invalid, non-dereferencable pointer value. These are problems for AMDGPU because alloca is used to implement the private address space, which uses a 32-bit index as the pointer value. Other pointers are 64-bit and behave more like LLVM's notion of generic address space. By changing the address space used for allocas, we can change our generic pointer type to be LLVM's generic pointer type which does have similar properties. llvm-svn: 299888	2017-04-10 22:27:50 +00:00
Dehao Chen	d4a3397861	Emit less compiler optimization remarks in samplepgo to reduce a call to findCalleeFunctionSamples which is going to be refactored. Summary: Now the SamplePGO support is more stable, we do not need so many verbose optimization remarks emitted. Reviewers: dnovillo, davidxl Reviewed By: davidxl Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D31826 llvm-svn: 299883	2017-04-10 20:49:16 +00:00
Geoff Berry	635e505675	[GVNHoist] Call isGuaranteedToTransferExecutionToSuccessor on each instruction w.r.t. https://bugs.llvm.org/show_bug.cgi?id=32153 The consensus seems to be isGuaranteedToTransferExecutionToSuccessor should be called for each function. Patch by Aditya Kumar Differential Revision: https://reviews.llvm.org/D31035 llvm-svn: 299882	2017-04-10 20:45:17 +00:00
Evgeniy Stepanov	ed7fce7c84	Revert "[asan] Put ctor/dtor in comdat." This reverts commit r299696, which is causing mysterious test failures. llvm-svn: 299880	2017-04-10 20:36:36 +00:00
Evgeniy Stepanov	ba7c2e9661	Revert "[asan] Fix dead stripping of globals on Linux." This reverts commit r299697, which caused a big increase in object file size. llvm-svn: 299879	2017-04-10 20:36:30 +00:00
Matt Arsenault	f10061ec70	Add address space mangling to lifetime intrinsics In preparation for allowing allocas to have non-0 addrspace. llvm-svn: 299876	2017-04-10 20:18:21 +00:00
Zachary Turner	0c990bbe09	[llvm-pdbdump] Display padding bytes on record layout When dumping classes, show where padding occurs, and at the end of the class print statistics about how many bytes total of padding exist in a class. Since PDB doesn't specifically contain information about padding, we have to mimic this by sort of reversing a small portion of the record layout algorithm (e.g. looking at offsets and sizes and trying to determine whether something is part of the same field or a new field). Differential Revision: https://reviews.llvm.org/D31800 llvm-svn: 299869	2017-04-10 19:33:29 +00:00
Matt Arsenault	daa08875b3	[MemCpyOpt] Only replace memcpy with bitcast if address spaces match Patch by James Price llvm-svn: 299866	2017-04-10 19:00:25 +00:00
Daniel Berlin	74603a68ef	MemorySSA: Make lifetime starts defs for mustaliased pointers Summary: While we don't want them aliasing with other pointers, there seems to be no point in not having them clobber must-aliased'd pointers. If some day, we split the aliasing and ordering chains, we'd make this not aliasing but an ordering barrier (IE it doesn't affect it's memory, but we can't hoist it above it). Reviewers: hfinkel, george.burgess.iv Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31865 llvm-svn: 299865	2017-04-10 18:46:00 +00:00
Matthew Simpson	1468d3e04e	[ARM/AArch64] Ensure valid vector element types for interleaved accesses This patch refactors and strengthens the type checks performed for interleaved accesses. The primary functional change is to ensure that the interleaved accesses have valid element types. The added test cases previously failed because the element type is f128. Differential Revision: https://reviews.llvm.org/D31817 llvm-svn: 299864	2017-04-10 18:34:37 +00:00
Craig Topper	0d830ff7bf	[InstCombine] Use commutable matchers and m_OneUse in visitSub to shorten code. Add missing test cases. In one case I removed commute handling for a multiply with a constant since we'll eventually get the constant on the right hand side. llvm-svn: 299863	2017-04-10 18:09:25 +00:00
Matt Arsenault	678e111e11	AMDGPU: Fix crash when disassembling VOP3 mac The unused dummy src2_modifiers is missing, so it crashes when trying to print it. I tried to fully remove src2_modifiers, but there are some irritations in the places where it is converted to mad since it starts to require modifying use lists while iterating over them. llvm-svn: 299861	2017-04-10 17:58:06 +00:00
Craig Topper	98851adc2a	[InstCombine] Use m_c_Add to shorten some code. Add testcases for this fold since they were missing. NFC llvm-svn: 299853	2017-04-10 16:59:40 +00:00
Simon Pilgrim	b6702eaec3	[X86][MMX] Add fast-isel support for MMX non-temporal writes Differential Revision: https://reviews.llvm.org/D31754 llvm-svn: 299852	2017-04-10 16:58:07 +00:00
Sanjay Patel	570e35c157	[InstCombine] fix matching of or-of-icmps constants (PR32524) Also, make the same change in and-of-icmps and remove a hack for detecting that case. Finally, add some FIXME comments because the code duplication here is awful. This should fix the remaining IR problem noted in: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 299851	2017-04-10 16:55:57 +00:00
Adrian McCarthy	08eb343cce	Improves pretty printing of variable types in llvm-pdbdump * Adds support for pointers to arrays, which was missing * Adds some tests * Improves consistency of const and volatile qualifiers * Eliminates non-composable special case code for arrays and function by using a more general recursive approach * Has a hack for getting the calling convention into the right spot for pointer-to-functions Given the rapid changes happenning in llvm-pdbdump, this may be difficult to merge. Differential Revision: https://reviews.llvm.org/D31832 llvm-svn: 299848	2017-04-10 16:43:09 +00:00
Craig Topper	3eec73e20b	[InstCombine] Support folding of add instructions with vector constants into select operations We currently only fold scalar add of constants into selects. This improves this to support vectors too. Differential Revision: https://reviews.llvm.org/D31683 llvm-svn: 299847	2017-04-10 16:40:00 +00:00
Sanjay Patel	8c1cc5abbb	[InstCombine] add test for PR32524; NFC llvm-svn: 299846	2017-04-10 16:28:08 +00:00
Diana Picus	3ff82c8cb7	[ARM] GlobalISel: Support G_FPOW for float and double Legalize to a libcall. llvm-svn: 299841	2017-04-10 09:27:39 +00:00
Craig Topper	838d13e7ee	[InstCombine] Make sure we preserve fast math flags when folding fp instructions into phi nodes Summary: I noticed in the select folding code that we copied fast math flags, but did not do the same for the similar handling in phi nodes. This patch fixes that to do the same thing as select Reviewers: spatel, davide, majnemer, hfinkel Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31690 llvm-svn: 299838	2017-04-10 07:00:10 +00:00
Craig Topper	d8840d7b10	[InstCombine] use m_c_And and m_c_Xor to handle commuted versions of a transform. llvm-svn: 299837	2017-04-10 06:53:28 +00:00
Craig Topper	7260e2f159	[InstCombine] Add test cases demonstrating missing handling for the commuted version of a transform. NFC. llvm-svn: 299836	2017-04-10 06:53:25 +00:00
Xin Tong	34888c08bc	[SCCP] Resolve indirect branch target when possible. Summary: Resolve indirect branch target when possible. This potentially eliminates more basicblocks and result in better evaluation for phi and other things. Reviewers: davide, efriedma, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30322 llvm-svn: 299830	2017-04-10 00:33:25 +00:00
Sanjay Patel	72fbb7868a	[InstCombine] remove duplicate test; NFC I moved this test to 'not.ll' in r299824 but accidentally added a copy here. llvm-svn: 299828	2017-04-09 21:45:52 +00:00
Sanjay Patel	2824927e8a	[SimplifyCFG] auto-generate better checks; NFC llvm-svn: 299825	2017-04-09 16:16:32 +00:00
Sanjay Patel	c5f963c2e5	[InstCombine] auto-generate better checks; NFC Also, move a test next to its sibling to eliminate a file with just one test. llvm-svn: 299824	2017-04-09 15:44:59 +00:00
Hal Finkel	a9d67cf601	[MemorySSA] Fix use of pointsToConstantMemory in isUseTriviallyOptimizableToLiveOnEntry In isUseTriviallyOptimizableToLiveOnEntry, pointsToConstantMemory needs to be called on the load's pointer operand, not on the result of the load (which might not even be a pointer). llvm-svn: 299823	2017-04-09 12:57:50 +00:00
Craig Topper	afa07c5ef6	[InstCombine] Extend some OR combines to support vectors. This adds support for these combines for vectors (X^C)\|Y -> (X\|Y)^C iff Y&C == 0 Y\|(X^C) -> (X\|Y)^C iff Y&C == 0 llvm-svn: 299822	2017-04-09 06:12:41 +00:00
Craig Topper	e63c21b1ba	[InstCombine] Extend a canonicalization check to apply to vector constants too. llvm-svn: 299821	2017-04-09 06:12:39 +00:00
Craig Topper	1c5af0d400	[InstCombine] Add test cases to show missing support for vectors in an OR combine. Also add the commuted versions. NFC llvm-svn: 299820	2017-04-09 06:12:36 +00:00
Matt Arsenault	dd8fd9dcfd	AMDGPU: Actually write nops for writeNopData Before this was just writing 0s, which ends up looking like a v_cndmask_b32 v0, s0, v0, vcc. Write out an encoded s_nop instead. llvm-svn: 299816	2017-04-08 21:28:38 +00:00
Coby Tayree	bedaae0d06	[AsmParser]Emit an error if a macro has two (or more) parameters sharing the same name Introducing a new error to macro parameters' parsing: currently, llvm-mc won't complain if a macro have two (or more) named params with the same name. this behavior is false, as there's no merit in having some params sharing a name. now, instead of tolerate such a phenomena - emit an appropriate error. Differential Revision: https://reviews.llvm.org/D31674 llvm-svn: 299815	2017-04-08 20:29:03 +00:00
Gor Nishanov	bfb2a9db31	[coroutines] Make CoroSplit pass deterministic coro-split-after-phi.ll test was flaky due to non-determinism in the coroutine frame construction that was sorting the spill vector using a pointer to a def as a part of the key. The sorting was intended to make sure that spills for the same def are kept together, however, we populate the vector by processing defs in order, so the spill entires will end up together anyways. This change removes spill sorting and restores the determinism in the test. llvm-svn: 299809	2017-04-08 00:49:46 +00:00

1 2 3 4 5 ...

44269 Commits