llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kuperstein	2ee911e985	Revert r274613 because it breaks the test suite with AVX512 This reverts most of r274613 (AKA r274626) and its follow-ups (r276347, r277289), due to miscompiles in the test suite. The FastISel change was left in, because it apparently fixes an unrelated issue. (Recommit of r279782 which was broken due to a bad merge.) This fixes 4 out of the 5 test failures in PR29112. llvm-svn: 279788	2016-08-25 22:48:11 +00:00
Michael Kuperstein	6e271f4ce8	Revert r279782 due to debug buildbot breakage. llvm-svn: 279785	2016-08-25 22:14:45 +00:00
Michael Kuperstein	a6ccc8d365	Revert r274613 because it breaks the test suite with AVX512 This reverts most of r274613 and its follow-ups (r276347, r277289), due to miscompiles in the test suite. The FastISel change was left in, because it apparently fixes an unrelated issue. This fixes 4 out of the 5 test failures in PR29112. llvm-svn: 279782	2016-08-25 21:55:41 +00:00
Tim Shen	3ad8b43cc2	[MemCpy] Add comments for r279769 Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279778	2016-08-25 21:03:46 +00:00
Tim Northover	3495647d0d	ARM: by default don't set the Thumb bit on MachO relocated values. Its existence is largely historical, apparently we tried to make ARM object files look maybe-almost-possibly runnable by putting our best guess at the actual value into relocated locations. Of course, the real linker then comes along and can completely change things. But it should only be there for word-sized and movw/movt relocations. It can't be encoded in branch relocations, and I've seen it mess up validity calculations twice in the last couple of weeks so the default is clearly problematic. llvm-svn: 279773	2016-08-25 20:41:30 +00:00
Hemant Kulkarni	5b60f63b32	llvm-objdump: ELF: Handle code and data mix in all scenarios Differential Revision: https://reviews.llvm.org/D23621 llvm-svn: 279770	2016-08-25 19:41:08 +00:00
Tim Shen	a3dbead2d6	[MemCpy] Check for alias in performMemCpyToMemSetOptzn, instead of the identity of two operands Summary: This fixes pr29105. The reason is that lifetime marks creates new aliasing pointers the original ones, but before this patch aliases were not checked in performMemCpyToMemSetOptzn. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279769	2016-08-25 19:27:26 +00:00
Tim Northover	fe880a8801	GlobalISel: mark simple ops legal even on types < 32-bit. The 32-bit variants of these operations don't depend on the bits not being operated on, so they also naturally model operations narrower than the actual register width. llvm-svn: 279760	2016-08-25 17:37:39 +00:00
Tim Northover	7a1ec0141a	GlobalISel: mark pointer constants as legal on AArch64. llvm-svn: 279759	2016-08-25 17:37:35 +00:00
Tim Northover	438c77ca1a	GlobalISel: perform multi-step legalization llvm-svn: 279758	2016-08-25 17:37:32 +00:00
Tim Northover	2c4a838e24	GlobalISel: mark small extends as legal on AArch64 llvm-svn: 279757	2016-08-25 17:37:25 +00:00
Michael Kuperstein	40887c5566	[X86] 512-bit VPAVG requires AVX512BW Fix VPAVG detection to require AVX512BW, not AVX512F for 512-bit widths, and change associated asserts to assert in the right direction... This fixes PR29111. llvm-svn: 279755	2016-08-25 17:17:46 +00:00
Wei Mi	59ca96636d	[UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expanded when unroll runtime iteration loop. In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner loop inside a loop nest, the scalar evolution needs to be dropped for its parent loop which is done by ScalarEvolution::forgetLoop. However, we can postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC expansion can still reuse existing value. Differential Revision: https://reviews.llvm.org/D23572 llvm-svn: 279748	2016-08-25 16:17:18 +00:00
Ron Lieberman	c93d123b86	[Hexagon] vector store print tracing. Add vector store print tracing option for hexagon vector instructions. https://reviews.llvm.org/D23870 llvm-svn: 279739	2016-08-25 13:35:48 +00:00
Simon Pilgrim	3125501bba	[X86][AVX] Improved AVX512F/AVX512VL SubVectorBroadcast tests llvm-svn: 279736	2016-08-25 12:50:13 +00:00
Simon Pilgrim	0ad9f3e93b	[X86][AVX] Provide SubVectorBroadcast fallback if load fold fails (PR29133) Fix for PR29133, matching the approach that was taken for AVX1 scalar broadcasts. llvm-svn: 279735	2016-08-25 12:45:16 +00:00
Sebastian Pop	5f0d0e60d1	GVN-hoist: fix hoistingFromAllPaths for loops (PR29034) It is invalid to hoist stores or loads if they are not executed on all paths from the hoisting point to the exit of the function. In the testcase, there are paths in the loop that do not execute the stores or the loads, and so hoisting them within the loop is unsafe. The problem is that the current implementation of hoistingFromAllPaths is incomplete: it walks all blocks dominated by the hoisting point, and does not return false when the loop contains a path on which the hoisted ld/st is not executed. Differential Revision: https://reviews.llvm.org/D23843 llvm-svn: 279732	2016-08-25 11:55:47 +00:00
Matthias Braun	1eb473680a	MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, compute it Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of running after register and simply describes that no vregs are used in a machine function. With that we can simply compute the property and do not need to dump/parse it in .mir files. Differential Revision: http://reviews.llvm.org/D23850 llvm-svn: 279698	2016-08-25 01:27:13 +00:00
Xinliang David Li	cad3a995a4	[Profile] Propagate branch metadata properly in instcombine Differential Revision: http://reviews.llvm.org/D23590 llvm-svn: 279693	2016-08-25 00:26:32 +00:00
Kyle Butt	90e51b1bef	Test: Add REQUIRES: asserts to test that now requires stats. Test was modified in r279670 llvm-svn: 279690	2016-08-25 00:06:52 +00:00
Evgeny Stupachenko	d7f9c3564a	The patch improves ValueTracking on left shift with nsw flag. Summary: The patch fixes PR28946. Reviewers: majnemer, sanjoy Differential Revision: http://reviews.llvm.org/D23296 From: Li Huang llvm-svn: 279684	2016-08-24 23:01:33 +00:00
Krzysztof Parzyszek	6dff336ad1	[Hexagon] Check for block end when skipping debug instructions llvm-svn: 279681	2016-08-24 22:36:35 +00:00
Matthias Braun	a319e2cae0	MIRParser/MIRPrinter: Compute HasInlineAsm instead of printing/parsing it llvm-svn: 279680	2016-08-24 22:34:06 +00:00
Matthias Braun	5dce48e0a7	Missed a test in my last commit llvm-svn: 279679	2016-08-24 22:32:11 +00:00
Sanjay Patel	d398d4a39e	[InstCombine] use m_APInt to allow icmp eq/ne (shr X, C2), C folds for splat constant vectors llvm-svn: 279677	2016-08-24 22:22:06 +00:00
Matthias Braun	f1b20c5225	MachineRegisterInfo/MIR: Initialize tracksSubRegLiveness early, do not print/parser it tracksSubRegLiveness only depends on the Subtarget and a cl::opt, there is not need to change it or save/parse it in a .mir file. Make the field const and move the initialization LiveIntervalAnalysis to the MachineRegisterInfo constructor. Also cleanup some code and fix some instances which better use MachineRegisterInfo::subRegLivenessEnabled() instead of TargetSubtargetInfo::enableSubRegLiveness(). llvm-svn: 279676	2016-08-24 22:17:45 +00:00
Kyle Butt	a8c7371d16	CodeGen: If Convert blocks that would form a diamond when tail-merged. The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 279671	2016-08-24 21:34:27 +00:00
Kyle Butt	6262ca3448	IfConversion: Rescan diamonds. The cost of predicating a diamond is only the instructions that are not shared between the two branches. Additionally If a predicate clobbering instruction occurs in the shared portion of the branches (e.g. a cond move), it may still be possible to if convert the sub-cfg. This change handles these two facts by rescanning the non-shared portion of a diamond sub-cfg to recalculate both the predication cost and whether both blocks are pred-clobbering. Fixed 2 bugs before recommitting. Branch instructions must be compared and found identical before diamond conversion. Also, predicate-clobbering instructions in the shared prefix disqualifies a potential diamond conversion. Includes tests for both. llvm-svn: 279670	2016-08-24 21:34:24 +00:00
Tim Northover	9c3633f516	ARM: don't diagnose cbz/cbnz to Thumb functions. A branch-distance to a Thumb function shouldn't be forced to be odd for CBZ/CBNZ instructions because (assuming it's within range), it's going to be a valid, even offset. llvm-svn: 279665	2016-08-24 21:21:29 +00:00
Changpeng Fang	75f0968b39	AMDGCN/SI: Implement readlane/readfirstlane intrinsics Summary: This patch implements readlane/readfirstlane intrinsics. TODO: need to define a new register class to consider the case that the source could be a vector register or M0. Reviewed by: arsenm and tstellarAMD Differential Revision: http://reviews.llvm.org/D22489 llvm-svn: 279660	2016-08-24 20:35:23 +00:00
Rafael Espindola	70c6a3976b	Use isTargetMachO instead of isTargetDarwin. llvm-svn: 279655	2016-08-24 19:02:29 +00:00
Simon Pilgrim	e14653e17d	[X86][SSE] Add MINSD/MAXSD/MINSS/MAXSS intrinsic scalar load folding support These are no different in load behaviour to the existing ADD/SUB/MUL/DIV scalar ops but were missing from isNonFoldablePartialRegisterLoad llvm-svn: 279652	2016-08-24 18:40:53 +00:00
David Blaikie	a01f295322	DebugInfo: Add flag to CU to disable emission of inline debug info into the skeleton CU In cases where .dwo/.dwp files are guaranteed to be available, skipping the extra online (in the .o file) inline info can save a substantial amount of space - see the original r221306 for more details there. llvm-svn: 279650	2016-08-24 18:29:49 +00:00
Matthew Simpson	abd2be1e2e	[LV] Unify vector and scalar maps This patch unifies the data structures we use for mapping instructions from the original loop to their corresponding instructions in the new loop. Previously, we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap. WidenMap maintained the vector values each instruction from the old loop was represented with, and ScalarIVMap maintained the scalar values each scalarized induction variable was represented with. With this patch, all values created for the new loop are maintained in VectorLoopValueMap. The change allows for several simplifications. Previously, when an instruction was scalarized, we had to insert the scalar values into vectors in order to maintain the mapping in WidenMap. Then, if a user of the scalarized value was also scalar, we had to extract the scalar values from the temporary vector we created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions. If a scalarized value is used by a scalar instruction, the scalar value is used directly. However, if the scalarized value is needed by a vector instruction, we generate the needed insertelement instructions on-demand. A common idiom in several locations in the code (including the scalarization code), is to first get the vector values an instruction from the original loop maps to, and then extract a particular scalar value. This patch adds getScalarValue for this purpose along side getVectorValue as an interface into VectorLoopValueMap. These functions work together to return the requested values if they're available or to produce them if they're not. The mapping has also be made less permissive. Entries can be added to VectorLoopValue map with the new initVector and initScalar functions. getVectorValue has been modified to return a constant reference to the mapped entries. There's no real functional change with this patch; however, in some cases we will generate slightly different code. For example, instead of an insertelement sequence following the definition of an instruction, it will now precede the first use of that instruction. This can be seen in the test case changes. Differential Revision: https://reviews.llvm.org/D23169 llvm-svn: 279649	2016-08-24 18:23:17 +00:00
Sanjoy Das	ff855b6020	[SCCP] Don't delete side-effecting instructions I'm not sure if the `!isa<CallInst>(Inst) && !isa<TerminatorInst>(Inst))` bit is correct either, but this fixes the case we know is broken. llvm-svn: 279647	2016-08-24 18:10:21 +00:00
Simon Pilgrim	941bd6bbae	[X86][SSE] Add support for combining VZEXT_MOVL target shuffles Includes adding more general support for the pattern: VZEXT_MOVL(VZEXT_LOAD(ptr)) -> VZEXT_LOAD(ptr) This has unearthed a couple of latent poor codegen issues (MINSS/MAXSS scalar load folding and MOVDDUP/BROADCAST load folding patterns), which will be fixed shortly. Its also reduced a couple of tests so that they no longer reach the instruction threshold necessary to be combined to PSHUFB (see PR26183). llvm-svn: 279646	2016-08-24 18:07:53 +00:00
Tim Northover	65f6336ff9	GlobalISel: fix cmp test to be in SSA form llvm-svn: 279633	2016-08-24 15:37:51 +00:00
Teresa Johnson	57891a50a8	[ThinLTO/gold] Add caching support to gold-plugin Summary: With support now in the new LTO API for caching (r279576), add optional ThinLTO caching in the gold-plugin. Reviewers: mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23836 llvm-svn: 279631	2016-08-24 15:11:47 +00:00
Simon Pilgrim	2725217680	[X86][SSE] Regenerate scalar math load folding tests for 32 and 64 bit targets llvm-svn: 279630	2016-08-24 15:07:11 +00:00
Wei Ding	1041a646a9	AMDGPU : Add V_SAD_U32 instruction pattern. Differential Revision: http://reviews.llvm.org/D23069 llvm-svn: 279629	2016-08-24 14:59:47 +00:00
Ying Yi	84dc971ee2	[llvm-cov] Add the project summary to each source file coverage report. This patch includes the following changes: - Included header "Code coverage report" and include the date that the report was created. - Included title (as specified in a command line option, (i.e llvm-cov -project-title="Simple Test") - In the summary, list the elf files that the source code file has contributed to. - Used column heading for "Line No.", "Count No.", Source". Differential Revision: https://reviews.llvm.org/D23345 llvm-svn: 279628	2016-08-24 14:27:23 +00:00
Krzysztof Parzyszek	a7ed090bba	Create subranges for new intervals resulting from live interval splitting The register allocator can split a live interval of a register into a set of smaller intervals. After the allocation of registers is complete, the rewriter will modify the IR to replace virtual registers with the corres- ponding physical registers. At this stage, if a register corresponding to a subregister of a virtual register is used, the rewriter will check if that subregister is undefined, and if so, it will add the <undef> flag to the machine operand. The function verifying liveness of the subregis- ter would assume that it is undefined, unless any of the subranges of the live interval proves otherwise. The problem is that the live intervals created during splitting do not have any subranges, even if the original parent interval did. This could result in the <undef> flag placed on a register that is actually defined. Differential Revision: http://reviews.llvm.org/D21189 llvm-svn: 279625	2016-08-24 13:37:55 +00:00
Simon Dardis	f114820912	[mips] Preparatory work for a generic scheduler Extend instruction definitions from nearly all ISAs to include appropriate instruction itineraries. Change MIPS16s gp prologue generation to use real instructions instead of using a pseudo instruction. Reviewers: dsanders, vkalintiris Differential Review: https://reviews.llvm.org/D23548 llvm-svn: 279623	2016-08-24 13:00:47 +00:00
Simon Pilgrim	7a50c8c2ba	[X86][AVX2] Ensure on 32-bit targets that we broadcast f64 types not i64 (PR29101) llvm-svn: 279622	2016-08-24 12:42:31 +00:00
Simon Pilgrim	3c8cd3df5e	[X86][F16C] Regenerated f16c tests llvm-svn: 279621	2016-08-24 11:56:15 +00:00
Gil Rapaport	550148b2f6	[Loop Vectorizer] Support predication of div/rem div/rem instructions in basic blocks that require predication currently prevent vectorization. This patch extends the existing mechanism for predicating stores to handle other instructions and leverages it to predicate divs and rems. Differential Revision: https://reviews.llvm.org/D22918 llvm-svn: 279620	2016-08-24 11:37:57 +00:00
Chandler Carruth	8882346842	[PM] Introduce basic update capabilities to the new PM's CGSCC pass manager, including both plumbing and logic to handle function pass updates. There are three fundamentally tied changes here: 1) Plumbing some mechanism for updating the CGSCC pass manager as the CG changes while passes are running. 2) Changing the CGSCC pass manager infrastructure to have support for the underlying graph to mutate mid-pass run. 3) Actually updating the CG after function passes run. I can separate them if necessary, but I think its really useful to have them together as the needs of #3 drove #2, and that in turn drove #1. The plumbing technique is to extend the "run" method signature with extra arguments. We provide the call graph that intrinsically is available as it is the basis of the pass manager's IR units, and an output parameter that records the results of updating the call graph during an SCC passes's run. Note that "...UpdateResult" isn't a great name here... suggestions very welcome. I tried a pretty frustrating number of different data structures and such for the innards of the update result. Every other one failed for one reason or another. Sometimes I just couldn't keep the layers of complexity right in my head. The thing that really worked was to just directly provide access to the underlying structures used to walk the call graph so that their updates could be informed by the particular nature of the change to the graph. The technique for how to make the pass management infrastructure cope with mutating graphs was also something that took a really, really large number of iterations to get to a place where I was happy. Here are some of the considerations that drove the design: - We operate at three levels within the infrastructure: RefSCC, SCC, and Node. In each case, we are working bottom up and so we want to continue to iterate on the "lowest" node as the graph changes. Look at how we iterate over nodes in an SCC running function passes as those function passes mutate the CG. We continue to iterate on the "lowest" SCC, which is the one that continues to contain the function just processed. - The call graph structure re-uses SCCs (and RefSCCs) during mutation events for the highest entry in the resulting new subgraph, not the lowest. This means that it is necessary to continually update the current SCC or RefSCC as it shifts. This is really surprising and subtle, and took a long time for me to work out. I actually tried changing the call graph to provide the opposite behavior, and it breaks EVERYTHING. The graph update algorithms are really deeply tied to this particualr pattern. - When SCCs or RefSCCs are split apart and refined and we continually re-pin our processing to the bottom one in the subgraph, we need to enqueue the newly formed SCCs and RefSCCs for subsequent processing. Queuing them presents a few challenges: 1) SCCs and RefSCCs use wildly different iteration strategies at a high level. We end up needing to converge them on worklist approaches that can be extended in order to be able to handle the mutations. 2) The order of the enqueuing need to remain bottom-up post-order so that we don't get surprising order of visitation for things like the inliner. 3) We need the worklists to have set semantics so we don't duplicate things endlessly. We don't need a persistent set though because we always keep processing the bottom node!!!! This is super, super surprising to me and took a long time to convince myself this is correct, but I'm pretty sure it is... Once we sink down to the bottom node, we can't re-split out the same node in any way, and the postorder of the current queue is fixed and unchanging. 4) We need to make sure that the "current" SCC or RefSCC actually gets enqueued here such that we re-visit it because we continue processing a new, bottom SCC/RefSCC. - We also need the ability to skip SCCs and RefSCCs that get merged into a larger component. We even need the ability to skip nodes from an SCC that are no longer part of that SCC. This led to the design you see in the patch which uses SetVector-based worklists. The RefSCC worklist is always empty until an update occurs and is just used to handle those RefSCCs created by updates as the others don't even exist yet and are formed on-demand during the bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and we push new SCCs onto it and blacklist existing SCCs on it to get the desired processing. We then directly update these when updating the call graph as I was never able to find a satisfactory abstraction around the update strategy. Finally, we need to compute the updates for function passes. This is mostly used as an initial customer of all the update mechanisms to drive their design to at least cover some real set of use cases. There are a bunch of interesting things that came out of doing this: - It is really nice to do this a function at a time because that function is likely hot in the cache. This means we want even the function pass adaptor to support online updates to the call graph! - To update the call graph after arbitrary function pass mutations is quite hard. We have to build a fairly comprehensive set of data structures and then process them. Fortunately, some of this code is related to the code for building the cal graph in the first place. Unfortunately, very little of it makes any sense to share because the nature of what we're doing is so very different. I've factored out the one part that made sense at least. - We need to transfer these updates into the various structures for the CGSCC pass manager. Once those were more sanely worked out, this became relatively easier. But some of those needs necessitated changes to the LazyCallGraph interface to make it significantly easier to extract the changed SCCs from an update operation. - We also need to update the CGSCC analysis manager as the shape of the graph changes. When an SCC is merged away we need to clear analyses associated with it from the analysis manager which we didn't have support for in the analysis manager infrsatructure. New SCCs are easy! But then we have the case that the original SCC has its shape changed but remains in the call graph. There we need to invalidate the analyses associated with it. - We also need to invalidate analyses after we finish processing an SCC. But the analyses we need to invalidate here are only those for the newly updated SCC!!! Because we only continue processing the bottom SCC, if we split SCCs apart the original one gets invalidated once when its shape changes and is not processed farther so its analyses will be correct. It is the bottom SCC which continues being processed and needs to have the "normal" invalidation done based on the preserved analyses set. All of this is mostly background and context for the changes here. Many thanks to all the reviewers who helped here. Especially Sanjoy who caught several interesting bugs in the graph algorithms, David, Sean, and others who all helped with feedback. Differential Revision: http://reviews.llvm.org/D21464 llvm-svn: 279618	2016-08-24 09:37:14 +00:00
Mehdi Amini	dfa0c53885	Tentatively fix gold-plugin test: ThinLTO objects start at offset 0 now. Annoyingly, incremental builds don't detect these kind of issue. llvm-svn: 279612	2016-08-24 05:50:07 +00:00
Gor Nishanov	241b041fba	[Coroutines] Part 8: Coroutine Frame Building algorithm Summary: This patch adds coroutine frame building algorithm. Now, simple coroutines such as ex0.ll and ex1.ll (first examples from docs\Coroutines.rst can be compiled). Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) ... 7. Split coroutine into subfunctions. (https://reviews.llvm.org/D23461) 8. Coroutine Frame Building algorithm <= we are here 9. Add f.cleanup subfunction. 10+. The rest of the logic Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23586 llvm-svn: 279609	2016-08-24 04:44:35 +00:00
Matthias Braun	733fe3676c	CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePasses Re-apply this patch, hopefully I will get away without any warnings in the constructor now. This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279602	2016-08-24 01:52:46 +00:00
Matthias Braun	79f85b3b8f	MIRParser/MIRPrinter: Compute isSSA instead of printing/parsing it. Specifying isSSA is an extra line at best and results in invalid MI at worst. Compute the value instead. Differential Revision: http://reviews.llvm.org/D22722 llvm-svn: 279600	2016-08-24 01:32:41 +00:00
Richard Smith	8c3fbdc6c4	Revert r279564. It introduces undefined behavior (binding a reference to a dereferenced null pointer) in MachineModuleInfo::MachineModuleInfo that causes -Werror builds (including several buildbots) to fail. llvm-svn: 279580	2016-08-23 22:08:27 +00:00
Tim Northover	d0cfb7344e	GlobalISel: add some G_TRUNCs to make icmp test valid MIR. llvm-svn: 279579	2016-08-23 22:07:31 +00:00
Petr Hosek	731bb9cf1e	[MC] Support .dc directives in assembler parser While these directives are mostly aliases for the existing integer and float value directives, some of them like .dc.a have no direct equivalents and are sometimes being used for convenience. Differential Revision: https://reviews.llvm.org/D23810 llvm-svn: 279577	2016-08-23 21:34:53 +00:00
Mehdi Amini	adc0e26bef	[ThinLTO] Add caching to the new LTO API Add the ability to plug a cache on the LTO API. I tried to write such that a linker implementation can control the cache backend. This is intrusive and I'm not totally happy with it, but I can't figure out a better design right now. Differential Revision: https://reviews.llvm.org/D23599 llvm-svn: 279576	2016-08-23 21:30:12 +00:00
Tim Northover	4bdf473590	GlobalISel: add forgotten test-case for G_ICMP llvm-svn: 279569	2016-08-23 21:11:36 +00:00
Tim Northover	bdf67c9a00	GlobalISel: make truncate/extend casts uniform They really should have both types represented, but early variants were created before MachineInstrs could have multiple types so they're rather ambiguous. llvm-svn: 279567	2016-08-23 21:01:33 +00:00
Tim Northover	b3a0be4d38	GlobalISel: legalize conditional branches on AArch64. llvm-svn: 279565	2016-08-23 21:01:20 +00:00
Matthias Braun	4c1f1f120c	CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePasses Re-apply this commit with the deletion of a MachineFunction delegated to a separate pass to avoid use after free when doing this directly in AsmPrinter. This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279564	2016-08-23 20:58:29 +00:00
Matthew Simpson	df2ab917ad	[SLP] Avoid signed integer overflow The test case included with r279125 exposed an existing signed integer overflow. Since getTreeCost can return INT_MAX, we can't sum this cost together with other costs, such as getReductionCost. This patch removes the possibility of assigning a cost of INT_MAX. Since we were previously using INT_MAX as an indicator for "should not vectorize", we now explicitly check this condition with "isTreeTinyAndNotFullyVectorizable" before computing a cost. This patch adds a run-line to the test case used for r279125 that ensures we don't vectorize. Previously, this line would vectorize the test case by chance due to undefined behavior in the cost calculation. Differential Revision: https://reviews.llvm.org/D23723 llvm-svn: 279562	2016-08-23 20:48:50 +00:00
Mehdi Amini	fd49c73f11	[LTO] Fix test following r279550 The output name changed, but it was passing locally using the old output still present in the build dir. llvm-svn: 279556	2016-08-23 19:32:41 +00:00
Tim Northover	456a3c03ac	GlobalISel: mark pointer casts legal on AArch64. llvm-svn: 279553	2016-08-23 19:30:38 +00:00
Mehdi Amini	2a1d15fad7	[ThinLTO] Add a llvm-lto2 test to check that ODR type uniquing is enabled (NFC) This adds a test for r279532, thanks David Li for noticing :) Recommit r279545 after committing first a dependent patch. llvm-svn: 279551	2016-08-23 18:39:15 +00:00
Mehdi Amini	359be8858a	Revert "[ThinLTO] Add a llvm-lto2 test to check that ODR type uniquing is enabled (NFC)" This reverts commit r279545, test is failing, my Output dir was dirty and making the test pass. llvm-svn: 279549	2016-08-23 18:25:59 +00:00
Tim Northover	3c73e367c0	GlobalISel: legalize 1-bit load/store and mark 8/16 bit variants legal on AArch64. llvm-svn: 279548	2016-08-23 18:20:09 +00:00
Mehdi Amini	c3ea5e1afc	[ThinLTO] Add a llvm-lto2 test to check that ODR type uniquing is enabled (NFC) This adds a test for r279532, thanks David Li for noticing :) llvm-svn: 279545	2016-08-23 18:12:55 +00:00
Sanjay Patel	6946e2ade3	[InstSimplify] allow icmp with constant folds for splat vectors, part 2 Completes the m_APInt changes for simplifyICmpWithConstant(). Other commits in this series: https://reviews.llvm.org/rL279492 https://reviews.llvm.org/rL279530 https://reviews.llvm.org/rL279534 https://reviews.llvm.org/rL279538 llvm-svn: 279543	2016-08-23 18:00:51 +00:00
Sanjay Patel	200e3cbfb0	[InstSimplify] allow icmp with constant folds for splat vectors, part 1 llvm-svn: 279538	2016-08-23 17:30:56 +00:00
Sanjay Patel	ada2bb3d5d	[InstSimplify] add tests to show missing vector icmp folds llvm-svn: 279534	2016-08-23 17:13:38 +00:00
Sanjay Patel	5c269d0b7a	[InstSimplify] move icmp with constant tests to another file; NFC ...because like the corresponding code, this is just too big to keep adding to. And the next step is to add a vector version of each of these tests to show missed folds. Also, auto-generate CHECK lines and add comments for the tests that correspond to the source code. llvm-svn: 279530	2016-08-23 16:46:53 +00:00
Simon Pilgrim	95580d6ed2	[X86][SSE] Demonstrate inability to recognise that (v)cvtpd2dq & (v)cvttpd2dq intrinsics implicitly zeroes the upper half of the xmm llvm-svn: 279527	2016-08-23 16:11:21 +00:00
Krzysztof Parzyszek	38e2ccc8d0	[Hexagon] Packetize return value setup with the return instruction Commit r279241 unintentionally reverted that ability. llvm-svn: 279526	2016-08-23 16:01:01 +00:00
Simon Pilgrim	04b99fcded	[X86][AVX] Updated fptosi_2f64_to_4i32 test to show missed opportunity to implicit zero the upper elements llvm-svn: 279521	2016-08-23 15:10:39 +00:00
Simon Pilgrim	22c415a696	[X86][AVX] Add v2i32 fp to int conversion tests llvm-svn: 279520	2016-08-23 15:00:52 +00:00
Simon Pilgrim	cb96142fb8	[X86][AVX] Add AVX2/AVX512 fp to int conversion tests llvm-svn: 279518	2016-08-23 14:37:35 +00:00
Simon Pilgrim	2ed547513d	[X86][SSE] Demonstrate inability to recognise that (v)cvtpd2ps intrinsics implicitly zeroes the upper half of the xmm llvm-svn: 279511	2016-08-23 11:26:28 +00:00
Simon Pilgrim	9eb978b47b	[X86][SSE] Demonstrate inability to recognise that (v)cvtpd2ps implicitly zeroes the upper half of the xmm llvm-svn: 279508	2016-08-23 10:35:24 +00:00
Oliver Stannard	9aa6f010a4	[ARM] Generate consistent frame records for Thumb2 There is not an official documented ABI for frame pointers in Thumb2, but we should try to emit something which is useful. We use r7 as the frame pointer for Thumb code, which currently means that if a function needs to save a high register (r8-r11), it will get pushed to the stack between the frame pointer (r7) and link register (r14). This means that while a stack unwinder can follow the chain of frame pointers up the stack, it cannot know the offset to lr, so does not know which functions correspond to the stack frames. To fix this, we need to push the callee-saved registers in two batches, with the first push saving the low registers, fp and lr, and the second push saving the high registers. This is already implemented, but previously only used for iOS. This patch turns it on for all Thumb2 targets when frame pointers are required by the ABI, and the frame pointer is r7 (Windows uses r11, so this isn't a problem there). If frame pointer elimination is enabled we still emit a single push/pop even if we need a frame pointer for other reasons, to avoid increasing code size. We must also ensure that lr is pushed to the stack when using a frame pointer, so that we end up with a complete frame record. Situations that could cause this were rare, because we already push lr in most situations so that we can return using the pop instruction. Differential Revision: https://reviews.llvm.org/D23516 llvm-svn: 279506	2016-08-23 09:19:22 +00:00
Matthias Braun	7f66202d38	Revert "(HEAD -> master, origin/master, origin/HEAD) CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePasses" Reverting while tracking down a use after free. This reverts commit r279502. llvm-svn: 279503	2016-08-23 05:17:11 +00:00
Matthias Braun	fd936841eb	CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePasses This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279502	2016-08-23 03:20:09 +00:00
Matt Arsenault	567631bdd4	BranchRelaxation: Fix handling of blocks with multiple conditional branches Looping over all terminators exposed AArch64 tests hitting an assert from analyzeBranch failing. I believe these cases were miscompiled before. e.g. fcmp s0, s1 b.ne LBB0_1 b.vc LBB0_2 b LBB0_2 LBB0_1: ; Large block LBB0_2: ; ... Both of the individual conditional branches need to be expanded, since neither can reach the final block. Split the original block into ones which analyzeBranch will be able to understand. llvm-svn: 279499	2016-08-23 01:30:30 +00:00
Sanjay Patel	a392049419	[InstCombine] use m_APInt to allow icmp (shr exact X, Y), 0 folds for splat constant vectors llvm-svn: 279472	2016-08-22 20:45:06 +00:00
Matt Arsenault	78fc9daf8d	AMDGPU: Split SILowerControlFlow into two pieces Do most of the lowering in a pre-RA pass. Keep the skip jump insertion late, plus a few other things that require more work to move out. One concern I have is now there may be COPY instructions which do not have the necessary implicit exec uses if they will be lowered to v_mov_b32. This has a positive effect on SGPR usage in shader-db. llvm-svn: 279464	2016-08-22 19:33:16 +00:00
James Molloy	5bf2114265	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd [Recommitting now an unrelated assertion in SROA is sorted out] The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279460	2016-08-22 19:07:15 +00:00
Jun Bum Lim	ec8b8cc595	[InstCombine] Allow sinking from unique predecessor with multiple edges Summary: We can allow sinking if the single user block has only one unique predecessor, regardless of the number of edges. Note that a switch statement with multiple cases can have the same destination. Reviewers: mcrosier, majnemer, spatel, reames Subscribers: reames, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23722 llvm-svn: 279448	2016-08-22 18:21:56 +00:00
James Molloy	475f4a763f	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r279443. It caused buildbot failures. llvm-svn: 279447	2016-08-22 18:13:12 +00:00
James Molloy	353052698a	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279443	2016-08-22 17:40:23 +00:00
Simon Pilgrim	c8ad5c069c	[X86][AVX] Don't use SubVectorBroadcast if there are additional users of the chain (PR29088) We could improve on this by making X86SubVBroadcast a full memory intrinsic similar to X86vzload llvm-svn: 279441	2016-08-22 16:47:55 +00:00
Simon Atanasyan	eb9ed61021	[mips][ias] Support .dtprel[d]word and .tprel[d]word directives Assembler directives .dtprelword, .dtpreldword, .tprelword, and .tpreldword generates relocations R_MIPS_TLS_DTPREL32, R_MIPS_TLS_DTPREL64, R_MIPS_TLS_TPREL32, and R_MIPS_TLS_TPREL64 respectively. The main motivation for this patch is to be able to write test cases for checking correctness of the LLD linker's behaviour. Differential Revision: https://reviews.llvm.org/D23669 llvm-svn: 279439	2016-08-22 16:18:42 +00:00
Artur Pilipenko	a1d9a67496	Remove missing file from r279433 reversal llvm-svn: 279434	2016-08-22 13:18:19 +00:00
Artur Pilipenko	bc76ecada0	Revert -r278267 [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers This change cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See https://reviews.llvm.org/D18777 for details. llvm-svn: 279433	2016-08-22 13:14:07 +00:00
Artur Pilipenko	b78ad9d41f	Revert -r278269 [IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative This change needs to be reverted in order to revert -r278267 which cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See comments on https://reviews.llvm.org/D18777 for details. llvm-svn: 279432	2016-08-22 13:12:07 +00:00
Balaram Makam	a927aa4ad0	[PM] Port LoopDataPrefetch AArch64 tests to new pass manager Reviewers: mcrosier, tejohnson Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23724 llvm-svn: 279431	2016-08-22 12:59:58 +00:00
Simon Pilgrim	2279e59573	[X86][SSE] Avoid specifying unused arguments in SHUFPD lowering As discussed on PR26491, we are missing the opportunity to make use of the smaller MOVHLPS instruction because we set both arguments of a SHUFPD when using it to lower a single input shuffle. This patch sets the lowered argument to UNDEF if that shuffle element is undefined. This in turn makes it easier for target shuffle combining to decode UNDEF shuffle elements, allowing combines to MOVHLPS to occur. A fix to match against MOVHPD stores was necessary as well. This builds on the improved MOVLHPS/MOVHLPS lowering and memory folding support added in D16956 Adding similar support for SHUFPS will have to wait until have better support for target combining of binary shuffles. Differential Revision: https://reviews.llvm.org/D23027 llvm-svn: 279430	2016-08-22 12:56:54 +00:00
Hrvoje Varga	f0ed16eae5	[mips][microMIPS] Implement BLTZC, BLEZC, BGEZC and BGTZC instructions, fix disassembly and add operand checking to existing B<cond>C implementations Differential Revision: https://reviews.llvm.org/D22667 llvm-svn: 279429	2016-08-22 12:17:59 +00:00
Simon Pilgrim	8738786bc1	[ThinLTO][X86] Fix windows build Windows 'rm' complains about non-existent files if a wildcard is used. Be more explicit about the files deleted to avoid this. llvm-svn: 279426	2016-08-22 10:49:37 +00:00
Mehdi Amini	35edfc5f13	Add REQUIRES:X86 to test/tools/llvm-lto2/common.ll llvm-svn: 279418	2016-08-22 06:37:41 +00:00
Mehdi Amini	dc4c8cf9ac	[LTO] Handles commons in monolithic LTO The gold-plugin was doing this internally, now the API is handling commons correctly based on the given resolution. Differential Revision: https://reviews.llvm.org/D23739 llvm-svn: 279417	2016-08-22 06:25:46 +00:00
Simon Pilgrim	89e375a95e	[CostModel][X86] Removed shift tests There are more thorough tests found in vshift-*-cost.ll llvm-svn: 279406	2016-08-21 19:56:02 +00:00
Simon Pilgrim	6ad12ec629	[CostModel][X86] Added costs for vXi16 and vXi8 vectors for add/sub/mul/and/or/xor tests llvm-svn: 279405	2016-08-21 19:44:44 +00:00
Simon Pilgrim	b0a0576ffc	[CostModel][X86] Replaced SSSE3 with SSE2 costs to create a better baseline llvm-svn: 279404	2016-08-21 19:14:48 +00:00
Simon Pilgrim	07d7a21ea1	[CostModel][X86] Added fsqrt and fma costs llvm-svn: 279403	2016-08-21 19:06:25 +00:00
Simon Pilgrim	3cd61a084f	[CostModel][X86] Split off float arithmetic cost tests llvm-svn: 279402	2016-08-21 18:34:47 +00:00
Sanjay Patel	643d21a62c	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 4 This concludes the fixes for icmp+shl in this series: https://reviews.llvm.org/rL279339 https://reviews.llvm.org/rL279398 https://reviews.llvm.org/rL279399 llvm-svn: 279401	2016-08-21 17:10:07 +00:00
Sanjay Patel	163a5ab799	remove FIXME comment; fixed by previous commit llvm-svn: 279400	2016-08-21 16:40:42 +00:00
Sanjay Patel	7ffcde7422	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 3 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279399	2016-08-21 16:35:34 +00:00
Sanjay Patel	7e09f13fed	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 2 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279398	2016-08-21 16:28:22 +00:00
Guy Blank	9ae797a798	[AVX512][FastISel] Do not use K registers in TEST instructions In some cases, FastIsel was emitting TEST instruction with K reg input, which is illegal. Changed to using KORTEST when dealing with K regs. Differential Revision: https://reviews.llvm.org/D23163 llvm-svn: 279393	2016-08-21 08:02:27 +00:00
Duncan P. N. Exon Smith	8f44c98d04	ARM: Avoid dereferencing end() in ARMFrameLowering::emitEpilogue This fixes the crash from PR29072, where the MachineBasicBlock::iterator wasn't being properly checked against MachineBasicBlock::end() before iterating. This was another bug exposed by the new ilist::iterator::operator*() assertion from r279314. This testcase is poor quality. bugpoint couldn't reduce any further, and I haven't had time to dig into what's going on so I can't invent a better one. I didn't even get good CHECK lines in: this is just a crasher. I'm committing anyway since this is a real crash with an obvious fix, but I'll leave PR29072 open and ask an ARM maintainer to help improve the testcase. llvm-svn: 279391	2016-08-21 00:08:10 +00:00
Simon Pilgrim	636422a898	[X86][SSE] Regenerate 32-bit buildvector test llvm-svn: 279389	2016-08-20 23:09:57 +00:00
Simon Pilgrim	ead5076753	[X86][SSE] Regenerate subvector extraction widening test llvm-svn: 279388	2016-08-20 22:00:53 +00:00
Simon Pilgrim	b65d476549	[X86] Regenerate fp truncate tests llvm-svn: 279387	2016-08-20 21:56:33 +00:00
Simon Pilgrim	8275583be6	Regenerate test llvm-svn: 279386	2016-08-20 21:37:30 +00:00
Simon Pilgrim	a1142579f1	Regenerate test llvm-svn: 279385	2016-08-20 21:35:45 +00:00
Simon Pilgrim	ae9b81e684	[X86][XOP] Tweak vpermil2pd test to stop it being combined away llvm-svn: 279384	2016-08-20 21:07:41 +00:00
Vitaly Buka	1f9e135023	[asan] Minimize code size by using __asan_set_shadow_* for large blocks Summary: We can insert function call instead of multiple store operation. Current default is blocks larger than 64 bytes. Changes are hidden behind -asan-experimental-poisoning flag. PR27453 Differential Revision: https://reviews.llvm.org/D23711 llvm-svn: 279383	2016-08-20 20:23:50 +00:00
Vitaly Buka	3455b9b8bc	[asan] Initialize __asan_set_shadow_* callbacks Summary: Callbacks are not being used yet. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23634 llvm-svn: 279380	2016-08-20 18:34:39 +00:00
Vitaly Buka	186280daa5	[asan] Optimize store size in FunctionStackPoisoner::poisonRedZones Summary: Reduce store size to avoid leading and trailing zeros. Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23648 llvm-svn: 279379	2016-08-20 18:34:36 +00:00
Vitaly Buka	5b4f12176c	[asan] Cleanup instrumentation of dynamic allocas Summary: Extract instrumenting dynamic allocas into separate method. Rename asan-instrument-allocas -> asan-instrument-dynamic-allocas Differential Revision: https://reviews.llvm.org/D23707 llvm-svn: 279376	2016-08-20 17:22:27 +00:00
Simon Pilgrim	cb0ba1067f	[X86][SSE] Added vector interleave test (PR21281) llvm-svn: 279375	2016-08-20 17:07:38 +00:00
Matthew Simpson	235e479984	Reapply "[SLP] Initialize VectorizedValue when gathering" The test case included in r279125 exposed existing undefined behavior in the SLP vectorizer that it did not introduce. This patch reapplies the original patch, but modifies the test case to avoid hitting the undefined behavior. This allows us to close PR28330 while keeping the UBSan bot happy. The undefined behavior the original test uncovered will be addressed in a follow-on patch. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 llvm-svn: 279370	2016-08-20 14:49:02 +00:00
Vitaly Buka	cc7db13bf0	Revert "[SLP] Initialize VectorizedValue when gathering" to fix ubsan bot. This reverts commit r279125. https://reviews.llvm.org/D23410 llvm-svn: 279363	2016-08-20 07:09:39 +00:00
Xinliang David Li	3bf2d58a21	[Profile] add test with large counts llvm-svn: 279361	2016-08-20 05:28:42 +00:00
Chandler Carruth	8abdf75d6b	[PM] Introduce an abstraction for all the analyses over a particular IR unit for use in the PreservedAnalyses set. This doesn't have any important functional change yet but it cleans things up and makes the analysis substantially more efficient by avoiding querying through the type erasure for every analysis. I also think it makes it much easier to reason about how analyses are preserved when walking across pass managers and across IR unit abstractions. Thanks to Sean and Mehdi both for the comments and suggestions. Differential Revision: https://reviews.llvm.org/D23691 llvm-svn: 279360	2016-08-20 04:57:28 +00:00
Teresa Johnson	765941a841	[gold/ThinLTO] Restore ThinLTO file management in gold plugin Summary: The gold-plugin changes added along with the new LTO API in r278338 had the effect of removing the management of the PluginInputFile that ensured the files weren't released back to gold until the backend threads were complete. Add back the old file handling. Fixes PR29020. Reviewers: mehdi_amini Subscribers: mehdi_amini, llvm-commits, hjl.tools Differential Revision: https://reviews.llvm.org/D23721 llvm-svn: 279356	2016-08-20 01:24:07 +00:00
Teresa Johnson	1f76caf3dd	[gold] Fix new gold test to specify emulation mode Add emulation mode option for new test added in r279023. llvm-svn: 279355	2016-08-20 01:22:10 +00:00
Mehdi Amini	458f805468	[LTO] Add the ability to test -thinlto-emit-imports-files through llvm-lto2 Summary: Start bringing llvm-lto2 to a level where we can test the LTO API a bit deeper. Reviewers: tejohnson Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23681 llvm-svn: 279349	2016-08-19 23:54:40 +00:00
Tim Northover	a11be04769	GlobalISel: support legalization of G_FCONSTANTs llvm-svn: 279341	2016-08-19 22:40:08 +00:00
Tim Northover	ea904f9424	GlobalISel: teach legalizer how to handle integer constants. llvm-svn: 279340	2016-08-19 22:40:00 +00:00
Sanjay Patel	fa7de606c4	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 1 This is a partial enablement (move the ConstantInt guard down) because there are many different folds here and one of the later ones will require reworking 'isSignBitCheck'. llvm-svn: 279339	2016-08-19 22:33:26 +00:00
Matthias Braun	a7d6fc9618	MachineFunction: Cleanup/simplify MachineFunctionProperties::print() - Always compile print() regardless of LLVM_ENABLE_DUMP. (We usually only gard dump() functions with that). - Only show the set properties to reduce output clutter. - Remove the unused variant that even shows the unset properties. - Fix comments llvm-svn: 279338	2016-08-19 22:31:45 +00:00
Tim Northover	b78e4cafde	GlobalISel: translate floating-point round/extend llvm-svn: 279320	2016-08-19 20:48:23 +00:00
Tim Northover	d5c23bcfc9	GlobalISel: translate floating-point comparisons llvm-svn: 279319	2016-08-19 20:48:16 +00:00
Justin Lebar	d13880a750	[NVPTX] Switch nvptx-use-infer-addrspace to true. Summary: This switches us to use a different, more powerful algorithm for address space inference. I've tested this locally and it seems to work great. Once we're more confident in it, we can remove the old pass altogether. Reviewers: jingyue Subscribers: llvm-commits, tra, jholewinski Differential Revision: https://reviews.llvm.org/D23694 llvm-svn: 279317	2016-08-19 20:46:45 +00:00
Reid Kleckner	98a48afa5d	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r279229. It breaks intrinsic function calls in diamonds. llvm-svn: 279313	2016-08-19 20:22:39 +00:00
Tim Northover	b16734fbaa	GlobalISel: translate floating-point constants llvm-svn: 279311	2016-08-19 20:09:15 +00:00
Tim Northover	d3761cd165	GlobalISel: translate float/int conversion instructions. llvm-svn: 279310	2016-08-19 20:09:11 +00:00
Tim Northover	5a28c3642f	GlobalISel: support translating select instructions. llvm-svn: 279309	2016-08-19 20:09:07 +00:00
Tim Northover	bbbfb1cfb8	GlobalISel: translate insertvalue instructions. This adds a G_INSERT instruction, which technically makes G_SEQUENCE redundant (it's equivalent to a G_INSERT into an IMPLICIT_DEF). We'll leave G_SEQUENCE for now though: it's likely to be far more common as it's a fundamental part of legalization, so avoiding the mess and bloat of the extra IMPLICIT_DEFs is probably worthwhile. llvm-svn: 279306	2016-08-19 20:08:55 +00:00
Krzysztof Parzyszek	021151d6c1	[Hexagon] Add RUN line to test llvm-svn: 279304	2016-08-19 19:36:35 +00:00
Krzysztof Parzyszek	505eb498bd	[Hexagon] Allow i1 values for 'r' constraint in inline-asm llvm-svn: 279302	2016-08-19 19:17:28 +00:00
Simon Pilgrim	054e7d2ec1	[CostModel][X86] Added sub, or, and, fadd and fsub costs and missing 512-bit mul costs llvm-svn: 279301	2016-08-19 19:07:10 +00:00
Tim Northover	26b76f2c59	GlobalISel: improve representation of G_SEQUENCE and G_EXTRACT First, make sure all types involved are represented, rather than being implicit from the register width. Second, canonicalize all types to scalar. These operations just act in bits and don't care about vectors. Also standardize spelling of Indices in the MachineIRBuilder (NFC here). llvm-svn: 279294	2016-08-19 18:32:14 +00:00
Simon Pilgrim	fbfa3ee4f6	[CostModel][X86] Added some AVX512 and 512-bit vector cost tests llvm-svn: 279291	2016-08-19 18:24:10 +00:00
Kyle Butt	ce0196de3f	Revert "CodeGen: If Convert blocks that would form a diamond when tail-merged." This reverts commit 0fda93481c4231c06b838ef476c0c404c51ff875. llvm-svn: 279288	2016-08-19 18:17:04 +00:00
Tim Northover	2fa5fa391f	GlobalISel: allow extractvalue to extract an aggregate. llvm-svn: 279287	2016-08-19 18:09:41 +00:00
Krzysztof Parzyszek	3d9946eb23	[Hexagon] Fixes for new-value jump formation - Recognize C2_cmpgtui, S2_tstbit_i, and S4_ntstbit_i. - Avoid creating new-value instructions with both source operands equal. llvm-svn: 279286	2016-08-19 17:54:49 +00:00
Tim Northover	6f80b08c64	GlobalISel: support translation of extractvalue instructions. llvm-svn: 279285	2016-08-19 17:47:05 +00:00
Simon Pilgrim	e309d2d0c3	[CostModel][X86] Add fdiv + frem cost tests llvm-svn: 279283	2016-08-19 17:39:00 +00:00
Tim Northover	91c8173093	GlobalISel: support overflow arithmetic intrinsics. Unsigned addition and subtraction can reuse the instructions created to legalize large width operations (i.e. both produce and consume a carry flag). Signed operations and multiplies get a dedicated op-with-overflow instruction. Once this is produced the two values are combined into a struct register (which will almost always be merged with a corresponding G_EXTRACT as part of legalization). llvm-svn: 279278	2016-08-19 17:17:06 +00:00
Vitaly Buka	170dede75d	Revert "[asan] Optimize store size in FunctionStackPoisoner::poisonRedZones" This reverts commit r279178. Speculative revert in hope to fix asan crash on arm. llvm-svn: 279277	2016-08-19 17:15:38 +00:00
Vitaly Buka	c8f4d69c82	Revert "[asan] Fix size of shadow incorrectly calculated in r279178" This reverts commit r279222. Speculative revert in hope to fix asan crash on arm. llvm-svn: 279276	2016-08-19 17:15:33 +00:00
Lang Hames	6e9f0309e9	[RuntimeDyld] Revert r279182 and 279201 -- they broke some ARM bots. llvm-svn: 279275	2016-08-19 17:06:39 +00:00
Michael Kuperstein	41898f0396	[AliasSetTracker] Degrade AliasSetTracker when may-alias sets get too large. Repeated inserts into AliasSetTracker have quadratic behavior - inserting a pointer into AST is linear, since it requires walking over all "may" alias sets and running an alias check vs. every pointer in the set. We can avoid this by tracking the total number of pointers in "may" sets, and when that number exceeds a threshold, declare the tracker "saturated". This lumps all pointers into a single "may" set that aliases every other pointer. (This is a stop-gap solution until we migrate to MemorySSA) This fixes PR28832. Differential Revision: https://reviews.llvm.org/D23432 llvm-svn: 279274	2016-08-19 17:05:22 +00:00
Simon Pilgrim	d7a3782ae4	[X86][SSE] Generalised combining to VZEXT_MOVL to any vector size This doesn't change tests codegen as we already combined to blend+zero which is what we lower VZEXT_MOVL to on SSE41+ targets, but it does put us in a better position when we improve shuffling for optsize. llvm-svn: 279273	2016-08-19 17:02:00 +00:00
Krzysztof Parzyszek	639545b4d8	[Hexagon] Enforce LLSC packetization rules Ensure that load locked and store conditional instructions are only packetized with ALU32 instructions. Patch by Ben Craig. llvm-svn: 279272	2016-08-19 16:57:05 +00:00
Reid Kleckner	a871d3872a	Fix regression in InstCombine introduced by r278944 The intended transform is: // Simplify icmp eq (or (ptrtoint P), (ptrtoint Q)), 0 // -> and (icmp eq P, null), (icmp eq Q, null). P and Q are both pointer types, but may have different types. We need two calls to getNullValue() to make the icmps. llvm-svn: 279271	2016-08-19 16:53:18 +00:00
David Majnemer	5554edabef	[CloneFunction] Don't remove unrelated nodes from the CGSSC CGSCC use a WeakVH to track call sites. RAUW a call within a function can result in that WeakVH getting confused about whether or not the call site is still around. llvm-svn: 279268	2016-08-19 16:37:40 +00:00
Krzysztof Parzyszek	9335bf0ec5	[Hexagon] Fix incorrect generation of S4_subi_asl_ri Patch by Jyotsna Verma. llvm-svn: 279267	2016-08-19 16:35:05 +00:00
Sanjay Patel	a867afe094	[InstCombine] use m_APInt to allow icmp (shl 1, Y), C folds for splat constant vectors llvm-svn: 279266	2016-08-19 16:12:16 +00:00
Sanjay Patel	57b12d3876	[InstCombine] use m_APInt to allow icmp X, C folds for splat constant vectors Of course, we really need to refactor and fix all of the cmp predicates, but this one is interesting because without it, we later perform an information-losing transform of icmp (shl 1, Y), C, and we can't recover the better fold. llvm-svn: 279263	2016-08-19 15:40:44 +00:00
Sanjay Patel	78111a7617	[InstCombine] add tests for missing vector icmp folds llvm-svn: 279259	2016-08-19 15:27:28 +00:00
Sanjay Patel	14cdf1968f	[InstCombine] add missing tests for basic icmp folds These are implicitly included as part of larger test cases, but they don't exist stand-alone (and don't happen for vectors...). llvm-svn: 279257	2016-08-19 15:21:45 +00:00
Krzysztof Parzyszek	7d200668e4	Unxfail passing tests on Hexagon llvm-svn: 279252	2016-08-19 15:07:58 +00:00
Krzysztof Parzyszek	0ba9754584	[Hexagon] Allow tail-call optimization when mixing C and fast calling conv Patch by Arnold Schwaighofer. llvm-svn: 279251	2016-08-19 15:02:18 +00:00
Krzysztof Parzyszek	66dd6797e8	[Hexagon] Check for empty live interval Patch by Brendon Cahoon. llvm-svn: 279249	2016-08-19 14:29:43 +00:00
Anton Korobeynikov	b38195c1a8	Revert r279242 - it's failing the tests llvm-svn: 279247	2016-08-19 14:18:34 +00:00
Anton Korobeynikov	2aae31a945	Fix PR27500: on MSP430 the branch destination offset is measured in words, not bytes. In addition, the branch instructions will have proper BB destinations, not offsets, like before. Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D20162 llvm-svn: 279242	2016-08-19 14:07:10 +00:00
Krzysztof Parzyszek	6421b934ec	[Hexagon] Mark PS_jumpret as pseudo-instruction, expand it into J2_jumpr llvm-svn: 279241	2016-08-19 14:04:45 +00:00
Krzysztof Parzyszek	bd8ef4b8ce	[Hexagon] Improvements to handling and generation of FP instructions Improved handling of fma, floating point min/max, additional load/store instructions for floating point types. Patch by Jyotsna Verma. llvm-svn: 279239	2016-08-19 13:34:31 +00:00
Simon Pilgrim	f1b8fdc074	[X86][SSE] Add support for matching commuted insertps patterns INSERTPS doesn't fit well with our shuffle mask canonicalization, so we need to attempt both the original mask and the commuted mask to more likely get a match llvm-svn: 279230	2016-08-19 10:31:53 +00:00
James Molloy	11a1936b70	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 279229	2016-08-19 10:10:27 +00:00
James Molloy	7ee640f9b6	[CodeGen] Fix a trivial type conversion bug dating back to pre-2008 The heuristic above this code is incredibly suspect, but disregarding that it mutates the cast opcode so we need to check the mutated opcode later to see if we need to emit an AssertSext or AssertZext node. Fixes PR29041. llvm-svn: 279223	2016-08-19 08:38:50 +00:00
Vitaly Buka	b81960a6c8	[asan] Fix size of shadow incorrectly calculated in r279178 Summary: r279178 generates 8 times more stores than necessary. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23708 llvm-svn: 279222	2016-08-19 08:33:53 +00:00
NAKAMURA Takumi	a535636759	Fix tests in llvm/test/tools/gold/X86 to satisfy r279014. They would unexpectedly pass if test/tools/gold/X86/Output had outputs of previous tests. llvm-svn: 279214	2016-08-19 06:44:44 +00:00
Dean Michael Berris	1dd1ca9727	[XRay] Synthesize a reference to the xray_instr_map Without the synthesized reference to a symbol in the xray_instr_map, linker section garbage collection will helpfully remove the whole xray_instr_map section from the final executable (or archive). This will cause the runtime to not be able to identify the sleds and hot-patch the calls/jumps into the runtime trampolines. This change adds a reference from the text section at the end of the function to keep around the associated xray_instr_map section as well. We also make sure that we catch this reference in the test. Reviewers: chandlerc, echristo, majnemer, mehdi_amini Subscribers: mehdi_amini, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D23398 llvm-svn: 279204	2016-08-19 04:44:30 +00:00
Lang Hames	e2ca3b65fc	[RuntimeDyld][MCJIT] Un-XFAIL some tests that were fixed by r279182. llvm-svn: 279201	2016-08-19 03:12:16 +00:00
Matthias Braun	fdc4c6b426	Revert "RegScavenging: Add scavengeRegisterBackwards()" The ppc64 multistage bot fails on this. This reverts commit r279124. Also Revert "CodeGen: Add/Factor out LiveRegUnits class; NFCI" because it depends on the previous change This reverts commit r279171. llvm-svn: 279199	2016-08-19 03:03:24 +00:00
Vitaly Buka	aa654292bd	[asan] Optimize store size in FunctionStackPoisoner::poisonRedZones Summary: Reduce store size to avoid leading and trailing zeros. Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23648 llvm-svn: 279178	2016-08-18 23:51:15 +00:00
Kyle Butt	780b517d6b	CodeGen: If Convert blocks that would form a diamond when tail-merged. The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. Regression on self-hosting bots with no obvious explanation. Tidied up range handling to be more obviously correct, but there was no smoking gun. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 279168	2016-08-18 22:09:27 +00:00
Hemant Kulkarni	e77a0a9a3b	llvm-objdump: Add Hexagon printer changes for -S/-l options Differential Revision: https://reviews.llvm.org/D23521 llvm-svn: 279161	2016-08-18 21:50:13 +00:00
Zhan Jun Liau	cf2f4b3251	[SystemZ] Use valid base/index regs for inline asm Summary: Inline asm memory constraints can have the base or index register be assigned to %r0 right now. Make sure that we assign only ADDR64 registers to the base and index. Reviewers: uweigand Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23367 llvm-svn: 279157	2016-08-18 21:44:15 +00:00
Tom Stellard	a1619cd9aa	AMDGPU/SI: Fix a test in wqm.ll to always use s_cbranch_vcc* Summary: We need to use floating-point compares to ensure that s_cbranch_vcc* instructions are always generated. With integer compares, future optimizations could cause s_cbranch_scc* to be generated instead. Reviewers: arsenm, nhaehnle Subscribers: llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23401 llvm-svn: 279148	2016-08-18 21:21:53 +00:00
Amaury Sechet	763c59dc9a	Make cltz and cttz zero undef when the operand cannot be zero in InstCombine Summary: Also add popcount(n) == bitsize(n) -> n == -1 transformation. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23134 llvm-svn: 279141	2016-08-18 20:43:50 +00:00
Sanjay Patel	40e8ca46ad	[InstCombine] use m_APInt to allow icmp (trunc X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 https://reviews.llvm.org/rL279101 llvm-svn: 279133	2016-08-18 20:28:54 +00:00
Wei Ding	52bb661dec	AMDGPU : Fix QSAD and MQSAD instructions' incorrect data type. Differential Revision: http://reviews.llvm.org/D23689 llvm-svn: 279126	2016-08-18 19:51:14 +00:00
Matthew Simpson	11db6b6b8c	[SLP] Initialize VectorizedValue when gathering We abort building vectorizable trees in some cases (e.g., if the maximum recursion depth is reached, if the region size is too large, etc.). If this happens for a reduction, we can be left with a root entry that needs to be gathered. For these cases, we need make sure we actually set VectorizedValue to the resulting vector. This patch ensures we properly set VectorizedValue, and it also ensures the insertelement sequence generated for the gathers is inserted at the correct location. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 Differential Revison: https://reviews.llvm.org/D23410 llvm-svn: 279125	2016-08-18 19:50:32 +00:00
Matthias Braun	075d0c23d5	RegScavenging: Add scavengeRegisterBackwards() Re-apply r276044 with off-by-1 instruction fix for the reload placement. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 279124	2016-08-18 19:47:59 +00:00
Simon Pilgrim	99fd9c5f56	[X86][SSE] Missed insertps shuffle patterns llvm-svn: 279111	2016-08-18 18:19:28 +00:00
Vitaly Buka	0596387ad3	[asan] Extend test Summary: PR27453 Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23647 llvm-svn: 279109	2016-08-18 18:17:19 +00:00
Valery Pykhtin	609c2f8137	[AMDGPU] add s_incperflevel/s_decperflevel intrinsics. Differential revision: https://reviews.llvm.org/D23666 llvm-svn: 279106	2016-08-18 18:06:20 +00:00
Elliot Colp	687691aeac	Fix SystemZ compilation abort caused by negative AND mask Normally, when an AND with a constant is lowered to NILL, the constant value is truncated to 16 bits. However, since r274066, ANDs whose results are used in a shift are caught by a different pattern that does not truncate. The instruction printer expects a 16-bit unsigned immediate operand for NILL, so this results in an abort. This patch adds code to manually truncate the constant in this situation. The rest of the bits are then set, so we will detect a case for NILL "naturally" rather than using peephole optimizations. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 279105	2016-08-18 18:04:26 +00:00
Duncan P. N. Exon Smith	84c2da47f9	AArch64: Don't call getIterator() on iterators Remove an unnecessary round-trip: iterator => operator->() => getIterator() In some cases, the iterator is end(), so the dereference of operator-> is invalid (UB). The testcase only crashes with r278974 (currently reverted to investigate this), which adds an assertion for invalid dereferences of ilist nodes. Fixes PR29035. llvm-svn: 279104	2016-08-18 17:58:09 +00:00
Sanjay Patel	fa5ca2bf46	[InstCombine] use m_APInt to allow icmp (udiv X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 llvm-svn: 279101	2016-08-18 17:55:59 +00:00
Dan Gohman	c9623db884	[WebAssembly] Disable the store-results optimization. The WebAssemly spec removing the return value from store instructions, so remove the associated optimization from LLVM. This patch leaves the store instruction operands in place for now, so stores now always write to "$drop"; these will be removed in a seperate patch. llvm-svn: 279100	2016-08-18 17:51:27 +00:00
Zachary Turner	ac5763eca4	Resubmit "Write the TPI stream from a PDB to Yaml." The original patch was breaking some buildbots due to an incorrect ordering of function definitions which caused some compilers to recognize a definition but others to not. llvm-svn: 279089	2016-08-18 16:49:29 +00:00
Saleem Abdulrasool	c6bf547564	llvm-objdump: add coff import library symbol listing support This adds behaviour similar to binutils' objdump which can show symbols in an import library. Differences from that stem around the fact that we do not create section symbols nor the all import import descriptor symbol reference. However, this does mean that the tool can serve as a possible replacement for the existing tool. llvm-svn: 279088	2016-08-18 16:39:19 +00:00
Artur Pilipenko	615b820af6	CVP. Turn marking adds as no wrap (introduced by r278107) off by default It causes a regression on our internal benchmark. Introduce cvp-dont-process flag and set it off by default while investigating the regression. llvm-svn: 279082	2016-08-18 16:08:35 +00:00
Ahmed Bougacha	33e19fe1c4	[AArch64][GlobalISel] Select floating-point binary ops. There is no FREM instruction, but the others are straightforward. llvm-svn: 279081	2016-08-18 16:05:11 +00:00
Ahmed Bougacha	71d033a17f	[GlobalISel] Add floating-point binary ops. llvm-svn: 279080	2016-08-18 16:05:06 +00:00
Sanjay Patel	6347807f87	[InstCombine] use m_APInt to allow icmp (mul X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 llvm-svn: 279077	2016-08-18 15:44:44 +00:00
Derek Schuff	ccdceda128	[WebAssembly] Refactor WebAssemblyLowerEmscriptenException pass for setjmp/longjmp This patch changes the code structure of WebAssemblyLowerEmscriptenException pass to support both exception handling and setjmp/longjmp. It also changes the name of the pass and the source file. 1. Change the file/pass name to WebAssemblyLowerEmscriptenExceptions -> WebAssemblyLowerEmscriptenEHSjLj to make it clear that it supports both EH and SjLj 2. List function / global variable names at the top so they can be changed easily 3. Some cosmetic changes Patch by Heejin Ahn Differential Revision: https://reviews.llvm.org/D23588 llvm-svn: 279075	2016-08-18 15:27:25 +00:00
Ahmed Bougacha	1d0560b14d	[AArch64][GlobalISel] Select G_SDIV/G_UDIV. There is no REM instruction; that will require an expansion. It's not obvious that should be done in select, rather than as a (custom?) legalization. llvm-svn: 279074	2016-08-18 15:17:13 +00:00
Ahmed Bougacha	13db94540c	[GlobalISel] Add support for DIV/REM. llvm-svn: 279073	2016-08-18 15:17:01 +00:00
Saleem Abdulrasool	3780b3a9eb	llvm-readobj: handle import libraries with -coff-exports `link -dump -exports` lists exported symbols from import libraries as well as normal dlls. Ensure that we can handle import libraries as well in llvm-readobj. llvm-svn: 279069	2016-08-18 14:32:11 +00:00
Krzysztof Parzyszek	b1b0372337	[Hexagon] Create vcombine in HexagonCopyToCombine llvm-svn: 279067	2016-08-18 14:12:34 +00:00
Sanjay Patel	4c5e60d95c	[InstCombine] use m_APInt to allow icmp (xor X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 llvm-svn: 279066	2016-08-18 14:10:48 +00:00
Simon Pilgrim	ab7c46eccf	[X86][SSE] Add SSE1 tests to make sure we don't merge loads on illegal types llvm-svn: 279065	2016-08-18 13:41:26 +00:00
Simon Dardis	ea3431598e	[mips] Correct tail call encoding for MIPSR6 r277708 enabled tails calls for MIPS but used the 'jr' instruction when the jump target was held in a register. For MIPSR6, 'jalr $zero, $reg' should have been used. Additionally, add missing patterns for external and global symbols for tail calls. Reviewers: dsanders, vkalintiris Differential Review: https://reviews.llvm.org/D23301 llvm-svn: 279064	2016-08-18 13:22:43 +00:00
Chad Rosier	83f6bbc154	[Reassociate] Add test for PR28367. llvm-svn: 279063	2016-08-18 13:22:37 +00:00
Matthias Braun	c9130ea6a3	Testcase for r279022 llvm-svn: 279031	2016-08-18 02:21:54 +00:00
Kostya Serebryany	524c3f32e7	[sanitizer-coverage/libFuzzer] instrument comparisons with __sanitizer_cov_trace_cmp[1248] instead of __sanitizer_cov_trace_cmp, don't pass the comparison type to save a bit performance. Use these new callbacks in libFuzzer llvm-svn: 279027	2016-08-18 01:25:28 +00:00
Teresa Johnson	f1844daac9	Fix bot failure due to new test I had updated the output file name but not the corresponding nm based check before submitting as r279023. This should fix the bot failures llvm-svn: 279025	2016-08-18 01:18:15 +00:00
Teresa Johnson	f2b5ec6ef4	[ThinLTO] Keep common symbols in ThinLTO modules Summary: Skip the merging of common symbols for ThinLTO modules, they will be merged by the final native object link. Trying to merge the symbols and add to a combined module will incorrectly enable the common symbol to be internalized in the ThinLTO module. Additionally, we will not want to create a combined module for ThinLTO distributed builds. This fixes failures in 7 cpu2006 benchmarks from the new LTO API in ThinLTO mode. Reviewers: mehdi_amini Subscribers: pcc, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23637 llvm-svn: 279023	2016-08-18 01:08:50 +00:00
Mehdi Amini	8ac7b32207	[LTO] Promote before performing weak resolution Summary: This was reversed compared to ThinLTOCodeGenerator for some reason, and lead to an increased code-size on my tests. I figured that the weak resolution may internalize a linkonce function, which will be promoted immediately (and renamed), before being internalized again. Reviewers: tejohnson Subscribers: pcc, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23632 llvm-svn: 279021	2016-08-18 00:59:24 +00:00
Mehdi Amini	eccffada33	[LTO] Change addSaveTemps API: do not add dot to the supplied prefix path Summary: It does not play well with directories (end up with a bunch of hidden files). Also, do not strip the 0 suffix for the first task, especially since 0 can be used by ThinLTO as well now. Reviewers: tejohnson Subscribers: mehdi_amini, pcc, llvm-commits Differential Revision: https://reviews.llvm.org/D23612 llvm-svn: 279014	2016-08-18 00:12:33 +00:00
Dominic Chen	a8a638292c	[WebAssembly] Handle debug information and virtual registers without crashing (reland r278967) Summary: Currently, enabling debug information when compiling for WebAssembly crashes the backend. This commit fixes these by skipping debug values in backend passes. Reviewers: jfb, aprantl, dschuff, echristo Subscribers: llvm-commits, dschuff, jfb, MatzeB, dexonsmith, yurydelendik, mehdi_amini Differential Revision: https://reviews.llvm.org/D23635 llvm-svn: 279011	2016-08-17 23:42:27 +00:00
Hans Wennborg	3879035e66	SCEV: Don't assert about non-SCEV-able value in isSCEVExprNeverPoison() (PR28932) Differential Revision: https://reviews.llvm.org/D23594 llvm-svn: 278999	2016-08-17 22:50:18 +00:00
Sanjay Patel	3c92db7560	[InstCombine] add test for missing vector icmp fold Also, add a scalar test to demonstrate one of the intermediate folds that is necessary to accomplish the existing, multi-step test. And simplify the vector tests to only check the final piece of that multi-step transform. llvm-svn: 278995	2016-08-17 22:18:57 +00:00
Chris Bieneman	432ba9d89a	[macho2yaml] Don't write empty linkedit data Since I stopped writing empty export tries it causes LinkEdit to potentially be completely empty which results in invalid yaml being generated. To prevent this we skip linkedit data if it is empty. llvm-svn: 278985	2016-08-17 21:46:04 +00:00
Duncan P. N. Exon Smith	afdd8e541b	Revert "[WebAssembly] Handle debug information and virtual registers without crashing" This reverts commit r278967, since the new test is failing when you don't build the WebAssembly target (most people, since it's off-by-default). llvm-svn: 278973	2016-08-17 20:41:50 +00:00
Tim Northover	de3aea0412	GlobalISel: support irtranslation of icmp instructions. llvm-svn: 278969	2016-08-17 20:25:25 +00:00
Dominic Chen	4326167a37	[WebAssembly] Handle debug information and virtual registers without crashing Summary: Currently, enabling debug information when compiling for WebAssembly crashes the backend. This commit fixes these by skipping debug values in backend passes. Reviewers: jfb, aprantl, dschuff, echristo Subscribers: mehdi_amini, yurydelendik, dexonsmith, MatzeB, jfb, dschuff, llvm-commits Differential Revision: https://reviews.llvm.org/D21808 llvm-svn: 278967	2016-08-17 20:11:03 +00:00
Sanjay Patel	84ff18ba92	[InstCombine] minimize tests and autogenerate checks llvm-svn: 278960	2016-08-17 19:56:10 +00:00
Marina Yatsina	53ce3f9d02	Fix for PR29010 This is a fix for https://llvm.org/bugs/show_bug.cgi?id=29010 Root cause of the bug is that the register class of the machine instruction operand does not fully reflect if this registers that can be allocated. Both for i386 and x86_64 the operand's register class is VR128RegClass and thus contains xmm0-xmm15, though in i386 we can only use xmm0-xmm8. In order to get the actual allocable registers of the class we need to use RegisterClassInfo. Differential Revision: https://reviews.llvm.org/D23613 llvm-svn: 278954	2016-08-17 19:07:40 +00:00
Adrian Prantl	ccd546e953	Move tests to the appropriate subdirectory. llvm-svn: 278948	2016-08-17 16:55:56 +00:00
Sanjay Patel	63e14a07e8	[InstCombine] use m_APInt to allow icmp (or X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 llvm-svn: 278945	2016-08-17 16:38:57 +00:00
Sanjay Patel	f636d762ed	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278943	2016-08-17 16:23:15 +00:00
Adrian Prantl	c19dee734f	Support the DW_AT_noreturn DWARF flag. This is used to mark functions with the C++11 [[ noreturn ]] or C11 _Noreturn attributes. Patch by Victor Leschuk! https://reviews.llvm.org/D23167 llvm-svn: 278940	2016-08-17 16:02:43 +00:00
Chad Rosier	ea7e4647db	Revert "Reassociate: Reprocess RedoInsts after each inst". This reverts commit r258830, which introduced a bug described in PR28367. PR28367 llvm-svn: 278938	2016-08-17 15:54:39 +00:00
Sanjay Patel	4f7eb2aa95	[InstCombine] use m_APInt to allow icmp (add X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 llvm-svn: 278935	2016-08-17 15:24:30 +00:00
Simon Dardis	ac96ec7906	[mips] Add l.[sd] and s.[sd] instruction aliases Reviewers: dsanders, vkalintiris Differential Review: https://reviews.llvm.org/D23121 llvm-svn: 278930	2016-08-17 14:45:09 +00:00
Chad Rosier	a6822f64f3	Revert "[Reassociate] Avoid iterator invalidation when negating value." This reverts commit r278928 due to lit test failures. llvm-svn: 278929	2016-08-17 14:31:34 +00:00
Chad Rosier	cf3e8121a6	[Reassociate] Avoid iterator invalidation when negating value. Differential Revision: https://reviews.llvm.org/D23464 PR28367 llvm-svn: 278928	2016-08-17 14:16:45 +00:00
Jonas Paulsson	7a79422536	[LoopStrenghtReduce] Refactoring and addition of a new target cost function. Refactored so that a LSRUse owns its fixups, as oppsed to letting the LSRInstance own them. This makes it easier to rate formulas for LSRUses, since the fixups are available directly. The Offsets vector has been removed since it was no longer necessary. New target hook isFoldableMemAccessOffset(), which is used during formula rating. For SystemZ, this is useful to express that loads and stores with float or vector types with a big/negative offset should be avoided in loops. Without this, LSR will generate a lot of negative offsets that would require extra instructions for loading the address. Updated tests: test/CodeGen/SystemZ/loop-01.ll Reviewed by: Quentin Colombet and Ulrich Weigand. https://reviews.llvm.org/D19152 llvm-svn: 278927	2016-08-17 13:24:19 +00:00
Sam Kolton	c05d7784a6	[AMDGPU] llvm-objdump: Skip amd_kernel_code_t only at the begining of kernel symbol. Summary: This change fix bug in AMDGPU disassembly. Previously, presence of symbols other than kernel symbols caused objdump to skip begining of those symbols. Reviewers: tstellarAMD, vpykhtin, Bigcheese, ruiu Subscribers: kzhuravl, arsenm Differential Revision: http://reviews.llvm.org/D21966 llvm-svn: 278921	2016-08-17 10:17:57 +00:00
Ayman Musa	71b43c5c1d	Fix bug in DAGBuilder for getelementptr with expanded vector. Replacing the usage of MVT with EVT in case the vector type is expanded. Differential Revision: https://reviews.llvm.org/D23306 llvm-svn: 278913	2016-08-17 07:52:15 +00:00
Chuang-Yu Cheng	f7ba716bcb	[ppc64] Don't apply sibling call optimization if callee has any byval arg This is a quick work around, because in some cases, e.g. caller's stack size > callee's stack size, we are still able to apply sibling call optimization even callee has any byval arg. This patch fix: https://llvm.org/bugs/show_bug.cgi?id=28328 Reviewers: hfinkel kbarton nemanjai amehsan Subscribers: hans, tjablin https://reviews.llvm.org/D23441 llvm-svn: 278900	2016-08-17 03:17:44 +00:00
Chandler Carruth	67fc52f067	[PM] Port the always inliner to the new pass manager in a much more minimal and boring form than the old pass manager's version. This pass does the very minimal amount of work necessary to inline functions declared as always-inline. It doesn't support a wide array of things that the legacy pass manager did support, but is alse ... about 20 lines of code. So it has that going for it. Notably things this doesn't support: - Array alloca merging - To support the above, bottom-up inlining with careful history tracking and call graph updates - DCE of the functions that become dead after this inlining. - Inlining through call instructions with the always_inline attribute. Instead, it focuses on inlining functions with that attribute. The first I've omitted because I'm hoping to just turn it off for the primary pass manager. If that doesn't pan out, I can add it here but it will be reasonably expensive to do so. The second should really be handled by running global-dce after the inliner. I don't want to re-implement the non-trivial logic necessary to do comdat-correct DCE of functions. This means the -O0 pipeline will have to be at least 'always-inline,global-dce', but that seems reasonable to me. If others are seriously worried about this I'd like to hear about it and understand why. Again, this is all solveable by factoring that logic into a utility and calling it here, but I'd like to wait to do that until there is a clear reason why the existing pass-based factoring won't work. The final point is a serious one. I can fairly easily add support for this, but it seems both costly and a confusing construct for the use case of the always inliner running at -O0. This attribute can of course still impact the normal inliner easily (although I find that a questionable re-use of the same attribute). I've started a discussion to sort out what semantics we want here and based on that can figure out if it makes sense ta have this complexity at O0 or not. One other advantage of this design is that it should be quite a bit faster due to checking for whether the function is a viable candidate for inlining exactly once per function instead of doing it for each call site. Anyways, hopefully a reasonable starting point for this pass. Differential Revision: https://reviews.llvm.org/D23299 llvm-svn: 278896	2016-08-17 02:56:20 +00:00
Justin Bogner	39eec466a2	Revert "Write the TPI stream from a PDB to Yaml." This is hitting a "use of undeclared identifier 'skipPadding' error locally and on some bots. This reverts r278869. llvm-svn: 278871	2016-08-16 23:37:10 +00:00
Zachary Turner	8321ba5437	Write the TPI stream from a PDB to Yaml. Reviewed By: ruiu, rnk Differential Revision: https://reviews.llvm.org/D23226 llvm-svn: 278869	2016-08-16 23:28:54 +00:00
Sanjay Patel	7ad324b396	[InstCombine] add tests for fold with no coverage and missing vector fold llvm-svn: 278867	2016-08-16 23:18:42 +00:00
Kyle Butt	07d61425e3	Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough. If AnalyzeBranch can't analyze a block and it is possible to fallthrough, then duplicating the block doesn't make sense, as only one block can be the layout predecessor for the un-analyzable fallthrough. Submitted wit a test case, but NOTE: the test case doesn't currently fail. However, the test case fails with D20505 and would have saved me some time debugging. llvm-svn: 278866	2016-08-16 22:56:14 +00:00
Sanjay Patel	e47df1ac62	[InstCombine] use m_APInt to allow icmp (sub X, Y), C folds for splat constant vectors llvm-svn: 278859	2016-08-16 21:53:19 +00:00
Sanjay Patel	904cd39b05	[x86] Allow merging multiple instances of an immediate within a basic block for code size savings, for 64-bit constants. This patch handles 64-bit constants which can be encoded as 32-bit immediates. It extends the functionality added by https://reviews.llvm.org/D11363 for 32-bit constants to 64-bit constants. Patch by Sunita Marathe! Differential Revision: https://reviews.llvm.org/D23391 llvm-svn: 278857	2016-08-16 21:35:16 +00:00
Reid Kleckner	b99b709068	Revert "Enhance SCEV to compute the trip count for some loops with unknown stride." This reverts commit r278731. It caused http://crbug.com/638314 llvm-svn: 278853	2016-08-16 21:02:04 +00:00
Haicheng Wu	9780df5385	[BranchFolding] Change a test case of r278575. Rename the operands to make the test less brittle. llvm-svn: 278841	2016-08-16 20:06:25 +00:00
Sjoerd Meijer	15c81b05ea	[MBP] do not reorder and move up loop latch block Do not reorder and move up a loop latch block before a loop header when optimising for size because this will generate an extra unconditional branch. Differential Revision: https://reviews.llvm.org/D22521 llvm-svn: 278840	2016-08-16 19:50:33 +00:00
David Majnemer	00940fb854	Make MDNode::intersect faster than O(n * m) It is pretty easy to get it down to O(nlogn + mlogm). This implementation has the added benefit of automatically deduplicating entries between the two sets. llvm-svn: 278837	2016-08-16 18:48:37 +00:00
David Majnemer	fa0f1e660b	Don't passively concatenate MDNodes I have audited all the callers of concatenate and none require duplicate entries to service concatenation. These duplicates serve no purpose but to needlessly embiggen the IR. N.B. Layering getMostGenericAliasScope on top of concatenate makes it O(nlogn + mlogm) instead of O(n*m). llvm-svn: 278836	2016-08-16 18:48:34 +00:00
Gor Nishanov	74309fa014	[Coroutines] Part 7: Split coroutine into subfunctions Summary: This patch adds simple coroutine splitting logic to CoroSplit pass. Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) ... 7. Split coroutine into subfunctions <= we are here 8. Coroutine Frame Building algorithm 9. Handle coroutine with unwinds 10+. The rest of the logic Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23461 llvm-svn: 278830	2016-08-16 18:04:14 +00:00
Simon Dardis	4893aff94e	[mips] Enforce compact branch restrictions Check both operands for use of the $zero register which cannot be used with a compact branch instruction. Reviewers: dsanders, vkalintris Differential Review: https://reviews.llvm.org/D23547 llvm-svn: 278824	2016-08-16 17:16:11 +00:00
Wolfgang Pieb	8df58f48dd	When the inline spiller rematerializes an instruction, take the debug location from the instruction that immediately follows the rematerialization point. Patch by Andrea DiBiagio. Differential Revision: http://reviews.llvm.org/D23539 llvm-svn: 278822	2016-08-16 17:12:50 +00:00
Wei Mi	db68c9adbd	Remove a stale comment from the test, NFC. llvm-svn: 278821	2016-08-16 16:57:15 +00:00
Vitaly Buka	1ce73ef11c	[Asan] Unpoison red zones even if use-after-scope was disabled with runtime flag Summary: PR27453 Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23481 llvm-svn: 278818	2016-08-16 16:24:10 +00:00
Ahmed Bougacha	e4c03abddd	[AArch64][GlobalISel] Select G_MUL. llvm-svn: 278810	2016-08-16 14:37:46 +00:00
Brendon Cahoon	65b6ebccad	[Pipeliner] Fix an asssert due to invalid Phi in the epilog The pipeliner was generating an invalid Phi name for an operand in the epilog block, which caused an assert in the live variable analysis pass. The fix is to the code that generates new Phis in the epilog block. In this case, there is an existing Phi that needs to be reused rather than creating a new Phi instruction. Differential Revision: https://reviews.llvm.org/D23513 llvm-svn: 278805	2016-08-16 14:29:24 +00:00
Ahmed Bougacha	2ac5bf94bc	[AArch64][GlobalISel] Select (variable) shifts. For now, no support for immediates. llvm-svn: 278804	2016-08-16 14:02:47 +00:00
Ahmed Bougacha	7e508a8fcd	[AArch64][GlobalISel] Robustize select tests. NFC. Using the same register means nothing was checking for operand order. llvm-svn: 278803	2016-08-16 14:02:44 +00:00
Ahmed Bougacha	0306b5ef07	[AArch64][GlobalISel] Select p0 G_FRAME_INDEX. And mark it as legal. llvm-svn: 278802	2016-08-16 14:02:42 +00:00
Simon Pilgrim	25d2506029	[X86][AVX] Fixed typo in zero element insertion llvm-svn: 278798	2016-08-16 13:33:33 +00:00
Ron Lieberman	a481c7db93	[Hexagon] Improve test to check for @PCREL, only run llc, not opt -> llc. llvm-svn: 278796	2016-08-16 13:10:09 +00:00
Simon Pilgrim	cc316f013a	[X86][SSE] Add support for combining v2f64 target shuffles to VZEXT_MOVL byte rotations The combine was only matching v2i64 as it assumed lowering to MOVQ - but we have v2f64 patterns that match in a similar fashion llvm-svn: 278794	2016-08-16 12:52:06 +00:00
Simon Pilgrim	d2d3202532	[X86][AVX512BW] Updated tests to demonstrate AVX512BW's inability to vectorize v64i8 shifts llvm-svn: 278790	2016-08-16 11:05:47 +00:00
Prakhar Bahuguna	a27c4a0e66	Correct the upper bound for a CBZ/CBNZ branch target. Summary: Fix for the upper bound check that was causing a build failure. Reviewers: olista01, rengolin, t.p.northover Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23501 llvm-svn: 278789	2016-08-16 10:41:56 +00:00
Prakhar Bahuguna	15ed7ec5aa	[Thumb] Validate branch target for CBZ/CBNZ instructions. Summary: The assembler currently does not check the branch target for CBZ/CBNZ instructions, which only permit branching forwards with a positive offset. This adds validation for the branch target to ensure negative PC-relative offsets are not encoded into the instruction, whether specified as a literal or as an assembler symbol. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D23312 llvm-svn: 278788	2016-08-16 10:41:52 +00:00
Simon Pilgrim	f16cd361d4	[X86][SSE] Add support for combining target shuffles to PALIGNR byte rotations llvm-svn: 278787	2016-08-16 10:03:23 +00:00
Guy Blank	722caebdae	[X86] Add xgetbv/xsetbv intrinsics to non-windows platforms Differential Revision: https://reviews.llvm.org/D21958 llvm-svn: 278782	2016-08-16 06:41:00 +00:00
David Majnemer	5c5df6283a	[InstSimplify] Fold gep (gep V, C), (xor V, -1) to C-1 llvm-svn: 278779	2016-08-16 06:13:46 +00:00
Sanjay Patel	46a68ba618	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278768	2016-08-16 00:48:38 +00:00
Sanjay Patel	f1bf21c56b	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278765	2016-08-16 00:27:12 +00:00
Teresa Johnson	c44a12244f	[ThinLTO] Fix temp file dumping, enable via llvm-lto and test it Summary: Fixed a bug in ThinLTOCodeGenerator's temp file dumping. The Twine needs to be passed directly as an argument, or a copy saved into a std::string. It doesn't seem there are any consumers of this, so I added a new option to llvm-lto to enable saving of temp files during ThinLTO, and augmented a test to use it to check post-import but pre-opt bitcode. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23525 llvm-svn: 278761	2016-08-15 23:24:57 +00:00
Reid Kleckner	a7b04a589e	Don't use %llc_dwarf with -mtriple, they don't combine llvm-svn: 278758	2016-08-15 22:54:26 +00:00
Sanjay Patel	df77a4dbb0	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278757	2016-08-15 22:43:52 +00:00
Wolfgang Pieb	12cd6ddef3	Adding the triple for test comitted with r278703. llvm-svn: 278755	2016-08-15 22:39:39 +00:00
Mike Aizatsky	5086417e9f	[sancov] extracting AArch64 test to a separate file. llvm-svn: 278754	2016-08-15 22:30:37 +00:00
Sanjay Patel	41520e1712	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278751	2016-08-15 21:47:50 +00:00
Eli Friedman	98151d6440	Fix typo in lowering for fp128 ueq. Regression from r259791. Differential Revision: https://reviews.llvm.org/D23374 llvm-svn: 278750	2016-08-15 21:46:19 +00:00
Jan Vesely	0486f739a4	AMDGPU/R600: Convert buffer id to VTX_READ input Use patterns instead of multiple instructions Add buffer id to asm string https://reviews.llvm.org/D22650 llvm-svn: 278749	2016-08-15 21:38:30 +00:00
Hemant Kulkarni	533aa25e1c	Really fix the issue with 502957cc9cf805dc6093950e8cdcd0db4969d933. Windows %p and FileCheck limitations makes the test linux only llvm-svn: 278748	2016-08-15 21:38:23 +00:00
Sanjay Patel	638b613101	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278747	2016-08-15 21:37:24 +00:00
Tim Northover	28fdc4272d	GlobalISel: support loads and stores of strange types. Before we mischaracterized structs and i1 types as a scalar with size 0 in various ways. llvm-svn: 278744	2016-08-15 21:13:17 +00:00
Teresa Johnson	ad71543972	Remove unnecessary flag from new test Remove -disable-inlining flag that snuck into the test I added for r278739. It doesn't have an effect in ThinLTO mode (something that should be fixed), but in any case the checks depend on inlining currently. llvm-svn: 278743	2016-08-15 21:07:57 +00:00
Sanjay Patel	55d87a88cc	update tests to use FileCheck and exact checking llvm-svn: 278741	2016-08-15 21:02:25 +00:00
Sanjoy Das	78db2963f6	Revert "[ValueTracking] Improve ValueTracking on left shift with nsw flag" This reverts commit r278172. It causes PR28946. llvm-svn: 278740	2016-08-15 21:01:31 +00:00
Teresa Johnson	6107a4195d	[ThinLTO] Remove functions resolved to available_externally from comdats Summary: thinLTOResolveWeakForLinkerModule needs to drop any preempted weak symbols that were converted to available_externally from comdats, otherwise we will get a verification failure (since available_externally is a declaration for the linker, and no declarations can be in a comdat). Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23015 llvm-svn: 278739	2016-08-15 21:00:04 +00:00
Sanjay Patel	3e9acec2fa	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278737	2016-08-15 20:56:11 +00:00
Hemant Kulkarni	5b140cdd17	Fix a test that failed due to: https://llvm.org/svn/llvm-project/llvm/trunk@278725 91177308-0d34-0410-b5e6-96231b3b80d8 llvm-svn: 278732	2016-08-15 20:36:16 +00:00
David L Kreitzer	7fe18251a5	Enhance SCEV to compute the trip count for some loops with unknown stride. Patch by Pankaj Chawla Differential Revision: https://reviews.llvm.org/D22377 llvm-svn: 278731	2016-08-15 20:21:41 +00:00
Sanjay Patel	b37bd6d7b7	[InstCombine] add test for missing vector icmp fold llvm-svn: 278727	2016-08-15 20:02:40 +00:00
Sanjay Patel	7d98be81cc	[InstCombine] add tests for vector icmp folds llvm-svn: 278726	2016-08-15 19:58:21 +00:00
Hemant Kulkarni	8dfc0b5541	llvm-objdump: Implement source[line numbers] interleaving Differential Revsion: https://reviews.llvm.org/D22932 llvm-svn: 278725	2016-08-15 19:49:24 +00:00
Sanjay Patel	b860859611	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278717	2016-08-15 19:16:33 +00:00
Sanjay Patel	6866b82a05	update test to use FileCheck and autogenerated checks llvm-svn: 278714	2016-08-15 18:56:10 +00:00
Reid Kleckner	bb8652312a	Fix WAsm test after LSR change in r278658 Now the increment is done in a different location llvm-svn: 278713	2016-08-15 18:51:42 +00:00
Matthias Braun	b948c52416	Revert "[Thumb] Validate branch target for CBZ/CBNZ instructions." This currently breaks the greendragon clang-stage1-configure-RA/ and brotli. It is probably just uncovering a pre-existing problem. Reverting temporarily to get the buildbots green again. A reduced testcase will follow shortly. This reverts commit r278659. llvm-svn: 278711	2016-08-15 18:50:13 +00:00
Sanjay Patel	2044a8eba9	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278709	2016-08-15 18:45:10 +00:00
Sanjay Patel	aaf34d1bfc	[InstCombine] add test for missing vector icmp fold llvm-svn: 278708	2016-08-15 18:39:54 +00:00
Sanjay Patel	195eb9340a	minimize test llvm-svn: 278707	2016-08-15 18:35:44 +00:00
Sanjay Patel	3f506daf8c	remove unnecessary IR comments about uses llvm-svn: 278705	2016-08-15 18:32:50 +00:00
Sanjay Patel	d391b0d69e	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278704	2016-08-15 18:26:56 +00:00
Wolfgang Pieb	dfad9b20c9	Local variables whose address is taken and passed on to a call are described in debug info using their stack slots instead of as an indirection of param reg + 0 offset. This is done by detecting FrameIndexSDNodes in SelectionDAG and generating FrameIndexDbgValues for them. This ultimately generates DBG_VALUEs with stack location operands. Differential Revision: http://reviews.llvm.org/D23283 llvm-svn: 278703	2016-08-15 18:18:26 +00:00
Sanjay Patel	cbd62a082c	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278689	2016-08-15 17:55:39 +00:00
Sanjay Patel	566b348987	[InstCombine] auto-generate exact checks Note that several of these tests belong in InstSimplify rather than InstCombine because they return existing operands or constants. llvm-svn: 278684	2016-08-15 17:19:07 +00:00
Sanjay Patel	a7b9bb3785	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278683	2016-08-15 17:10:35 +00:00
Matt Arsenault	3661e90e71	AMDGPU: Don't fold subregister extracts into tied operands llvm-svn: 278676	2016-08-15 16:18:36 +00:00
Reid Kleckner	70a600b8bb	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r278660. It causes downstream assertion failure in InstCombine on shuffle instructions. Comes up in __mm_swizzle_epi32. llvm-svn: 278672	2016-08-15 15:42:31 +00:00
Valery Pykhtin	c761675ef4	[AMDGPU] fix failure on printing of non-existing instruction operands. Differential revision: https://reviews.llvm.org/D23323 llvm-svn: 278665	2016-08-15 10:56:48 +00:00
James Molloy	9a3c82f5cf	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 278660	2016-08-15 08:04:56 +00:00
Prakhar Bahuguna	a305a435a6	[Thumb] Validate branch target for CBZ/CBNZ instructions. Summary: The assembler currently does not check the branch target for CBZ/CBNZ instructions, which only permit branching forwards with a positive offset. This adds validation for the branch target to ensure negative PC-relative offsets are not encoded into the instruction, whether specified as a literal or as an assembler symbol. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D23312 llvm-svn: 278659	2016-08-15 07:57:44 +00:00
James Molloy	196ad0823e	[LSR] Don't try and create post-inc expressions on non-rotated loops If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs! Motivating testcase: void f(float a, float b, float c, int n) { while (n-- > 0) c++ = a++ + b++; } It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage. llvm-svn: 278658	2016-08-15 07:53:03 +00:00
Sanjay Patel	52fe9ae990	[InstCombine] add test for missing vector icmp fold llvm-svn: 278639	2016-08-14 22:56:46 +00:00
Sanjay Patel	7e57b00274	[InstCombine] add tests for vector icmp folds llvm-svn: 278637	2016-08-14 22:44:10 +00:00
Sanjay Patel	8554f70c07	[InstCombine] add test for potentially missing vector icmp fold llvm-svn: 278636	2016-08-14 22:30:07 +00:00
Sanjay Patel	beebe05af1	[InstCombine] add test for missing vector icmp fold llvm-svn: 278635	2016-08-14 22:29:27 +00:00
Sanjay Patel	ba1f9fbddc	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278634	2016-08-14 22:28:50 +00:00
Sanjay Patel	f6559404d5	[InstCombine] remove unnecessary function attributes from tests llvm-svn: 278633	2016-08-14 21:48:21 +00:00
Sanjay Patel	b44ca3bfa9	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278632	2016-08-14 21:36:22 +00:00
Sanjay Patel	bbb3dffd0a	[InstCombine] add test for missing vector icmp fold llvm-svn: 278631	2016-08-14 21:05:08 +00:00
Sanjay Patel	66a3457a4c	[InstCombine] add test for missing vector icmp fold llvm-svn: 278630	2016-08-14 20:39:42 +00:00
Igor Breger	505f2cc468	[AVX512] Fix VFPCLASSSD/VFPCLASSSS intrinsic lowering. The i1 result should be zero extended according to SPEC. Differential Revision: http://reviews.llvm.org/D23489 llvm-svn: 278626	2016-08-14 13:58:57 +00:00
Igor Breger	6fc00b0acf	autogenerate checks llvm-svn: 278624	2016-08-14 09:34:39 +00:00
Igor Breger	8672408db0	[AVX512] Fix insertelement i1 lowering. 1. Use shuffle to insert element i1 into vector. The previous implementation was incorrect ( dest_bit OR src_bit , it doesn't clear the bit if src_bit=0 ) 2. Improve shuffle i1 vector, use CVT2MASK if supported instead TRUNCATE. Differential Revision: http://reviews.llvm.org/D23347 llvm-svn: 278623	2016-08-14 05:25:07 +00:00
Diana Picus	68be1eb885	Revert "CodeGen: If Convert blocks that would form a diamond when tail-merged." This reverts commit r278287. This commit broke the clang-cmake-thumbv7-a15-full-sh bot. See https://llvm.org/bugs/show_bug.cgi?id=28949 llvm-svn: 278621	2016-08-14 02:10:18 +00:00
Diana Picus	35ccf53e75	Revert "Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough." This reverts commit r278288. r278287 broke the clang-cmake-thumbv7-a15-full-sh bot. Revert this so we can get to r278287. llvm-svn: 278620	2016-08-14 02:10:12 +00:00
Sanjoy Das	2143447c73	[IRCE] Create llvm::Loop instances for cloned out loops llvm-svn: 278618	2016-08-14 01:04:46 +00:00
Sanjoy Das	7a18a238c6	[IRCE] Don't iterate on loops that were cloned out IRCE has the ability to further version pre-loops and post-loops that it created, but this isn't useful at all. This change teaches IRCE to leave behind some metadata in the loops it creates (by cloning the main loop) so that these new loops are not re-processed by IRCE. Today this bug is hidden by another bug -- IRCE does not update LoopInfo properly so the loop pass manager does not re-invoke IRCE on the loops it split out. However, once the latter is fixed the bug addressed in this change causes IRCE to infinite-loop in some cases (e.g. it splits out a pre-loop, a pre-pre-loop from that, a pre-pre-pre-loop from that and so on). llvm-svn: 278617	2016-08-14 01:04:36 +00:00
Mehdi Amini	a71002e7f1	Fix bitcode auto-upgrade when using bitcode lazy loading The auto-upgrade path could be called before the VST (global names) was fully parsed, and thus intrinsic names were not available and the autoupgrade logic could not operate. Fix link failures with ThinLTO. This is a recommit of r278610 with a different fix. llvm-svn: 278615	2016-08-14 00:01:27 +00:00
Ron Lieberman	822ee88ab8	Fix unsupported relocation type R_HEX_6_X' for symbol .rodata LowerTargetConstantPool is not properly setting the TargetFlag to indicate desired relocation. Coding error, the offset parameter was omitted, so the TargetFlag was used as the offset, and the TargetFlag defaulted to zero. This only affects -fpic compilation, and only those items created in a Constant Pool, for example a vector of constants. Halide ran into this issue. llvm-svn: 278614	2016-08-13 23:41:11 +00:00
Mehdi Amini	466a64e298	Revert "Fix bitcode auto-upgrade when using bitcode lazy loading" This reverts commit r278610. Tests are broken llvm-svn: 278613	2016-08-13 23:39:14 +00:00
Sanjoy Das	1b1272f515	[IRCE] Fix test case; NFC The (negative) test case is supposed to check that IRCE does not muck with range checks it cannot handle, not that it does the right thing in the absence of profiling information. llvm-svn: 278612	2016-08-13 23:36:40 +00:00
Sanjoy Das	2a2f14d7ab	[IRCE] Be resilient in the face of non-simplified loops Loops containing `indirectbr` may not be in simplified form, even after running LoopSimplify. Reject then gracefully, instead of tripping an assert. llvm-svn: 278611	2016-08-13 23:36:35 +00:00
Mehdi Amini	e62aaf2303	Fix bitcode auto-upgrade when using bitcode lazy loading The auto-upgrade path could be called before the VST (global names) was fully parsed, and thus intrinsic names were not available and the autoupgrade logic could not operate. Fix link failures with ThinLTO. llvm-svn: 278610	2016-08-13 23:31:53 +00:00
Mehdi Amini	8c629ecf3a	Revert "Revert "Invariant start/end intrinsics overloaded for address space"" This reverts commit 32fc6488e48eafc0ca1bac1bd9cbf0008224d530. llvm-svn: 278609	2016-08-13 23:31:24 +00:00
Mehdi Amini	164ac651da	Revert "Invariant start/end intrinsics overloaded for address space" This reverts commit r276447. llvm-svn: 278608	2016-08-13 23:27:32 +00:00
Mehdi Amini	7079efe1ce	Add missing REQUIRES in sancov/print_coverage_pcs.test: it requires aarch64 as well now llvm-svn: 278601	2016-08-13 19:44:02 +00:00
Sanjay Patel	08c876673e	[x86] add tests to show missed 64-bit immediate merging Tests are slightly modified versions of those written by Sunita Marathe in D23391. llvm-svn: 278599	2016-08-13 18:42:14 +00:00
Craig Topper	3f8126e6fa	[AVX-512] Remove an AddedComplexity that was prioritizing basic vzmovl patterns over more complex ones that produce better code. llvm-svn: 278593	2016-08-13 05:43:20 +00:00
Craig Topper	600685d510	[AVX-512] Add patterns to support VZEXT_MOVL from 512-bit vectors with 64-bit and 32-bit elements. Fixes PR28961. llvm-svn: 278592	2016-08-13 05:33:12 +00:00
Teresa Johnson	1eca6bc6a7	[PM] Port LoopDataPrefetch to new pass manager Summary: Refactor the existing support into a LoopDataPrefetch implementation class and a LoopDataPrefetchLegacyPass class that invokes it. Add a new LoopDataPrefetchPass for the new pass manager that utilizes the LoopDataPrefetch implementation class. Reviewers: mehdi_amini Subscribers: sanjoy, mzolotukhin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23483 llvm-svn: 278591	2016-08-13 04:11:27 +00:00
Matt Arsenault	3cc1e0066d	AMDGPU: Fix missing test for addressing mode with odd offsets Add test if the constant offset looks unaligned. llvm-svn: 278589	2016-08-13 01:43:51 +00:00
Sanjoy Das	3502511548	[IndVars] Ignore (s\|z)exts that don't extend the induction variable `IVVisitor::visitCast` used to have the invariant that if the instruction it was passed was a sext or zext instruction, the result of the instruction would be wider than the induction variable. This is no longer true after rL275037, so this change teaches `IndVarSimplify` s implementation of `IVVisitor::visitCast` to work with the relaxed invariant. A corresponding change to SimplifyIndVar to preserve the said invariant after rL275037 would also work, but given how `IVVisitor::visitCast` is spelled (no indication of said invariant), I figured the current fix is cleaner. Fixes PR28935. llvm-svn: 278584	2016-08-13 00:58:31 +00:00
Dominic Chen	4a9b99ee92	[WebAssembly] Re-enable disabled debug value test Summary: This test was resulting in asan/valgrind failures due to undefined DWARF register mappings for WebAssembly, and was disabled in r278495. These have been resolved. Reviewers: sunfish, dschuff Subscribers: bkramer, llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D23459 llvm-svn: 278576	2016-08-12 23:14:18 +00:00
Haicheng Wu	7c4535d1e7	Reapply [BranchFolding] Restrict tail merging loop blocks after MBP Fixed a bug in the test case. To fix PR28104, this patch restricts tail merging to blocks that belong to the same loop after MBP. llvm-svn: 278575	2016-08-12 23:13:38 +00:00
Tim Shen	c9c0d2dcb5	[LoopVectorize] Detect loops in the innermost loop before creating InnerLoopVectorizer InnerLoopVectorizer shouldn't handle a loop with cycles inside the loop body, even if that cycle isn't a natural loop. Fixes PR28541. Differential Revision: https://reviews.llvm.org/D22952 llvm-svn: 278573	2016-08-12 22:47:13 +00:00
Reid Kleckner	6ee00a2602	[Inliner] Don't treat inalloca allocas as static They aren't static, and moving them to the entry block across something else will only result in tears. Root cause of http://crbug.com/636558. llvm-svn: 278571	2016-08-12 22:23:04 +00:00
Artem Belevich	2f0a3dfe64	[NVPTX] Use untyped (.b) integer registers in PTX. This bring LLVM-generated PTX closer to what nvcc generates and avoids triggering issues in ptxas. For instance, ptxas does not accept .s16 (or .u16) registers as operands for .fp16 instructions. Differential Revision: https://reviews.llvm.org/D23460 llvm-svn: 278568	2016-08-12 22:02:19 +00:00
Eli Friedman	f184e4befc	[AArch64LoadStoreOptimizer] Check aliasing correctly when creating paired loads/stores. The existing code accidentally skipped the aliasing check in edge cases. Differential revision: https://reviews.llvm.org/D23372 llvm-svn: 278562	2016-08-12 20:39:51 +00:00
Mike Aizatsky	f4fdb5ddf3	[AArch64] Registering default MCInstrAnalysis Even in this form it is useful: it can detect branch instructions. https://github.com/google/sanitizers/issues/706 Subscribers: aemerson, rengolin Differential Revision: https://reviews.llvm.org/D23426 llvm-svn: 278560	2016-08-12 20:28:05 +00:00
Eli Friedman	8585e9d33d	[AArch64LoadStoreOpt] Handle offsets correctly for post-indexed paired loads. Trunk would try to create something like "stp x9, x8, [x0], #512", which isn't actually a valid instruction. Differential revision: https://reviews.llvm.org/D23368 llvm-svn: 278559	2016-08-12 20:28:02 +00:00
Kevin Enderby	c614d283b7	Next set of additional error checks for invalid Mach-O files. This contains the two missing checks for LC_SEGMENT load command fields. And checks for the Mach-O sections fields that would make them invalid. With the new checks, some of the existing malformed file checks now trips one of these instead of the issue it was having before so those tests were adjusted. llvm-svn: 278557	2016-08-12 20:10:25 +00:00
Mike Aizatsky	17a907588c	[sancov] test file cleanup llvm-svn: 278556	2016-08-12 20:06:32 +00:00
Mike Aizatsky	3c4d60ad89	[sancov] MachO indirect symbols support. Differential Revision: https://reviews.llvm.org/D23338 llvm-svn: 278551	2016-08-12 19:25:59 +00:00
Michael Kuperstein	31b8399beb	[PM] Port LowerInvoke to the new pass manager llvm-svn: 278531	2016-08-12 17:28:27 +00:00
Dehao Chen	c0a1e432c7	Fine tuning of sample profile propagation algorithm. Summary: The refined propagation algorithm is more accurate and robust. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23224 llvm-svn: 278522	2016-08-12 16:22:12 +00:00
Artur Pilipenko	87e4038a91	[x86] X86ISelLowering zext(add_nuw(x, C)) --> add(zext(x), C_zext) Currently X86ISelLowering has a similar transformation for sexts: sext(add_nsw(x, C)) --> add(sext(x), C_sext) In this change I extend this code to handle zexts as well. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D23359 llvm-svn: 278520	2016-08-12 16:08:30 +00:00
Artur Pilipenko	2e8f82d962	[LVI] Take guards into account Teach LVI to gather control dependant constraints from guards. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D23358 llvm-svn: 278518	2016-08-12 15:52:23 +00:00
Artur Pilipenko	b623088abe	[LVI] Fix potential memory corruption in getValueFromCondition Rewrite Visited[Cond] = getValueFromConditionImpl(..., Visited) statement which can lead to a memory corruption since getValueFromConditionImpl changes Visited map and invalidates the iterators. llvm-svn: 278514	2016-08-12 15:08:15 +00:00
James Y Knight	2cc9da9a65	Revert "[Sparc] Leon errata fix passes." ...and the two followup commits: Revert "[Sparc][Leon] Missed resetting option flags from check-in 278489." Revert "[Sparc][Leon] Errata fixes for various errata in different versions of the Leon variants of the Sparc 32 bit processor." This reverts commit r274856, r278489, and r278492. llvm-svn: 278511	2016-08-12 14:48:09 +00:00
Teresa Johnson	4223dd8559	[PM] Port NameAnonFunction pass to new pass manager Summary: Port the NameAnonFunction pass and add a test. Depends on D23439. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23440 llvm-svn: 278509	2016-08-12 14:03:36 +00:00
Teresa Johnson	f93b246f8b	[PM] Port ModuleSummaryIndex analysis to new pass manager Summary: Port the ModuleSummaryAnalysisWrapperPass to the new pass manager. Use it in the ported BitcodeWriterPass (similar to how we use the legacy ModuleSummaryAnalysisWrapperPass in the legacy WriteBitcodePass). Also, pass the -module-summary opt flag through to the new pass manager pipeline and through to the bitcode writer pass, and add a test that uses it. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23439 llvm-svn: 278508	2016-08-12 13:53:02 +00:00
Simon Pilgrim	687d71e877	[X86][SSE] Add support for combining target shuffles to PSLLDQ/PSRLDQ byte shifts llvm-svn: 278502	2016-08-12 11:24:34 +00:00
Artur Pilipenko	6669f253d5	[LVI] Take range metadata into account while calculating icmp condition constraints Take range metadata into account for conditions like this: %length = load i32, i32* %length_ptr, !range !{i32 0, i32 2147483647} %cmp = icmp ult i32 %a, %length This is a common pattern for range checks where the length of the array is dynamically loaded. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D23267 llvm-svn: 278496	2016-08-12 10:14:11 +00:00
Benjamin Kramer	05e760ec4b	[Webassembly] disable unstable test. It reads uninitialized memory and crashes randomly. llvm-svn: 278495	2016-08-12 10:13:45 +00:00
Simon Pilgrim	ed96b9adfb	[X86][SSE] Fixed PALIGNR target shuffle decode The PALIGNR target shuffle decode was not taking into account that DecodePALIGNRMask (rather oddly) expects the operands to be in reverse order, nor was it detecting unary patterns, causing combines to combine with the incorrect input. The cgbuiltin, auto upgrade and instruction comments code correctly swap the operands so are not affected. llvm-svn: 278494	2016-08-12 10:10:51 +00:00
Artur Pilipenko	635625855f	[LVI] Handle any predicate in comparisons like icmp <pred> (add Val, Offset), ... Currently LVI can only gather value constraints from comparisons like: * icmp <pred> Val, ... * icmp ult (add Val, Offset), ... In fact we can handle any predicate in latter comparisons. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D23357 llvm-svn: 278493	2016-08-12 10:05:11 +00:00
Chris Dewhurst	829f8efe55	[Sparc][Leon] Errata fixes for various errata in different versions of the Leon variants of the Sparc 32 bit processor. The nature of the errata are listed in the comments preceding the errata fix passes. Relevant unit tests are implemented for each of these. These changes update older versions of these errata fixes with improvements to code and unit tests. Differential Revision: https://reviews.llvm.org/D21960 llvm-svn: 278489	2016-08-12 09:34:26 +00:00
Haicheng Wu	d9cbb1608f	Revert "[BranchFolding] Restrict tail merging loop blocks after MBP" This reverts commit r278463 because it hits the bot. llvm-svn: 278484	2016-08-12 08:40:24 +00:00
Gor Nishanov	0f303accde	[Coroutines]: Part6b: Add coro.id intrinsic. Summary: 1. Make coroutine representation more robust against optimization that may duplicate instruction by introducing coro.id intrinsics that returns a token that will get fed into coro.alloc and coro.begin. Due to coro.id returning a token, it won't get duplicated and can be used as reliable indicator of coroutine identify when a particular coroutine call gets inlined. 2. Move last three arguments of coro.begin into coro.id as they will be shared if coro.begin will get duplicated. 3. doc + test + code updated to support the new intrinsic. Reviewers: mehdi_amini, majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23412 llvm-svn: 278481	2016-08-12 05:45:49 +00:00
Wei Mi	7e103d92cc	Recommit 'Remove the restriction that MachineSinking is now stopped by "insert_subreg, subreg_to_reg, and reg_sequence" instructions' after adjusting some unittest checks. This is to solve PR28852. The restriction was added at 2010 to make better register coalescing. We assumed that it was not necessary any more. Testing results on x86 supported the assumption. We will look closely to any performance impact it will bring and will be prepared to help analyzing performance problem found on other architectures. Differential Revision: https://reviews.llvm.org/D23210 llvm-svn: 278466	2016-08-12 03:33:22 +00:00
Haicheng Wu	ea02372059	[BranchFolding] Restrict tail merging loop blocks after MBP To fix PR28014, this patch restricts tail merging to blocks that belong to the same loop after MBP. Differential Revision: https://reviews.llvm.org/D23191 llvm-svn: 278463	2016-08-12 03:30:23 +00:00
Eli Friedman	a6707f56b5	[DSE] Don't remove stores made live by a call which unwinds. Issue exposed by noalias or more aggressive alias analysis. Fixes http://llvm.org/PR25422. Differential revision: https://reviews.llvm.org/D21007 llvm-svn: 278451	2016-08-12 01:09:53 +00:00
Piotr Padlewski	332b3b2210	Don't import variadic functions Summary: This patch adds IsVariadicFunction bit to summary in order to not import variadic functions. Inliner doesn't inline variadic functions because it is hard to reason about it. This one small fix improves Importer by about 16% (going from 86% to 100% of imported functions that are inlined anywhere) on some spec benchmarks like 'int' and others. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23339 llvm-svn: 278432	2016-08-11 22:13:57 +00:00
Vyacheslav Klochkov	6daefcf626	X86-FMA3: Implemented commute transformation for EVEX/AVX512 FMA3 opcodes. This helped to improved memory-folding and register coalescing optimizations. Also, this patch fixed the tracker #17229. Reviewer: Craig Topper. Differential Revision: https://reviews.llvm.org/D23108 llvm-svn: 278431	2016-08-11 22:07:33 +00:00
Tim Northover	8e0c53a018	GlobalISel: support 'null' constant in translation. It's sharing the integer G_CONSTANT for now since I don't think it creates any ambiguity (even on weird archs). If that turns out wrong we can create a G_PTRCONSTANT or something. llvm-svn: 278423	2016-08-11 21:40:55 +00:00
Ehsan Amiri	dbcfea9811	Extend trip count instead of truncating IV in LFTR, when legal When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because (1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant). (2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change). I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere. To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence. This commit contains changes in a newly added testcase which was not included in the previous commit (which was reverted later on). https://reviews.llvm.org/D23075 llvm-svn: 278421	2016-08-11 21:31:40 +00:00
Krzysztof Parzyszek	1b689da04e	[Hexagon] Allow non-returning calls in hardware loops llvm-svn: 278416	2016-08-11 21:14:25 +00:00
Geoff Berry	d01828096f	[SCEV] Update interface to handle SCEVExpander insert point motion. Summary: This is an extension of the fix in r271424. That fix dealt with builder insert points being moved by SCEV expansion, but only for the lifetime of the expand call. This change modifies the interface so that LSR can safely call expand multiple times at the same insert point and do the right thing if one of the expansions decides to move the original insert point. This is a fix for PR28719. Reviewers: sanjoy Subscribers: llvm-commits, mcrosier, mzolotukhin Differential Revision: https://reviews.llvm.org/D23342 llvm-svn: 278413	2016-08-11 21:05:17 +00:00
Tim Northover	da6f5f2d0a	Remove empty file left by partial reversion. llvm-svn: 278411	2016-08-11 21:01:15 +00:00
Tim Northover	30e67ce793	GlobalISel: add translation support for shift operations. llvm-svn: 278410	2016-08-11 21:01:13 +00:00
Tim Northover	f1f7bf1279	GlobalISel: support zext & sext during translation phase. llvm-svn: 278409	2016-08-11 21:01:10 +00:00
Daniel Berlin	2698cbb4f1	Move GVNHoist tests into their own directory since it is a separate pass llvm-svn: 278404	2016-08-11 20:35:07 +00:00
Wei Ding	70cda07526	AMDGPU : Add intrinsic for instruction v_cvt_pk_u8_f32 Differential Revision: http://reviews.llvm.org/D23336 llvm-svn: 278403	2016-08-11 20:34:48 +00:00
Wei Mi	3ab5816000	Revert rL278384 which caused several buildbot failures (like check failures in CodeGen/X86/clz.ll). llvm-svn: 278402	2016-08-11 20:33:37 +00:00
Daniel Berlin	f75fd1b58b	Fix PR 28933 Summary: This fixes PR 28933 by making sure GVNHoist does not try to recreate memory accesses when it has not actually moved them. Reviewers: sebpop Subscribers: llvm-commits, george.burgess.iv Differential Revision: https://reviews.llvm.org/D23411 llvm-svn: 278401	2016-08-11 20:32:43 +00:00
Ivan Krasin	f3403fd2c8	WholeProgramDevirt: generate more detailed and accurate remarks. Summary: Keep track of all methods for which we have devirtualized at least one call and then print them sorted alphabetically. That allows to avoid duplicates and also makes the order deterministic. Add optimization names into the remarks, so that it's easier to understand how has each method been devirtualized. Fix a bug when wrong methods could have been reported for tryVirtualConstProp. Reviewers: kcc, mehdi_amini Differential Revision: https://reviews.llvm.org/D23297 llvm-svn: 278389	2016-08-11 19:09:02 +00:00
Wei Mi	ec19b35179	Remove the restriction that MachineSinking is now stopped by "insert_subreg, subreg_to_reg, and reg_sequence" instructions. This is to solve PR28852. The restriction was added at 2010 to make better register coalescing. We assumed that it was not necessary any more. Testing results on x86 supported the assumption. We will look closely to any performance impact it will bring and will be prepared to help analyzing performance problem found on other architectures. Differential Revision: https://reviews.llvm.org/D23210 llvm-svn: 278384	2016-08-11 18:42:56 +00:00
Krzysztof Parzyszek	a003b76391	If-conversion incorrectly calculates liveness of redefined registers Differential Revision: https://reviews.llvm.org/D23207 llvm-svn: 278383	2016-08-11 18:42:06 +00:00
Andrew Kaylor	7cdf01ef58	Target independent codesize heuristics for Loop Idiom Recognition Patch by Sunita Marathe Differential Revision: https://reviews.llvm.org/D21449 llvm-svn: 278378	2016-08-11 18:28:33 +00:00
Krzysztof Parzyszek	60f0b51485	[Hexagon] Skip byval arguments when checking parameter attributes From the point of view of register assignment, byval parameters are ignored: a byval parameter is not going to be assigned to a register, and it will not affect the assignments of subsequent parameters. When matching registers with parameters in the bit tracker, make sure to skip byval parameters before advancing the registers. llvm-svn: 278375	2016-08-11 18:15:16 +00:00
Dominic Chen	6ba19659cb	Improve virtual register handling when computing debug information Summary: Some backends, like WebAssembly, use virtual registers instead of physical registers. This crashes the DbgValueHistoryCalculator pass, which assumes that all registers are physical. Instead, skip virtual registers when iterating aliases, and assume that they are clobbered. Reviewers: dexonsmith, dschuff, aprantl Subscribers: yurydelendik, llvm-commits, jfb, sunfish Differential Revision: https://reviews.llvm.org/D22590 llvm-svn: 278371	2016-08-11 17:52:40 +00:00
Michael Kuperstein	e36d7716c3	Make TwoAddressInstructionPass::rescheduleMIBelowKill subreg-aware This fixes PR28824. Differential Revision: https://reviews.llvm.org/D23220 llvm-svn: 278370	2016-08-11 17:38:33 +00:00
Matt Arsenault	56684d4538	AMDGPU: Fix crashes on memory functions llvm-svn: 278369	2016-08-11 17:31:42 +00:00
Wei Ding	d3344378c6	AMDGPU : Fix SAD related instruction LIT tests function atttibute issues. Differential Revision: http://reviews.llvm.org/D23133 llvm-svn: 278360	2016-08-11 17:14:17 +00:00
Wei Ding	34e1753585	AMDGPU : Add LLVM intrinsics for SAD related instructions. Differential Revision: http://reviews.llvm.org/D23133 llvm-svn: 278354	2016-08-11 16:33:53 +00:00
Tim Northover	0d51044b69	GlobalISel: clear vreg mapping after translating each function Otherwise we only materialize (shared) constants in the first function they appear in. This doesn't go well. llvm-svn: 278351	2016-08-11 16:21:29 +00:00
Teresa Johnson	9ba95f99f3	Restore "Resolution-based LTO API." This restores commit r278330, with fixes for a few bot failures: - Fix a late change I had made to the save temps output file that I missed due to existing files sitting on my disk - Fix a bunch of Windows bot failures with "ambiguous call to overloaded function" due to confusion between llvm::make_unique vs std::make_unique (preface the new make_unique calls with "llvm::") - Attempt to fix a modules bot failure by adding a missing include to LTO/Config.h. Original change: Resolution-based LTO API. Summary: This introduces a resolution-based LTO API. The main advantage of this API over existing APIs is that it allows the linker to supply a resolution for each symbol in each object, rather than the combined object as a whole. This will become increasingly important for use cases such as ThinLTO which require us to process symbol resolutions in a more complicated way than just adjusting linkage. Patch by Peter Collingbourne. Reviewers: rafael, tejohnson, mehdi_amini Subscribers: lhames, tejohnson, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D20268 llvm-svn: 278338	2016-08-11 14:58:12 +00:00
Ehsan Amiri	3818f1b38a	revert 278334 llvm-svn: 278337	2016-08-11 14:51:14 +00:00
Valery Pykhtin	82c73bee2b	Revert "[AMDGPU] fix failure on printing of non-existing instruction operands." This reverts revision 278333, newly added test failed. llvm-svn: 278336	2016-08-11 14:22:05 +00:00
Ehsan Amiri	b9fcc2b171	Extend trip count instead of truncating IV in LFTR, when legal When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because (1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant). (2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change). I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere. To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence. https://reviews.llvm.org/D23075 llvm-svn: 278334	2016-08-11 13:51:20 +00:00
Valery Pykhtin	3048ff6ec3	[AMDGPU] fix failure on printing of non-existing instruction operands. Differential revision: https://reviews.llvm.org/D23323 llvm-svn: 278333	2016-08-11 13:49:46 +00:00
Teresa Johnson	cbf684e6c6	Revert "Resolution-based LTO API." This reverts commit r278330. I made a change to the save temps output that is causing issues with the bots. Didn't realize this because I had older output files sitting on disk in my test output directory. llvm-svn: 278331	2016-08-11 13:03:56 +00:00
Teresa Johnson	f99573b3ee	Resolution-based LTO API. Summary: This introduces a resolution-based LTO API. The main advantage of this API over existing APIs is that it allows the linker to supply a resolution for each symbol in each object, rather than the combined object as a whole. This will become increasingly important for use cases such as ThinLTO which require us to process symbol resolutions in a more complicated way than just adjusting linkage. Patch by Peter Collingbourne. Reviewers: rafael, tejohnson, mehdi_amini Subscribers: lhames, tejohnson, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D20268 Address review comments llvm-svn: 278330	2016-08-11 12:56:40 +00:00
Igor Breger	a77b14d02c	[AVX512] Fix extractelement i1 lowering. The previous implementation (not custom) doesn't enforce zeroing off upper bits. The assumption is that i1 PRODUCER (truncate and extractelement) must zero all upper bits, so i1 CONSUMER instructions ( test, zext, save, etc) can be done without additional zeroing. Make extractelement i1 lowering custom for all vector i1. Differential Revision: http://reviews.llvm.org/D23246 llvm-svn: 278328	2016-08-11 12:13:46 +00:00
Marina Yatsina	88f0c31f13	Avoid false dependencies of undef machine operands This patch helps avoid false dependencies on undef registers by updating the machine instructions' undef operand to use a register that the instruction is truly dependent on, or use a register with clearance higher than Pref. Pseudo example: loop: xmm0 = ... xmm1 = vcvtsi2sdl eax, xmm0<undef> ... = inst xmm0 jmp loop In this example, selecting xmm0 as the undef register creates false dependency between loop iterations. This false dependency cannot be solved by inserting an xor before vcvtsi2sdl because xmm0 is alive at the point of the vcvtsi2sdl instruction. Selecting a different register instead of xmm0, especially a register that is not used in the loop, will eliminate this problem. Differential Revision: https://reviews.llvm.org/D22466 llvm-svn: 278321	2016-08-11 07:32:08 +00:00
Amjad Aboud	b83e73bceb	[Debug Info] Added a LIT test that covers the fix committed in rL277290. Differential Revision: http://reviews.llvm.org/D23056 llvm-svn: 278320	2016-08-11 07:22:53 +00:00
Craig Topper	a78b768ed4	[AVX-512] Promote 512-bit integer loads to v8i64 similar to what is done for 128/256-bit vectors for overall consistency. llvm-svn: 278318	2016-08-11 06:04:07 +00:00
Craig Topper	14aa2665d3	[AVX-512] Add patterns to allow EVEX encoded stores of v16i16/v8i16/v16i8/v32i8 even when BWI is not supported. llvm-svn: 278317	2016-08-11 06:04:04 +00:00
Craig Topper	3563d0f622	[AVX-512] Fix the 128-bit and 256-bit nontemporal load patterns with elements type other than i64. These loads have all been promoted to v2i64/v4i64 loads so we need bitcasts or we end up selecting VMOVDQA32/VMOVDQU32 instead. llvm-svn: 278316	2016-08-11 06:04:00 +00:00
Xinliang David Li	76a0108be4	[Profile] improve warning control option Change --no-pgo-warn-missing to -pgo-warn-missing-function and negate the default. /NFC Add more test to make sure the warning is off by default llvm-svn: 278314	2016-08-11 05:09:30 +00:00
Tim Northover	357f1be2ca	GlobalISel: support same ConstantExprs as Instructions. It's more than just inttoptr, but the others can't be tested until we have support for non-trivial constants (they currently get unavoidably folded to a ConstantInt). llvm-svn: 278303	2016-08-10 23:02:41 +00:00
Tim Northover	2ff5935a95	GlobalISel: add tests forgotten in r278293. llvm-svn: 278296	2016-08-10 22:13:48 +00:00
Changpeng Fang	fb9c3818dd	AMDGPU/SI: Implement amdgcn image intrinsics with sampler Summary: This patch define and implement amdgcn image intrinsics with sampler. 1. define vdata type to be llvm_anyfloat_ty, address type to be llvm_anyfloat_ty, and rsrc type to be llvm_anyint_ty. As a result, we expect the intrinsics name to have three suffixes to overload each of these three types; 2. D128 as well as two other flags are implied in the three types, for example, if you use v8i32 as resource type, then r128 is 0! 3. don't expose TFE flag, and other flags are exposed in the instruction order: unrm, glc, slc, lwe and da. Differential Revision: http://reviews.llvm.org/D22838 Reviewed by: arsenm and tstellarAMD llvm-svn: 278291	2016-08-10 21:15:30 +00:00
Kyle Butt	81d32846b0	Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough. If AnalyzeBranch can't analyze a block and it is possible to fallthrough, then duplicating the block doesn't make sense, as only one block can be the layout predecessor for the un-analyzable fallthrough. Submitted wit a test case, but NOTE: the test case doesn't currently fail. However, the test case fails with D20505 and would have saved me some time debugging. llvm-svn: 278288	2016-08-10 21:03:27 +00:00
Kyle Butt	e1c931b171	CodeGen: If Convert blocks that would form a diamond when tail-merged. The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 278287	2016-08-10 20:45:56 +00:00
Reid Kleckner	7cbd6b74b4	Disable sancov tests failing due to apparent endianness issues Undoes some of the effect of r278271 llvm-svn: 278285	2016-08-10 20:11:35 +00:00
Reid Kleckner	0881472ac4	[sancov] Port sancov -print-coverage-pcs to COFF The export table is not considered part of the object file symbol table, so we have to look through it separately. Reviewers: kcc Differential Revision: https://reviews.llvm.org/D23321 llvm-svn: 278284	2016-08-10 20:08:19 +00:00
Matt Arsenault	57431c9680	AMDGPU: Change insertion point of si_mask_branch Insert before the skip branch if one is created. This is a somewhat more natural placement relative to the skip branches, and makes it possible to implement analyzeBranch for skip blocks. The test changes are mostly due to a quirk where the block label is not emitted if there is a terminator that is not also a branch. llvm-svn: 278273	2016-08-10 19:11:42 +00:00
Reid Kleckner	260ac88cd4	[sancov] Run more sancov tests on non-x86-Linux machines Add the $arch-registered-target features that clang uses to disable tests that require a registered backend, so that we can run the sancov tests on Windows. LLVM's lit suite did not appear to have a per-test way to do this, and I would rather not split up the sancov tests into architecture directories. Split out of https://reviews.llvm.org/D23321 llvm-svn: 278271	2016-08-10 19:03:18 +00:00
Sanjay Patel	5ccc85fe83	[x86, AVX] allow FP vector select folding to bitwise logic ops (PR28895) This handles the case in: https://llvm.org/bugs/show_bug.cgi?id=28895 ...but we are not getting all of the possibilities yet. Eg, we use 'X86::FANDN' for scalar FP select combines. That enhancement is filed as: https://llvm.org/bugs/show_bug.cgi?id=28925 Differential Revision: https://reviews.llvm.org/D23337 llvm-svn: 278270	2016-08-10 19:00:11 +00:00
Andrew Kaylor	498d3113c3	[IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18867 llvm-svn: 278269	2016-08-10 18:56:35 +00:00
Nicolai Haehnle	02d784172c	LiveIntervalAnalysis: fix a crash in repairOldRegInRange Summary: See the new test case for one that was (non-deterministically) crashing on trunk and deterministically hit the assertion that I added in D23302. Basically, the machine function contains a sequence DS_WRITE_B32 %vreg4, %vreg14:sub0, ... DS_WRITE_B32 %vreg4, %vreg14:sub0, ... %vreg14:sub1<def> = COPY %vreg14:sub0 and SILoadStoreOptimizer::mergeWrite2Pair merges the two DS_WRITE_B32 instructions into one before calling repairIntervalsInRange. Now repairIntervalsInRange wants to repair %vreg14, in particular, and ends up trying to repair %vreg14:sub1 as well, but that only becomes active _after_ the range that is to be repaired, hence the crash due to LR.find(...) == LR.begin() at the start of repairOldRegInRange. I believe that just skipping those subrange is fine, but again, not too familiar with that code. Reviewers: MatzeB, kparzysz, tstellarAMD Subscribers: llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D23303 llvm-svn: 278268	2016-08-10 18:51:14 +00:00
Andrew Kaylor	b10f6876cd	[ValueTracking] An improvement to IR ValueTracking on Non-negative Integers Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18777 llvm-svn: 278267	2016-08-10 18:47:19 +00:00
Kyle Butt	71b1ca1be4	Codegen: Tail Merge: Be less aggressive with special cases. This change makes it possible for tail-duplication and tail-merging to be disjoint. By being less aggressive when merging during layout, there are no overlapping cases between tail-duplication and tail-merging, provided the thresholds are disjoint. There is a remaining TODO to benchmark the succ_size() test for non-layout tail merging. llvm-svn: 278265	2016-08-10 18:36:18 +00:00
Krzysztof Parzyszek	0bbad0fc86	[Hexagon] Simplify the SplitConst32/64 pass llvm-svn: 278256	2016-08-10 18:05:47 +00:00
Krzysztof Parzyszek	3b946c90ef	[Hexagon] Add extra patterns for single-precision min/max instructions llvm-svn: 278252	2016-08-10 17:56:24 +00:00
Tim Northover	7552ef5a00	GlobalISel: avoid inserting redundant COPYs for bitcasts. If the value produced by the bitcast hasn't been referenced yet, we can simply reuse the input register avoiding an unnecessary COPY instruction. llvm-svn: 278245	2016-08-10 16:51:14 +00:00
Krzysztof Parzyszek	a3386501af	[Hexagon] Use integer instructions for floating point immediates Floating point instructions use general purpose registers, so the few instructions that can put floating point immediates into registers are, in fact, integer instruction. Use them explicitly instead of having pseudo-instructions specifically for dealing with floating point values. Simplify the constant loading instructions (from sdata) to have only two: one for 32-bit values and one for 64-bit values: CONST32 and CONST64. llvm-svn: 278244	2016-08-10 16:46:36 +00:00
Gor Nishanov	b2a9c02521	[Coroutines] Part 6: Elide dynamic allocation of a coroutine frame when possible Summary: A particular coroutine usage pattern, where a coroutine is created, manipulated and destroyed by the same calling function, is common for coroutines implementing RAII idiom and is suitable for allocation elision optimization which avoid dynamic allocation by storing the coroutine frame as a static `alloca` in its caller. coro.free and coro.alloc intrinsics are used to indicate which code needs to be suppressed when dynamic allocation elision happens: ``` entry: %elide = call i8* @llvm.coro.alloc() %need.dyn.alloc = icmp ne i8* %elide, null br i1 %need.dyn.alloc, label %coro.begin, label %dyn.alloc dyn.alloc: %alloc = call i8* @CustomAlloc(i32 4) br label %coro.begin coro.begin: %phi = phi i8* [ %elide, %entry ], [ %alloc, %dyn.alloc ] %hdl = call i8* @llvm.coro.begin(i8* %phi, i32 0, i8* null, i8* bitcast ([2 x void (%f.frame)]* @f.resumers to i8)) ``` and ``` %mem = call i8 @llvm.coro.free(i8* %hdl) %need.dyn.free = icmp ne i8* %mem, null br i1 %need.dyn.free, label %dyn.free, label %if.end dyn.free: call void @CustomFree(i8* %mem) br label %if.end if.end: ... ``` If heap allocation elision is performed, we replace coro.alloc with a static alloca on the caller frame and coro.free with null constant. Also, we need to make sure that if there are any tail calls referencing the coroutine frame, we need to remote tail call attribute, since now coroutine frame lives on the stack. Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) 3.Add empty coroutine passes. (https://reviews.llvm.org/D22847) 4.Add coroutine devirtualization + tests. ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998) c) Do devirtualization (https://reviews.llvm.org/D23229) 5.Add CGSCC restart trigger + tests. (https://reviews.llvm.org/D23234) 6.Add coroutine heap elision + tests. <= we are here 7.Add the rest of the logic (split into more patches) Reviewers: mehdi_amini, majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23245 llvm-svn: 278242	2016-08-10 16:40:39 +00:00
Simon Pilgrim	b204f03004	[X86][XOP] Tweak vpermil2pd test to stop it being combined away The target shuffle combined to a BLENDPD pattern which we will shortly add support for. llvm-svn: 278233	2016-08-10 15:15:56 +00:00
Simon Pilgrim	f1f55198c1	[X86][SSE] Regenerate vector shift lowering tests llvm-svn: 278232	2016-08-10 15:13:49 +00:00
Artur Pilipenko	fd223d5d25	[LVI] Handle conditions in the form of (cond1 && cond2) Teach LVI how to gather information from conditions in the form of (cond1 && cond2). Our out-of-tree front-end emits range checks in this form. Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D23200 llvm-svn: 278231	2016-08-10 15:13:15 +00:00
Sanjay Patel	2c677a9306	use different comparison predicates for better test coverage llvm-svn: 278229	2016-08-10 15:06:11 +00:00
Simon Pilgrim	ac8fa6c2c6	[X86][SSE] Add support for combining target shuffles to MOVSS/MOVSD Only do this on pre-SSE41 targets where we should be lowering to BLENDPS/BLENDPD instead llvm-svn: 278228	2016-08-10 14:15:41 +00:00
Artur Pilipenko	e896325ca3	Add a test case for r278217 "[LVI] Relax the assertion about LVILatticeVal type in getConstantRange" llvm-svn: 278226	2016-08-10 13:51:01 +00:00
Artur Pilipenko	e171ea8a33	Teach CorrelatedValuePropagation to mark adds as no wrap This is a resubmission of previously reverted r277592. It was hitting overly strong assertion in getConstantRange which was relaxed in r278217. Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem. Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D23059 llvm-svn: 278220	2016-08-10 13:08:34 +00:00
Simon Pilgrim	d99242c44d	[X86][SSE] Regenerate SSE1 tests Properly demonstrate the nasty codegen we get for vselect without integer vectors llvm-svn: 278215	2016-08-10 12:26:40 +00:00
Simon Pilgrim	cb5a189b90	Regenerate test llvm-svn: 278214	2016-08-10 12:24:19 +00:00
Simon Pilgrim	85c7ea86ae	[DAGCombine] Avoid INSERT_SUBVECTOR reinsertions (PR28678) If the input vector to INSERT_SUBVECTOR is another INSERT_SUBVECTOR, and this inserted subvector replaces the last insertion, then insert into the common source vector. i.e. INSERT_SUBVECTOR( INSERT_SUBVECTOR( Vec, SubOld, Idx ), SubNew, Idx ) --> INSERT_SUBVECTOR( Vec, SubNew, Idx ) Differential Revision: https://reviews.llvm.org/D23330 llvm-svn: 278211	2016-08-10 10:50:53 +00:00
Sam Parker	62965c96df	[ARM] Improve sxta{b\|h} and uxta{b\|h} tests Created a Thumb2 predicated pattern matcher that uses Thumb2 and HasT2ExtractPack and used it to redefine the patterns for sxta{b\|h} and uxta{b\|h}. Also used the similar patterns to fill in isel pattern gaps for the corresponding instructions in the ARM backend. The patch is mainly changes to tests since most of this functionality appears not to have been tested. Differential Revision: https://reviews.llvm.org/D23273 llvm-svn: 278207	2016-08-10 09:34:34 +00:00
Davide Italiano	873219c406	[SimplifyLibCalls] Restore the old behaviour, emit a libcall. Hal pointed out that the semantic of our intrinsic and the libc call are slightly different. Add a comment while I'm here to explain why we can't emit an intrinsic. Thanks Hal! llvm-svn: 278200	2016-08-10 06:33:32 +00:00
Adam Nemet	896c09bd10	[Inliner,OptDiag] Add hotness attribute to opt diagnostics Summary: The inliner not being a function pass requires the work-around of generating the OptimizationRemarkEmitter and in turn BFI on demand. This will go away after the new PM is ready. BFI is only computed inside ORE if the user has requested hotness information for optimization diagnostitics (-pass-remark-with-hotness at the 'opt' level). Thus there is no additional overhead without the flag. Reviewers: hfinkel, davidxl, eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22694 llvm-svn: 278185	2016-08-10 00:44:44 +00:00
Tim Northover	d403a3d8ee	GlobalISel: support 'undef' constant. llvm-svn: 278174	2016-08-09 23:01:30 +00:00
Michael Zolotukhin	aae168f993	[LoopSimplify] Rebuild LCSSA for the inner loop after separating nested loops. Summary: This hopefully fixes PR28825. The problem now was that a value from the original loop was used in a subloop, which became a sibling after separation. While a subloop doesn't need an lcssa phi node, a sibling does, and that's where we broke LCSSA. The most natural way to fix this now is to simply call formLCSSA on the original loop: it'll do what we've been doing before plus it'll cover situations described above. I think we don't need to run formLCSSARecursively here, and we have an assert to verify this (I've tried testing it on LLVM testsuite + SPECs). I'd be happy to be corrected here though. I also changed a run line in the test from '-lcssa -loop-unroll' to '-lcssa -loop-simplify -indvars', because it exercises LCSSA preservation to the same extent, but also makes less unrelated transformation on the CFG, which makes it easier to verify. Reviewers: chandlerc, sanjoy, silvas Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23288 llvm-svn: 278173	2016-08-09 22:44:56 +00:00
Andrew Kaylor	3c05edfd5e	[ValueTracking] Improve ValueTracking on left shift with nsw flag Patch by Li Huang Differential Revison: https://reviews.llvm.org/D23296 llvm-svn: 278172	2016-08-09 22:41:35 +00:00
Derek Schuff	66641322ce	[WebAssembly] Add -emscripten-cxx-exceptions-whitelist option This patch adds -emscripten-cxx-exceptions-whitelist option to WebAssemblyLowerEmscriptenExceptions pass. This options is the list of function names in which Emscripten-style exception handling is enabled. This is to support emscripten's EXCEPTION_CATCHING_WHITELIST which exists because of the performance impact of emscripten's non-zero-cost EH method. Patch by Heejin Ahn Differential Revision: https://reviews.llvm.org/D23292 llvm-svn: 278171	2016-08-09 22:37:00 +00:00
Tim Northover	5ed648e509	GlobalISel: first translation support for Constants. For now put them all in the entry block. This should be correct but may give poor runtime performance. Hopefully MachineSinking combined with isReMaterializable can solve those issues, but if not the interface is sound enough to support alternatives. llvm-svn: 278168	2016-08-09 21:28:04 +00:00
Sanjay Patel	d34b128fbc	add test cases for missed vselect optimizations (PR28895) llvm-svn: 278165	2016-08-09 21:07:17 +00:00
Lang Hames	ae73b0a932	[ExecutionEngine] Disable weak symbol tests for COFF. COFF doesn't support weak linkage on functions. llvm-svn: 278162	2016-08-09 20:48:22 +00:00
Wei Mi	575435012c	Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion". The patch is to fix the bug in PR28705. It was caused by setting wrong return value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion have different meanings when the function is used in different ways so it is easy to make mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion, and specifies where each interface is expected to be used. Differential Revision: https://reviews.llvm.org/D22942 llvm-svn: 278161	2016-08-09 20:40:03 +00:00

... 7 8 9 10 11 ...

39519 Commits