llvm-project

Commit Graph

Author	SHA1	Message	Date
Bjorn Pettersson	807f732ce8	Fix memory issue in AttrBuilder::removeAttribute uses. Summary: Found when running Valgrind. This removes two unnecessary assignments when using AttrBuilder::removeAttribute. AttrBuilder::removeAttribute returns a reference to the object. As the LHSes were the same as the callees, the assignments resulted in memcpy calls where dst = src. Commited on behalf-of: dstenb (David Stenberg) Reviewers: mkuper, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25460 llvm-svn: 285298	2016-10-27 14:48:09 +00:00
Krzysztof Parzyszek	046da74699	[Hexagon] Do not expand ISD::SELECT for HVX vectors llvm-svn: 285297	2016-10-27 14:30:16 +00:00
Simon Pilgrim	01e755eab1	[DAGCombiner] Add vector demanded elements support to computeKnownBits Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used. I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course. DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit. Differential Revision: https://reviews.llvm.org/D25691 llvm-svn: 285296	2016-10-27 14:29:28 +00:00
Alexey Bataev	46c0278e7d	[SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer. After successfull horizontal reduction vectorization attempt for PHI node vectorizer tries to update root binary op by combining vectorized tree and the ReductionPHI node. But during vectorization this ReductionPHI can be vectorized itself and replaced by the `undef` value, while the instruction itself is marked for deletion. This 'marked for deletion' PHI node then can be used in new binary operation, causing "Use still stuck around after Def is destroyed" crash upon PHI node deletion. Also the test is fixed to make it perform actual testing. Differential Revision: https://reviews.llvm.org/D25671 llvm-svn: 285286	2016-10-27 12:02:28 +00:00
Sam Parker	e7d9505c08	[ARM] Predicate UMAAL selection on hasDSP. UMAAL is a DSP instruction and it is not available on thumbv7m (Cortex-M3) and thumbv6m (Cortex-M0+1) targets. Also fix wrong CHECK prefix in longMAC.ll test. Patch by Vadzim Dambrouski. Differential Revision: https://reviews.llvm.org/D25890 llvm-svn: 285278	2016-10-27 09:47:10 +00:00
Dylan McKay	dd680cc753	[AVR] Generate all of the TableGen files we need This enables generation of all of the TableGen files that are used downstream. llvm-svn: 285274	2016-10-27 08:20:47 +00:00
Nicolai Haehnle	7b0e25b7ad	AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies Summary: When finding a match for a merge and collecting the instructions that must be moved, keep in mind that the instruction we merge might actually use one of the defs that are being moved. Fixes piglit spec/arb_enhanced_layouts/execution/component-layout/vs-tcs-load-output[-indirect]. The fact that the ds_read in the test case is not eliminated suggests that there might be another problem related to alias analysis, but that's a separate problem: this pass should still work correctly even when earlier optimization passes missed something or were disabled. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25829 llvm-svn: 285273	2016-10-27 08:15:07 +00:00
Dylan McKay	00009d4824	[AVR] Compile the disassembler This also updates references of 'TheAVRTarget' to the new 'getTheAVRTarget()' method. llvm-svn: 285272	2016-10-27 08:09:15 +00:00
Dylan McKay	ec47065795	[AVR] Add AVRISelDAGToDAG.cpp Summary: This pulls the AVR instruction selector in-tree. Reviewers: arsenm, kparzysz Subscribers: llvm-commits, wdng, beanz, japaric, mgorny Differential Revision: https://reviews.llvm.org/D25278 llvm-svn: 285270	2016-10-27 07:03:47 +00:00
Dylan McKay	6eaa4e4bcc	[AVR] Add the machine code emitter Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, japaric, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D25388 llvm-svn: 285269	2016-10-27 06:56:46 +00:00
Nemanja Ivanovic	32b5fed639	[PowerPC] - No SExt/ZExt needed for count trailing zeros This patch corresponds to review: https://reviews.llvm.org/D25896 It just eliminates the redundant ZExt after a count trailing zeros instruction. llvm-svn: 285267	2016-10-27 05:17:58 +00:00
Kostya Serebryany	94c427c23e	[libFuzzer] speculatively trying to fix the Mac build; second attempt llvm-svn: 285262	2016-10-27 00:36:38 +00:00
Kostya Serebryany	3d945f6247	[libFuzzer] revert 285259 -- hit commit too soon llvm-svn: 285260	2016-10-27 00:24:34 +00:00
Kostya Serebryany	15cd6b4b10	[libFuzzer] speculatively trying to fix the Mac build llvm-svn: 285259	2016-10-27 00:22:39 +00:00
Evandro Menezes	ca8370396a	[AArch64] Create feature set for Samsung Exynos-M2 Since Exynos-M2 improved the FP square root unit a bit over the one in Exynos-M1, it does not benefit from using the Newton series for such operations. llvm-svn: 285246	2016-10-26 22:06:20 +00:00
Victor Leschuk	a37660c669	DebugInfo: fix incorrect alignment type (NFC) Change type of some missed DebugInfo-related alignment variables, that are still uint64_t, to uint32_t. Original change introduced in r284482. llvm-svn: 285242	2016-10-26 21:32:29 +00:00
Tim Northover	a9cc385664	ARM: don't rely on push/pop reglists being in order when folding SP adjust. It would be a very nice invariant to rely on, but unfortunately it doesn't necessarily hold (and the causes of mis-sorted reglists appear to be quite varied) so to be robust the frame lowering code can't assume that the first register in the list is also the first one that actually gets pushed. Should fix an issue where we were turning something like: push {r8, r4, r7, lr} sub sp, #24 into nonsense like: push {r2, r3, r4, r5, r6, r7, r8, r4, r7, lr} llvm-svn: 285232	2016-10-26 20:01:00 +00:00
Nemanja Ivanovic	275853e777	Do not assume that FP vector operands are never legalized by expanding This patch ensures that if a floating point vector operand is legalized by expanding, it is legalized through the stack rather than by calling DAGTypeLegalizer::IntegerToVector which will cause a failure since the operand is a non-integer type. This fixes PR 30715. llvm-svn: 285231	2016-10-26 19:51:35 +00:00
Sanjoy Das	01969218a4	Simplify `x >=u x >> y` and `x >=u x udiv y` Summary: Extends InstSimplify to handle both `x >=u x >> y` and `x >=u x udiv y`. This is a folloup of rL258422 and https://github.com/rust-lang/rust/pull/30917 where llvm failed to optimize away the bounds checking in a binary search. Patch by Arthur Silva! Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25941 llvm-svn: 285228	2016-10-26 19:18:43 +00:00
Chad Rosier	4447d7a816	Revert "[AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory." This reverts commit r285191. LICM appears to rely on the Alias Set Tracker hitting lifetime markers to prevent code from being moved outside of the original scope. llvm-svn: 285227	2016-10-26 19:18:19 +00:00
Nemanja Ivanovic	0f45998bc6	[PowerPC] Implement vec_insert_exp builtins - llvm portion This revision corresponds to review: https://reviews.llvm.org/D25957. Committing on behalf of Zaara Syeda. llvm-svn: 285225	2016-10-26 19:03:40 +00:00
Kostya Serebryany	2fabecaee3	[libFuzzer] simplify TracePC::HandleTrace even further. Also, when dealing with -exit_on_src_pos, symbolize every PC only once llvm-svn: 285223	2016-10-26 18:52:04 +00:00
Chad Rosier	0c621fda0d	[AArch64] Avoid materializing constant 1 when generating cneg instructions. Instead of cmp w0, #1 orr w8, wzr, #0x1 cneg w0, w8, ne we now generate cmp w0, #1 csinv w0, w0, wzr, eq PR28965 llvm-svn: 285217	2016-10-26 18:15:32 +00:00
Dan Gohman	68a423bf84	[WebAssembly] Update the README.txt. Update the README.txt with newer information, add a link to the Emscripten page explaining the current easiest way to use the LLVM wasm backend, and mention that other ways of using the LLVM wasm backend are in development. llvm-svn: 285215	2016-10-26 17:44:09 +00:00
Nirav Dave	e2369aafb9	[MC] Fix comma typo in .loc parsing llvm-svn: 285214	2016-10-26 17:28:58 +00:00
Robert Lougher	660f2f9560	Reapply: "Remove debug location from common tail when tail-merging" This reapplies revision 285093. Original commit message: The branch folding pass tail merges blocks into a common-tail. However, the tail retains the debug information from one of the original inputs to the merge (chosen randomly). This is a problem for sampled-based PGO, as hits on the common-tail will be attributed to whichever block was chosen, irrespective of which path was actually taken to the common-tail. This patch fixes the issue by nulling the debug location for the common-tail. Differential Revision: https://reviews.llvm.org/D25742 llvm-svn: 285212	2016-10-26 17:01:47 +00:00
Yaxun Liu	94add85adb	AMDGPU: Refactor processor definition to use ISA version features Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend. Refactor processor definition to use ISA version features. Fixed ISA version for stoney. Based on Laurent Morichetti's patch. Differential Revision: https://reviews.llvm.org/D25919 llvm-svn: 285210	2016-10-26 16:37:56 +00:00
Dehao Chen	e713000eb6	Introduce updateDiscriminator interface to DILocation to make it cleaner assigning discriminators. Summary: This patch introduces updateDiscriminator to DILocation so that it can be directly called by AddDiscriminator. It also makes it easier to update the discriminator later. Reviewers: dnovillo, dblaikie, aprantl, echristo Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25959 llvm-svn: 285207	2016-10-26 15:48:45 +00:00
Matt Arsenault	39787bdcbb	Reapply "AMDGPU: Don't use offen if it is 0" This reverts r283003 llvm-svn: 285203	2016-10-26 15:08:16 +00:00
Matt Arsenault	1110f14b42	AMDGPU: Fix counting si_mask_branch as 4 bytes llvm-svn: 285202	2016-10-26 14:53:54 +00:00
Matt Arsenault	8fac501602	Fix nondeterministic output in local stack slot alloc pass This finds all of the references to a frame index in a function, and sorts by the offset. If multiple instructions use the same offset, nothing was breaking the tie for sorting. This avoids the test failures the reverted r282999 introduced. llvm-svn: 285201	2016-10-26 14:53:50 +00:00
Sanjay Patel	8d7196bfde	[InstCombine] clean up commonCastTransforms; NFC 1. Use 'auto' with dyn_cast. 2. Variables start with a capital letter. 3. Use proper punctuation in comments. llvm-svn: 285200	2016-10-26 14:52:35 +00:00
Tom Stellard	284cf32ab4	LegalizeDAG: Support promoting [US]DIV and [US]REM operations Summary: AMDGPU will need this one i16 is added as a legal type. This is tested by: test/CodeGen/AMDGPU/sdiv.ll test/CodeGen/AMDGPU/sdivrem24.ll test/CodeGen/AMDGPU/udiv.ll test/CodeGen/AMDGPU/udivrem24.ll Reviewers: bogner, efriedma Subscribers: efriedma, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D25699 llvm-svn: 285199	2016-10-26 14:52:25 +00:00
Tom Stellard	f8e6eaff6e	AMDGPU/SI: Don't emit multi-dword flat memory ops when they might access scratch Summary: A single flat memory operations that might access the scratch buffer can only access MaxPrivateElementSize bytes. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25788 llvm-svn: 285198	2016-10-26 14:38:47 +00:00
Zvi Rackover	aa3402b41e	[X86] AVX512 fallback for floating-point scalar selects Summary: In the case where of 'select i1 , f32, f32' or select i1, f64, f64 prefer lowering to masked-moves over branches. Fixes pr30561 Reviewers: igorb, aymanmus, delena Differential Revision: https://reviews.llvm.org/D25310 llvm-svn: 285196	2016-10-26 14:12:46 +00:00
Chad Rosier	1408628ffa	[AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory. Differential Revision: https://reviews.llvm.org/D25969 llvm-svn: 285191	2016-10-26 12:42:11 +00:00
Victor Leschuk	3c9899842b	DebugInfo: support for DWARFv5 DW_AT_alignment attribute * Assume that clang passes non-zero alignment value to DIBuilder only in case when it was forced by C++11 'alignas', C11 '_Alignas' or compiler attribute '__attribute__((aligned (N)))'. * Emit DW_AT_alignment if alignment is specified for type/object. Differential Revision: https://reviews.llvm.org/D24425 llvm-svn: 285189	2016-10-26 11:59:03 +00:00
Andrea Di Biagio	9bcb064f19	[IndVarSimplify][DebugLoc] When widening the exit loop condition, correctly reuse the debug location of the original comparison. When the loop exit condition is canonicalized as a != compaison, reuse the debug location of the original (non canonical) comparison. Before this patch, the debug location of the new icmp was obtained from the loop latch terminator. This patch fixes the issue by correctly setting the IRBuilder's "current debug location" to the location of the original compare. Differential Revision: https://reviews.llvm.org/D25953 llvm-svn: 285185	2016-10-26 10:28:32 +00:00
Vassil Vassilev	df5042ab61	Revert r285181 "DebugInfo: support for DWARFv5 DW_AT_alignment attribute". The commit broke the builds. llvm-svn: 285183	2016-10-26 10:13:47 +00:00
Victor Leschuk	e398c6afa9	DebugInfo: support for DWARFv5 DW_AT_alignment attribute * Assume that clang passes non-zero alignment value to DIBuilder only in case when it was forced by C++11 'alignas', C11 '_Alignas' or compiler attribute '__attribute__((aligned (N)))'. * Emit DW_AT_alignment if alignment is specified for type/object. Differential Revision: https://reviews.llvm.org/D24425 llvm-svn: 285181	2016-10-26 08:55:27 +00:00
Craig Topper	812d3d30ae	[AVX-512] Add scalar vfmsub/vfnmsub mask3 intrinsics Summary: Clang's intrinsic header currently tries to negate the third operand of a vfmadd mask3 in order to create vfmsub, but this fails isel. This patch adds scalar vfmsub and vfnmsub mask3 that we can use instead to avoid the negate. This is consistent with the packed instructions. Reviewers: igorb, delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25933 llvm-svn: 285173	2016-10-26 04:59:58 +00:00
Peter Collingbourne	7b7bac367c	Cloning: Also clone global variable attached metadata. llvm-svn: 285161	2016-10-26 02:57:33 +00:00
Kostya Serebryany	06b8757b57	[libFuzzer] simplify the code in TracePC::HandleTrace a bit more llvm-svn: 285147	2016-10-26 00:42:52 +00:00
Kostya Serebryany	a5b2e54fcb	[libFuzzer] simplify the code to print new PCs llvm-svn: 285145	2016-10-26 00:20:51 +00:00
Evgeniy Stepanov	ea6d49d3ee	Utility functions for appending to llvm.used/llvm.compiler.used. llvm-svn: 285143	2016-10-25 23:53:31 +00:00
Kostya Serebryany	275e260258	[libFuzzer] simplify the code in TracePC::HandleTrace llvm-svn: 285142	2016-10-25 23:52:25 +00:00
Kostya Serebryany	117976818e	[libFuzzer] add StandaloneFuzzTargetMain.c and a test for it llvm-svn: 285135	2016-10-25 22:30:34 +00:00
James Y Knight	2e64b8b79e	[Sparc] Don't overlap variable-sized allocas with other stack variables. On SparcV8, it was previously the case that a variable-sized alloca might overlap by 4-bytes the last fixed stack variable, effectively because 92 (the number of bytes reserved for the register spill area) != 96 (the offset added to SP for where to start a DYNAMIC_STACKALLOC). It's not as simple as changing 96 to 92, because variables that should be 8-byte aligned would then be misaligned. For now, simply increase the allocation size by 8 bytes for each dynamic allocation -- wastes space, but at least doesn't overlap. As the large comment says, doing this more efficiently will require larger changes in llvm. Also adds some test cases showing that we continue to not support dynamic stack allocation and over-alignment in the same function. llvm-svn: 285131	2016-10-25 22:13:28 +00:00
Bob Haarman	26a87bd030	[codeview] support emitting indirect virtual base class information Summary: Fixes PR28281. MSVC lists indirect virtual base classes in the field list of a class, using LF_IVBCLASS records. This change makes LLVM emit such records when processing DW_TAG_inheritance tags with the DIFlagVirtual and (newly introduced) DIFlagIndirect tags. Reviewers: rnk, ruiu, zturner Differential Revision: https://reviews.llvm.org/D25578 llvm-svn: 285130	2016-10-25 22:11:52 +00:00
Simon Pilgrim	de86241a09	[DAGCombiner] Enable (urem x, (shl pow2, y)) -> (and x, (add (shl pow2, y), -1)) combine for splatted vectors llvm-svn: 285129	2016-10-25 22:01:09 +00:00
Rong Xu	33308f92eb	[PGO] Fix select instruction annotation Summary: Select instruction annotation in IR PGO uses the edge count to infer the branch count. It's currently placed in setInstrumentedCounts() where no all the BB counts have been computed. This leads to wrong branch weights. Move the annotation after all BB counts are populated. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25961 llvm-svn: 285128	2016-10-25 21:47:24 +00:00
Simon Pilgrim	f534573e8c	[DAGCombiner] Enable srem(x.y) -> urem(x,y) combine for vectors SelectionDAG::SignBitIsZero (via SelectionDAG::computeKnownBits) has supported vectors since rL280927 llvm-svn: 285123	2016-10-25 21:20:18 +00:00
Simon Pilgrim	4ebb04510a	[DAGCombiner] Enable sdiv(x.y) -> udiv(x,y) combine for vectors SelectionDAG::SignBitIsZero (via SelectionDAG::computeKnownBits) has supported vectors since rL280927 llvm-svn: 285118	2016-10-25 20:56:42 +00:00
Guozhi Wei	ae541f6a71	[InstCombine] Resubmit the combine of A->B->A BitCast and fix for pr27996 The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996. The problem is with following code xB = load (type B); xA = load (type A); +yA = (A)xB; B -> A +zAn = PHI[yA, xA]; PHI +zBn = (B)zAn; // A -> B store zAn; store zBn; optimizeBitCastFromPhi generates +zBn = (B)zAn; // A -> B and expects it will be combined with the following store instruction to another store zAn Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely. optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions. Differential Revision: https://reviews.llvm.org/D23896 llvm-svn: 285116	2016-10-25 20:43:42 +00:00
Robert Lougher	3080d71fc8	revert: "Remove debug location from common tail when tail-merging" This reverts r285093, as it caused unexpected buildbot failures on clang-ppc64le-linux, clang-ppc64be-linux, clang-ppc64be-linux-multistage and clang-ppc64be-linux-lnt. Failing test ubsan/TestCases/TypeCheck/vptr.cpp. llvm-svn: 285110	2016-10-25 20:17:58 +00:00
Kostya Serebryany	c48c93184a	[libFuzzer] when mutating based on CMP traces also try adding +/- 1 to the desired bytes. Add another test for use_cmp llvm-svn: 285109	2016-10-25 20:15:15 +00:00
Sanjay Patel	f3dda13bd2	[InstCombine] Ensure that truncated int types are legal. Fixes the FIXMEs in D25952 and rL285075. Patch by bryant! Differential Revision: https://reviews.llvm.org/D25955 llvm-svn: 285108	2016-10-25 20:11:47 +00:00
Evandro Menezes	7696dc0685	[AArch64] Adjust the cost model for Exynos M1. Modify the maximum jump table size. llvm-svn: 285106	2016-10-25 20:05:42 +00:00
Tim Shen	85de51db85	[APFloat] Make APFloat an interface class to the internal IEEEFloat. NFC. Summary: The intention is to make APFloat an interface class, so that later I can add a second implementation class DoubleAPFloat to correctly implement PPCDoubleDouble semantic. The interface of IEEEFloat is not public, and can be simplified (currently it's exactly the same as the old APFloat), but that belongs to a separate patch. DoubleAPFloat should look like: class DoubleAPFloat { const fltSemantics *Semantics; std::unique_ptr<APFloat> APFloats; // Two heap-allocated APFloats. }; There is no functional change, nor public interface change. Reviewers: hfinkel, chandlerc, iteratee, echristo, kbarton Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D25536 llvm-svn: 285105	2016-10-25 19:55:59 +00:00
Evandro Menezes	eb97e3554c	Add option to specify minimum number of entries for jump tables Add an option to allow easier experimentation by target maintainers with the minimum number of entries to create jump tables. Also clarify the name of the other existing option governing the creation of jump tables. Differential revision: https://reviews.llvm.org/D25883 llvm-svn: 285104	2016-10-25 19:53:51 +00:00
Evandro Menezes	601f4cb9f7	Switch lowering: improve partitioning of jump tables When there's a tie between partitionings of jump tables, consider also cases that result in no jump tables, but in one or a few cases. The motivation is that many contemporary processors typically perform case switches fairly quickly. Differential revision: https://reviews.llvm.org/D25212 llvm-svn: 285099	2016-10-25 19:11:43 +00:00
Matthew Simpson	c62266d680	[LV] Sink scalar operands of predicated instructions When we predicate an instruction (div, rem, store) we place the instruction in its own basic block within the vectorized loop. If a predicated instruction has scalar operands, it's possible to recursively sink these scalar expressions into the predicated block so that they might avoid execution. This patch sinks as much scalar computation as possible into predicated blocks. We previously were able to sink such operands only if they were extractelement instructions. Differential Revision: https://reviews.llvm.org/D25632 llvm-svn: 285097	2016-10-25 18:59:45 +00:00
Michael Ilseman	e542804343	Add -strip-nonlinetable-debuginfo capability This adds a new function to DebugInfo.cpp that takes an llvm::Module as input and removes all debug info metadata that is not directly needed for line tables, thus effectively stripping all type and variable information from the module. The primary motivation for this feature was the bitcode work flow (cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html for more background). This is not wired up yet, but will be in subsequent patches. For testing, the new functionality is exposed to opt with a -strip-nonlinetable-debuginfo option. The secondary use-case (and one that works right now!) is as a reduction pass in bugpoint. I added two new bugpoint options (-disable-strip-debuginfo and -disable-strip-debug-types) to control the new features. By default it will first attempt to remove all debug information, then only the type info, and then proceed to hack at any remaining MDNodes. Thanks to Adrian Prantl for stewarding this patch! llvm-svn: 285094	2016-10-25 18:44:13 +00:00
Robert Lougher	e32564774c	Remove debug location from common tail when tail-merging The branch folding pass tail merges blocks into a common-tail. However, the tail retains the debug information from one of the original inputs to the merge (chosen randomly). This is a problem for sampled-based PGO, as hits on the common-tail will be attributed to whichever block was chosen, irrespective of which path was actually taken to the common-tail. This patch fixes the issue by nulling the debug location for the common-tail. Differential Revision: https://reviews.llvm.org/D25742 llvm-svn: 285093	2016-10-25 18:44:07 +00:00
Michael Kuperstein	cffedc4a94	Fix 80-char violations. NFC. llvm-svn: 285092	2016-10-25 18:31:23 +00:00
Dan Gohman	f50d964bdb	[WebAssembly] Add immediate fields to call_indirect and memory operators. call_indirect, grow_memory, and current_memory now have immediate operands in the 0xd binary encoding. llvm-svn: 285085	2016-10-25 16:55:52 +00:00
Dehao Chen	c1472b5092	Move discriminator assignment to where it is used. (NFC) llvm-svn: 285084	2016-10-25 16:50:27 +00:00
Andrea Di Biagio	824cabd06d	[IndVarSimplify][Dwarf] When widening the IV increment, correctly set the debug loc. When indvars widened an induction variable, the debug location for the loop increment computation was incorrectly set equal to the debug loc of the loop latch terminator. This patch fixes the issue by propagating the correct location from the original loop increment instruction to the new widened increment. Differential Revision: https://reviews.llvm.org/D25872 llvm-svn: 285083	2016-10-25 16:45:17 +00:00
Pavel Labath	ec534e6dfe	Replace TimeValue by TimePoint in LegacyPassManager. NFC. llvm-svn: 285081	2016-10-25 16:20:07 +00:00
Geoff Berry	91e9a5cc23	[EarlyCSE] Make MemorySSA memory dependency check more aggressive. Now that MemorySSA keeps track of whether MemoryUses are optimized, use getClobberingMemoryAccess() to check MemoryUse memory dependencies since it should no longer be so expensive. This is a follow-up change to https://reviews.llvm.org/D25881 llvm-svn: 285080	2016-10-25 16:18:47 +00:00
Sanjay Patel	e3de152530	fix formatting; NFC llvm-svn: 285078	2016-10-25 16:12:31 +00:00
Ulrich Weigand	7bdb485e18	[SystemZ] Do not use LOC(G) for volatile loads It is not safe to use LOAD ON CONDITION to implement access to a memory location marked "volatile", since the architecture leaves it unspecified whether or not an access happens if the condition is false. The current code already appears to care about that: def LOC : CondUnaryRSY<"loc", 0xEBF2, nonvolatile_load, GR32, 4>; Unfortunately, that "nonvolatile_load" operator is simply ignored by the CondUnaryRSY class, and there was no test to catch it. llvm-svn: 285077	2016-10-25 15:39:15 +00:00
Sanjay Patel	d59f7f9047	[InstCombine] add test and code comment to show potentially misguided icmp trunc transform llvm-svn: 285075	2016-10-25 15:16:39 +00:00
Simon Pilgrim	5c3c9707c3	[X86][SSE] Add support for (V)PMOVSX* constant folding We already have (V)PMOVZX* combining support, this is the beginning of handling (V)PMOVSX* similarly - other combines in combineVSZext can be generalized in future patches. This unearthed an interesting bug in that we were generating illegal build vectors on 32-bit targets - it was proving difficult to create a test for it from PMOVZX, but it fired immediately with PMOVSX. I've created a more general form of the existing getConstVector to handle these cases - ideally this should be handled in non-target-specific code but I couldn't find an equivalent. Differential Revision: https://reviews.llvm.org/D25874 llvm-svn: 285072	2016-10-25 14:29:25 +00:00
Rafael Espindola	20aa1779d0	fix warning llvm-svn: 285064	2016-10-25 12:28:26 +00:00
Zvi Rackover	124470a202	[DAGCombine] Preserve shuffles when one of the vector operands is constant Summary: Do not perform combines such as: vector_shuffle<4,1,2,3>(build_vector(Ud, C0, C1 C2), scalar_to_vector(X)) -> build_vector(X, C0, C1, C2) Keeping the shuffle allows lowering the constant build_vector to a materialized constant vector (such as a vector-load from the constant-pool or some other idiom). Reviewers: delena, igorb, spatel, mkuper, andreadb, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25524 llvm-svn: 285063	2016-10-25 12:14:19 +00:00
Rafael Espindola	7912110ddc	Make the LTO comdat api more symbol table friendly. In an IR symbol table I would expect the comdats to be represented as: - A table of strings, one for each comdat name. - Each symbol has an optional index into that table. The natural api for accessing that would be InputFile: ArrayRef<StringRef> getComdatTable() const; Symbol: int getComdatIndex() const; This patch implements an API as close to that as possible. The implementation on top of the current IRObjectFile is a bit hackish, but should map just fine over a symbol table and is very convenient to use. llvm-svn: 285061	2016-10-25 12:02:03 +00:00
Benjamin Kramer	7df3043db3	Fix an unused warning in WebAssemblyInstPrinter with NDEBUG. Patch by Sam McCall! Differential Revision: https://reviews.llvm.org/D25934 llvm-svn: 285055	2016-10-25 09:08:50 +00:00
Craig Topper	01e4667e02	[AVX-512] Add support for creating SIGN_EXTEND_VECTOR_INREG and ZERO_EXTEND_VECTOR_INREG for 512-bit vectors to support vpmovzxbq and vpmovsxbq. Summary: The one tricky thing about this is that the sign/zero_extend_inreg uses v64i8 as an input type which isn't legal without BWI support. Though the vpmovsxbq and vpmovzxbq instructions themselves don't require BWI. To support this we need to add custom lowering for ZERO_EXTEND_VECTOR_INREG with v64i8 input. This can mostly reuse the existing sign extend code with a couple checks for sign extend vs zero extend added. Reviewers: delena, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25594 llvm-svn: 285053	2016-10-25 04:00:29 +00:00
Peter Collingbourne	4f3b2df9bb	GlobalDCE: Restore a statement accidentally removed in r285048. llvm-svn: 285052	2016-10-25 02:57:27 +00:00
Matthias Braun	c8440dddb2	MachineInstrBundle: Pass iterators to getBundle(Start\|End); NFC This is a function to go backwards in a block to find the first instruction in a bundle, so iterator is a more natural choice for parameter/return rather than a reference to a MachineInstruction. llvm-svn: 285051	2016-10-25 02:55:17 +00:00
Peter Collingbourne	ca7664e761	IR: Deduplicate getParent() functions on derived classes of GlobalValue into the base class. NFCI. llvm-svn: 285050	2016-10-25 02:54:08 +00:00
Kostya Serebryany	3364f90783	[libFuzzer] simplify the code for use_cmp, also use the position hint when available, add a test llvm-svn: 285049	2016-10-25 02:04:43 +00:00
Peter Collingbourne	7695cb6da8	GlobalDCE: Deduplicate code. NFCI. llvm-svn: 285048	2016-10-25 01:58:26 +00:00
Dan Gohman	48abaa9c74	[WebAssembly] Reorder load/store operands to match binary encoding. The p2align operand of a load/store is encoded before the offset operand; reorder the MachineInstr operands accordingly. llvm-svn: 285044	2016-10-25 00:17:11 +00:00
Dan Gohman	3acb187d95	[WebAssembly] Implement more WebAssembly binary encoding. This changes locals from being declared by the emitLocal hook in WebAssemblyTargetStreamer, rather than with an instruction. After exploring the infastructure in LLVM more, this seems to make more sense since declaring locals doesn't use an encoded opcode. This also adds more 0xd opcodes, type encodings, and miscellaneous binary encoding bits. llvm-svn: 285040	2016-10-24 23:27:49 +00:00
Matthias Braun	8b38ffaa98	CodeGen/Passes: Pass MachineFunction as functor arg; NFC Passing a MachineFunction as argument is more natural and avoids an unnecessary round-trip through the logic determining the correct Subtarget because MachineFunction already has a reference anyway. llvm-svn: 285039	2016-10-24 23:23:02 +00:00
Eli Friedman	c5b7262073	Fix regression from my recent GlobalsAA fix. There are two fixes here: one, AnalyzeUsesOfPointer can't return false until it has checked all the uses of the pointer. Two, if a global uses another global, we have to assume the address of the first global escapes. Fixes https://llvm.org/bugs/show_bug.cgi?id=30707 . Differential Revision: https://reviews.llvm.org/D25798 llvm-svn: 285034	2016-10-24 21:47:44 +00:00
Simon Pilgrim	e3e6585c2d	[SelectionDAG] Update ComputeNumSignBits SRA/SHL handlers to accept scalar or vector splats Use isConstOrConstSplat helper. Also use APInt instead of getZExtValue directly to avoid out of range issues. llvm-svn: 285033	2016-10-24 21:47:19 +00:00
Matthias Braun	fc371558a0	Use MachineInstr::mop_iterator instead of MIOperands; NFC (Const)?MIOperands is equivalent to the C++ style MachineInstr::mop_iterator. Use the latter for consistency except for a few callers of MIOperands::analyzePhysReg(). llvm-svn: 285029	2016-10-24 21:36:43 +00:00
Kevin Enderby	79d6c63f61	nother additional error check for an invalid Mach-O file when contained in a Mach-O universal file and the cputypes in both headers don’t match. llvm-svn: 285026	2016-10-24 21:15:11 +00:00
Simon Pilgrim	d8ec09c74f	Use SDValue::getConstantOperandVal() helper. NFCI. llvm-svn: 285025	2016-10-24 20:56:52 +00:00
Dan Gohman	5d3391f859	[WebAssembly] Fix a broken URL. llvm-svn: 285017	2016-10-24 20:35:17 +00:00
Dan Gohman	4becc58587	[WebAssembly] Define the `end` opcode value. CFGStackify differentiates between END_LOOP and END_BLOCK, but wasm itself doesn't. For now, just use the same opcode for both. llvm-svn: 285016	2016-10-24 20:32:04 +00:00
Dan Gohman	c968297b95	[WebAssembly] Update opcode values according to recent spec changes. This corresponds to the "0xd" opcode renumbering. llvm-svn: 285014	2016-10-24 20:21:49 +00:00
Dan Gohman	4fc4e42dea	[WebAssembly] Add an option to make get_local/set_local explicit. This patch adds a pass, controlled by an option and off by default for now, for making implicit get_local/set_local explicit. This simplifies emitting wasm with MC. Differential Revision: https://reviews.llvm.org/D25836 llvm-svn: 285009	2016-10-24 19:49:43 +00:00
Davide Italiano	c3e0ce8f85	Merge two if conditions into one. NFCI. llvm-svn: 285008	2016-10-24 19:41:47 +00:00
Peter Collingbourne	6733564e5a	Target: Change various section classifiers in TargetLoweringObjectFile to take a GlobalObject. These functions are about classifying a global which will actually be emitted, so it does not make sense for them to take a GlobalValue which may for example be an alias. Change the Mach-O object writer and the Hexagon, Lanai and MIPS backends to look through aliases before using TargetLoweringObjectFile interfaces. These are functional changes but all appear to be bug fixes. Differential Revision: https://reviews.llvm.org/D25917 llvm-svn: 285006	2016-10-24 19:23:39 +00:00
Peter Collingbourne	16e9b944e9	CodeGen: Do not add a global's address space to the folding set profile. It is already part of the type (which is part of the global, which is already being added), so there's no need to do it. llvm-svn: 285002	2016-10-24 18:56:09 +00:00
Adrian Prantl	28d2d281e7	add-discriminators: Fix handling of lexical scopes. This fixes a bug in the handling of lexical scopes, when more than one scope is defined on the same line or functions are inlined into call sites that are on the same line as the function definition. This situation can easily happen in macro expansions. The problem is solved by introducing a SmallDenseMap<DIScope , DILexicalBlockFile , 1> that keeps track of all the different lexical scopes that share a line/file location. Fixes PR30681. llvm-svn: 284998	2016-10-24 18:23:51 +00:00
Krzysztof Parzyszek	eb6172404d	Revert r284972 and remove other defaulted copy/move constructors/= David Blaikie pointed out that we get them for free without having to write anything. llvm-svn: 284996	2016-10-24 17:40:46 +00:00
Ehsan Amiri	c90b02cf50	[PPC] Generate positive FP zero using xor insn instead of loading from constant area https://reviews.llvm.org/D23614 Currently we load +0.0 from constant area. That can change to be generated using XOR instruction. llvm-svn: 284995	2016-10-24 17:31:09 +00:00
Eli Friedman	b37864b58d	Revert r284580+r284917. ("Synthesize TBB/TBH instructions") The optimization has correctness issues, so reverting for now to fix tests on thumb1 targets. llvm-svn: 284993	2016-10-24 17:20:50 +00:00
Simon Pilgrim	a072e375b5	Removed FIXME from include ordering comment Nothing to fix, it's just the way it has to be. llvm-svn: 284991	2016-10-24 17:15:05 +00:00
Rong Xu	b05bac940d	Check the number of Args in LibCallsShrinkWrap. Some library fucntions can have no argument. llvm-svn: 284989	2016-10-24 16:50:12 +00:00
Evandro Menezes	eff2bd9d4f	[AArch64] Optionally use the Newton series for reciprocal estimation Add support for estimating the square root or its reciprocal and division or reciprocal using the combiner generic Newton series. Differential revision: https://reviews.llvm.org/D25291 llvm-svn: 284986	2016-10-24 16:14:58 +00:00
Geoff Berry	6815468768	[EarlyCSE] Optimize MemoryPhis and reduce memory clobber queries w/ MemorySSA Summary: When using MemorySSA, re-optimize MemoryPhis when removing a store since this may create MemoryPhis with all identical arguments. Also, when using MemorySSA to check if two MemoryUses are reading from the same version of the heap, use the defining access instead of calling getClobberingAccess, since the latter can currently result in many more AA calls. Once the MemorySSA use optimization tracking changes are done, we can remove this limitation, which should result in more loads being CSE'd. Reviewers: dberlin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D25881 llvm-svn: 284984	2016-10-24 15:54:00 +00:00
Ehsan Amiri	1f31e9157d	[PPC] Better codegen for AND, ANY_EXT, SRL sequence https://reviews.llvm.org/D24924 This improves the code generated for a sequence of AND, ANY_EXT, SRL instructions. This is a targetted fix for this special pattern. The pattern is generated by target independet dag combiner and so a more general fix may not be necessary. If we come across other similar cases, some ideas for handling it are discussed on the code review. llvm-svn: 284983	2016-10-24 15:46:58 +00:00
Nicolai Haehnle	a785209bc2	AMDGPU: Fix Two Address problems with v_movreld Summary: The v_movreld machine instruction is used with three operands that are in a sense tied to each other (the explicit VGPR_32 def and the implicit VGPR_NN def and use). There is no way to express that using the currently available operand bits, and indeed there are cases where the Two Address instructions pass does the wrong thing. This patch introduces a new set of pseudo instructions that are identical in intended semantics as v_movreld, but they only have two tied operands. Having to add a new set of pseudo instructions is admittedly annoying, but it's a fairly straightforward and solid approach. The only alternative I see is to try to teach the Two Address instructions pass about Three Address instructions, and I'm afraid that's trickier and is going to end up more fragile. Note that v_movrels does not suffer from this problem, and so this patch does not touch it. This fixes several GL45-CTS.shaders.indexing.* tests. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25633 llvm-svn: 284980	2016-10-24 14:56:02 +00:00
Nico Weber	b38d341106	Revert 284971. It seems to break selfhost on some bots, see e.g. http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/21 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/20 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/22 llvm-svn: 284979	2016-10-24 14:52:04 +00:00
Nirav Dave	1a9044b782	[MC] Fix Various End Of Line Comment checkings Fix AsmParser lines to correctly handle end-of-line pre-processor comments parsing when '#' is not the assembly line comment prefix. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25567 llvm-svn: 284978	2016-10-24 14:35:29 +00:00
Pavel Labath	676a875b06	[Chrono] Fix !HAVE_FUTIMENS build If we don't have futimens(), we fall back to futimes(), which only supports microsecond timestamps. In that case, we need to explicitly cast away the extra precision in setLastModificationAndAccessTime(). llvm-svn: 284977	2016-10-24 14:19:28 +00:00
Pavel Labath	51c454c1a9	Remove unused #includes of TimeValue.h. NFC. llvm-svn: 284975	2016-10-24 14:00:26 +00:00
Pavel Labath	bff47b51b6	[Object] Replace TimeValue with std::chrono Summary: Most of the changes are very straight-forward. The only choice I had to make was to use second-precision time points in the Archive classes. I did this because the archive files use that precision in the on-disk representation anyway. Reviewers: rafael, zturner Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25773 llvm-svn: 284974	2016-10-24 13:38:27 +00:00
Joel Jones	504bf334b0	AArch64 ILP32 relocations for assembly and ELF Summary: Add relocations for AArch64 ILP32. Includes: - Addition of definitions for R_AARCH32_* - Definition of new -target-abi: ilp32 - Definition of data layout string - Tests for added relocations. Not comprehensive, but matches existing tests for 64-bit. Renames "CHECK-OBJ" to "CHECK-OBJ-LP64". - Tests for llvm-readobj Reviewers: zatrazz, peter.smith, echristo, t.p.northover Subscribers: aemerson, rengolin, mehdi_amini Differential Revision: https://reviews.llvm.org/D25159 llvm-svn: 284973	2016-10-24 13:37:13 +00:00
Krzysztof Parzyszek	f74683f930	[RDF] Add default move constructors/assignment operators llvm-svn: 284972	2016-10-24 13:15:20 +00:00
Pablo Barrio	f9e0d0b7d0	[JumpThreading] Unfold selects that depend on the same condition Summary: These are good candidates for jump threading. This enables later opts (such as InstCombine) to combine instructions from the selects with instructions out of the selects. SimplifyCFG will fold the select again if unfolding wasn't worth it. Patch by James Molloy and Pablo Barrio. Reviewers: reames, bkramer, mcrosier, gberry, haicheng, jmolloy, sebpop Subscribers: jojo, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D25477 llvm-svn: 284971	2016-10-24 13:04:45 +00:00
Simon Pilgrim	6d2de6aa9e	Fix windows builds by swapping windows.h and wincrypt.h ordering. We need to include windows.h first even though it breaks default include ordering rules llvm-svn: 284968	2016-10-24 12:39:23 +00:00
Pavel Labath	757ca886cd	Remove TimeValue usage from llvm/Support Summary: This is a follow-up to D25416. It removes all usages of TimeValue from llvm/Support library (except for the actual TimeValue declaration), and replaces them with appropriate usages of std::chrono. To facilitate this, I have added small utility functions for converting time points and durations into appropriate OS-specific types (FILETIME, struct timespec, ...). Reviewers: zturner, mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25730 llvm-svn: 284966	2016-10-24 10:59:17 +00:00
Simon Dardis	9c34854833	[mips] synci microMIPS instruction definition. Add synci to the microMIPS instruction definitions, mark the MIPS sync & synci as not being part of microMIPS. This does not cover the sync instruction alias, as that will be handled with a different patch. Add sync to the valid tests for microMIPS. Reviewers: vkalintiris Differential Revision: https://reviews.llvm.org/D25795 llvm-svn: 284962	2016-10-24 10:23:59 +00:00
Craig Topper	8ec5c7326d	[AVX-512] Remove masked pmin/pmax intrinsics and autoupgrade to native IR. Clang patch to replace 512-bit vector and 64-bit element versions with native IR will follow. llvm-svn: 284955	2016-10-24 04:04:16 +00:00
Sanjay Patel	9ca028c2d6	[DAG] enhance computeKnownBits to handle SRL/SRA with vector splat constant llvm-svn: 284953	2016-10-23 23:13:31 +00:00
Simon Pilgrim	d06641d3dc	Use SDValue::getConstantOperandVal() helper. NFCI. llvm-svn: 284949	2016-10-23 20:17:21 +00:00
Justin Lebar	a45e6fde58	Remove LLVM_CONSTEXPR. Summary: With MSVC 2013 and GCC < 4.8 gone, we can use the "constexpr" keyword. Reviewers: bkramer, mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25901 llvm-svn: 284947	2016-10-23 19:39:16 +00:00
Simon Pilgrim	6ac1e98b09	[X86][SSE] Add SSE41/AVX1 costs for vector shifts. We were defaulting to SSE2 costs which weren't taking into account the availability of PBLENDW/PBLENDVB to improve merging of per-element shift results. llvm-svn: 284939	2016-10-23 16:49:04 +00:00
Simon Pilgrim	96ef0c1103	Use APInt::isAllOnesValue instead of popcnt. NFCI. More obvious implementation and faster too. llvm-svn: 284937	2016-10-23 15:09:44 +00:00
Dylan McKay	479a13c0aa	[AVR] Add the machine code disassembler This adds a super basic implementation of a machine code disassembler. It doesn't support any operands with custom encoding. llvm-svn: 284930	2016-10-22 23:57:59 +00:00
Simon Pilgrim	d3829c89bc	[X86][AVX512VL] Added support for combining target 256-bit shuffles to AVX512VL VPERMV3 llvm-svn: 284922	2016-10-22 20:15:39 +00:00
Simon Pilgrim	56c0524f0f	[X86][AVX512] Added support for combining target shuffles to AVX512 VPERMV3 llvm-svn: 284921	2016-10-22 19:53:59 +00:00
James Molloy	2bae8640d7	[ARM] Fix crash in ConstantIslands tPCRelJT may not be the first instruction in a block. Check that instead of dereferencing a broken iterator. llvm-svn: 284917	2016-10-22 09:58:37 +00:00
Craig Topper	b084c90a18	[X86] Add support for printing shuffle comments for VALIGN instructions. llvm-svn: 284915	2016-10-22 06:51:56 +00:00
Craig Topper	7b2b8db438	[X86] Add support for lowering v4i64 and v8i64 shuffles directly to PALIGNR. I think shuffle combine can figure it out later, but we should try to get it right up front. llvm-svn: 284914	2016-10-22 06:51:52 +00:00
Craig Topper	9f374533e3	[X86] Remove unnecessary AVX2 check that was already covered by an assertion earlier in the function. NFC llvm-svn: 284913	2016-10-22 06:51:49 +00:00
Craig Topper	bea5cb5491	[X86] Remove 128-bit lane handling from the main loop of matchVectorShuffleAsByteRotate. Instead check for is128LaneRepeatedSuffleMask before the loop and just loop over the repeated mask. I plan to use the loop to support VALIGND/Q shuffles so this makes it easier to reuse. llvm-svn: 284912	2016-10-22 06:51:44 +00:00
Simon Pilgrim	0d376bcbf0	[X86][SSE] Use getConstVector helper for VPERMV mask generation. NFCI. llvm-svn: 284911	2016-10-22 06:18:36 +00:00
Daniel Berlin	f5361139bb	Now that VS2013 is gone, make a memoryssa structure an anonymous union again llvm-svn: 284910	2016-10-22 04:15:41 +00:00
Kostya Serebryany	65f102d4d2	[libFuzzer] mutation: insert the size of the input in bytes as one of the ways to mutate a binary integer llvm-svn: 284909	2016-10-22 03:48:53 +00:00
Gerolf Hoflehner	9e2afa8bd7	[BasicAA] Fix - missed alias in GEP expressions In BasicAA GEP operand values get adjusted ("wrap-around") based on the pointersize. Otherwise, in non-64b modes, AA could report false negatives. However, a wrap-around is valid only for a fully evaluated expression. It had been introduced to fix an alias problem in http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160118/326163.html. This commit restricts the wrap-around to constant gep operands only where the value is known at compile-time. llvm-svn: 284908	2016-10-22 02:41:39 +00:00
Davide Italiano	738837eed9	[CtorUtils] Modernize. No functional changes intended. llvm-svn: 284904	2016-10-22 01:21:24 +00:00
Kostya Serebryany	10ae9e23a3	[libFuzzer] typo in a test llvm-svn: 284903	2016-10-22 01:07:38 +00:00
Kostya Serebryany	2bfff021ad	[libFuzzer] add a test for asan's strict_string_checks=1 llvm-svn: 284902	2016-10-22 00:05:44 +00:00
Konstantin Zhuravlyov	fda33eaf0c	[AMDGPU] Perform uchar to float combine for ISD::SINT_TO_FP This will prevent following regression when enabling i16 support (D18049): test/CodeGen/AMDGPU/cvt_f32_ubyte.ll Differential Revision: https://reviews.llvm.org/D25805 llvm-svn: 284891	2016-10-21 22:10:03 +00:00
Tom Stellard	6c7dd980e4	AMDGPU/SI: Fix crash caused by r284267 Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25782 llvm-svn: 284875	2016-10-21 20:25:11 +00:00
Sanjay Patel	ca92c36e01	[DAG] enhance computeKnownBits to handle SHL with vector splat constant Also, use APInt to avoid crashing on types larger than vNi64. llvm-svn: 284874	2016-10-21 20:16:27 +00:00
Peter Collingbourne	ecdd58f1d6	Analysis: Move llvm::getConstantRangeFromMetadata to IR library. We're about to start using it there. Differential Revision: https://reviews.llvm.org/D25877 llvm-svn: 284865	2016-10-21 19:59:26 +00:00
Peter Collingbourne	e9bd49824d	X86: Improve BT instruction selection for 64-bit values. If a 64-bit value is tested against a bit which is known to be in the range [0..31) (modulo 64), we can use the 32-bit BT instruction, which has a slightly shorter encoding. Differential Revision: https://reviews.llvm.org/D25862 llvm-svn: 284864	2016-10-21 19:57:55 +00:00
Simon Pilgrim	ab48872313	[X86][AVX512BWVL] Added support for lowering v16i16 shuffles to AVX512BWVL vpermw llvm-svn: 284863	2016-10-21 19:54:38 +00:00
Bob Haarman	653baa2aaa	[pdb] added support for dumping globals stream Summary: This adds support for dumping the globals stream from PDB files using llvm-pdbdump, similar to the support we have for the publics stream. Reviewers: ruiu, zturner Subscribers: beanz, mgorny, modocache Differential Revision: https://reviews.llvm.org/D25801 llvm-svn: 284861	2016-10-21 19:43:19 +00:00
Simon Pilgrim	da814cba0d	[X86][AVX512BWVL] Added support for combining target v16i16 shuffles to AVX512BWVL vpermw llvm-svn: 284860	2016-10-21 19:40:29 +00:00
Simon Pilgrim	0109bf116f	[X86][AVX512] Added support for combining target shuffles to AVX512 vpermpd/vpermq/vpermps/vpermd/vpermw llvm-svn: 284858	2016-10-21 19:18:09 +00:00
Krzysztof Parzyszek	6e7fa99d3a	[RDF] Use RegisterId typedef more consistently, NFC llvm-svn: 284857	2016-10-21 19:12:13 +00:00
Anna Thomas	0860259434	[StripGCRelocates] New pass to remove gc.relocates added by RS4GC Summary: Utility pass to remove gc.relocates created by rewrite statepoints for GC. With respect to safepoint verification, the IR generated would be incorrect, and cannot run as such. This would be a single transformation on the final optimized IR. The benefit of the pass is for easy analysis when the IRs are 'polluted' by too many gc.relocates. Added tests. test run: All RS4GC tests with -verify option. Local downstream tests on large IR files. This also works when the pointer being gc.relocated is another gc.relocate. Reviewers: sanjoy, reames Subscribers: beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D25096 llvm-svn: 284855	2016-10-21 18:43:16 +00:00
Sanjay Patel	81029f6a76	[DAG] fold negation of sign-bit 0 - X --> 0, if the sub is NUW 0 - X --> 0, if X is 0 or the minimum signed value and the sub is NSW 0 - X --> X, if X is 0 or the minimum signed value This is the DAG equivalent of: https://reviews.llvm.org/rL284649 plus the fold for the NUW case which already existed in InstSimplify. Note that we miss a vector fold because of a deficiency in the DAG version of computeKnownBits(). llvm-svn: 284844	2016-10-21 17:24:26 +00:00
Krzysztof Parzyszek	b71085b547	[Hexagon] Handle spills of partially defined double vector registers After register allocation it is possible to have a spill of a register that is only partially defined. That in itself it fine, but creates a problem for double vector registers. Stores of such registers are pseudo instructions that are expanded into pairs of individual vector stores, and in case of a partially defined source, one of the stores may use an entirely undefined register. To avoid this, track the defined parts and only generate actual stores for those. llvm-svn: 284841	2016-10-21 16:38:29 +00:00
Derek Schuff	6f69783f1f	[WebAssembly] Fix for 0xc call_indirect changes Summary: Need to reorder the operands to have the callee as the last argument. Adds a pseudo-instruction, and a pass to lower it into a real call_indirect. This is the first of two options for how to fix the problem. Reviewers: dschuff, sunfish Subscribers: jfb, beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D25708 llvm-svn: 284840	2016-10-21 16:38:07 +00:00
Abderrazek Zaafrani	9daf8110c8	Set the vectorizer MaxInterleaveFactor for Exynos. llvm-svn: 284839	2016-10-21 16:28:27 +00:00
Reid Kleckner	ac2a2a86e4	Fix -Wunused-variable warning in libFuzzer llvm-svn: 284838	2016-10-21 16:26:27 +00:00
Simon Pilgrim	2d96daa885	[X86] Use DAG::getBuildVector helper wrapper where possible. NFCI. llvm-svn: 284835	2016-10-21 16:07:51 +00:00
Abderrazek Zaafrani	9f382f53d1	Test commit llvm-svn: 284832	2016-10-21 15:24:08 +00:00
Artur Pilipenko	47dc098c06	[LVI] Fix a bug with a guard being the very first instruction in a BB not taken into account While looking for guards use reverse iterator and scan up to rend() not to begin() llvm-svn: 284827	2016-10-21 15:02:21 +00:00
Sanjay Patel	501be9b3d7	fix variable names; NFCI Because we're just 'or-ing' these 2 variables later in the code, I don't think there's a logical bug here, but of course the string with "no size" is the one that should have the size suffix stripped off. llvm-svn: 284826	2016-10-21 14:58:30 +00:00
Artem Tamazov	751985a757	[AMDGPU][mc] Fix ds_min/max[_rtn]_f32 - extra source operand removed. Fixes Bug 28215. Lit tests updated. Differential Revision: https://reviews.llvm.org/D25837 llvm-svn: 284825	2016-10-21 14:49:22 +00:00
Sanjay Patel	cbaba93ce8	[DAG] use SDNode flags 'nsz' to enable fadd/fsub with zero folds As discussed in D24815, let's start the process of killing off the broken fast-math global state housed in TargetOptions and eliminate the need for function-level fast-math attributes. Here we enable two similar folds that are possible when we don't care about signed-zero: fadd nsz x, 0 --> x fsub nsz 0, x --> -x Note that although the test cases include a 'sin' function call, I'm side-stepping the FMF-on-calls question (and lack of support in the DAG) for now. It's not needed for these tests - isNegatibleForFree/GetNegatedExpression just look through a ISD::FSIN node. Also, when we create an FNEG node and propagate the Flags of the FSUB to it, this doesn't actually do anything today because Flags are silently dropped for any node that is not a binary operator. Differential Revision: https://reviews.llvm.org/D25297 llvm-svn: 284824	2016-10-21 14:36:58 +00:00
Simon Pilgrim	c98d99a600	[X86][AVX2] Begun generalizing lowering to VPERMD/VPERMPS in preparation for AVX512 support. llvm-svn: 284823	2016-10-21 13:00:47 +00:00
Simon Pilgrim	32b06235da	[X86][AVX512] Add mask/maskz writemask support to subvector broadcast shuffle decode comments llvm-svn: 284821	2016-10-21 12:14:24 +00:00
John Brawn	84b21835f1	[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 llvm-svn: 284818	2016-10-21 11:08:48 +00:00
Bjorn Pettersson	9fcd605d1e	[AArch64] Corrected spill size for DDD register class. NFCI Summary: The spill size was incorrectly set to 196 bits, which isn't a multiple of 8. This problem was detected when experimenting with asserts that the spill size should be a multiple of the byte size. New corrected value for the spill size is set to 192 bits. Note that tablegen (RegisterInfoEmitter) will divide the size set in the RegisterClass definition by 8. So this change should not have any impact on the tablegen output (trunc(192/8) == trunc(196/8) == 24 bytes). Reviewers: t.p.northover Subscribers: llvm-commits, aemerson, rengolin Differential Revision: https://reviews.llvm.org/D25818 llvm-svn: 284814	2016-10-21 09:53:42 +00:00
Davide Italiano	d15477b09d	Revert "[GVN/PRE] Hoist global values outside of loops." There's no agreement about this patch. I personally find the PRE machinery of the current GVN hard enough to reason about that I'm not sure I'll try to land this again, instead of working on the rewrite). llvm-svn: 284796	2016-10-21 01:37:02 +00:00
Keno Fischer	b04df8eaa2	Fix cross-endianness RuntimeDyld relocation for ARM rL284780 fixed the PREL31 relocation and added a test for it. Being the first such test for ARM relocations, it exposed incorrect endianness assumptions (causing buildbot failures on big-endian hosts). Fix that by using the same helpers used for the x86 case. llvm-svn: 284789	2016-10-20 22:15:56 +00:00
Li Huang	fcfe8cd3ae	[SCEV] Add a threshold to restrict number of mul operands to be inlined into SCEV This is to avoid inlining too many multiplication operands into a SCEV, which could take exponential time in the worst case. Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin Differential Revision: https://reviews.llvm.org/D25794 llvm-svn: 284784	2016-10-20 21:38:39 +00:00
Keno Fischer	c32ffe3916	Fix PREL31 relocation on ARM Summary: This is a 31bits relative relocation instead of a 32bits absolute relocation. Reviewers: t.p.northover, peter.smith, rengolin Subscribers: aemerson, llvm-commits, samparker Differential Revision: https://reviews.llvm.org/D25069 llvm-svn: 284780	2016-10-20 21:15:29 +00:00
Michael Kuperstein	b2443ed62b	[X86] Enable interleaved memory access by default This lets the loop vectorizer generate interleaved memory accesses on x86. Differential Revision: https://reviews.llvm.org/D25350 llvm-svn: 284779	2016-10-20 21:04:31 +00:00
Daniel Berlin	cd2deacac6	[MSSA] Avoid unnecessary use walks when calling getClobberingMemoryAccess Summary: This allows us to mark when uses have been optimized. This lets us avoid rewalking (IE when people call getClobberingAccess on everything), and also enables us to later relax the requirement of use optimization during updates with less cost. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25172 llvm-svn: 284771	2016-10-20 20:13:45 +00:00
Kevin Enderby	c8bb42283e	Another additional error check for invalid Mach-O files for the load commands that use the MachO::twolevel_hints_command type which includes only the LC_TWOLEVEL_HINTS load command. This is not used in llvm libObject code or in llvm tool code. But does appear in one of the binary test files. While this load command is obsolete it is easier to add code for it in libObject than edit or change the binary test case. llvm-svn: 284769	2016-10-20 20:10:30 +00:00
Zachary Turner	4d49eb9fa0	[CodeView] Refactor serialization to use StreamInterface. This was all using ArrayRef<>s before which presents a problem when you want to serialize to or deserialize from an actual PDB stream. An ArrayRef<> is really just a special case of what can be handled with StreamInterface though (e.g. by using a ByteStream), so changing this to use StreamInterface allows us to plug in a PDB stream and get all the record serialization and deserialization for free on a MappedBlockStream. Subsequent patches will try to remove TypeTableBuilder and TypeRecordBuilder in favor of class that operate on Streams as well, which should allow us to completely merge the reading and writing codepaths for both types and symbols. Differential Revision: https://reviews.llvm.org/D25831 llvm-svn: 284762	2016-10-20 18:31:19 +00:00
Konstantin Zhuravlyov	521e5ef4ce	[AMDGPU] Make note record name a static const member of target streamer Differential Revision: https://reviews.llvm.org/D25746 llvm-svn: 284760	2016-10-20 18:22:36 +00:00
Konstantin Zhuravlyov	08326b6256	[AMDGPU] Emit constant address space data in .rodata section and use relocations instead of fixups (amdhsa only) Differential Revision: https://reviews.llvm.org/D25693 llvm-svn: 284759	2016-10-20 18:12:38 +00:00
Dehao Chen	f03f51555a	Using branch probability to guide critical edge splitting. Summary: The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass. The performance impact on speccpu2006 on Intel sandybridge machines: spec/2006/fp/C++/444.namd 25.3 +0.26% spec/2006/fp/C++/447.dealII 45.96 -0.10% spec/2006/fp/C++/450.soplex 41.97 +1.49% spec/2006/fp/C++/453.povray 36.83 -0.96% spec/2006/fp/C/433.milc 23.81 +0.32% spec/2006/fp/C/470.lbm 41.17 +0.34% spec/2006/fp/C/482.sphinx3 48.13 +0.69% spec/2006/int/C++/471.omnetpp 22.45 +3.25% spec/2006/int/C++/473.astar 21.35 -2.06% spec/2006/int/C++/483.xalancbmk 36.02 -2.39% spec/2006/int/C/400.perlbench 33.7 -0.17% spec/2006/int/C/401.bzip2 22.9 +0.52% spec/2006/int/C/403.gcc 32.42 -0.54% spec/2006/int/C/429.mcf 39.59 +0.19% spec/2006/int/C/445.gobmk 26.98 -0.00% spec/2006/int/C/456.hmmer 24.52 -0.18% spec/2006/int/C/458.sjeng 28.26 +0.02% spec/2006/int/C/462.libquantum 55.44 +3.74% spec/2006/int/C/464.h264ref 46.67 -0.39% geometric mean +0.20% Manually checked 473 and 471 to verify the diff is in the noise range. Reviewers: rengolin, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24818 llvm-svn: 284757	2016-10-20 18:06:52 +00:00
Simon Pilgrim	365be4f95c	[CostModel][X86] Fixed AVX1/AVX512 sdiv/udiv uniformconst costs for 256/512 bit integer vectors We weren't checking for uniform const costs before the general cost, resulting in very high estimates. llvm-svn: 284755	2016-10-20 18:00:35 +00:00
Pirama Arumuga Nainar	05b0f93ad3	Fix _EXTEND_VECTOR_INREG legalization Summary: While promoting _EXTEND_VECTOR_INREG nodes whose inputs are already promoted, perform the appropriate sign extension for the promoted node before doing the *_EXTEND_VECTOR_INREG operation. If not, the undefined high-order bits of the promoted operand may (a) be garbage inc ase of zext) or (b) contribute the wrong sign-bit (in case of sext) Updated the promote-vec3.ll test after this change. The diff shows explicit zeroing in case of zext and intermediate sign extension in case of sext. Reviewers: RKSimon Subscribers: llvm-commits, srhines Differential Revision: https://reviews.llvm.org/D25790 llvm-svn: 284752	2016-10-20 17:56:36 +00:00
Sanjay Patel	0051efcf97	[Target] remove TargetRecip class; 2nd try This is a retry of r284495 which was reverted at r284513 due to use-after-scope bugs caused by faulty usage of StringRef. This version also renames a pair of functions: getRecipEstimateDivEnabled() getRecipEstimateSqrtEnabled() as suggested by Eric Christopher. original commit msg: [Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering This is a follow-up to https://reviews.llvm.org/D24816 - where we changed reciprocal estimates to be function attributes rather than TargetOptions. This patch is intended to be a structural, but not functional change. By moving all of the TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate state, shield the callers from the string format implementation, and simplify/localize the logic needed for a target to enable this. If a function has a "reciprocal-estimates" attribute, those settings may override the target's default reciprocal preferences for whatever operation and data type we're trying to optimize. If there's no attribute string or specific setting for the op/type pair, just use the target default settings. As noted earlier, a better solution would be to move the reciprocal estimate settings to IR instructions and SDNodes rather than function attributes, but that's a multi-step job that requires infrastructure improvements. I intend to work on that, but it's not clear how long it will take to get all the pieces in place. Differential Revision: https://reviews.llvm.org/D25440 llvm-svn: 284746	2016-10-20 16:55:45 +00:00
Simon Pilgrim	025e26dd32	[CostModel][X86] Fixed AVX1/AVX512 sdiv/udiv general costs for 256/512 bit integer vectors We weren't accounting for legal types on every subtarget, meaning that many of the costs were using defaults. We still don't correctly cost (or test) the 512-bit sdiv/udiv by uniform const cases, nor the power-of-2 cases. llvm-svn: 284744	2016-10-20 16:39:11 +00:00
Valery Pykhtin	e55fd41f73	[AMDGPU] add fcopysign(f64, f32) pattern Differential revision: https://reviews.llvm.org/D25827 llvm-svn: 284743	2016-10-20 16:17:54 +00:00
Benjamin Kramer	b2505005c7	Retire llvm::alignOf in favor of C++11 alignof. No functionality change intended. llvm-svn: 284733	2016-10-20 15:02:18 +00:00
Benjamin Kramer	26b2593b24	[GVN] Use defaulted members. No functional change. llvm-svn: 284726	2016-10-20 13:09:12 +00:00
Simon Dardis	226752c15d	[mips][mcjit] Add the majority of N32 support. The missing piece is relocation composition for %hi(%neg(%gp_rel(x))) and similar. Patch by: Daniel Sanders llvm-svn: 284724	2016-10-20 13:02:23 +00:00
Benjamin Kramer	2a8bef8769	Do a sweep over move ctors and remove those that are identical to the default. All of these existed because MSVC 2013 was unable to synthesize default move ctors. We recently dropped support for it so all that error-prone boilerplate can go. No functionality change intended. llvm-svn: 284721	2016-10-20 12:20:28 +00:00
Pavel Labath	59838f7ea6	Reapply "Add Chrono.h - std::chrono support header" This is a resubmission of r284590. The mingw build should be fixed now. The problem was we were matching time_t with _localtime_64s, which was incorrect on _USE_32BIT_TIME_T systems. Instead I use localtime_s, which should always evaluate to the correct function. llvm-svn: 284720	2016-10-20 12:05:50 +00:00
Simon Pilgrim	618d3aedaf	[DAGCombiner] Add general constant vector support to (srl (shl x, c), c) -> (and x, cst2) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284717	2016-10-20 11:10:21 +00:00
Simon Pilgrim	25059360d5	Fix spelling mistake in comment. llvm-svn: 284714	2016-10-20 10:42:14 +00:00
Simon Pilgrim	071da46a35	Fix MSVC bool -> uint64_t promotion warning llvm-svn: 284713	2016-10-20 10:37:58 +00:00
Jonas Paulsson	8010b631d5	[SystemZ] Post-RA scheduler implementation Post-RA sched strategy and scheduling instruction annotations for z196, zEC12 and z13. This scheduler optimizes decoder grouping and balances processor resources (including side steering the FPd unit instructions). The SystemZHazardRecognizer keeps track of the scheduling state, which can be dumped with -debug-only=misched. Reviers: Ulrich Weigand, Andrew Trick. https://reviews.llvm.org/D17260 llvm-svn: 284704	2016-10-20 08:27:16 +00:00
Peter Collingbourne	c7766778a0	X86: Allow expressions to appear as u8imm operands. llvm-svn: 284688	2016-10-20 01:58:34 +00:00
Peter Collingbourne	de1f039360	X86: Deduplicate some lowering code. NFCI. llvm-svn: 284686	2016-10-20 01:21:26 +00:00
Reid Kleckner	40d7230f2f	Use __func__ directly now that all supported compilers support it Remove the portability macro now that it is unused. llvm-svn: 284681	2016-10-20 00:22:23 +00:00
Victor Leschuk	2ede126b1b	DebugInfo: preparation to implement DW_AT_alignment - Add alignment attribute to DIVariable family - Modify bitcode format to match new DIVariable representation - Update tests to match these changes (also add bitcode upgrade test) - Expect that frontend passes non-zero align value only when it is not default (was forcibly aligned by alignas()/_Alignas()/__atribute__(aligned()) Differential Revision: https://reviews.llvm.org/D25073 llvm-svn: 284678	2016-10-20 00:13:12 +00:00
Reid Kleckner	990504e625	Remove LLVM_NOEXCEPT and replace it with noexcept Now that we have dropped MSVC 2013, all supported compilers support noexcept and we can drop this portability macro. llvm-svn: 284672	2016-10-19 23:52:38 +00:00
Kevin Enderby	210030ba95	Next set of additional error checks for invalid Mach-O files for the load commands that use the MachO::thread_command type but are not used in llvm libObject code but used in llvm tool code. This includes the LC_UNIXTHREAD and LC_THREAD load commands. A quick note about the philosophy of the error checking in libObject for Mach-O files, the idea behind the checking is that we never will return a Mach-O file out of libObject that contains unknown things in the load commands. To do this the 32-bit ARM and PPC general tread states needed to be defined as two test case binaries contained them. If other thread states for other CPUs need to be added we will do that as needed. Going forward the LC_MAIN load command is used to set the entry point in Mach-O executables these days instead of an LC_UNIXTHREAD as was done in the past. So today only in core files are LC_THREAD load commands and thread states usually found. Other thread states have not yet been defined in include/Support/MachO.h at this time. But that can be added as needed with their corresponding checking also added. llvm-svn: 284668	2016-10-19 23:44:34 +00:00
Rong Xu	2c684cfd94	[PGO] Fix bogus warning for merging empty llvm profile file Profile runtime can generate an empty raw profile (when there is no function in the shared library). This empty profile is treated as a text format profile. A test format profile without the flag of "#IR" is thought to be a clang generated profile. So in llvm profile merging, we will get a bogus warning of "Merge IR generated profile with Clang generated profile." The fix here is to skip the empty profile (when the buffer size is 0) for profile merge. Reviewers: vsk, davidxl Differential Revision: http://reviews.llvm.org/D25687 llvm-svn: 284659	2016-10-19 22:51:17 +00:00
Mehdi Amini	db46b7d217	Add computeHostNumPhysicalCores() implementation for Darwin Differential Revision: https://reviews.llvm.org/D25800 llvm-svn: 284656	2016-10-19 22:36:07 +00:00
Wei Ding	3cb2a1e8d1	AMDGPU : Add a function to enable and disable IEEEBit for SC and shader respectively. Differential Revision: http://reviews.llvm.org/D25789 llvm-svn: 284655	2016-10-19 22:34:49 +00:00
Sanjay Patel	efd8885772	[InstSimplify] fold negation of sign-bit 0 - X --> X, if X is 0 or the minimum signed value 0 - X --> 0, if X is 0 or the minimum signed value and the sub is NSW I noticed this pattern might be created in the backend after the change from D25485, so we'll want to add a similar fold for the DAG. The use of computeKnownBits in InstSimplify may be something to investigate if the compile time of InstSimplify is noticeable. We could replace computeKnownBits with specific pattern matchers or limit the recursion. Differential Revision: https://reviews.llvm.org/D25785 llvm-svn: 284649	2016-10-19 21:23:45 +00:00
Hans Wennborg	2d55d67c62	Typo: nomed struct -> named struct llvm-svn: 284635	2016-10-19 20:10:03 +00:00
Reid Kleckner	f8d1d12fef	[GlobalMerge] Handle non-landingpad EH pads This code crashed on funclet-style EH instructions such as catchpad, catchswitch, and cleanuppad. Just treat all EH pad instructions equivalently and avoid merging the globals they reference through any use. llvm-svn: 284633	2016-10-19 19:56:22 +00:00
Artur Pilipenko	5c6ef75485	[IndVarSimplify] Teach calculatePostIncRange to take guards into account Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D25739 llvm-svn: 284632	2016-10-19 19:43:54 +00:00
Matthew Simpson	41fa838f07	[LV] Avoid emitting trivially dead instructions Some instructions from the original loop, when vectorized, can become trivially dead. This happens because of the way we structure the new loop. For example, we create new induction variables and induction variable "steps" in the new loop. Thus, when we go to vectorize the original induction variable update, it may no longer be needed due to the instructions we've already created. This patch prevents us from creating these redundant instructions. This reduces code size before simplification and allows greater flexibility in code generation since we have fewer unnecessary instruction uses. Differential Revision: https://reviews.llvm.org/D25631 llvm-svn: 284631	2016-10-19 19:22:02 +00:00
Chad Rosier	6e3a92ec88	[AliasSetTracker] Add support for memcpy and memmove. Differential Revision: https://reviews.llvm.org/D25776 llvm-svn: 284630	2016-10-19 19:09:03 +00:00
Artur Pilipenko	f2d5dc5dc6	[IndVarSimplify] Use control-dependent range information to prove non-negativity This change is motivated by the case when IndVarSimplify doesn't widen a comparison of IV increment because it can't prove IV increment being non-negative. We end up with a redundant trunc of the widened increment on this example. for.body: %i = phi i32 [ %start, %for.body.lr.ph ], [ %i.inc, %for.inc ] %within_limits = icmp ult i32 %i, 64 br i1 %within_limits, label %continue, label %for.end continue: %i.i64 = zext i32 %i to i64 %arrayidx = getelementptr inbounds i32, i32* %base, i64 %i.i64 %val = load i32, i32* %arrayidx, align 4 br label %for.inc for.inc: %i.inc = add nsw nuw i32 %i, 1 %cmp = icmp slt i32 %i.inc, %limit br i1 %cmp, label %for.body, label %for.end There is a range check inside of the loop which guarantees the IV to be non-negative. NSW on the increment guarantees that the increment is also non-negative. Teach IndVarSimplify to use the range check to prove non-negativity of loop increments. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D25738 llvm-svn: 284629	2016-10-19 18:59:03 +00:00
Chad Rosier	16970a847c	[AliasSetTracker] Return void for add() functions. NFC. Differential Revision: https://reviews.llvm.org/D25748 llvm-svn: 284628	2016-10-19 18:50:32 +00:00
Krzysztof Parzyszek	c87155037b	[AMDGPU] Stop using MCRegisterClass::getSize() Differential Review: https://reviews.llvm.org/D24675 llvm-svn: 284619	2016-10-19 17:40:36 +00:00
Teresa Johnson	ec544c552e	[ThinLTO] Default backend threads to heavyweight_hardware_concurrency Summary: Changes default backend parallelism from thread::hardware_concurrency to the new llvm::heavyweight_hardware_concurrency, which for X86 Linux defaults to the number of physical cores (and will fall back to thread::hardware_concurrency otherwise). This avoid oversubscribing the physical cores using hyperthreading. Reviewers: mehdi_amini, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25775 llvm-svn: 284618	2016-10-19 17:35:01 +00:00
Simon Pilgrim	e32d0f8413	Merged nested ifs. NFCI. llvm-svn: 284616	2016-10-19 17:30:24 +00:00
Pavel Labath	504f3844ae	Revert "Add Chrono.h - std::chrono support header" This reverts commit r284590 as it fails on the mingw buildbot. I think I know the fix, but I cannot test it right now. Will reapply when I verify it works ok. This reverts r284590. llvm-svn: 284615	2016-10-19 17:17:53 +00:00
Simon Pilgrim	a20aeea998	[DAGCombiner] Add general constant vector support to (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284613	2016-10-19 17:12:22 +00:00
Reid Kleckner	f7ad5341d0	[WinEH] Allow catchpads to reuse the same catch object This code used a regular when it should have used a multimap. llvm-svn: 284612	2016-10-19 17:08:23 +00:00
Sanjay Patel	3a3aaf67e0	[DAG] optimize negation of bool Use mask and negate for legalization of i1 source type with SIGN_EXTEND_INREG. With the mask, this should be no worse than 2 shifts. The mask can be eliminated in some cases, so that should be better than 2 shifts. This change exposed some missing folds related to negation: https://reviews.llvm.org/rL284239 https://reviews.llvm.org/rL284395 There may be others, so please let me know if you see any regressions. Differential Revision: https://reviews.llvm.org/D25485 llvm-svn: 284611	2016-10-19 16:58:59 +00:00
Zachary Turner	383803230b	[pdb] Improve error messages when DIA is not found. llvm-svn: 284610	2016-10-19 16:42:20 +00:00
Krzysztof Parzyszek	7bb63ac029	[RDF] Switch RefMap in liveness calculation to use lane masks This required reengineering of some of the part of liveness calculation, including fixing some issues caused by the limitations of the previous approach. The current code is not necessarily the fastest, but it should be functionally correct (at least more so than before). The compile-time performance will be addressed in the future. llvm-svn: 284609	2016-10-19 16:30:56 +00:00
Simon Pilgrim	4554e161be	[DAGCombiner] Add general constant vector support to (shl (sra x, c1), c1) -> (and x, (shl -1, c1)) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284608	2016-10-19 16:15:30 +00:00
Simon Pilgrim	c2e9724909	[DAGCombiner] Add general constant vector support to (shl (mul x, c1), c2) -> (mul x, c1 << c2) We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector llvm-svn: 284607	2016-10-19 15:59:28 +00:00
Tim Northover	e699929507	Revert r284604. A.K.A. "TMP" Committed by mistake. llvm-svn: 284606	2016-10-19 15:56:12 +00:00
Tim Northover	3d4cd3d930	TMP llvm-svn: 284604	2016-10-19 15:55:09 +00:00
Tim Northover	7152dcaf77	GlobalISel: support translating volatile loads and stores. llvm-svn: 284603	2016-10-19 15:55:06 +00:00
Artur Pilipenko	ed84103a1a	Introduce ConstantRange.addWithNoSignedWrap To be used by upcoming change to IndVarSimplify Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D25732 llvm-svn: 284597	2016-10-19 14:44:23 +00:00
Chris Dewhurst	2c3cdd66d2	[Sparc][LEON] Detects an erratum on UT699 LEON 3 processors involving rounding mode changes and issues an appropriate user error message. Differential Revision: https://reviews.llvm.org/D24665 llvm-svn: 284591	2016-10-19 14:01:06 +00:00
Pavel Labath	13b6a10e7b	Add Chrono.h - std::chrono support header Summary: std::chrono mostly covers the functionality of llvm::sys::TimeValue and lldb_private::TimeValue. This header adds a bit of utility functions and typedefs, which make the usage of the library and porting code from TimeValues easier. Rationale: - TimePoint typedef - precision of system_clock is implementation defined - using a well-defined precision helps maintain consistency between platforms, makes it interact better with existing TimeValue classes, and avoids cases there a time point is implicitly convertible to a specific precision on some platforms but not on others. - system_clock::to_time_t only accepts time_points with the default system precision (even though time_t has only second precision on all platforms we support). To avoid the need for explicit casts, I have added a toTimeT() wrapper function. toTimePoint(time_t) was not strictly necessary, but I have added it for symmetry. Reviewers: zturner, mehdi_amini Subscribers: beanz, mgorny, llvm-commits, modocache Differential Revision: https://reviews.llvm.org/D25416 llvm-svn: 284590	2016-10-19 13:58:55 +00:00
Sjoerd Meijer	2fc4cb6f72	Reapply r284571 (with the new tests fixed). llvm-svn: 284588	2016-10-19 13:43:02 +00:00
Ulrich Weigand	6e31ab388a	[SystemZ] Add missing vector instructions for the assembler Most z13 vector instructions have a base form where the data type of the operation (whether to consider the vector to be 16 bytes, 8 halfwords, 4 words, or 2 doublewords) is encoded into a mask field, and then a set of extended mnemonics where the mask field is not present but the data type is encoded into the mnemonic name. Currently, LLVM only supports the type-specific forms (since those are really the ones needed for code generation), but not the base type-generic forms. To complete the assembler support and make it fully compatible with the GNU assembler, this commit adds assembler aliases for all the base forms of the various vector instructions. It also adds two more alias forms that are documented in the PoP: VFPSO/VFPSODB/WFPSODB -- generic form of VFLCDB etc. VNOT -- special variant of VNO llvm-svn: 284586	2016-10-19 13:03:18 +00:00
Ulrich Weigand	556a90c00c	[SystemZ] Add optional argument to some vector string instructions The vfee[bhf], vfene[bhf], and vistr[bhf] assembler mnemonics are documented in the Principles of Operation to have an optional last operand to encode arbitrary values in a mask field. This commit adds support for those optional operands, and cleans up the patterns to generate vector string instruction as bit. No change to code generation intended. llvm-svn: 284585	2016-10-19 12:57:46 +00:00
Michal Gorny	e22b7f5aea	[cmake] Declare LLVM_CMAKE_PATH for use in subprojects Declare the LLVM_CMAKE_PATH to the source directory location of CMake files, in order to make it possible to easily use them in subprojects. Such a variable is already declared in most of LLVM projects (and inconsistently mixed with direct source tree references), including Clang, LLDB, compiler-rt, libcxx... Declaring it inside main LLVM tree makes it possible to avoid having to declare fallback values or use conditionals in those projects. It should be noted that in some of the subprojects LLVM_CMAKE_PATH is used to reference generated LLVMConfig.cmake file. However, these references are conditional to stand-alone builds and explicitly including this file is unnecessary in combined builds. Differential Revision: https://reviews.llvm.org/D25724 llvm-svn: 284581	2016-10-19 12:18:34 +00:00
James Molloy	fbfd173447	[Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions. It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size. TBB example: Before: lsls r0, r0, #2 After: add r0, pc adr r1, .LJTI0_0 ldrb r0, [r0, #6] ldr r0, [r0, r1] lsls r0, r0, #1 mov pc, r0 add pc, r0 => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4. The only case that can increase dynamic instruction count is the TBH case: Before: lsls r0, r4, #2 After: lsls r4, r4, #1 adr r1, .LJTI0_0 add r4, pc ldr r0, [r0, r1] ldrh r4, [r4, #6] mov pc, r0 lsls r4, r4, #1 add pc, r4 => 1 more instruction in prologue. Jump table shrunk by a factor of 2. So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!) llvm-svn: 284580	2016-10-19 12:06:49 +00:00
Simon Pilgrim	7dcb6e572e	[DAGCombiner] Just call isConstOrConstSplat directly. NFCI. This will get the same ConstantSDNode scalar or vector splat value as the current separate dyn_cast<ConstantSDNode> / isVector() approach. llvm-svn: 284578	2016-10-19 11:28:15 +00:00
Simon Pilgrim	b2ca2505cc	[DAGCombine] Generalize distributeTruncateThroughAnd to work with any non-opaque constant or constant vector llvm-svn: 284574	2016-10-19 08:57:37 +00:00
Sjoerd Meijer	3f5111d363	Revert of r284571 because of failing tests. llvm-svn: 284572	2016-10-19 07:45:48 +00:00
Sjoerd Meijer	a318779263	Checking FP function attribute values and adding more build attribute tests. This renames the function for checking FP function attribute values and also adds more build attribute tests (which are in separate files because build attributes are set per file). Differential Revision: https://reviews.llvm.org/D25625 llvm-svn: 284571	2016-10-19 07:25:06 +00:00
Craig Topper	a4dc340cf2	[AVX-512] Teach isel lowering that a subvector broadcast being inserted into both halves of a 512-bit vector can be combined into a larger subvector broadcast. Summary: This allows us to create broadcasts of 128-bit vector loads into 512-bit vectors. New patterns added to support 8-bit and 16-bit vector types and v2f64/v2i64->v8f64/v8i64 without DQI instructions. There also fallback patterns when the load can't be folded. These patterns are a little complex as we first need to insert the lower 128-bits into the second 128-bits using a zmm subvector insert instruction. We need to use a zmm insert in case VLX isn't available. Then use another zmm sub vector insert to take those 256-bits and insert them into the upper bits. Since we used a zmm insert to create the 256-bits we also need to do a extract_subreg to get just the lower 256-bits to pass to the second insert. The outer insert for the fallback patterns should have its type correct because eventually we should also supported masked operations here too. So we need a DQI and a NoDQI version of the v16f32/v16i32 patterns. Reviewers: RKSimon, delena, igorb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25651 llvm-svn: 284567	2016-10-19 04:44:17 +00:00
Dehao Chen	95fc43143d	Revert r284545 again as the regression in ppc still exists. There is bug in MBPI exposed by th patch. Also update the section.ll to fix non-x86 failure. llvm-svn: 284563	2016-10-19 01:18:25 +00:00
Vitaly Buka	490fda3366	[asan] Replace std::to_string with llvm::to_string llvm-svn: 284557	2016-10-19 00:16:56 +00:00
Kostya Serebryany	95b1a434d2	[libFuzzer] extend -print_coverage to also print uncovered lines, functions, and files. Example of output: COVERAGE: COVERED: in DSO2(int) /pathto/DSO2.cpp:6 COVERED: in DSO2(int) /pathto/DSO2.cpp:8 COVERED: in DSO1(int) /pathto/DSO1.cpp:6 COVERED: in DSO1(int) /pathto/DSO1.cpp:8 COVERED: in LLVMFuzzerTestOneInput /pathto/DSOTestMain.cpp:16 COVERED: in LLVMFuzzerTestOneInput /pathto/DSOTestMain.cpp:19 COVERED: in LLVMFuzzerTestOneInput /pathto/DSOTestMain.cpp:25 COVERED: in LLVMFuzzerTestOneInput /pathto/DSOTestMain.cpp:26 MODULE_WITH_COVERAGE: /pathto/libLLVMFuzzer-DSO1.so UNCOVERED_LINE: in DSO1(int) /pathto/DSO1.cpp:9 UNCOVERED_FUNC: in Uncovered1() MODULE_WITH_COVERAGE: /pathto/libLLVMFuzzer-DSO2.so UNCOVERED_LINE: in DSO2(int) /pathto/DSO2.cpp:9 UNCOVERED_FUNC: in Uncovered2() MODULE_WITH_COVERAGE: /pathto/LLVMFuzzer-DSOTest UNCOVERED_LINE: in LLVMFuzzerTestOneInput /pathto/DSOTestMain.cpp:21 UNCOVERED_LINE: in LLVMFuzzerTestOneInput /pathto/DSOTestMain.cpp:27 UNCOVERED_FILE: /pathto/DSOTestExtra.cpp Several things are not perfect here: * we are using objdump+awk instead of sancov because sancov does not support DSOs yet. * this breaks in the presence of ASAN_OPTIONS=strip_path_prefix=... (need to implement another API to get the module name by PC) llvm-svn: 284554	2016-10-19 00:12:03 +00:00
Vitaly Buka	5910a92560	[asan] Simplify calculation of stack frame layout extraction calculation of stack description into separate function. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25754 llvm-svn: 284547	2016-10-18 23:29:52 +00:00
Vitaly Buka	d88e52012b	[asan] Append line number to variable name if line is available and in the same file as the function. PR30498 Reviewers: eugenis Differential Revision: https://reviews.llvm.org/D25715 llvm-svn: 284546	2016-10-18 23:29:41 +00:00
Dehao Chen	f8ac3d26d5	Using branch probability to guide critical edge splitting. Summary: The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass. The performance impact on speccpu2006 on Intel sandybridge machines: spec/2006/fp/C++/444.namd 25.3 +0.26% spec/2006/fp/C++/447.dealII 45.96 -0.10% spec/2006/fp/C++/450.soplex 41.97 +1.49% spec/2006/fp/C++/453.povray 36.83 -0.96% spec/2006/fp/C/433.milc 23.81 +0.32% spec/2006/fp/C/470.lbm 41.17 +0.34% spec/2006/fp/C/482.sphinx3 48.13 +0.69% spec/2006/int/C++/471.omnetpp 22.45 +3.25% spec/2006/int/C++/473.astar 21.35 -2.06% spec/2006/int/C++/483.xalancbmk 36.02 -2.39% spec/2006/int/C/400.perlbench 33.7 -0.17% spec/2006/int/C/401.bzip2 22.9 +0.52% spec/2006/int/C/403.gcc 32.42 -0.54% spec/2006/int/C/429.mcf 39.59 +0.19% spec/2006/int/C/445.gobmk 26.98 -0.00% spec/2006/int/C/456.hmmer 24.52 -0.18% spec/2006/int/C/458.sjeng 28.26 +0.02% spec/2006/int/C/462.libquantum 55.44 +3.74% spec/2006/int/C/464.h264ref 46.67 -0.39% geometric mean +0.20% Manually checked 473 and 471 to verify the diff is in the noise range. Reviewers: rengolin, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24818 llvm-svn: 284545	2016-10-18 23:24:02 +00:00
Dehao Chen	62d0e64e9e	revert r284541. llvm-svn: 284544	2016-10-18 23:11:20 +00:00
Rong Xu	1c0e9b97d2	Conditionally eliminate library calls where the result value is not used Summary: This pass shrink-wraps a condition to some library calls where the call result is not used. For example: sqrt(val); is transformed to if (val < 0) sqrt(val); Even if the result of library call is not being used, the compiler cannot safely delete the call because the function can set errno on error conditions. Note in many functions, the error condition solely depends on the incoming parameter. In this optimization, we can generate the condition can lead to the errno to shrink-wrap the call. Since the chances of hitting the error condition is low, the runtime call is effectively eliminated. These partially dead calls are usually results of C++ abstraction penalty exposed by inlining. This optimization hits 108 times in 19 C/C++ programs in SPEC2006. Reviewers: hfinkel, mehdi_amini, davidxl Subscribers: modocache, mgorny, mehdi_amini, xur, llvm-commits, beanz Differential Revision: https://reviews.llvm.org/D24414 llvm-svn: 284542	2016-10-18 21:36:27 +00:00
Dehao Chen	ea62ae9844	Using branch probability to guide critical edge splitting. Summary: The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass. The performance impact on speccpu2006 on Intel sandybridge machines: spec/2006/fp/C++/444.namd 25.3 +0.26% spec/2006/fp/C++/447.dealII 45.96 -0.10% spec/2006/fp/C++/450.soplex 41.97 +1.49% spec/2006/fp/C++/453.povray 36.83 -0.96% spec/2006/fp/C/433.milc 23.81 +0.32% spec/2006/fp/C/470.lbm 41.17 +0.34% spec/2006/fp/C/482.sphinx3 48.13 +0.69% spec/2006/int/C++/471.omnetpp 22.45 +3.25% spec/2006/int/C++/473.astar 21.35 -2.06% spec/2006/int/C++/483.xalancbmk 36.02 -2.39% spec/2006/int/C/400.perlbench 33.7 -0.17% spec/2006/int/C/401.bzip2 22.9 +0.52% spec/2006/int/C/403.gcc 32.42 -0.54% spec/2006/int/C/429.mcf 39.59 +0.19% spec/2006/int/C/445.gobmk 26.98 -0.00% spec/2006/int/C/456.hmmer 24.52 -0.18% spec/2006/int/C/458.sjeng 28.26 +0.02% spec/2006/int/C/462.libquantum 55.44 +3.74% spec/2006/int/C/464.h264ref 46.67 -0.39% geometric mean +0.20% Manually checked 473 and 471 to verify the diff is in the noise range. Reviewers: rengolin, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24818 llvm-svn: 284541	2016-10-18 21:36:11 +00:00
David Blaikie	69494a9805	dwarfdump: add space missing from the type unit header description llvm-svn: 284540	2016-10-18 21:18:43 +00:00
David Blaikie	e4c3915a5a	dwarfdump: Include the name in the unit description, even in non-summarized mode (accidentally removed this from my previous change when I was rejecting some clang-format formatting... ) llvm-svn: 284539	2016-10-18 21:16:45 +00:00
David Blaikie	50cc27ecb9	dwarfdump: -summarize-types: print a short summary (unqualified type name, hash, length) of type units rather than dumping contents This is just a quick utility handy for getting rough summaries of types in a given object or dwo file. I've been using it to investigate the amount of type info redundancy across a project build, for example. llvm-svn: 284537	2016-10-18 21:09:48 +00:00
Eli Friedman	c0a717ba5b	Improve ARM lowering for "icmp <2 x i64> eq". The custom lowering is pretty straightforward: basically, just AND together the two halves of a <4 x i32> compare. Differential Revision: https://reviews.llvm.org/D25713 llvm-svn: 284536	2016-10-18 21:03:40 +00:00
Davide Italiano	36efa68463	[GVN] Consistently use division instead of shift. NFCI. This is in line with other places of GVN (e.g. load coercion logic). llvm-svn: 284535	2016-10-18 21:02:27 +00:00
Davide Italiano	64cd985e44	[GVN] Remove dead code. NFC. llvm-svn: 284534	2016-10-18 21:00:26 +00:00
Dehao Chen	302b69c940	Use profile info to set function section prefix to group hot/cold functions. Summary: The original implementation is in r261607, which was reverted in r269726 to accomendate the ProfileSummaryInfo analysis pass. The new implementation: 1. add a new metadata for function section prefix 2. query against ProfileSummaryInfo in CGP to set the correct section prefix for each function 3. output the section prefix set by CGP Reviewers: davidxl, eraman Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D24989 llvm-svn: 284533	2016-10-18 20:42:47 +00:00
Evandro Menezes	ce8d60156c	[AArch64] Avoid materializing 0.0 when generating FP SELECT Transform `a == 0.0 ? 0.0 : x` to `a == 0.0 ? a : x` and `a != 0.0 ? x : 0.0` to `a != 0.0 ? x : a` to avoid materializing 0.0 for FCSEL, since it does not have to be materialized beforehand for FCMP, as it has a form that has 0.0 as an implicit operand. Differential Revision: https://reviews.llvm.org/D24808 llvm-svn: 284531	2016-10-18 20:37:35 +00:00
Kevin Enderby	89baf99c92	One more additional error check for invalid Mach-O files for a load command that use the MachO:: linkedit_data_command type but is not used in llvm libObject code but used in llvm tool code. This is for the LC_CODE_SIGNATURE load command. llvm-svn: 284529	2016-10-18 20:24:12 +00:00
Tim Northover	6e9043009e	GlobalISel: translate the @llvm.objectsize intrinsic. llvm-svn: 284527	2016-10-18 20:03:51 +00:00
Tim Northover	55782222c0	GlobalISel: select small binary operations on AArch64. AArch64 actually supports many 8-bit operations under the definition used by GlobalISel: the designated information-carrying bits of a GPR32 get the right value if you just use the normal 32-bit instruction. llvm-svn: 284526	2016-10-18 20:03:48 +00:00
Tim Northover	3f18603c52	GlobalISel: translate memcpy intrinsics. llvm-svn: 284525	2016-10-18 20:03:45 +00:00
Tim Northover	4494d69862	GlobalISel: support floating-point constants on AArch64. Patch from Ahmed Bougacha. llvm-svn: 284523	2016-10-18 19:47:57 +00:00
Krzysztof Parzyszek	5bb417bed2	[Hexagon] Handle block live-ins with lane masks in HexagonBlockRanges llvm-svn: 284522	2016-10-18 19:47:20 +00:00
Benjamin Kramer	4c2582ad78	Reduce global namespace pollution. NFC. llvm-svn: 284521	2016-10-18 19:39:31 +00:00
Benjamin Kramer	ee042234ae	[esan] Remove global variable. It's not thread safe and completely unnecessary. llvm-svn: 284520	2016-10-18 19:39:23 +00:00
Benjamin Kramer	1e425c9f24	[InterleavedAccessPass] Remove global variable. This is a threading hazard and rightfully complained about by tsan. No functionality change. llvm-svn: 284515	2016-10-18 18:59:58 +00:00
Kostya Serebryany	bb59ef77ca	[libFuzzer] detect leaks after every run when executing fixed inputs (./fuzzer -runs=1000000 my-file) llvm-svn: 284514	2016-10-18 18:38:08 +00:00
Sanjay Patel	19601fa587	revert r284495: [Target] remove TargetRecip class There's something wrong with the StringRef usage while parsing the attribute string. llvm-svn: 284513	2016-10-18 18:36:49 +00:00
Kostya Serebryany	8dfed45cd4	[libFuzzer] reshuffle the code for -exit_on_src_pos and -exit_on_item llvm-svn: 284508	2016-10-18 18:06:05 +00:00
Vitaly Buka	8e1906ea7e	[asan] Make -asan-experimental-poisoning the only behavior Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25735 llvm-svn: 284505	2016-10-18 18:04:59 +00:00
Kevin Enderby	6f69582e9b	Next set of additional error checks for invalid Mach-O files for the load commands that use the MachO::routines_command and and MachO::routines_command_64 types but are not used in llvm libObject code but used in llvm tool code. This includes the LC_ROUTINES and LC_ROUTINES_64 load commands. llvm-svn: 284504	2016-10-18 17:54:17 +00:00
Sanjoy Das	507dd40a4a	[SCEV] Make CompareValueComplexity a little bit smarter This helps canonicalization in some cases. Thanks to Pankaj Chawla for the investigation and the test case! llvm-svn: 284501	2016-10-18 17:45:16 +00:00
Sanjoy Das	9cd877a25a	[SCEV] Extract out a helper function; NFC llvm-svn: 284500	2016-10-18 17:45:13 +00:00
Sanjay Patel	08fff9ca81	[Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering This is a follow-up to D24816 - where we changed reciprocal estimates to be function attributes rather than TargetOptions. This patch is intended to be a structural, but not functional change. By moving all of the TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate state, shield the callers from the string format implementation, and simplify/localize the logic needed for a target to enable this. If a function has a "reciprocal-estimates" attribute, those settings may override the target's default reciprocal preferences for whatever operation and data type we're trying to optimize. If there's no attribute string or specific setting for the op/type pair, just use the target default settings. As noted earlier, a better solution would be to move the reciprocal estimate settings to IR instructions and SDNodes rather than function attributes, but that's a multi-step job that requires infrastructure improvements. I intend to work on that, but it's not clear how long it will take to get all the pieces in place. Differential Revision: https://reviews.llvm.org/D25440 llvm-svn: 284495	2016-10-18 17:05:05 +00:00
Simon Pilgrim	25e9628978	[DAGCombiner] Add splatted vector support to (udiv x, (shl pow2, y)) -> x >>u (log2(pow2)+y) llvm-svn: 284491	2016-10-18 16:36:00 +00:00
Simon Pilgrim	ca3072ac58	[X86][AVX512] Add mask/maskz writemask support to constant pool shuffle decode commentx llvm-svn: 284488	2016-10-18 15:45:37 +00:00
Simon Dardis	858915f054	[mips][ias] Handle more complicated expressions for memory operands This patch teaches ias for mips to handle expressions such as (84)+(831)($sp). Such expression typically occur from the expansion of multiple macro definitions. This partially resolves PR/30383. Thanks to Sean Bruno for reporting the issue! Reviewers: zoran.jovanovic, vkalintiris Differential Revision: https://reviews.llvm.org/D24667 llvm-svn: 284485	2016-10-18 15:17:17 +00:00
Simon Dardis	c4463c942c	[mips] Fix sync instruction definition The 'sync' instruction for MIPS was defined in MIPS-II as taking no operands. MIPS32 extended the define of 'sync' as taking an optional unsigned 5 bit immediate. This patch correct the definition of sync so that it is accepted with an operand of 0 or no operand for MIPS-II to MIPS-V, and a 5 bit unsigned immediate for MIPS32 and later revisions. Additionally a clear error is given when the MIPS32 version of sync is used when targeting pre MIPS32. This partially resolves PR/30714. Thanks to Daniel Sanders for reporting this issue! Reveiwers: vkalintiris Differential Revision: https://reviews.llvm.org/D25672 llvm-svn: 284483	2016-10-18 14:42:13 +00:00
Victor Leschuk	197aa3192d	DebugInfo: change alignment type from uint64_t to uint32_t to save space. In futher patches we shall have alignment field added to DIVariable family and switching from uint64_t to uint32_t will save 4 bytes per variable. Differential Revision: https://reviews.llvm.org/D25620 llvm-svn: 284482	2016-10-18 14:31:22 +00:00
Simon Dardis	aff4d141b9	[mips] Macro expansion for ld, sd for O32 ld and sd when assembled for the O32 ABI expand to a pair of 32 bit word loads or stores using the specified source or destination register and the next register. This patch does not add support for the cases where the offset is greater than a 16 bit signed immediate as that would lead to a wrong/misleading error message as the assembler would report "instruction requires a CPU feature not currently enabled" for ld & sd for MIPS64 when their offset is not a signed 16 bit number. This fixes PR/29159. Thanks to Sean Bruno for reporting this issue! Reviewers: vkalintiris, seanbruno, zoran.jovanovic Differential Review: https://reviews.llvm.org/D24556 llvm-svn: 284481	2016-10-18 14:28:00 +00:00
Michael Zuckerman	1bee6340ef	[x86][inline-asm][avx512] allow swapping of '{k<num>}' & '{z}' marks Committing on behalf of Coby Tayree: After check-all and LGTM Desc: AVX512 allows dest operand to be followed by an op-mask register specifier ('{k<num>}', which in turn may be followed by a merging/zeroing specifier ('{z}') Currently, the following forms are allowed: {k<num>} {k<num>}{z} This patch allows the following forms: {z}{k<num>} and ignores the next form: {z} Justification would be quite simple - GCC Differential Revision: http://reviews.llvm.org/D25013 llvm-svn: 284479	2016-10-18 13:52:39 +00:00
Simon Pilgrim	65e0c73875	Strip trailing whitespace (NFCI) llvm-svn: 284478	2016-10-18 13:44:00 +00:00
Vasileios Kalintiris	3955b75ba9	[mips][FastISel] Instantiate the MipsFastISel class only for targets that support FastISel. Summary: Instead of instantiating the MipsFastISel class and checking if the target is supported in the overriden methods, we should perform that check before creating the class. This allows us to enable FastISel only for targets that truly support it, ie. MIPS32 to MIPS32R5. Reviewers: sdardis Subscribers: ehostunreach, llvm-commits Differential Revision: https://reviews.llvm.org/D24824 llvm-svn: 284475	2016-10-18 13:05:42 +00:00
John Brawn	ecf79300dd	[SCEV] More accurate calculation of max backedge count of some less-than loops In loops that look something like i = n; do { ... } while(i++ < n+k); where k is a constant, the maximum backedge count is k (in fact the backedge count will be either 0 or k, depending on whether n+k wraps). More generally for LHS < RHS if RHS-(LHS of first comparison) is a constant then the loop will iterate either 0 or that constant number of times. This allows for more loop unrolling with the recent upper bound loop unrolling changes, and I'm working on a patch that will let loop unrolling additionally make use of the loop being executed either 0 or k times (we need to retain the loop comparison only on the first unrolled iteration). Differential Revision: https://reviews.llvm.org/D25607 llvm-svn: 284465	2016-10-18 10:10:53 +00:00
Renato Golin	9ce5074d29	Revert "Resubmit "Add support for advanced number formatting."" This reverts commits 284436 and 284437 because they still break AArch64 bots: Value of: format_number(-10, IntegerStyle::Integer, 1) Actual: "-0" Expected: "-10" llvm-svn: 284462	2016-10-18 09:30:18 +00:00
Javed Absar	e7c338081a	[ARM] Assign cost of scaling for Cortex-R52 This patch assigns cost of the scaling used in addressing for Cortex-R52. On Cortex-R52 a negated register offset takes longer than a non-negated register offset, in a register-offset addressing mode. Differential Revision: http://reviews.llvm.org/D25670 Reviewer: jmolloy llvm-svn: 284460	2016-10-18 09:08:54 +00:00
Simon Pilgrim	4ddc92b6cd	[X86][SSE] Add lowering to cvttpd2dq/cvttps2dq for sitofp v2f64/2f32 to 2i32 As discussed on PR28461 we currently miss the chance to lower "fptosi <2 x double> %arg to <2 x i32>" to cvttpd2dq due to its use of illegal types. This patch adds support for fptosi to 2i32 from both 2f64 and 2f32. It also recognises that cvttpd2dq zeroes the upper 64-bits of the xmm result (similar to D23797) - we still don't do this for the cvttpd2dq/cvttps2dq intrinsics - this can be done in a future patch. Differential Revision: https://reviews.llvm.org/D23808 llvm-svn: 284459	2016-10-18 07:42:15 +00:00
Dean Michael Berris	156f6cafc2	[XRay] Support for for tail calls for ARM no-Thumb This patch adds simplified support for tail calls on ARM with XRay instrumentation. Known issue: compiled with generic flags: `-O3 -g -fxray-instrument -Wall -std=c++14 -ffunction-sections -fdata-sections` (this list doesn't include my specific flags like --target=armv7-linux-gnueabihf etc.), the following program #include <cstdio> #include <cassert> #include <xray/xray_interface.h> [[clang::xray_always_instrument]] void __attribute__ ((noinline)) fC() { std::printf("In fC()\n"); } [[clang::xray_always_instrument]] void __attribute__ ((noinline)) fB() { std::printf("In fB()\n"); fC(); } [[clang::xray_always_instrument]] void __attribute__ ((noinline)) fA() { std::printf("In fA()\n"); fB(); } // Avoid infinite recursion in case the logging function is instrumented (so calls logging // function again). [[clang::xray_never_instrument]] void simplyPrint(int32_t functionId, XRayEntryType xret) { printf("XRay: functionId=%d type=%d.\n", int(functionId), int(xret)); } int main(int argc, char* argv[]) { __xray_set_handler(simplyPrint); printf("Patching...\n"); __xray_patch(); fA(); printf("Unpatching...\n"); __xray_unpatch(); fA(); return 0; } gives the following output: Patching... XRay: functionId=3 type=0. In fA() XRay: functionId=3 type=1. XRay: functionId=2 type=0. In fB() XRay: functionId=2 type=1. XRay: functionId=1 type=0. XRay: functionId=1 type=1. In fC() Unpatching... In fA() In fB() In fC() So for function fC() the exit sled seems to be called too much before function exit: before printing In fC(). Debugging shows that the above happens because printf from fC is also called as a tail call. So first the exit sled of fC is executed, and only then printf is jumped into. So it seems we can't do anything about this with the current approach (i.e. within the simplification described in https://reviews.llvm.org/D23988 ). Differential Revision: https://reviews.llvm.org/D25030 llvm-svn: 284456	2016-10-18 05:54:15 +00:00
Justin Bogner	b7c9deb587	Object: Add a missing return in ObjectFile::createObjectFile When Error was threaded through these APIs back in r265606 the "return" was missed here, which triggers a warning if/when I add LLVM_NODISCARD to the Error type. llvm-svn: 284454	2016-10-18 05:17:23 +00:00
Craig Topper	448358b5f1	[X86] Fix DecodeVPERMVMask to handle cases where the constant pool entry has a different type than the shuffle itself. This is especially important for 32-bit targets with 64-bit shuffle elements. llvm-svn: 284453	2016-10-18 04:48:33 +00:00
Craig Topper	7268bf99ab	[AVX-512] Fix DecodeVPERMV3Mask to handle cases where the constant pool entry has a different type than the shuffle itself. Summary: This is especially important for 32-bit targets with 64-bit shuffle elements.This is similar to how PSHUFB and VPERMIL handle the same problem. Reviewers: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25666 llvm-svn: 284451	2016-10-18 04:00:32 +00:00
Craig Topper	175a415e78	[AVX-512] Add support for decoding shuffle mask from constant pool for masked VPERMILPS/PD. llvm-svn: 284450	2016-10-18 03:36:52 +00:00
Mandeep Singh Grang	e82678a657	Fix differences in codegen between Linux and Windows toolchains Summary: There are differences in codegen between Linux and Windows due to: 1. Using std::sort which uses quicksort which is a non-stable sort. 2. Iterating over Set data structure where the iteration order is non deterministic. Reviewers: arsenm, grosbach, junbuml, zinob, MatzeB Subscribers: MatzeB, wdng Differential Revision: https://reviews.llvm.org/D25695 llvm-svn: 284441	2016-10-18 00:11:19 +00:00
Zachary Turner	0d31d9c012	Rename HexStyle -> HexFormatStyle, and remove a constexpr. This should fix the remaining broken builds. llvm-svn: 284437	2016-10-17 23:08:47 +00:00
Zachary Turner	7cd0745c95	Resubmit "Add support for advanced number formatting." This resubmits commits 284425 and r284428, which were reverted in r284429 due to some infinite recursion caused by an incorrect selection of function overloads. Reproduced the failure on Linux using GCC 4.8.4, and confirmed that with the new patch the tests path on GCC as well as MSVC. So hopefully this fixes everything. llvm-svn: 284436	2016-10-17 22:49:24 +00:00
Konstantin Zhuravlyov	98a3ac7106	[AMDGPU] Mark .note section SHF_ALLOC so lld creates a segment for it Differential Revision: https://reviews.llvm.org/D25694 llvm-svn: 284435	2016-10-17 22:40:15 +00:00
Justin Lebar	ee34a7343d	[ADT] Move CachedHashString to its own header in ADT, and rename to CachedHashStringRef. Summary: Reclaiming the name 'CachedHashString' will let us add a type with that name that owns its value. Reviewers: timshen Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25644 llvm-svn: 284434	2016-10-17 22:24:36 +00:00
Kevin Enderby	2490de06f7	Next set of additional error checks for invalid Mach-O files for the load commands that use the MachO::sub_framework_command, MachO::sub_umbrella_command, MachO::sub_library_command and MachO::sub_client_command types but are not used in llvm libObject code but used in llvm tool code. This includes the LC_SUB_FRAMEWORK, LC_SUB_UMBRELLA, LC_SUB_LIBRARY and LC_SUB_CLIENT load commands. llvm-svn: 284431	2016-10-17 22:09:25 +00:00
Zachary Turner	9d58e362d2	Revert formatting changes. This reverts r288425 and r284428 as they are causing test crashes on some systems. llvm-svn: 284429	2016-10-17 21:25:41 +00:00
Zachary Turner	47e2c0a9cb	Try to fix build after invalid pointer conversion. llvm-svn: 284428	2016-10-17 21:14:27 +00:00
Zachary Turner	99eef2d736	[Support] Add support for "advanced" number formatting. raw_ostream has not afforded a lot of flexibility in terms of how to format numbers when outputting. Wrap this all up into a set of low level helper functions that can be used to output numbers with arbitrary precision, alignment, format, etc and then update raw_ostream to use these functions. This will be useful for upcoming improvements to llvm's string formatting libraries, but are still useful independently. Differential Revision: https://reviews.llvm.org/D25497 llvm-svn: 284425	2016-10-17 20:57:45 +00:00
Sanjay Patel	523cd8290a	[DAG] use isConstOrConstSplat in ComputeNumSignBits to optimize SRA The scalar version of this pattern was noted in: https://reviews.llvm.org/D25485 and fixed with: https://reviews.llvm.org/rL284395 More refactoring of the constant/splat helpers is needed and will happen in follow-up patches. Differential Revision: https://reviews.llvm.org/D25685 llvm-svn: 284424	2016-10-17 20:41:39 +00:00
Sanjay Patel	a7cab58055	[DAG] make isConstOrConstSplat and isConstOrConstSplatFP more accessible; NFC As noted in: https://reviews.llvm.org/D25685 This is the next-to-smallest step needed to enable the ComputeNumSignBits fix in that patch. In a minor attempt to keep some structure, we're pulling the FP helper over along with its integer sibling, but clearly we can and should do more refactoring of the similar helper functions in DAGCombiner and SelectionDAG to simplify and not duplicate functionality. llvm-svn: 284421	2016-10-17 20:26:46 +00:00
Davide Italiano	84bd58e915	[opt] Strip coverage if debug info is not present. If -coverage is passed, but -g is not, clang populates the PassManager pipeline with StripSymbols(debugOnly = true). The stripSymbol pass therefore scans the list of named metadata, drops !llvm.dbg.cu, but leaves !llvm.gcov and !0 (the compileUnit MD) around. The verifier runs, and finds out that there's a CU not listed in !llvm.dbg.cu (as it was previously dropped) -> crash. When we strip debug info, so, check if there's coverage data, and strip it as well, in order to avoid pending metadata left around. Differential Revision: https://reviews.llvm.org/D25689 llvm-svn: 284418	2016-10-17 20:05:35 +00:00
Dehao Chen	018a3afa99	Ignore debug info when making optimization decisions in SimplifyCFG. Summary: Debug info should not affect code generation. This patch properly handles debug info to make sure the generated code are the same with or without debug info. Reviewers: davidxl, mzolotukhin, jmolloy Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D25286 llvm-svn: 284415	2016-10-17 19:28:44 +00:00
Michael LeMay	14153552fd	Test commit. llvm-svn: 284411	2016-10-17 19:09:19 +00:00
Walter Erquinigo	b58d6a5655	Handle relocations to thumb functions when dynamic linking COFF modules Summary: This adds the necessary logic to support relocations to thumb functions in the COFF dynamic linker. The jumps to function addresses are mostly blx, which requires the ISA selection bit when jumping to a thumb function. Note: I'm determining if the relocation requires the ISA bit when creating the relocation entries and not when resolving the relocation. I have to do that because I need the ObjectFile and the actual Symbol, which are available only when creating the entries. It would require a gross refactor if I do it otherwise, but I'm okay with doing it if you think it's better. Reviewers: peter.smith, compnerd Subscribers: rengolin, sas Differential Revision: https://reviews.llvm.org/D25151 llvm-svn: 284410	2016-10-17 18:56:18 +00:00
Tim Northover	020d104496	GlobalISel: support wider range of load/store sizes in AArch64. llvm-svn: 284406	2016-10-17 18:36:53 +00:00
Tom Stellard	083f1626f5	AMDGPU/SI: LowerParameter() should be computing align based on memory type Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25203 llvm-svn: 284398	2016-10-17 16:56:19 +00:00
Tom Stellard	bc6c523cce	AMDGPU/SI: Fix LowerParameter() for i16 arguments Summary: If we are loading an i16 value from a 32-bit memory location, then we need to be able to truncate the loaded value to i16. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25198 llvm-svn: 284397	2016-10-17 16:21:45 +00:00
Sanjay Patel	2cf6bfaf73	[DAG] optimize away an arithmetic-right-shift of a 0 or -1 value This came up as part of: https://reviews.llvm.org/D25485 Note that the vector case is missed because ComputeNumSignBits() is deficient for vectors. llvm-svn: 284395	2016-10-17 15:58:28 +00:00
Teresa Johnson	c0ef9e4316	Rename interface for querying physical hardware concurrency Based on post-commit review for D25585/r284180, rename hardware_physical_concurrency to heavyweight_hardware_concurrency, to better reflect what type of tasks it should be used for and to enable other systems to map this to something other than the number of physical cores. llvm-svn: 284390	2016-10-17 14:56:53 +00:00
Benjamin Kramer	937dd7af2d	[Support] remove_dots: Remove .. from absolute paths. /../foo is still a proper path after removing the dotdot. This should now finally match https://9p.io/sys/doc/lexnames.html [Cleaning names]. llvm-svn: 284384	2016-10-17 13:28:21 +00:00
James Molloy	aa79b19a3e	[SDAG] Use ABI type alignment for constant pools when optimizing for size SelectionDAG::getConstantPool will automatically determine an appropriate alignment if one is not specified. It does this by querying the type's preferred alignment. This can end up creating quite a lot of padding when the preferred alignment for vectors is 128. In optimize-for-size mode, it makes sense to instead query the ABI type alignment which is often smaller and causes less padding. llvm-svn: 284381	2016-10-17 12:54:07 +00:00
Oliver Stannard	fe4432b105	[SimplifyCFG] Don't lower complex ConstantExprs to lookup tables Not all ConstantExprs can be represented by a global variable, for example most pointer arithmetic other than addition of a constant, so we can't convert these values from switch statements to lookup tables. Differential Revision: https://reviews.llvm.org/D25550 llvm-svn: 284379	2016-10-17 12:00:24 +00:00
Tobias Grosser	2bbec0ee7f	[SCEV] Consider delinearization pattern with extension with identity factor Summary: The delinearization algorithm did not consider terms which had an extension without a multiply factor, i.e. a identify factor. We lose cases where size is char type where there will no multiply factor. Reviewers: sanjoy, grosser Subscribers: mzolotukhin, Eugene.Zelenko, llvm-commits, mssimpso, sanjoy, grosser Differential Revision: https://reviews.llvm.org/D16492 llvm-svn: 284378	2016-10-17 11:56:26 +00:00
Andrea Di Biagio	fa90c692db	[CodeGenPrepare] When moving a zext near to its associated load, do not retain the original debug location. CodeGenPrepare knows how to move a zext of a load into the same basic block where the load lives. The goal is to help ISel match a zero-extending load instead of two separated instructions. CGP attempts to move a zext computation even if it lives in a basic block that does not post-dominate the load's basic block. That means, the hoisted zext may be speculated. Preserving the zext location would hurt the debugging experience and the quality of sample pgo. With this patch, when moving a zext near to its associated load, CGP no longer propagates the zext's debug location. Instead, CGP conservatively reuses the same debug location for the load and the zext. An alternative approach would be to assign an artificial line-0 location to the zext. However we don't want to over-use the 'line-0' for this particular case because it would have a size cost in the line-table section for no additional benefit. Differential Revision: https://reviews.llvm.org/D25611 llvm-svn: 284377	2016-10-17 11:32:26 +00:00
Craig Topper	1f5178ff9f	[X86] Fix shuffle decoding assertions to print the right number of required operands. Update the checks themselves to be >= to the same number instead of > one less than the required number. llvm-svn: 284365	2016-10-17 06:41:18 +00:00
Craig Topper	5b24cd31f5	[AVX-512] Add shuffle combining support for vpermi2var shuffles derived from existing support for vpermt2var. llvm-svn: 284357	2016-10-17 04:26:47 +00:00
Craig Topper	715ad7fef5	[AVX-512] Add support for turning a 256-bit load that goes to both halfs of an insert_subvector into a subvector broadcast. Differential Revision: https://reviews.llvm.org/D25650 llvm-svn: 284353	2016-10-16 23:29:51 +00:00
Justin Bogner	1674261886	Support: Return void from Scanner::scan_ns_uri_char, no one uses the result Simplify this a little bit since the result is never used. It can be added back easily enough if that changes. llvm-svn: 284348	2016-10-16 22:01:22 +00:00
Richard Smith	e3a9a676f9	PR30711: Fix incorrect profiling of 'long long' in FoldingSet, then use it to fix TBAA violation in profiling of pointers. llvm-svn: 284336	2016-10-16 17:49:09 +00:00
Craig Topper	aa1370ac57	[AVX-512] Fix the operand order for vpermi2var_qi intrinsics to match the other vpermi2var intrinsics. llvm-svn: 284329	2016-10-16 04:54:35 +00:00
Craig Topper	4729fe8bb6	[AVX-512] Correct execution domain for VPERMT2PS and VPERMI2PS. llvm-svn: 284328	2016-10-16 04:54:31 +00:00
Craig Topper	f18b9201f5	[AVX-512] Move (v4i64 (X86SubVBroadcast (v2i64))) alternate patterns under a HasVLX predicate. Similar for floating point. llvm-svn: 284327	2016-10-16 04:54:26 +00:00
Davide Italiano	e9cdb24f67	[ArmFastISel] Kill dead code. NFCI. llvm-svn: 284320	2016-10-16 01:09:39 +00:00
Konstantin Zhuravlyov	8ea0246e93	[MachineMemOperand] Move synchronization scope and atomic orderings from SDNode to MachineMemOperand, and remove redundant getAtomic* member functions from SelectionDAG. Differential Revision: https://reviews.llvm.org/D24577 llvm-svn: 284312	2016-10-15 22:01:18 +00:00
Davide Italiano	590ad7037e	[GVN/PRE] Hoist global values outside of loops. In theory this could be generalized to move anything where we prove the operands are available, but that would require rewriting PRE. As NewGVN will hopefully come soon, and we're trying to rewrite PRE in terms of NewGVN+MemorySSA, it's probably not worth spending too much time on it. Fix provided by Daniel Berlin! llvm-svn: 284311	2016-10-15 21:35:23 +00:00
Li Huang	755f75f765	Test commit. (NFC) llvm-svn: 284307	2016-10-15 19:00:04 +00:00
Craig Topper	dde865afb5	[AVX-512] Add shuffle comments for vbroadcast instructions. llvm-svn: 284305	2016-10-15 16:26:07 +00:00
Craig Topper	51e052f741	[AVX-512] Rename VPBROADCASTI32X2 and VPBROADCASTF32X2 instruction classes to match the mnemonic which does not include a 'P'. llvm-svn: 284304	2016-10-15 16:26:02 +00:00
Benjamin Kramer	d8b079708d	[SimplifyCFG] Use the error checking provided by getPrevNode. BasicBlock::size is O(insts), making this loop O(blocks*insts), which can be really slow on generated code. getPrevNode already checks if we're at the beginning of the block and returns nullptr if so, just use that instead. No functionality change intended. llvm-svn: 284303	2016-10-15 13:15:05 +00:00
Kostya Serebryany	9a4b10a56f	[libFuzzer] swap bytes in integers when handling CMP traces llvm-svn: 284301	2016-10-15 04:00:07 +00:00
Kostya Serebryany	f9b8e8b117	[libFuzzer] better algorithm for -minimize_crash llvm-svn: 284299	2016-10-15 01:00:24 +00:00
Tom Stellard	961811c906	AMDGPU/SI: Handle s_getreg hazard in GCNHazardRecognizer Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25526 llvm-svn: 284298	2016-10-15 00:58:14 +00:00
Evgeny Astigeevich	48fd87e4aa	[NFC] Loop Versioning for LICM code clean up - Removed unused class members. - Made class internal data private. - Made class scoped data function scoped where it's possible. - Replace naked new/delete with unique_ptr. - Made resources guaranteed to be freed. Differential Revision: https://reviews.llvm.org/D25464 llvm-svn: 284290	2016-10-14 23:00:36 +00:00
Tim Northover	69fa84a6e9	GlobalISel: rename legalizer components to match others. The previous names were both misleading (the MachineLegalizer actually contained the info tables) and inconsistent with the selector & translator (in having a "Machine") prefix. This should make everything sensible again. The only functional change is the name of a couple of command-line options. llvm-svn: 284287	2016-10-14 22:18:18 +00:00
Mehdi Amini	f8fd20402a	hardware_physical_concurrency() should return 1 when LLVM is built with LLVM_ENABLE_THREADS=OFF llvm-svn: 284283	2016-10-14 21:32:35 +00:00
Guozhi Wei	0cd65429be	[PPC] Shorter sequence to load 64bit constant with same hi/lo words This is a patch to implement pr30640. When a 64bit constant has the same hi/lo words, we can use rldimi to copy the low word into high word of the same register. This optimization caused failure of test case bperm.ll because of not optimal heuristic in function SelectAndParts64. It chooses AND or ROTATE to extract bit groups from a register, and OR them together. This optimization lowers the cost of loading 64bit constant mask used in AND method, and causes different code sequence. But actually ROTATE method is better in this test case. The reason is in ROTATE method the final OR operation can be avoided since rldimi can insert the rotated bits into target register directly. So this patch also enhances SelectAndParts64 to prefer ROTATE method when the two methods have same cost and there are multiple bit groups need to be ORed together. Differential Revision: https://reviews.llvm.org/D25521 llvm-svn: 284276	2016-10-14 20:41:50 +00:00
Kostya Serebryany	e450e40741	[libFuzzer] remove subdir fuzzer-test-suite as it is now superseded with https://github.com/google/fuzzer-test-suite llvm-svn: 284275	2016-10-14 20:26:40 +00:00
Kostya Serebryany	a5f94fb6c9	[libFuzzer] add -trace_cmp=1 (guiding mutations based on the observed CMP instructions). This is a reincarnation of the previously deleted -use_traces, but using a different approach for collecting traces. Still a toy, but at least it scales well. Also fix -merge in trace-pc-guard mode llvm-svn: 284273	2016-10-14 20:20:33 +00:00
Sanjay Patel	72b5ff646d	[DAG] avoid creating illegal node when transforming negated shifted sign bit Eli noted this potential bug in the post-commit thread for: https://reviews.llvm.org/rL284239 ...but I'm not sure how to trigger it, so there's no test case yet. llvm-svn: 284268	2016-10-14 19:46:31 +00:00
Tom Stellard	09c2bd6bd4	AMDGPU/SI: Use new SimplifyDemandedBits helper for multi-use operations Summary: We are using this helper for our 24-bit arithmetic combines, so we are now able to eliminate multi-use operations that mask the high-bits of 24-bit inputs (e.g. and x, 0xffffff) Reviewers: arsenm, nhaehnle Subscribers: tony-tye, arsenm, kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24672 llvm-svn: 284267	2016-10-14 19:14:29 +00:00
Tom Stellard	ab61007914	TargetLowering: Add SimplifyDemandedBits() helper to TargetLoweringOpt Summary: The main purpose of this new helper is to enable simplifying operations that have multiple uses. SimplifyDemandedBits does not handle multiple uses currently, and this new function makes it possible to optimize: and v1, v0, 0xffffff mul24 v2, v1, v1 ; Multiply ignoring high 8-bits. To: mul24 v2, v0, v0 Where before this would not be optimized, because v1 has multiple uses. Reviewers: bogner, arsenm Subscribers: nhaehnle, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24964 llvm-svn: 284266	2016-10-14 19:14:26 +00:00
Krzysztof Parzyszek	775a20913d	The real fix for post-r284255 failures llvm-svn: 284264	2016-10-14 19:06:25 +00:00
Krzysztof Parzyszek	a22daa0fa6	Workaround to eliminate check-llvm failures after r284255 llvm-svn: 284262	2016-10-14 18:36:42 +00:00
David L Kreitzer	01a057a0c4	Add a pass to optimize patterns of vectorized interleaved memory accesses for X86. The pass optimizes as a unit the entire wide load + shuffles pattern produced by interleaved vectorization. This initial patch optimizes one pattern (64-bit elements interleaved by a factor of 4). Future patches will generalize to additional patterns. Patch by Farhana Aleen Differential revision: http://reviews.llvm.org/D24681 llvm-svn: 284260	2016-10-14 18:20:41 +00:00
Tom Stellard	64a9d0876c	AMDGPU/SI: Don't allow unaligned scratch access Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 llvm-svn: 284257	2016-10-14 18:10:39 +00:00
Krzysztof Parzyszek	445bd12621	[RDF] Switch RegisterRef to be a pair (Register, LaneMask) Use PackedRegisterRef to store the register information in the graph nodes. This commit also removes support for virtual registers. It has never been tested or used. It will be possible to add it back if there is a need. llvm-svn: 284255	2016-10-14 17:57:55 +00:00
David L Kreitzer	d5c6755d83	[safestack] Use non-thread-local unsafe stack pointer for Contiki OS Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D19852 llvm-svn: 284254	2016-10-14 17:56:00 +00:00
Eric Christopher	c39f8b0a3a	Revert "In preparation for removing getNameWithPrefix off of TargetMachine," as it's causing sanitizer/memory issues until I can track down this set. This reverts commit r284203 llvm-svn: 284252	2016-10-14 17:28:23 +00:00
Vedant Kumar	743574b8e3	[Coverage] Support loading multiple binaries into a CoverageMapping Add support for loading multiple coverage readers into a single CoverageMapping instance. This should make it easier to prepare a unified coverage report for multiple binaries. Differential Revision: https://reviews.llvm.org/D25535 llvm-svn: 284251	2016-10-14 17:16:53 +00:00
Rafael Espindola	4a2ef91143	Move alignTo computation inside the if. This is an improvement when compiling with llvm. llvm doesn't inline the call to insert, so the align is always executed and shows up in the profile. With gcc the call to insert is inlined and the align computation moved and done only if needed. With this patch we explicitly only compute it if it is needed. In the two tests with debug info, the speedup was scylla master 3.008959365 patch 2.932080942 1.02621974786x faster firefox master 6.709823604 patch 6.592387227 1.01781393795x faster In all others the difference was in the noise. llvm-svn: 284249	2016-10-14 17:01:39 +00:00
Pierre Gousseau	b6d652adb5	[X86] Take advantage of the lzcnt instruction on btver2 architectures when ORing comparisons to zero. This change adds transformations such as: zext(or(setcc(eq, (cmp x, 0)), setcc(eq, (cmp y, 0)))) To: srl(or(ctlz(x), ctlz(y)), log2(bitsize(x)) This optimisation is beneficial on Jaguar architecture only, where lzcnt has a good reciprocal throughput. Other architectures such as Intel's Haswell/Broadwell or AMD's Bulldozer/PileDriver do not benefit from it. For this reason the change also adds a "HasFastLZCNT" feature which gets enabled for Jaguar. Differential Revision: https://reviews.llvm.org/D23446 llvm-svn: 284248	2016-10-14 16:41:38 +00:00
Sanjay Patel	6d6eca5cdc	[InstCombine] use m_APInt to allow sub with constant folds for splat vectors llvm-svn: 284247	2016-10-14 16:31:54 +00:00
Sanjay Patel	c6c5965a42	[InstCombine] sub X, sext(bool Y) -> add X, zext(bool Y) Prefer add/zext because they are better supported in terms of value-tracking. Note that the backend should be prepared for this IR canonicalization (including vector types) after: https://reviews.llvm.org/rL284015 Differential Revision: https://reviews.llvm.org/D25135 llvm-svn: 284241	2016-10-14 15:24:31 +00:00
David L Kreitzer	3155abfb57	Define "contiki" OS specifier. Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D24897 llvm-svn: 284240	2016-10-14 14:41:46 +00:00
Sanjay Patel	00fc7a6159	[DAG] add folds for negated shifted sign bit The same folds exist in InstCombine already. This came up as part of: https://reviews.llvm.org/D25485 llvm-svn: 284239	2016-10-14 14:26:47 +00:00
Nicolai Haehnle	67624af0cc	AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodes Summary: This will be used for 64-bit MULHU, which is in turn used for the 64-bit divide-by-constant optimization (see D24822). Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25289 llvm-svn: 284224	2016-10-14 10:30:00 +00:00
Nicolai Haehnle	86e72d98dd	Fix use-after-frees Extracted from D25313, as suggested by Justin Bogner. llvm-svn: 284220	2016-10-14 09:49:51 +00:00
Simon Dardis	b3fd189cb5	[mips] Fix aui/daui/dahi/dati for MIPSR6 For compatiblity with binutils, define these instructions to take two registers with a 16bit unsigned immediate. Both of the registers have to be same for dahi and dati. Reviewers: dsanders, zoran.jovanovic Differential Review: https://reviews.llvm.org/D21473 llvm-svn: 284218	2016-10-14 09:31:42 +00:00
Nicolai Haehnle	bd15c3267f	AMDGPU: Fix use-after-frees Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25312 llvm-svn: 284215	2016-10-14 09:03:04 +00:00
Michael Zuckerman	174d2e784b	[x86][ms-inline-asm] use of "jmp short" in asm is not supported Committing in the name of Ziv Izhar: After check-all and LGTM . The following patch is for compatability with Microsoft. Microsoft ignores the keyword "short" when used after a jmp, for example: __asm { jmp short label label: } A test for that patch will be added in another patch, since it's located in clang's codegen tests. Link will be added shortly. link to test: https://reviews.llvm.org/D24958 Differential Revision: https://reviews.llvm.org/D24957 llvm-svn: 284211	2016-10-14 08:09:40 +00:00
Craig Topper	40feb7f157	[DAGCombiner] Teach createBuildVecShuffle to handle cases where input vectors are less than half of the output vector size. This will be needed by a future commit to support sign/zero extending from v8i8 to v8i64 which requires a sign/zero_extend_vector_inreg to be created which requires v8i8 to be concatenated upto v64i8 and goes through this code. llvm-svn: 284204	2016-10-14 06:00:42 +00:00
Eric Christopher	2bd52b5d91	In preparation for removing getNameWithPrefix off of TargetMachine, sink the current behavior into the callers and sink TargetMachine::getNameWithPrefix into TargetMachine::getSymbol. llvm-svn: 284203	2016-10-14 05:47:41 +00:00
Eric Christopher	445c952bd0	Tidy the calls to getCurrentSection().first -> getCurrentSectionOnly to help readability a bit. llvm-svn: 284202	2016-10-14 05:47:37 +00:00
Konstantin Zhuravlyov	c96b5d7073	[AMDGPU] Emit 32-bit lo/hi got and pc relative variant kinds for external and global address space variables Differential Revision: https://reviews.llvm.org/D25562 llvm-svn: 284196	2016-10-14 04:37:34 +00:00
Konstantin Zhuravlyov	2a2ac37c2c	[AMDGPU] Add 32-bit lo/hi got and pc relative variant kinds and emit appropriate relocations Differential Revision: https://reviews.llvm.org/D25548 llvm-svn: 284195	2016-10-14 04:21:32 +00:00
Matthias Braun	ac28703111	Timer: Fix doxygen comments, use member initializer; NFC llvm-svn: 284181	2016-10-14 00:17:19 +00:00
Teresa Johnson	2bd812c5dc	Add interface for querying physical hardware concurrency Summary: This will be used by ThinLTO to set the amount of backend parallelism, which performs better when restricted to the number of physical cores (on X86 at least, where getHostNumPhysicalCores is currently defined). If not available this falls back to thread::hardware_concurrency. Note I didn't add to the thread class since that is a typedef to std::thread where available. Reviewers: mehdi_amini Subscribers: beanz, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D25585 llvm-svn: 284180	2016-10-14 00:13:59 +00:00
Saleem Abdulrasool	7705c4f1be	CodeGen: use MSVC division on windows itanium Windows itanium is identical to MSVC when dealing with everything but C++. Lower the math routines into msvcrt rather than compiler-rt. llvm-svn: 284175	2016-10-13 23:00:11 +00:00
Saleem Abdulrasool	06383dd272	CodeGen: adjust floating point operations in Windows itanium Windows itanium is equivalent to MSVC except in C++ mode. Ensure that the promote the 32-bit floating point operations to their 64-bit equivalences. llvm-svn: 284173	2016-10-13 22:38:15 +00:00
Sanjay Patel	98d0ea64ca	[DAG] hoist DL(N) and fix formatting; NFC llvm-svn: 284170	2016-10-13 22:27:10 +00:00
Kostya Serebryany	0381374f96	[libFuzzer] more detailed message for disabled leak detection llvm-svn: 284169	2016-10-13 22:24:10 +00:00
Tom Stellard	f80c1875a3	LegalizeDAG: Implement PROMOTE for ISD::BITREVERSE Summary: This operation is promoted the same way was ISD::BSWAP. This will prevent a regression in test/Target/AMDGOU/bitreverse.ll when i16 support is implemented. Reviewers: bogner, hfinkel Subscribers: hfinkel, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D25202 llvm-svn: 284163	2016-10-13 21:03:49 +00:00
David L Kreitzer	9015316df4	[safestack] Reapply r283248 after moving X86-targeted SafeStack tests into the X86 subdirectory. Original commit message: Requires a valid TargetMachine to be passed to the SafeStack pass. Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D24896 llvm-svn: 284161	2016-10-13 20:57:51 +00:00
Sriraman Tallam	f29fa586e1	New llc option pie-copy-relocations to optimize access to extern globals. This option indicates copy relocations support is available from the linker when building as PIE and allows accesses to extern globals to avoid the GOT. Differential Revision: https://reviews.llvm.org/D24849 llvm-svn: 284160	2016-10-13 20:54:39 +00:00
Nirav Dave	a81682aad4	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon llvm-svn: 284157	2016-10-13 20:23:25 +00:00
Quentin Colombet	52ffa6711b	[RAGreedy] Empty live-ranges always succeed in last chance recoloring. Relax the constraint for empty live-ranges while doing last chance recoloring. Indeed, those live-ranges do not need an actual color to be fond for the recoloring to work. Empty live-range may happen as a result of splitting/spilling. Unfortunately no test case for in-tree targets. llvm-svn: 284152	2016-10-13 19:27:48 +00:00
Nirav Dave	4b36957243	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after upstream changes. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 284151	2016-10-13 19:20:16 +00:00
Kostya Serebryany	a17d23eaa7	[libFuzzer] add -trace_malloc= flag llvm-svn: 284149	2016-10-13 19:06:46 +00:00
Quentin Colombet	b3f5a8c644	[AArch64][RegisterBankInfo] Switch to fully static opds mapping for G_BITCAST. NFC. llvm-svn: 284146	2016-10-13 18:46:38 +00:00
Teresa Johnson	7943fecee8	Add interface to compute number of physical cores on host system Summary: For now I have only added support for x86_64 Linux, but other systems can be added incrementally. This is to be used for setting the default parallelism for ThinLTO backends (instead of thread::hardware_concurrency which includes hyperthreading and is too aggressive). I'll send this as a follow-on patch, and it will fall back to hardware_concurrency when the new getHostNumPhysicalCores returns -1 (when not supported for a given host system). I also added an interface to MemoryBuffer to force reading a file as a stream - this is required for /proc/cpuinfo which is a special file that looks like a normal file but appears to have 0 size. The existing readers of this file in Host.cpp are reading the first 1024 or so bytes from it, because the necessary info is near the top. But for the new functionality we need to be able to read the entire file. I can go back and change the other readers to use the new getFileAsStream as a follow-on patch since it seems much more robust. Added a unittest. Reviewers: mehdi_amini Subscribers: beanz, mgorny, llvm-commits, modocache Differential Revision: https://reviews.llvm.org/D25564 llvm-svn: 284138	2016-10-13 17:43:20 +00:00
Reid Kleckner	edfc9dcf42	Truncate long names in type records In the MS ABI, the frontend is supposed to MD5 such pathologically long names. LLVM should still defend itself from long names, though. Fixes part of PR29098. llvm-svn: 284136	2016-10-13 17:33:22 +00:00
Igor Breger	8409c356ad	[X86][AVX512] Fix sext v32i1 -> v32i8 lowering. Fix PR30600. Differential Revision: https://reviews.llvm.org/D25554 llvm-svn: 284134	2016-10-13 17:20:38 +00:00
Kostya Serebryany	17d176e16b	[libFuzzer] reapply r283946: refactoring to speed things up, NFC. Now with a fix for gcc build llvm-svn: 284132	2016-10-13 16:19:09 +00:00
Reid Kleckner	468e793fea	Fix for PR30687. Avoid dereferencing MBB.end(). We don't need to return a MachineInstr* from these stack probe insertion calls anyway. If we ever need to add it back, we can return an iterator instead. Based on a patch by David Kreitzer This bug is a consequence of r279314 \| dexonsmith \| 2016-08-19 13:40:12 -0700 (Fri, 19 Aug 2016) \| 110 lines We hit the "Assertion `!NodePtr->isKnownSentinel()' failed" assertion, but only when inserting a stack probe call at the end of an MBB, which isn't necessarily a common situation. Differential Revision: https://reviews.llvm.org/D25566 llvm-svn: 284130	2016-10-13 15:48:48 +00:00
Eric Liu	af5a28fe87	Do not delete leading ../ in remove_dots. Reviewers: bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25561 llvm-svn: 284129	2016-10-13 15:07:14 +00:00
Javed Absar	85874a9360	[ARM]: Assign cost of scaling used in addressing mode for ARM cores This patch assigns cost of the scaling used in addressing. On many ARM cores, a negated register offset takes longer than a non-negated register offset, in a register-offset addressing mode. For instance: LDR R0, [R1, R2 LSL #2] LDR R0, [R1, -R2 LSL #2] Above, (1) takes less cycles than (2). By assigning appropriate scaling factor cost, we enable the LLVM to make the right trade-offs in the optimization and code-selection phase. Differential Revision: http://reviews.llvm.org/D24857 Reviewers: jmolloy, rengolin llvm-svn: 284127	2016-10-13 14:57:43 +00:00
Matthew Simpson	1d4b163fc0	[LV] Account for predicated stores in instruction costs This patch ensures that we scale the estimated cost of predicated stores by block probability. This is a follow-on patch for r284123. llvm-svn: 284126	2016-10-13 14:54:31 +00:00
Matthew Simpson	6cdb5a6f96	[LV] Avoid rounding errors for predicated instruction costs This patch modifies the cost calculation of predicated instructions (div and rem) to avoid the accumulation of rounding errors due to multiple truncating integer divisions. The calculation for predicated stores will be addressed in a follow-on patch since we currently don't scale the cost of predicated stores by block probability. Differential Revision: https://reviews.llvm.org/D25333 llvm-svn: 284123	2016-10-13 14:19:48 +00:00
Simon Pilgrim	cb59b5257c	[DAGCombiner] Add vector support to (mul (shl X, Y), Z) -> (shl (mul X, Z), Y) style combines llvm-svn: 284122	2016-10-13 14:04:35 +00:00
Matt Arsenault	253640e18d	AMDGPU: Assume spilling will occur at -O0 Because everything live is spilled at the end of a block by fast regalloc, assume this will happen and avoid the copies of the resource descriptor. llvm-svn: 284119	2016-10-13 13:10:00 +00:00
Simon Pilgrim	fa8fadc0e5	[DAGCombiner] Add vector support to C2-(A+C1) -> (C2-C1)-A folding llvm-svn: 284117	2016-10-13 12:49:31 +00:00
Matt Arsenault	dac31db12f	AMDGPU: Fix truncate to bool warnings llvm-svn: 284116	2016-10-13 12:45:16 +00:00
Simon Dardis	515e8699f4	[mips] Add IAS support for dvp, evp These instructions were only defined for microMIPSR6 previously. Add definitions for MIPSR6, correct definitions for microMIPSR6, flag these instructions as having unmodelled side effects (they disable/enable virtual processors) and add missing disassember tests for microMIPSR6. Reviewers: vkalintiris Differential Review: https://reviews.llvm.org/D24291 llvm-svn: 284115	2016-10-13 12:12:56 +00:00
Simon Pilgrim	833b8a2071	[DAGCombiner] Add vector support to (sub -1, x) -> (xor x, -1) canonicalization Improves commutation potential llvm-svn: 284113	2016-10-13 12:05:20 +00:00
Oren Ben Simhon	92ccbf20ff	[X86] Basic additions to support RegCall Calling Convention. The Register Calling Convention (RegCall) was introduced by Intel to optimize parameter transfer on function call. This calling convention ensures that as many values as possible are passed or returned in registers. This commit presents the basic additions to LLVM CodeGen in order to support RegCall in X86. Differential Revision: http://reviews.llvm.org/D25022 llvm-svn: 284108	2016-10-13 07:53:43 +00:00
Daniel Jasper	bee9dea306	Silence unused warning in non-assert builds. llvm-svn: 284107	2016-10-13 06:39:44 +00:00
Craig Topper	ff23af4299	[AVX-512] Teach shuffle lowering to recognize 512-bit zero extends. llvm-svn: 284105	2016-10-13 05:29:41 +00:00
Craig Topper	8cb2efa58a	[X86] Simplify the lowering code for extracting and inserting subvectors. We don't need to check if AVX is enabled. It's implied by the operation action being set to Custom. We don't need to check both the input and output type widths. We only need to check the type that's being inserted or extracted. The other type is known to be a legal type and we can assume its a different width. llvm-svn: 284102	2016-10-13 04:14:47 +00:00
Sebastian Pop	5068d7a338	Memory-SSA: strengthen defClobbersUseOrDef interface As Danny pointed out, defClobbersUseOrDef should use MemoryLocOrCall to make sure fences are properly handled. llvm-svn: 284099	2016-10-13 03:23:33 +00:00
Sebastian Pop	5ba9f24ed7	commit back "GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)" This is with an extra change to avoid calling MemoryLocation::get() on a call instruction. Differential Revision: https://reviews.llvm.org/D25542 llvm-svn: 284098	2016-10-13 01:39:10 +00:00
Quentin Colombet	6b87a3109c	[AArch64][RegisterBankInfo] Provide alternative mappings for 64-bit load This allows RegBankSelect in greedy mode to get rid some of the cross register bank copies when loads are involved in the chain of computation. llvm-svn: 284097	2016-10-13 01:01:23 +00:00
Reid Kleckner	741d8a21d3	Correct PrivateLinkage for COFF - Use storage class C_STAT for 'PrivateLinkage' The storage class for PrivateLinkage should equal to the Internal Linkage. - Set 'PrivateGlobalPrefix' from "L" to ".L" for MM_WinCOFF (includes x86_64) MM_WinCOFF has empty GlobalPrefix '\0' so PrivateGlobalPrefix "L" may conflict to the normal symbol name starting with 'L'. Based on a patch by Han Sangjin! Manually updated test cases. llvm-svn: 284096	2016-10-13 00:55:24 +00:00
Quentin Colombet	cd80e97e88	[AArch64][RegisterBankInfo] Provide alternative mappings for G_BITCASTs. Thanks to this patch, RegBankSelect is able to get rid of some register bank copies as demonstrated in the test case. llvm-svn: 284094	2016-10-13 00:34:48 +00:00
Reid Kleckner	8958f6a529	Revert "GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)" This CL didn't actually address the test case in PR30499, and clang still crashes. Also revert dependent change "Memory-SSA cleanup of clobbers interface, NFC" Reverts r283965 and r283967. llvm-svn: 284093	2016-10-13 00:18:26 +00:00
Quentin Colombet	45c9c1432f	[AArch64][RegisterBankInfo] Describe cross regbank copies statically. NFC. llvm-svn: 284091	2016-10-13 00:12:06 +00:00
Quentin Colombet	9e64919b7c	[AArch64][RegisterBankInfo] Use static mapping for same bank G_BITCAST. NFC. llvm-svn: 284090	2016-10-13 00:12:04 +00:00
Quentin Colombet	db643d9091	[AArch64][MachineLegalizer] Mark more G_BITCAST as legal. Basically any vector types that fits in a 32-bit register is also valid as far as copies are concerned. llvm-svn: 284089	2016-10-13 00:12:01 +00:00
Quentin Colombet	f760799c40	[AArch64][RegisterBankInfo] Bump the cost of vector loads. This does not change anything yet, because we do not offer any alternative mapping. llvm-svn: 284088	2016-10-13 00:11:59 +00:00
Quentin Colombet	f35a8c5bdc	[AArch64][RegisterBankInfo] Use a proper cost for cross regbank G_BITCASTs. This does not change anything yet, because we do not offer any alternative mapping. llvm-svn: 284087	2016-10-13 00:11:57 +00:00
Quentin Colombet	27b40356f7	[AArch64][RegisterBankInfo] Provide more realistic copy costs. llvm-svn: 284086	2016-10-13 00:11:55 +00:00
Krzysztof Parzyszek	abc0662f04	Handle lane masks in LivePhysRegs when adding live-ins Differential Revision: https://reviews.llvm.org/D25533 llvm-svn: 284076	2016-10-12 22:53:41 +00:00
Tim Northover	fb8d989818	GlobalISel: support G_TRUNC selection on AArch64. Ahmed's patch again. llvm-svn: 284075	2016-10-12 22:49:15 +00:00
Tim Northover	69271c64d5	GlobalISel: support int <-> float conversions on AArch64. More of Ahmed's work. llvm-svn: 284074	2016-10-12 22:49:11 +00:00
Tim Northover	7dd378dd08	GlobalISel: select G_FCMP instructions on AArch64. Another of Ahmed's patches. llvm-svn: 284073	2016-10-12 22:49:07 +00:00
Tim Northover	6c02ad5e4f	GlobalISel: support selection of G_ICMP on AArch64. Patch from Ahmed Bougaca again. llvm-svn: 284072	2016-10-12 22:49:04 +00:00
Tim Northover	5e3dbf326c	GlobalISel: select G_BRCOND instructions on AArch64. llvm-svn: 284071	2016-10-12 22:49:01 +00:00
Tim Northover	6aacd27cd7	GlobalISel: mark G_BRCOND on s1 as legal. It's going to be a TBNZ (at -O0) anyway, so the high bits don't matter. llvm-svn: 284070	2016-10-12 22:48:36 +00:00
Vedant Kumar	68216d7bd3	[Coverage] Factor out logic to create FunctionRecords (NFC) llvm-svn: 284063	2016-10-12 22:27:45 +00:00
Albert Gutowski	795d7d6381	Create llvm.addressofreturnaddress intrinsic Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows. Reviewers: majnemer, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25293 llvm-svn: 284061	2016-10-12 22:13:19 +00:00
Reid Kleckner	fb58be862c	Update _MSC_VER equality checks for msdiaNNN.dll Use inequality instead of equality to defend against minor version increases in _MSC_VER. An _MSC_VER value of 1901 should still use msdia140.dll, as described in this blog post: https://blogs.msdn.microsoft.com/vcblog/2016/10/05/visual-c-compiler-version/ llvm-svn: 284058	2016-10-12 21:51:14 +00:00
Haicheng Wu	1ef17e90b2	Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049. The original summary: This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. llvm-svn: 284053	2016-10-12 21:29:38 +00:00
Krzysztof Parzyszek	d62669d791	[MIRParser] Parse lane masks for register live-ins Differential Revision: https://reviews.llvm.org/D25530 llvm-svn: 284052	2016-10-12 21:06:45 +00:00
Haicheng Wu	45e4ef737d	Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" This reverts commit r284044. llvm-svn: 284051	2016-10-12 21:02:22 +00:00
Haicheng Wu	6cac34fd41	[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. Differential Revision: https://reviews.llvm.org/D24790 llvm-svn: 284044	2016-10-12 20:24:32 +00:00
Peter Collingbourne	46aafc16e8	LTO: Use the correct mangler function in LTOCodeGenerator::applyScopeRestrictions(). We need to use the overload of Mangler::getNameWithPrefix that takes a GlobalValue in order to mangle in the stdcall stack byte count for Windows targets. Differential Revision: https://reviews.llvm.org/D25529 llvm-svn: 284040	2016-10-12 20:12:19 +00:00
Krzysztof Parzyszek	8271be9a1d	Do not remove implicit defs in BranchFolder Branch folder removes implicit defs if they are the only non-branching instructions in a block, and the branches do not use the defined registers. The problem is that in some cases these implicit defs are required for the liveness information to be correct. Differential Revision: https://reviews.llvm.org/D25478 llvm-svn: 284036	2016-10-12 19:50:57 +00:00
Matt Arsenault	d486d3f8d1	AMDGPU: Initial implementation of VGPR indexing mode This is the most basic handling of the indirect access pseudos using GPR indexing mode. This currently only enables the mode for a single v_mov_b32 and then disables it. This is much more complicated to use than the movrel instructions, so a new optimization pass is probably needed to fold the access into the uses and keep the mode enabled for them. llvm-svn: 284031	2016-10-12 18:49:05 +00:00
Teresa Johnson	4b9b379172	[ThinLTO] Don't link module level assembly when importing Module inline asm was always being linked/concatenated when running the IRLinker. This is correct for full LTO but not when we are importing for ThinLTO, as it can result in multiply defined symbols when the module asm defines a global symbol. In order to test with llvm-lto2, I had to work around PR30396, where a symbol that is defined in module assembly but defined in the LLVM IR appears twice. Added workaround to llvm-lto2 with a FIXME. Fixes PR30610. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25359 llvm-svn: 284030	2016-10-12 18:39:29 +00:00
Sanjoy Das	bc357e8fa3	[SimplifyCFG] Don't create PHI nodes for constant bundle operands Summary: Constant bundle operands may need to retain their constant-ness for correctness. I'll admit that this is slightly odd, but it looks like SimplifyCFG already does this for things like @llvm.frameaddress and @llvm.stackmap, so I suppose adding one more case is not a big deal. It is possible to add a mechanism to denote bundle operands that need to remain constants, but that's probably too complicated for the time being. Reviewers: jmolloy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D25502 llvm-svn: 284028	2016-10-12 18:15:33 +00:00
Matt Arsenault	cc88ce36ed	AMDGPU: Add instruction definitions for VGPR indexing VI added a second method of indexing into VGPRs besides using v_movrel* llvm-svn: 284027	2016-10-12 18:00:51 +00:00
Tom Stellard	fac248cb5f	AMDGPU/SI: Change mimg intrinsic signatures This makes more fields overridable and removes redundant bits. Patch by: Changpeng Fang llvm-svn: 284024	2016-10-12 16:35:29 +00:00
Artur Pilipenko	c6eb6bd9cb	[ValueTracking] An improvement to IR ValueTracking on Non-negative Integers Since this change is known to cause performance degradations in some cases it's commited under a temporary flag which is turned off by default. Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18777 llvm-svn: 284022	2016-10-12 16:18:43 +00:00
Matt Arsenault	691efe03bd	BranchRelaxation: Unique live ins when creating block llvm-svn: 284018	2016-10-12 15:32:04 +00:00
Nirav Dave	d046332679	[MC] Fix Error Location for ParseIdentifier Prevent partial parsing of '$' or '@' of invalid identifiers and fixup workaround points. NFC Intended. llvm-svn: 284017	2016-10-12 13:58:07 +00:00
Simon Pilgrim	08190943cb	[DAGCombiner] Update most ADD combines to support general vector combines Add a number of helper functions to match scalar or vector equivalent constant/splat values to allow most of the combine patterns to be used by vectors. Differential Revision: https://reviews.llvm.org/D25374 llvm-svn: 284015	2016-10-12 13:48:10 +00:00
Konstantin Zhuravlyov	081385a74e	[DAGCombiner] Do not remove the load of stored values when optimizations are disabled This combiner breaks debug experience and should not be run when optimizations are disabled. For example: int main() { int j = 0; j += 2; if (j == 2) return 0; return 5; } When debugging this code compiled in /O0, it should be valid to break at line "j+=2;" and edit the value of j. It should change the return value of the function. Differential Revision: https://reviews.llvm.org/D19268 llvm-svn: 284014	2016-10-12 13:44:24 +00:00
Chad Rosier	c215c3fd14	[CVP] Convert an AShr to a LShr if 1st operand is known to be nonnegative. An arithmetic shift can be safely changed to a logical shift if the first operand is known positive. This allows ComputeKnownBits (and similar analysis) to determine the sign bit of the shifted value in some cases. In turn, this allows InstCombine to canonicalize a signed comparison (a > 0) into an equality check (a != 0). PR30577 Differential Revision: https://reviews.llvm.org/D25119 llvm-svn: 284013	2016-10-12 13:41:38 +00:00
Alexey Bataev	b271a58e37	NFC: The Cost Model specialization, by Andrey Tischenko The current Cost Model implementation is very inaccurate and has to be updated, improved, re-implemented to be able to take into account the concrete CPU models and the concrete targets where this Cost Model is being used. For example, the Latency Cost Model should be differ from Code Size Cost Model, etc. This patch is the first step to launch the developing and implementation of a new Cost Model generation. Differential Revision: https://reviews.llvm.org/D25186 llvm-svn: 284012	2016-10-12 13:24:13 +00:00
Simon Pilgrim	fd0d7b21e0	[InstCombine] Fix constexpr issue in select combining As discussed by Andrea on PR30486, we have an unsafe cast to an Instruction type in the select combine which doesn't take into account that it could be a ConstantExpr instead. Differential Revision: https://reviews.llvm.org/D25466 llvm-svn: 284000	2016-10-12 10:20:15 +00:00
Alex Lorenz	d680287898	[Support][CommandLine] Display subcommands in help when there are less than 3 subcommands This commit fixes a bug where the help output doesn't display subcommands when a tool has less than 3 subcommands. This change doesn't include a corresponding unittest as there is no viable way to provide a unittest for it. Differential Revision: https://reviews.llvm.org/D25463 llvm-svn: 283998	2016-10-12 10:04:35 +00:00
Chandler Carruth	5dbc164d15	[LCG] Add the necessary functionality to the LazyCallGraph to support inlining. The basic inlining operation makes the following changes to the call graph: 1) Add edges that were previously transitive edges. This is always trivial and this patch gives the LCG helper methods to make this more convenient. 2) Remove the inlined edge. We had existing support for this, but it contained bugs that needed to be fixed. Testing in the same pattern as the inliner exposes these bugs very nicely. 3) Delete a function when it becomes dead because it is internal and all calls have been inlined. The LCG had no support at all for this operation, so this adds that support. Two unittests have been added that exercise this specific mutation pattern to the call graph. They were extremely effective in uncovering bugs. Sadly, a large fraction of the code here is just to implement those unit tests, but I think they're paying for themselves. =] This was split out of a patch that actually uses the routines to implement inlining in the new pass manager in order to isolate (with unit tests) the logic that was entirely within the LCG. Many thanks for the careful review from folks! There will be a few minor follow-up patches based on the comments in the review as well. Differential Revision: https://reviews.llvm.org/D24225 llvm-svn: 283982	2016-10-12 07:59:56 +00:00
Daniel Jasper	90d990e034	Revert "[libFuzzer] refactoring to speed things up, NFC" This reverts commit r283946. This breaks when build with GCC: lib/Fuzzer/FuzzerTracePC.cpp:169:6: error: always_inline function might not be inlinable [-Werror=attributes] lib/Fuzzer/FuzzerTracePC.cpp:169:6: error: inlining failed in call to always_inline 'void fuzzer::TracePC::HandleCmp(void*, T, T) [with T = long unsigned int]': target specific option mismatch lib/Fuzzer/FuzzerTracePC.cpp:198:65: error: called from here llvm-svn: 283979	2016-10-12 07:26:46 +00:00
Quentin Colombet	9de30faeac	[AArch64][InstrustionSelector] Teach the selector about G_BITCAST. llvm-svn: 283973	2016-10-12 03:57:52 +00:00
Quentin Colombet	cb629a897c	[AArch64][InstructionSelector] Refactor the handling of copies. Although Copies are not specific to preISel, we still have to assign them a proper register class. However, given they are not constrained to anything we do not have to handle the source register at the copy. It will be properly mapped when reaching the related definition. In the process, the handlong of G_ANYEXT is slightly modified as those end up being selected as copy. The difference is that when register size do not match on both sides, we need to insert SUBREG_TO_REG operation, otherwise the post RA copy expansion will not be happy! llvm-svn: 283972	2016-10-12 03:57:49 +00:00
Quentin Colombet	404e4350dc	[AArch64][MachineLegalizer] Mark more bitcasts as legal. Those are copies, we do not have to do any legalization action for them. llvm-svn: 283970	2016-10-12 03:57:43 +00:00
Sebastian Pop	d57d93c9de	Memory-SSA cleanup of clobbers interface, NFC This implements the cleanup that Danny asked to commit separately from the previous fix to GVN-hoist in https://reviews.llvm.org/D25476#inline-219818 Tested with ninja check on x86_64-linux. llvm-svn: 283967	2016-10-12 03:08:40 +00:00
Sebastian Pop	ab12fb62ee	GVN-hoist: fix store past load dependence analysis (PR30216, PR30499) This is a refreshed version of a patch that was reverted: it fixes the problems reported in both PR30216 and PR30499, and contains all the test-cases from both bugs. To hoist stores past loads, we used to search for potential conflicting loads on the hoisting path by following a MemorySSA def-def link from the store to be hoisted to the previous defining memory access, and from there we followed the def-use chains to all the uses that occur on the hoisting path. The problem is that the def-def link may point to a store that does not alias with the store to be hoisted, and so the loads that are walked may not alias with the store to be hoisted, and even as in the testcase of PR30216, the loads that may alias with the store to be hoisted are not visited. The current patch visits all loads on the path from the store to be hoisted to the hoisting position and uses the alias analysis to ask whether the store may alias the load. I was not able to use the MemorySSA functionality to ask for whether load and store are clobbered: I'm not sure which function to call, so I used a call to AA->isNoAlias(). Store past store is still working as before using a MemorySSA query: I added an extra test to pr30216.ll to make sure store past store does not regress. Tested on x86_64-linux with check and a test-suite run. Differential Revision: https://reviews.llvm.org/D25476 llvm-svn: 283965	2016-10-12 02:23:39 +00:00
Tim Shen	4ff62b187e	[PPCMIPeephole] Fix splat elimination Summary: In PPCMIPeephole, when we see two splat instructions, we can't simply do the following transformation: B = Splat A C = Splat B => C = Splat A because B may still be used between these two instructions. Instead, we should make the second Splat a PPC::COPY and let later passes decide whether to remove it or not: B = Splat A C = Splat B => B = Splat A C = COPY B Fixes PR30663. Reviewers: echristo, iteratee, kbarton, nemanjai Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25493 llvm-svn: 283961	2016-10-12 00:48:25 +00:00
Michael Kuperstein	7adbf6b042	[DAG] Fix crash in build_vector -> vector_shuffle combine Fixes a crash in the build_vector -> vector_shuffle combine when the first vector input is twice as wide as the output, and the second input vector is even wider. llvm-svn: 283953	2016-10-11 22:44:31 +00:00
Tim Northover	c1d8c2bf8c	GlobalISel: support same-size casts on AArch64. Mostly Ahmed's work again, I'm just sprucing things up slightly before committing. llvm-svn: 283952	2016-10-11 22:29:23 +00:00
Kostya Serebryany	a09d11e108	[libFuzzer] refactoring to speed things up, NFC llvm-svn: 283946	2016-10-11 21:27:37 +00:00
Reid Kleckner	bdfc05ff93	Re-land "[Thumb] Save/restore high registers in Thumb1 pro/epilogues" Reverts r283938 to reinstate r283867 with a fix. The original change had an ArrayRef referring to a destroyed temporary initializer list. Use plain C arrays instead. llvm-svn: 283942	2016-10-11 21:14:03 +00:00
Kevin Enderby	68fffa8a62	Next set of additional error checks for invalid Mach-O files for the load commands that uses the MachO::linker_option_command type but not used in llvm libObject code but used in llvm tool code. This includes just LC_LINKER_OPTION load command. llvm-svn: 283939	2016-10-11 21:04:39 +00:00
Reid Kleckner	f4876beb2b	Revert "[Thumb] Save/restore high registers in Thumb1 pro/epilogues" This reverts r283867. This appears to be an infinite loop: while (HiRegToSave != AllHighRegs.end() && CopyReg != AllCopyRegs.end()) { if (HiRegsToSave.count(*HiRegToSave)) { ... CopyReg = findNextOrderedReg(++CopyReg, CopyRegs, AllCopyRegs.end()); HiRegToSave = findNextOrderedReg(++HiRegToSave, HiRegsToSave, AllHighRegs.end()); } } llvm-svn: 283938	2016-10-11 20:54:41 +00:00
Tim Northover	3d38b3a4d1	GlobalISel: support selection of extend operations. Patch mostly by Ahmed Bougaca. llvm-svn: 283937	2016-10-11 20:50:21 +00:00
Tim Northover	60cf6fc58f	MIRParser: allow types on registers with a RegBank. This fixes some GlobalISel regression tests. llvm-svn: 283936	2016-10-11 20:50:04 +00:00
Kyle Butt	0846e56e63	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283934	2016-10-11 20:36:43 +00:00
Reid Kleckner	5d0bc63d91	Avoid braced initialization for default member initializers for MSVC 2013 llvm-svn: 283928	2016-10-11 20:02:57 +00:00
Arnold Schwaighofer	9103e268cf	Silence -Wunused-but-set-variable warning llvm-svn: 283927	2016-10-11 19:49:29 +00:00
Rui Ueyama	f9904043ca	Re-submit r283823: Define DbiStreamBuilder::addDbgStream to add stream. The previous commit was failing because we filled empty slots of the debug stream index with kInvalidStreamIndex. It should've been 0. llvm-svn: 283925	2016-10-11 19:43:12 +00:00
Kostya Serebryany	4d25ad93f3	[sanitizer-coverage] use private linkage for coverage guards, delete old commented-out code. llvm-svn: 283924	2016-10-11 19:36:50 +00:00
Rui Ueyama	888de9b031	Fix build error on LP64 platforms. llvm-svn: 283922	2016-10-11 19:28:56 +00:00
Zachary Turner	733be51dcd	[raw_ostream] Raise some helper functions out of raw_ostream. Low level functionality to format numbers were embedded in the implementation of raw_ostream. I have need to use these through an interface other than the overloaded stream operators, so they need to be raised to a level that they can be used from either raw_ostream operators or other code. llvm-svn: 283921	2016-10-11 19:24:45 +00:00
Konstantin Zhuravlyov	cdd4547607	[AMDGPU] Refactor waitcnt encoding - Refactor bit packing/unpacking - Calculate bit mask given bit shift and bit width - Introduce function for decoding bits of waitcnt - Introduce function for encoding bits of waitcnt - Introduce function for getting waitcnt mask (instead of using bare numbers) - Introduce function fot getting max waitcnt(s) (instead of using bare numbers) Differential Revision: https://reviews.llvm.org/D25298 llvm-svn: 283919	2016-10-11 18:58:22 +00:00
Dehao Chen	e9d075233a	Allow Switch instruction to have extractProfTotalWeight called as it can terminate a basic block. (NFC) llvm-svn: 283918	2016-10-11 18:53:00 +00:00
Mehdi Amini	caec5559cb	Fix "static initialization order fiasco" for the XCore Target. I fixed all the other Targets in r283702, and interestingly the sanitizers are only now "sometimes" catching this bug on the only one I missed. llvm-svn: 283914	2016-10-11 18:22:41 +00:00
Zachary Turner	76e007e7e3	[Support] Fix undefined behavior in RandomNumberGenerator. This has existed pretty much forever AFAICT, but the code was never being exercised because nobody was using the class. A user of this class surfaced, and now we're breaking with UB. The code was obviously wrong, so it's fixed here. llvm-svn: 283912	2016-10-11 18:17:26 +00:00
NAKAMURA Takumi	e9587bd771	ARMMachineFunctionInfo.cpp: Add an initializer of ARMFunctionInfo::ReturnRegsCount in the explicit ctor. It caused crash since r283867. llvm-svn: 283909	2016-10-11 17:38:30 +00:00
NAKAMURA Takumi	e61de02016	Reformat. llvm-svn: 283908	2016-10-11 17:38:25 +00:00
Sanjay Patel	8253e15ef3	[DAG] add fold for masked negated sign-extended bool This enhances the fold added with: https://reviews.llvm.org/rL283900 llvm-svn: 283905	2016-10-11 17:05:52 +00:00
Sanjay Patel	8384703d9b	[DAG] add fold for masked negated extended bool The non-obvious motivation for adding this fold (which already happens in InstCombine) is that we want to canonicalize IR towards select instructions and canonicalize DAG nodes towards boolean math. So we need to recreate some folds in the DAG to handle that change in direction. An interesting implementation difference for cases like this is that InstCombine generally works top-down while the DAG goes bottom-up. That means we need to detect different patterns. In this case, the SimplifyDemandedBits fold prevents us from performing a zext to sext fold that would then be recognized as a negation of a sext. llvm-svn: 283900	2016-10-11 16:26:36 +00:00
Daniel Jasper	b617286089	Silence unused warning in non-assert builds. llvm-svn: 283899	2016-10-11 16:22:36 +00:00
Changpeng Fang	98317d20f4	AMDGPU/SI: Update ISA version numbers for Tonga and Polaris10/11. Differential Revision: http://reviews.llvm.org/D25454 Reviewers: tstellarAMD llvm-svn: 283893	2016-10-11 16:00:47 +00:00
Zachary Turner	79f3333d3f	[cl] Don't print subcommand help when no subcommands present. Previously we would print USAGE: <exe> [subcommand] [options] Even if no subcommands were present. This changes the output format to only print "[subcommand]" if there is at least one subcommand. Fixes llvm.org/pr30598 Patch by Serge Guelton llvm-svn: 283892	2016-10-11 15:58:48 +00:00
Sanjay Patel	38a42e4bfa	[DAG] simplify logic; NFC llvm-svn: 283885	2016-10-11 14:14:30 +00:00
Sanjay Patel	907ae69125	[DAG] hoist DL(N) and fix formatting; NFC llvm-svn: 283884	2016-10-11 14:04:24 +00:00
Sanjay Patel	9609f3d6c7	[DAG] fix formatting; NFC llvm-svn: 283878	2016-10-11 13:47:43 +00:00
Igor Laevsky	04423cf785	[LCSSA] Implement linear algorithm for the isRecursivelyLCSSAForm For each block check that it doesn't have any uses outside of it's innermost loop. Differential Revision: https://reviews.llvm.org/D25364 llvm-svn: 283877	2016-10-11 13:37:22 +00:00
Oliver Stannard	d2083fb356	[Thumb] Save/restore high registers in Thumb1 pro/epilogues The high registers are not allocatable in Thumb1 functions, but they could still be used by inline assembly, so we need to save and restore the callee-saved high registers (r8-r11) in the prologue and epilogue. This is complicated by the fact that the Thumb1 push and pop instructions cannot access these registers. Therefore, we have to move them down into low registers before pushing, and move them back after popping into low registers. In most functions, we will have low registers that are also being pushed/popped, which we can use as the temporary registers for saving/restoring the high registers. However, this is not guaranteed, so we may need to push some extra low registers to ensure that the high registers can be saved/restored. For correctness, it would be sufficient to use just one low register, but if we have enough low registers available then we only need one push/pop instruction, rather than one per high register. We can also use the argument/return registers when they are not live, and the link register when saving (but not restoring), reducing the number of extra registers we need to push. There are still a few extreme edge cases where we need two push/pop instructions, because not enough low registers can be made live in the prologue or epilogue. In addition to the regression tests included here, I've also tested this using a script to generate functions which clobber different combinations of registers, have different numbers of argument and return registers (including variadic arguments), allocate different fixed sized objects on the stack, and do or don't use variable sized allocas and the __builtin_return_address intrinsic (all of which affect the available registers in the prologue and epilogue). I ran these functions in a test harness which verifies that all of the callee-saved registers are correctly preserved. Differential Revision: https://reviews.llvm.org/D24228 llvm-svn: 283867	2016-10-11 10:12:25 +00:00
Oliver Stannard	50a74393c2	[ARM] Fix registers clobbered by SjLj EH on soft-float targets Currently, the Int_eh_sjlj_dispatchsetup intrinsic is marked as clobbering all registers, including floating-point registers that may not be present on the target. This is technically true, as we could get linked against code that does use the FP registers, but that will not actually work, as the soft-float code cannot save and restore the FP registers. SjLj exception handling can only work correctly if either all or none of the code is built for a target with FP registers. Therefore, we can assume that, when Int_eh_sjlj_dispatchsetup is compiled for a soft-float target, it is only going to be linked against other soft-float code, and so only clobbers the general-purpose registers. This allows us to check that no non-savable registers are clobbered when generating the prologue/epilogue. Differential Revision: https://reviews.llvm.org/D25180 llvm-svn: 283866	2016-10-11 10:06:59 +00:00
Diana Picus	c93518db8c	[AArch64] Allow label arithmetic with add/sub/cmp Allow instructions such as 'cmp w0, #(end - start)' by folding the expression into a constant. For ELF, we fold only if the symbols are in the same section. For MachO, we fold if the expression contains only symbols that are not linker visible. Fixes https://llvm.org/bugs/show_bug.cgi?id=18920 Differential Revision: https://reviews.llvm.org/D23834 llvm-svn: 283862	2016-10-11 09:17:47 +00:00
Fraser Cormack	48d9fdc1e1	Fix formatting in findRegisterUseOperandIdx. NFC. llvm-svn: 283860	2016-10-11 09:09:21 +00:00
Daniel Jasper	0c42dc4784	Revert "Codegen: Tail-duplicate during placement." This reverts commit r283842. test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our internal testing. I'll share a link with you. llvm-svn: 283857	2016-10-11 07:36:11 +00:00
Mehdi Amini	ea8e9795a5	Make RandomNumberGenerator compatible with <random> LLVM's RandomNumberGenerator wasn't compatible with the random distribution from <random>. Fixes PR25105 Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu> Differential Revision: https://reviews.llvm.org/D25443 llvm-svn: 283854	2016-10-11 07:13:01 +00:00
Dehao Chen	c87bc79d90	Tune isHotFunction/isColdFunction Summary: This patch sets function as hot if function's entry count is hot/cold. Reviewers: eraman, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25048 llvm-svn: 283852	2016-10-11 05:19:00 +00:00
Matthias Braun	e67fc4a328	Fix warning; NFC llvm-svn: 283851	2016-10-11 04:32:03 +00:00
Matthias Braun	3d85ebe5b1	MIRParser: generic register operands with types This should fix the fallout of r283848. llvm-svn: 283850	2016-10-11 04:22:29 +00:00
Matthias Braun	74ad41c7cd	MIRParser: Rewrite register info initialization; mostly NFC This changes MachineRegisterInfo to be initializes after parsing all instructions. This is in preparation for upcoming commits that allow the register class specification on the operand or deduce them from the MCInstrDesc. This commit removes the unused feature of having nonsequential register numbers. This was confusing anyway as the vreg numbers would be different after parsing when you had "holes" in your numbering. This patch also introduces the concept of an incomplete virtual register. An incomplete virtual register may be used during .mir parsing to construct MachineOperands without knowing the exact register class (or register bank) yet. NFC except for some error messages. Differential Revision: https://reviews.llvm.org/D22397 llvm-svn: 283848	2016-10-11 03:13:01 +00:00
Kyle Butt	ae068a320c	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283842	2016-10-11 01:20:33 +00:00
Kostya Serebryany	d19919a80e	[libFuzzer] implement value profile for switch, increase the size of the PCs array, make sure we don't overflow it llvm-svn: 283841	2016-10-11 01:14:41 +00:00
Kostya Serebryany	3e0e901a18	[libFuzzer] add switch tests llvm-svn: 283840	2016-10-11 01:13:32 +00:00
Dylan McKay	c328fe5af4	[RegAllocGreedy] Attempt to split unspillable live intervals Summary: Previously, when allocating unspillable live ranges, we would never attempt to split. We would always bail out and try last ditch graph recoloring. This patch changes this by attempting to split all live intervals before performing recoloring. This fixes LLVM bug PR14879. I can't add test cases for any backends other than AVR because none of them have small enough register classes to trigger the bug. Reviewers: qcolombet Subscribers: MatzeB Differential Revision: https://reviews.llvm.org/D25070 llvm-svn: 283838	2016-10-11 01:04:36 +00:00
David Majnemer	80dca0c78f	[InstCombine] Transform !range metadata to !nonnull when combining loads When combining an integer load with !range metadata that does not include 0 to a pointer load, make sure emit !nonnull metadata on the newly-created pointer load. This prevents the !nonnull metadata from being dropped during a ptrtoint/inttoptr pair. This fixes PR30597. Patch by Ariel Ben-Yehuda! Differential Revision: https://reviews.llvm.org/D25215 llvm-svn: 283836	2016-10-11 01:00:45 +00:00
Quentin Colombet	d2623f8e38	[AArch64][InstructionSelector] Teach how to select FP load/store. This patch allows to select 32 and 64-bit FP load and store. llvm-svn: 283832	2016-10-11 00:21:14 +00:00
Quentin Colombet	0e5312787e	[AArch64][InstructionSelector] Teach the selector how to handle vector OR. This only adds the support for 64-bit vector OR. Adding more sizes is not difficult, but it requires a bigger refactoring because ORs work on any size, not necessarly the ones that match the width of the register width. Right now, this is not expressed in the legalization, so don't bother pushing the refactoring yet. llvm-svn: 283831	2016-10-11 00:21:11 +00:00
Quentin Colombet	d3126d5fb4	[AArch64][MachineLegalizer] Mark v2s32 G_LOAD as legal. Actually every 64-bit loads are legal, but right now the API does not offer a simple way to express that. llvm-svn: 283829	2016-10-11 00:21:08 +00:00
Rui Ueyama	8af4988f35	Revert r283824 and r283823: Define DbiStreamBuilder::addDbgStream to add stream. This reverts commit r283824 and r283823 to fix buildbots. llvm-svn: 283828	2016-10-11 00:15:50 +00:00
Rui Ueyama	914eef6a64	Fix a bug in DbiStreamBuilder::addDbgStream. This feature will be tested in LLD unit tests. llvm-svn: 283824	2016-10-10 23:44:04 +00:00
Rui Ueyama	70edd9e41d	Define DbiStreamBuilder::addDbgStream to add stream. Previously, there is no way to create a stream other than pre-defined special stream such as DBI or IPI. This patch adds a new method, addDbgStream, to add a debug stream to a PDB file. Differential Revision: https://reviews.llvm.org/D25356 llvm-svn: 283823	2016-10-10 23:35:36 +00:00
Peter Collingbourne	0da86301ad	Revert r283690, "MC: Remove unused entities." llvm-svn: 283814	2016-10-10 22:49:37 +00:00
Tim Northover	bdf1624367	GlobalISel: select G_GLOBAL_VALUE uses on AArch64. llvm-svn: 283809	2016-10-10 21:50:00 +00:00
Tim Northover	ad0acca544	GlobalISel: allow G_GLOBAL_VALUEs in AArch64 legalization. llvm-svn: 283808	2016-10-10 21:49:53 +00:00
Tim Northover	2fda4b08ae	GlobalISel: support selecting G_GEP instructions. They're basically just an alias for G_ADD on AArch64. llvm-svn: 283807	2016-10-10 21:49:49 +00:00
Tim Northover	4edc60d785	GlobalISel: support selecting constants on AArch64. llvm-svn: 283806	2016-10-10 21:49:42 +00:00
Dehao Chen	84287abf43	Rename isHotFunction/isColdFunction to isFunctionEntryHot/isFunctionEntryCold. (NFC) This is in preparation for https://reviews.llvm.org/D25048 llvm-svn: 283805	2016-10-10 21:47:28 +00:00
Hal Finkel	fcd2421667	[SelectionDAGBuilder] Support llvm.flt.rounds on targets where i32 is not legal Add integer expansion for FLT_ROUNDS_ for targets where i32 is not a legal type. Patch by Edward Jones, thanks! Differential Revision: https://reviews.llvm.org/D24459 llvm-svn: 283797	2016-10-10 20:45:15 +00:00
Adrian Prantl	3bfe1093df	Teach llvm::StripDebugInfo() about global variable !dbg attachments. This is a regression introduced by the global variable ownership reversal performed in r281284. rdar://problem/28448075 llvm-svn: 283784	2016-10-10 17:53:33 +00:00
Justin Lebar	611c5c225a	Use unique_ptr in LLVMContextImpl's constant maps. Reviewers: timshen Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D25419 llvm-svn: 283767	2016-10-10 16:26:13 +00:00
Alexandros Lamprineas	20e9ddba73	[ARM] Fix invalid VLDM/VSTM access when targeting Big Endian with NEON The instructions VLDM/VSTM can only access word-aligned memory locations and produce alignment fault if the condition is not met. The compiler currently generates VLDM/VSTM for v2f64 load/store regardless the alignment of the memory access. Instead, if a v2f64 load/store is not word-aligned, the compiler should generate VLD1/VST1. For each non double-word-aligned VLD1/VST1, a VREV instruction should be generated when targeting Big Endian. Differential Revision: https://reviews.llvm.org/D25281 llvm-svn: 283763	2016-10-10 16:01:54 +00:00
Nirav Dave	f43cc9f8b5	Add return type for checkForValidSection parsing function. NFC Intended. llvm-svn: 283761	2016-10-10 15:24:54 +00:00
Zvi Rackover	2a21f125bd	[X86] Prefer rotate by 1 over rotate by imm Summary: Rotate by 1 is translated to 1 micro-op, while rotate with imm8 is translated to 2 micro-ops. Fixes pr30644. Reviewers: delena, igorb, craig.topper, spatel, RKSimon Differential Revision: https://reviews.llvm.org/D25399 llvm-svn: 283758	2016-10-10 14:43:55 +00:00
Chris Dewhurst	850131213f	This pass, fixing an erratum in some LEON 2 processors ensures that the SDIV instruction is not issued, but replaced by SDIVcc instead, which does not exhibit the error. Unit test included. Differential Review: https://reviews.llvm.org/D24660 llvm-svn: 283727	2016-10-10 08:53:06 +00:00
Daniel Jasper	0dea246b4f	Fix WebAssembly build after r283702. llvm-svn: 283723	2016-10-10 06:49:55 +00:00
Craig Topper	9ece2f7529	[AVX-512] Add missing pattern sext or zext from bytes to quad words with a 128-bit load as input. llvm-svn: 283720	2016-10-10 06:25:48 +00:00
Michael Zuckerman	3eeac2d56b	[x86][inline-asm][llvm] accept 'v' constraint Commit in the name of:Coby Tayree 1.'v' constraint for (x86) non-avx arch imitates the already implemented 'x' constraint, i.e. allows XMM{0-15} & YMM{0-15} depending on the apparent arch & mode (32/64). 2.for the avx512 arch it allows [X,Y,Z]MM{0-31} (mode dependent) This patch applies the needed changes to clang clang patch: https://reviews.llvm.org/D25004 Differential Revision: D25005 llvm-svn: 283717	2016-10-10 05:48:56 +00:00
Dylan McKay	1a523767dc	[AVR] Enable generation of the TableGen assembly writer tables This also changes the order of the statements in CMakeLists.txt to be alphabetical. llvm-svn: 283711	2016-10-10 01:28:45 +00:00
Craig Topper	64378f4378	[AVX-512] Port 128 and 256-bit memory->register sign/zero extend patterns from SSE file. Also add a minimal set for 512-bit. llvm-svn: 283704	2016-10-09 23:08:39 +00:00
Craig Topper	29558b8284	[X86] Remove redundant patterns. The same pattern appears a few lines up. llvm-svn: 283703	2016-10-09 23:08:33 +00:00
Mehdi Amini	f42454b94b	Move the global variables representing each Target behind accessor function This avoids "static initialization order fiasco" Differential Revision: https://reviews.llvm.org/D25412 llvm-svn: 283702	2016-10-09 23:00:34 +00:00
Elena Demikhovsky	5b10aa1f1e	DAG: Setting Masked-Expand-Load as a variant of Masked-Load node Masked-expand-load node represents load operation that loads a variable amount of elements from memory according to amount of "true" bits in the mask and expands the loaded elements according to their position in the mask vector. Right now, the node is used in intrinsics for VEXPAND* instructions. The work is done towards implementation of masked.expandload and masked.compressstore intrinsics. Differential Revision: https://reviews.llvm.org/D25322 llvm-svn: 283694	2016-10-09 10:48:52 +00:00
Craig Topper	43973154dd	[AVX-512] Fix execution domain for EVEX encoded VINSERTPS. llvm-svn: 283692	2016-10-09 06:41:47 +00:00
Peter Collingbourne	cc723cccab	MC: Remove unused entities. llvm-svn: 283691	2016-10-09 04:39:13 +00:00
Peter Collingbourne	5c924d7117	Target: Remove unused entities. llvm-svn: 283690	2016-10-09 04:38:57 +00:00
Craig Topper	e30cb00dc0	[AVX-512] Add subvector insert and extract to load/store folding tables. llvm-svn: 283689	2016-10-09 03:54:13 +00:00
Craig Topper	4262d53024	[AVX-512] Add the vector down convert instructions to the store folding tables. llvm-svn: 283687	2016-10-09 03:54:05 +00:00
Kostya Serebryany	7abb95d3b3	[libFuzzer] make a test less flaky llvm-svn: 283686	2016-10-09 03:45:38 +00:00
Kostya Serebryany	c5325ed29d	[libFuzzer] when shrinking the corpus, delete evicted files previously created by the current process llvm-svn: 283682	2016-10-08 23:24:45 +00:00
Kostya Serebryany	9adc7c8b4a	[libFuzzer] control the reload interval by a flag, make it 10 seconds by default llvm-svn: 283676	2016-10-08 22:12:14 +00:00
Kostya Serebryany	cd04ec25dd	[libFuzzer] fix use-after-free in libFuzzer found by ... fuzzing. llvm-svn: 283675	2016-10-08 21:57:48 +00:00
Mehdi Amini	732afdd09a	Turn cl::values() (for enum) from a vararg function to using C++ variadic template The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when calling: va_start(ValueArgs, Desc); with Desc being a StringRef. Differential Revision: https://reviews.llvm.org/D25342 llvm-svn: 283671	2016-10-08 19:41:06 +00:00
Craig Topper	086f0c1401	[AVX-512] Fix a bug in getLargestLegalSuperClass where we inflated to VR128X/VR256X even when VLX isn't supported. This seems to have been responsible for the XMM16-31 spills observed in PR29112. With this fixed the test case has been modified to no longer have a spill of XMM16. llvm-svn: 283668	2016-10-08 18:49:57 +00:00
Colin LeMahieu	c69f7ff6c0	[Hexagon] Adding change of flow max 1 (cofMax1) TS flag for marking this restriction rather than implying it from TypeJR. llvm-svn: 283665	2016-10-08 17:18:51 +00:00
Teresa Johnson	897bab9b35	[ThinLTO] Record calls to aliases Summary: When there is a call to an alias in the same module, we were not adding a call edge. So we could incorrectly think that the alias was dead if it was inlined in that function, despite having a reference imported elsewhere. This resulted in unsats at link time. Add a call edge when the call is to an alias. Reviewers: davide, mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25384 llvm-svn: 283664	2016-10-08 16:11:42 +00:00
Sebastian Pop	eb65d72d9c	[AArch64] Avoid generating indexed vector instructions for Exynos Avoid generating indexed vector instructions for Exynos. This is needed for fmla/fmls/fmul/fmulx. For example, the instruction fmla v0.4s, v1.4s, v2.s[1] is less efficient than the instructions dup v2.4s, v2.s[1] fmla v0.4s, v1.4s, v2.4s Patch written by Abderrazek Zaafrani. Differential Revision: https://reviews.llvm.org/D21571 llvm-svn: 283663	2016-10-08 12:30:07 +00:00
Adam Nemet	ee5cf031ce	[OptRemarks] Remove non-printable chars from function name Value names may be prefixed with a binary '1' to indicate that the backend should not modify the symbols due to any platform naming convention. This should not show up in the YAML opt record file because it breaks the YAML parser. llvm-svn: 283656	2016-10-08 04:47:20 +00:00
Mehdi Amini	f82bda0a7a	ThinLTO: don't perform incremental LTO on module without a hash Clang always emit a hash for ThinLTO, but as other frontend are starting to use ThinLTO, this could be a serious bug. Differential Revision: https://reviews.llvm.org/D25379 llvm-svn: 283655	2016-10-08 04:44:23 +00:00
Mehdi Amini	00fa1409ec	ThinLTO: handles modules with empty summaries We need to add an entry in the combined-index for modules that have a hash but otherwise empty summary, this is needed so that we can get the hash for the module. Also, if no entry is present in the combined index for a module, we need to skip it when trying to compute a cache entry. Differential Revision: https://reviews.llvm.org/D25300 llvm-svn: 283654	2016-10-08 04:44:18 +00:00
Kyle Butt	2facd194a2	Revert "Codegen: Tail-duplicate during placement." This reverts commit 71c312652c10f1855b28d06697c08d47e7a243e4. llvm-svn: 283647	2016-10-08 01:47:05 +00:00
Dylan McKay	f96ffe1ebf	[AVR] Add backend dependencies to MCTargetDesc/LLVMBuild.txt llvm-svn: 283642	2016-10-08 01:14:23 +00:00
Zachary Turner	3b14764ce5	[pdb] Dump Module Symbols to Yaml. This is the first step towards round-tripping symbol information, and thusly being able to write symbol information to a PDB. This patch writes the symbol information for each compiland to the Yaml when running in pdb2yaml mode. There's still some loose ends, such as what to do about relocations (necessary in order to print linkage names), how to print enums with friendly names, and how to give the dumper access to the StringTable, but this is a good first start. llvm-svn: 283641	2016-10-08 01:12:01 +00:00
Dylan McKay	552b7856d3	Fix incorrect assertion in AVRFrameLowering.cpp This wasn't looking at the right instruction, and would always fail. llvm-svn: 283640	2016-10-08 01:10:36 +00:00
Dylan McKay	b16b6d5739	[AVR] Don't worry about call frame size when initializing frame pointer We previously only used the frame pointer if the frame pointer was too big. This was to work around a bug (described in this old commit) https://sourceforge.net/p/avr-llvm/code/204/tree//llvm/trunk/AVR/AVRFrameLowering.cpp?diff=50d64d912718465cb887d17a:203 I mistakenly invered the condition assuming it was a typo. I am now removing it because it doesn't seem to be a problem anymore (plus it's a dirty hack). llvm-svn: 283639	2016-10-08 01:10:31 +00:00
Dylan McKay	7c2d41aa9f	[AVR] Don't shadow container while iterating in range-based loop This works on clang, but fails on GCC 4.6 llvm-svn: 283638	2016-10-08 01:09:06 +00:00
Dylan McKay	a1a944e3cb	[AVR] Use references rather than pointers in AVRISelLowering llvm-svn: 283636	2016-10-08 01:06:21 +00:00
Dylan McKay	12109e7314	Allow a maximum of 64 bits to be returned in registers The rest spills to the stack Authored by Jake Goulding llvm-svn: 283635	2016-10-08 01:05:09 +00:00
Dylan McKay	c1ff65cf62	[AVR] Expand MULHS for all types Once MULHS was expanded, this exposed an issue where the condition register was thought to be 16-bit. This caused an attempt to copy a 16-bit register to an 8-bit register. Authored by Jake Goulding llvm-svn: 283634	2016-10-08 01:01:49 +00:00
Dylan McKay	ddb7a59fe9	[AVR] Add the 'SoftFail' field to all instruction formats This will be used in the future for disassembly. llvm-svn: 283630	2016-10-08 00:55:46 +00:00
Dylan McKay	24d02ee141	[AVR] Set up the instruction printer and the assembly backend llvm-svn: 283629	2016-10-08 00:50:11 +00:00
Dylan McKay	2b0936d41d	[AVR] Add dependencies to AVR libraries in AVRCodeGen llvm-svn: 283628	2016-10-08 00:45:24 +00:00
Dylan McKay	07897f5492	[AVR] Add missing subdirectories to LLVMBuild llvm-svn: 283627	2016-10-08 00:42:58 +00:00
Gor Nishanov	1b6aec8e25	[coroutines] Store an address of destroy OR cleanup part in the coroutine frame. Summary: If heap allocation of a coroutine is elided, we need to make sure that we will update an address stored in the coroutine frame from f.destroy to f.cleanup. Before this change, CoroSplit synthesized these stores after coro.begin: ``` store void (%f.Frame) @f.resume, void (%f.Frame)* %resume.addr store void (%f.Frame) @f.destroy, void (%f.Frame)* %destroy.addr ``` In those cases where we did heap elision, but were not able to devirtualize all indirect calls, destroy call will attempt to "free" the coroutine frame stored on the stack. Oops. Now we use select to put an appropriate coroutine subfunction in the destroy slot. As bellow: ``` store void (%f.Frame) @f.resume, void (%f.Frame)* %resume.addr %0 = select i1 %need.alloc, void (%f.Frame) @f.destroy, void (%f.Frame) @f.cleanup store void (%f.Frame) %0, void (%f.Frame)* %destroy.addr ``` Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25377 llvm-svn: 283625	2016-10-08 00:22:50 +00:00
Dylan McKay	4d82df32b9	[AVR] Add the assembly printer Summary: This adds the AVRAsmPrinter class. Reviewers: arsenm, kparzysz Subscribers: llvm-commits, wdng, beanz, japaric, mgorny Differential Revision: https://reviews.llvm.org/D25271 llvm-svn: 283623	2016-10-08 00:02:36 +00:00
Tom Stellard	5ab6154dc3	AMDGPU/SI: Handle div_fmas hazard in GCNHazardRecognizer Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25250 llvm-svn: 283622	2016-10-07 23:42:48 +00:00
Kyle Butt	37e676d857	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283619	2016-10-07 22:33:20 +00:00
Arnold Schwaighofer	3f25658143	swifterror: Don't compute swifterror vregs during instruction selection The code used llvm basic block predecessors to decided where to insert phi nodes. Instruction selection can and will liberally insert new machine basic block predecessors. There is not a guaranteed one-to-one mapping from pred. llvm basic blocks and machine basic blocks. Therefore the current approach does not work as it assumes we can mark predecessor machine basic block as needing a copy, and needs to know the set of all predecessor machine basic blocks to decide when to insert phis. Instead of computing the swifterror vregs as we select instructions, propagate them at the end of instruction selection when the MBB CFG is complete. When an instruction needs a swifterror vreg and we don't know the value yet, generate a new vreg and remember this "upward exposed" use, and reconcile this at the end of instruction selection. This will only happen if the target supports promoting swifterror parameters to registers and the swifterror attribute is used. rdar://28300923 llvm-svn: 283617	2016-10-07 22:06:55 +00:00
Sanjay Patel	14c02052d6	[DAG] clean up foldSelectOfConstants(); NFCI Rename variables, simplify logic. Not clear yet why we don't handle a target with ZeroOrNegativeOneBooleanContent too. llvm-svn: 283613	2016-10-07 21:55:42 +00:00
Davide Italiano	f6988d2980	[InstCombine] Don't unpack arrays that are too large (part 2). This is similar to r283599, but for store instructions. Thanks to David for pointing out! llvm-svn: 283612	2016-10-07 21:53:09 +00:00
Zachary Turner	0d8407447d	Refactor Symbol visitor code. Type visitor code had already been refactored previously to decouple the visitor and the visitor callback interface. This was necessary for having the flexibility to visit in different ways (for example, dumping to yaml, reading from yaml, dumping to ScopedPrinter, etc). This patch merely implements the same visitation pattern for symbol records that has already been implemented for type records. llvm-svn: 283609	2016-10-07 21:34:46 +00:00
Davide Italiano	da11412243	[InstCombine] Don't unpack arrays that are too large Differential Revision: https://reviews.llvm.org/D25376 llvm-svn: 283599	2016-10-07 20:57:42 +00:00
Sanjay Patel	ecaf343fe7	[DAG] move fold (select C, 0, 1 -> xor C, 1) to a helper function; NFC We're missing at least 3 other similar folds based on what we have in InstCombine. llvm-svn: 283596	2016-10-07 20:47:51 +00:00
Tom Stellard	6982bb8f25	AMDGPU/SI: Add support for 8-byte relocations Reviewers: arsenm, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25375 llvm-svn: 283593	2016-10-07 20:36:58 +00:00
Colin LeMahieu	9694825d32	[Hexagon][NFC] Using documented instruction type name V4LDST instead of MEMOP. llvm-svn: 283582	2016-10-07 19:11:28 +00:00
Mehdi Amini	dc5a507c92	Recommit "Use StringRef in LTOModule implementation (NFC)"" This reverts commit r283456 and reapply r282997, with explicitly zeroing the struct member to workaround a bug in MSVC2013 with zero-initialization: https://connect.microsoft.com/VisualStudio/feedback/details/802160 llvm-svn: 283581	2016-10-07 19:05:14 +00:00
Davide Italiano	c0169fa94f	[LoopIdiomRecognize] Merge two if conditions into one. NFCI. llvm-svn: 283579	2016-10-07 18:39:43 +00:00
Sanjay Patel	4326c4ac8f	[InstCombine] fold select X, (ext X), C If we're going to canonicalize IR towards select of constants, try harder to create those. Also, don't lose the metadata. This is actually 4 related transforms in one patch: // select X, (sext X), C --> select X, -1, C // select X, (zext X), C --> select X, 1, C // select X, C, (sext X) --> select X, C, 0 // select X, C, (zext X) --> select X, C, 0 Differential Revision: https://reviews.llvm.org/D25126 llvm-svn: 283575	2016-10-07 17:53:07 +00:00
Tom Stellard	ef33c4b3f2	AMDGPU/SI: Emit fixups for long branches Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25366 llvm-svn: 283570	2016-10-07 16:01:18 +00:00
Artem Tamazov	73f1ab28cd	[AMDGPU][mc] Add support for buffer_load_dwordx3, buffer_store_dwordx3. Partially fixes Bug 28232. Lit tests added. Differential Revision: https://reviews.llvm.org/D25367 llvm-svn: 283567	2016-10-07 15:53:16 +00:00
Dehao Chen	6e0c8446db	Invoke add-discriminator at -g0 -fsample-profile Summary: -fsample-profile needs discriminator, which will not be added if built with -g0. This patch makes sure the discriminator is added for sample-profile at -g0. A followup patch will be send out to update clang tests. Reviewers: davidxl, dblaikie, echristo, dnovillo Subscribers: mehdi_amini, probinson, llvm-commits Differential Revision: https://reviews.llvm.org/D25132 llvm-svn: 283565	2016-10-07 15:21:31 +00:00
Matthew Simpson	a371c14ffe	[LV] Don't mark multi-use branch conditions uniform Previously, we marked the branch conditions of latch blocks uniform after vectorization if they were instructions contained in the loop. However, if a condition instruction has users other than the branch, it may not remain uniform. This patch ensures the conditions we mark uniform are only used by the branch. This should fix PR30627. Reference: https://llvm.org/bugs/show_bug.cgi?id=30627 llvm-svn: 283563	2016-10-07 15:20:13 +00:00
Krzysztof Parzyszek	e513e17b23	Only track physical registers in LivePhysRegs llvm-svn: 283561	2016-10-07 14:50:49 +00:00
Sam Kolton	a3ec5c10e2	[AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx to AMDGPUBaseInfo.h Reviewers: artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25084 llvm-svn: 283560	2016-10-07 14:46:06 +00:00
Konstantin Zhuravlyov	c09e2d7e46	[AMDGPU] AMDGPUCodeGenPrepare: remove extra ';' llvm-svn: 283558	2016-10-07 14:39:53 +00:00
Tom Stellard	17eb3413cd	[ValueTracking] Fix crash in GetPointerBaseWithConstantOffset() Summary: While walking defs of pointer operands we were assuming that the pointer size would remain constant. This is not true, because addresspacecast instructions may cast the pointer to an address space with a different pointer width. This partial reverts r282612, which was a more conservative solution to this problem. Reviewers: reames, sanjoy, apilipenko Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24772 llvm-svn: 283557	2016-10-07 14:23:29 +00:00
Konstantin Zhuravlyov	f74fc60a7d	[AMDGPU] Promote uniform (i1, i16] operations to i32 Differential Revision: https://reviews.llvm.org/D25302 llvm-svn: 283555	2016-10-07 14:22:58 +00:00
Javed Absar	9797989ca7	[ARM]: add missing switch case for cortex-r52 Adds a missing switch case for handling cortex-r52 in init-subtarget-features. llvm-svn: 283551	2016-10-07 13:41:55 +00:00
Martin Storsjo	04864f45b2	[ARM] Reapply: Use __rt_div functions for divrem on Windows Reapplying r283383 after revert in r283442. The additional fix is a getting rid of a stray space in a function name, in the refactoring part of the commit. This avoids falling back to calling out to the GCC rem functions (__moddi3, __umoddi3) when targeting Windows. The __rt_div functions have flipped the two arguments compared to the __aeabi_divmod functions. To match MSVC, we emit a check for division by zero before actually calling the library function (even if the library function itself also might do the same check). Not all calls to __rt_div functions for division are currently merged with calls to the same function with the same parameters for the remainder. This is more wasteful than a div + mls as before, but avoids calls to __moddi3. Differential Revision: https://reviews.llvm.org/D25332 llvm-svn: 283550	2016-10-07 13:28:53 +00:00
Javed Absar	fb4b6e8db9	[ARM]: Add Cortex-R52 target to LLVM This patch adds Cortex-R52, the new ARM real-time processor, to LLVM. Cortex-R52 implements the ARMv8-R architecture. llvm-svn: 283542	2016-10-07 12:06:40 +00:00
Simon Pilgrim	a5d019ee95	[X86][SSE] Update register class during MOVSD/MOVSS - BLENDPD/BLENDPS commutation MOVSD/MOVSS take a 128-bit register and a FR32/FR64 register input, the commutation code wasn't taking this into account leading to verification errors. This patch inserts a vreg copy mi to ensure that the registers are correct. Fix for PR30607 Differential Revision: https://reviews.llvm.org/D25280 llvm-svn: 283539	2016-10-07 11:18:38 +00:00
Alexey Bataev	6ad5da7c81	[SLPVectorizer] Fix for PR25748: reduction vectorization after loop unrolling. The next code is not vectorized by the SLPVectorizer: ``` int test(unsigned int *p) { int sum = 0; for (int i = 0; i < 8; i++) sum += p[i]; return sum; } ``` During optimization this loop is fully unrolled and SLPVectorizer is unable to vectorize it. Patch tries to fix this problem. Differential Revision: https://reviews.llvm.org/D24796 llvm-svn: 283535	2016-10-07 09:39:22 +00:00
Oliver Stannard	4df1cc0b00	[ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI With the ROPI and RWPI relocation models we can't always have pointers to global data or functions in constant data, so don't try to convert switches into lookup tables if any value in the lookup table would require a relocation. We can still safely emit lookup tables of other values, such as simple constants. Differential Revision: https://reviews.llvm.org/D24462 llvm-svn: 283530	2016-10-07 08:48:24 +00:00
Mehdi Amini	68c6c8cd78	Use StringRef in ARMELFStreamer (NFC) llvm-svn: 283529	2016-10-07 08:48:07 +00:00
Nicolai Haehnle	87bc4c218b	AMDGPU: Fix use-after-free in SIOptimizeExecMasking Summary: There was a bug with sequences like s_mov_b64 s[0:1], exec s_and_b64 s[2:3]<def>, s[0:1], s[2:3]<kill> ... s_mov_b64_term exec, s[2:3] because s[2:3] was defined and used in the same instruction, ending up with SaveExecInst inside OtherUseInsts. Note that the test case also exposes an unrelated bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98028 Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25306 llvm-svn: 283528	2016-10-07 08:40:14 +00:00
Mehdi Amini	a0016ec95f	Use StringReg in TargetParser APIs (NFC) llvm-svn: 283527	2016-10-07 08:37:29 +00:00
Mehdi Amini	9ff8e87ca4	Revert "Revert "Add a static_assert to enforce that parameters to llvm::format() are not totally unsafe"" This reverts commit r283510 and reapply r283509, with updates to clang-tools-extra as well. llvm-svn: 283525	2016-10-07 08:25:42 +00:00
Craig Topper	948625633f	[X86] Fix patterns for VPMULLD and VPCMPEQQ to not require aligned loads. llvm-svn: 283524	2016-10-07 06:54:43 +00:00
Craig Topper	871da8ebea	[X86] Remove unused PatFrags. NFC llvm-svn: 283523	2016-10-07 06:54:39 +00:00
Dylan McKay	e5d89e8001	[AVR] Add the AVRMCInstLower class Summary: This class deals with the lowering of CodeGen `MachineInstr` objects to MC `MCInst` objects. Reviewers: kparzysz, arsenm Subscribers: wdng, beanz, japaric, mgorny Differential Revision: https://reviews.llvm.org/D25269 llvm-svn: 283522	2016-10-07 06:13:09 +00:00
David Majnemer	8c03c1bade	[SimplifyCFG] Correctly test for unconditional branches in GetCaseResults GetCaseResults assumed that a terminator with one successor was an unconditional branch. This is not necessarily the case, it could be a cleanupret. Strengthen the check by querying whether or not the terminator is exceptional. llvm-svn: 283517	2016-10-07 01:38:35 +00:00
Peter Collingbourne	2261d78cd2	Target: Remove unused patterns and transforms. NFC. llvm-svn: 283515	2016-10-07 00:30:49 +00:00
Colin LeMahieu	8ed1aee9dd	[Hexagon] NFC Removing 'V4_' prefix from duplex instruction names. llvm-svn: 283514	2016-10-07 00:15:07 +00:00
Mehdi Amini	292f376934	Revert "Add a static_assert to enforce that parameters to llvm::format() are not totally unsafe" This reverts commit r283509, clang is hitting the assert. llvm-svn: 283510	2016-10-06 23:41:49 +00:00
Mehdi Amini	a7e893f638	Add a static_assert to enforce that parameters to llvm::format() are not totally unsafe Summary: I had for the second time today a bug where llvm::format("%s", Str) was called with Str being a StringRef. The Linux and MacOS bots were fine, but windows having different calling convention, it printed garbage. Instead we can catch this at compile-time: it is never expected to call a C vararg printf-like function with non scalar type I believe. Reviewers: bogner, Bigcheese, dexonsmith Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25266 llvm-svn: 283509	2016-10-06 23:26:29 +00:00
Colin LeMahieu	9675de5ba8	[Hexagon] NFC. Canonicalizing absolute address instruction names. llvm-svn: 283507	2016-10-06 23:02:11 +00:00
Vedant Kumar	7beb423765	Delete some dead code in SelectionDAG (NFC) Differential Revision: https://reviews.llvm.org/D24435 llvm-svn: 283505	2016-10-06 22:53:43 +00:00
Dan Gohman	2726b88c03	[WebAssemby] Implement block signatures. Per spec changes, this implements block signatures, and adds just enough logic to produce correct block signatures at the ends of functions. Differential Revision: https://reviews.llvm.org/D25144 llvm-svn: 283503	2016-10-06 22:29:32 +00:00
Dan Gohman	3a643e8d46	[WebAssembly] Remove loop's bottom label. Per spec changes, loop constructs no longer have a bottom label. https://reviews.llvm.org/D25118 llvm-svn: 283502	2016-10-06 22:10:23 +00:00
Dan Gohman	7f1bdb2e02	[WebAssembly] Remove the output operand from stores. Per spec changes, store instructions in WebAssembly no longer have a return value. Update the instruction descriptions. Differential Revision: https://reviews.llvm.org/D25122 llvm-svn: 283501	2016-10-06 22:08:28 +00:00
Wolfgang Pieb	e51bede1d8	Preserve the debug location when CodeGenPrepare sinks a compare instruction into the basic block of a user. Patch by Andrea DiBiagio. Differential Revision: https://reviews.llvm.org/D24632 llvm-svn: 283500	2016-10-06 21:43:45 +00:00
Pirama Arumuga Nainar	cc152ac794	Handle *_EXTEND_VECTOR_INREG during Integer Legalization Summary: These nodes need legalization for 3-element vectors. This commit handles the legalization and adds tests for zext and sext. This fixes PR30614. Reviewers: RKSimon, srhines Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25268 llvm-svn: 283496	2016-10-06 21:27:05 +00:00
Rong Xu	0e79f7d11d	[PGO] Create weak alias for the renamed Comdat function Add a weak alias to the renamed Comdat function in IR level instrumentation, using it's original name. This ensures the same behavior w/ and w/o IR instrumentation, even for non standard conforming code. Differential Revision: http://reviews.llvm.org/D25339 llvm-svn: 283490	2016-10-06 20:38:13 +00:00
Michael Kuperstein	e524e22846	[X86] Preserve BasePtr for LEA64_32r When replacing FrameIndex with BasePtr, we must preserve BasePtr for LEA64_32r since BasePtr is used later for stack adjustment if it is the same as StackPtr. Patch by H.J Lu <hjl.tools@gmail.com> Differential Revision: https://reviews.llvm.org/D23575 llvm-svn: 283486	2016-10-06 19:31:27 +00:00
Michael Kuperstein	7cc2123847	[DAG] Generalize build_vector -> vector_shuffle combine for more than 2 inputs This generalizes the build_vector -> vector_shuffle combine to support any number of inputs. The idea is to create a binary tree of shuffles, where the first layer performs pairwise shuffles of the input vectors placing each input element into the correct lane, and the rest of the tree blends these shuffles together. This doesn't try to be smart and create any sort of "optimal" shuffles. The assumption is that even a "poor" shuffle sequence is better than extracting and inserting the elements one by one. Differential Revision: https://reviews.llvm.org/D24683 llvm-svn: 283480	2016-10-06 18:58:24 +00:00
Michael Ilseman	6d6b4d87a3	Revert "Add -strip-nonlinetable-debuginfo capability" This reverts commit r283473. Reverted until review is completed. llvm-svn: 283478	2016-10-06 18:30:26 +00:00
Matt Arsenault	5e63a04e46	AMDGPU: Don't fold undef uses or copies with implicit uses llvm-svn: 283476	2016-10-06 18:12:13 +00:00
Matt Arsenault	c59a92387e	AMDGPU: Remove scheduling info from si_mask_branch llvm-svn: 283475	2016-10-06 18:12:07 +00:00
Michael Ilseman	d0a4db7632	Add -strip-nonlinetable-debuginfo capability This adds a new function to DebugInfo.cpp that takes an llvm::Module as input and removes all debug info metadata that is not directly needed for line tables, thus effectively stripping all type and variable information from the module. The primary motivation for this feature was the bitcode work flow (cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html for more background). This is not wired up yet, but will be in subsequent patches. For testing, the new functionality is exposed to opt with a -strip-nonlinetable-debuginfo option. The secondary use-case (and one that works right now!) is as a reduction pass in bugpoint. I added two new bugpoint options (-disable-strip-debuginfo and -disable-strip-debug-types) to control the new features. By default it will first attempt to remove all debug information, then only the type info, and then proceed to hack at any remaining MDNodes. llvm-svn: 283473	2016-10-06 17:58:38 +00:00
Matt Arsenault	c2ee42cd16	AMDGPU: Remove leftover implicit operands when folding immediates When constant folding an operation to a copy or an immediate mov, the implicit uses/defs of the old instruction were left behind, e.g. replacing v_or_b32 left the implicit exec use on the new copy. llvm-svn: 283471	2016-10-06 17:54:30 +00:00
Matt Arsenault	11f7402075	Reapply "AMDGPU: Support using tablegened MC pseudo expansions" Fix bad merge llvm-svn: 283470	2016-10-06 17:19:11 +00:00
Matt Arsenault	cbc879ee2f	Revert "AMDGPU: Support using tablegened MC pseudo expansions" llvm-svn: 283469	2016-10-06 17:08:01 +00:00
Matt Arsenault	d20a2dd7ac	AMDGPU: Support using tablegened MC pseudo expansions Make the necessary refactorings to make use of PseudoInstExpansion llvm-svn: 283467	2016-10-06 16:56:41 +00:00
Matt Arsenault	6bc43d8627	BranchRelaxation: Support expanding unconditional branches AMDGPU needs to expand unconditional branches in a new block with an indirect branch. llvm-svn: 283464	2016-10-06 16:20:41 +00:00
Krzysztof Parzyszek	d391d6f1c3	[Hexagon] Avoid replacing full regs with subregisters in tied operands Doing so will result in the two-address pass generating incorrect code. llvm-svn: 283463	2016-10-06 16:18:04 +00:00
Matt Arsenault	ef5bba0136	BranchRelaxation: Account for function alignment llvm-svn: 283462	2016-10-06 16:00:58 +00:00
Matt Arsenault	36919a4f7c	Move AArch64BranchRelaxation to generic code llvm-svn: 283459	2016-10-06 15:38:53 +00:00
Matt Arsenault	0a3ea89e85	AArch64: Move remaining target specific BranchRelaxation bits to TII llvm-svn: 283458	2016-10-06 15:38:09 +00:00
Nirav Dave	ee554e6155	[X86] Fix intel syntax push parsing bug Change erroneous parsing of push immediate instructions in intel syntax to default to pointer size by rewriting into the ATT style for matching. This fixes PR22028. Reviewers: majnemer, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25288 llvm-svn: 283457	2016-10-06 15:28:08 +00:00
Mehdi Amini	a5ee89863c	Revert "Use StringRef in LTOModule implementation (NFC)" This reverts commit r282997, a windows bot is asserting in one test apparently. llvm-svn: 283456	2016-10-06 15:12:22 +00:00
Sam Kolton	3381d7a216	[AMDGPU] Disassembler: print label names in branch instructions Summary: Add AMDGPUSymbolizer for finding names for labels from ELF symbol table. Initialize MCObjectFileInfo with some default values. Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D24802 llvm-svn: 283450	2016-10-06 13:46:08 +00:00
Anna Thomas	488c05763c	[RS4GC] Fix comment to show TODO. NFC llvm-svn: 283449	2016-10-06 13:24:20 +00:00
Krzysztof Parzyszek	459a1c9f2b	[RDF] Replace some expensive copies with references in range-based loops llvm-svn: 283446	2016-10-06 13:05:46 +00:00
Krzysztof Parzyszek	61d9032bf3	[RDF] Replace potentially unclear autos with real types llvm-svn: 283445	2016-10-06 13:05:13 +00:00
Diana Picus	6341e46cd1	Revert "[ARM] Use __rt_div functions for divrem on Windows" This reverts commit r283383 because it broke some of the bots: undefined reference to ` __aeabi_uldivmod' It affected (at least) clang-cmake-armv7-a15-selfhost, clang-cmake-armv7-a15-selfhost and clang-native-arm-lnt. llvm-svn: 283442	2016-10-06 11:24:29 +00:00
Henric Karlsson	54a53bd303	Test commit access (NFC) llvm-svn: 283439	2016-10-06 10:58:41 +00:00
Matt Arsenault	10c17ca6c6	AMDGPU: Partially fix reported code size for some instructions These ones need to have the size on the pseudo instruction set for getInstSizeInBytes to work correctly. These also have a statically known size. llvm-svn: 283437	2016-10-06 10:13:23 +00:00
Bjorn Pettersson	3961603921	[ValueTracking] Teach computeKnownBits and ComputeNumSignBits to look through ExtractElement. Summary: The computeKnownBits and ComputeNumSignBits functions in ValueTracking can now do a simple look-through of ExtractElement. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24955 llvm-svn: 283434	2016-10-06 09:56:21 +00:00
Sagar Thakur	f9292220dc	[EfficiencySanitizer] Adds shadow memory parameters for 40-bit virtual memory address. Adding 40-bit shadow memory parameters because MIPS64 uses 40-bit virtual memory addresses. Reviewed by rengolin. Differential: https://reviews.llvm.org/D23801 llvm-svn: 283433	2016-10-06 09:52:06 +00:00
Nuno Lopes	d3f5af0fe4	fix build on cygwin Cygwin has dlfcn.h, but no Dl_info llvm-svn: 283427	2016-10-06 09:32:16 +00:00
James Molloy	6215fad0e9	[ARM] Constant pool promotion - fix alignment calculation Global variables are GlobalValues, so they have explicit alignment. Querying DataLayout for the alignment was incorrect. Testcase added. llvm-svn: 283423	2016-10-06 07:56:00 +00:00
Petr Hosek	e023d62e76	[Triple] Add triple for Fuchsia Fuchsia is a new operating system. Differential Revision: https://reviews.llvm.org/D25116 llvm-svn: 283419	2016-10-06 05:17:26 +00:00
Kostya Serebryany	936b1e774f	[libFuzzer] be more careful with memory usage, print peak rss in status lines llvm-svn: 283418	2016-10-06 05:14:00 +00:00
Konstantin Zhuravlyov	b4eb5d5049	[AMDGPU] Promote uniform i16 bitreverse intrinsic to i32 Differential Revision: https://reviews.llvm.org/D25121 llvm-svn: 283415	2016-10-06 02:20:46 +00:00
Kostya Serebryany	3b564e9765	[libFuzzer] when re-running for lsan, don't look at the coverage llvm-svn: 283411	2016-10-05 23:31:01 +00:00
Kostya Serebryany	1c73f1bf27	[libFuzzer] refactoring to make -shrink=1 work for value profile, added a test. llvm-svn: 283409	2016-10-05 22:56:21 +00:00
Reid Kleckner	bb96df602e	[codeview] Truncate records to maximum record size near 64KB If we don't truncate, LLVM asserts when the label difference doesn't fit in a 16 bit field. This patch truncates two kinds of data: trailing null terminated names in symbol records, and inline line tables. The inline line table test that I have is too large (many MB), so I'm not checking it in. Hopefully fixes PR28264. llvm-svn: 283403	2016-10-05 22:36:07 +00:00
Adrian Prantl	b3510afcd1	Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace. This came out of a discussion in https://reviews.llvm.org/D25285. There used to be various other llvm.dbg.* nodes, but we don't support upgrading them and we want to reserve the namespace for future uses. This also removes an entirely obsolete and bitrotted testcase for PR7662. Reapplies 283390 with a forgotten testcase. llvm-svn: 283400	2016-10-05 22:15:37 +00:00
Adrian Prantl	497f085475	Revert "Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace." Forgot to add a testcase in r283390. llvm-svn: 283399	2016-10-05 22:15:34 +00:00
David Callahan	c1051ab26e	Modify df_iterator to support post-order actions Summary: This makes a change to the state used to maintain visited information for depth first iterator. We know assume a method "completed(...)" which is called after all children of a node have been visited. In all existing cases, this method does nothing so this patch has no functional changes. It will however allow a client to distinguish back from cross edges in a DFS tree. Reviewers: nadav, mehdi_amini, dberlin Subscribers: MatzeB, mzolotukhin, twoh, freik, llvm-commits Differential Revision: https://reviews.llvm.org/D25191 llvm-svn: 283391	2016-10-05 21:36:16 +00:00
Adrian Prantl	71bba7253e	Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace. This came out of a discussion in https://reviews.llvm.org/D25285. There used to be various other llvm.dbg.* nodes, but we don't support upgrading them and we want to reserve the namespace for future uses. This also removes an entirely obsolete and bitrotted testcase for PR7662. llvm-svn: 283390	2016-10-05 21:31:19 +00:00
Dan Gohman	5a68ec7f09	[WebAssembly] Add binary-encoding opcode values to instruction descriptions. llvm-svn: 283389	2016-10-05 21:24:08 +00:00
Reid Kleckner	2b3e6428e5	[codeview] Translate bitpiece metadata to DEFRANGE_SUBFIELD* records This allows LLVM to describe locations of aggregate variables that have been split by SROA. Fixes PR29141 Reviewers: amccarth, majnemer Differential Revision: https://reviews.llvm.org/D25253 llvm-svn: 283388	2016-10-05 21:21:33 +00:00
Lang Hames	a5e873e2a1	[Object] Fix a crash in Archive::child_iterator's default constructor. To be default constructible, Archive::child_iterator needs to be able to construct an Archive::Child with a null parent, however Archive::Child's constructor always dereferenced its Parent argument to compute the remaining archive size. This commit fixes Archive::Child's constructor to only do the size calculation when the parent is non-null. llvm-svn: 283387	2016-10-05 21:20:00 +00:00
Martin Storsjo	f997759aef	[ARM] Use __rt_div functions for divrem on Windows This avoids falling back to calling out to the GCC rem functions (__moddi3, __umoddi3) when targeting Windows. The __rt_div functions have flipped the two arguments compared to the __aeabi_divmod functions. To match MSVC, we emit a check for division by zero before actually calling the library function (even if the library function itself also might do the same check). Not all calls to __rt_div functions for division are currently merged with calls to the same function with the same parameters for the remainder. This is more wasteful than a div + mls as before, but avoids calls to __moddi3. Differential Revision: https://reviews.llvm.org/D24076 llvm-svn: 283383	2016-10-05 21:08:02 +00:00
James Y Knight	b0a473aaf8	[Sparc] Implement UMUL_LOHI and SMUL_LOHI instead of MULHS/MULHU/MUL. This is what the instruction-set actually provides, and the default expansions of the others into the lohi opcodes are good. llvm-svn: 283381	2016-10-05 20:54:17 +00:00
Anna Zaks	9a6a6eff0e	[asan] Reapply: Switch to using dynamic shadow offset on iOS The VM layout is not stable between iOS version releases, so switch to dynamic shadow offset. This is the LLVM counterpart of https://reviews.llvm.org/D25218 Differential Revision: https://reviews.llvm.org/D25219 llvm-svn: 283376	2016-10-05 20:34:13 +00:00
Matthew Simpson	a58c50dff0	[LV] Pass profitability analysis in vectorizer constructor (NFC) The vectorizer already holds a pointer to one cost model artifact in a member variable (i.e., MinBWs). As we add more, it will be easier to communicate these artifacts to the vectorizer if we simply pass a pointer to the cost model instead. llvm-svn: 283373	2016-10-05 20:23:46 +00:00
Krzysztof Parzyszek	3b6cbd55f7	[RDF] Fix live def propagation through basic block llvm-svn: 283371	2016-10-05 20:08:09 +00:00
Matthias Braun	0a6916f303	AMDGPU: Do not re-use tmpreg in spill/restore lowering The register scavenging code does not support multiple definitions of the same vreg. Differential Revision: https://reviews.llvm.org/D25220 llvm-svn: 283369	2016-10-05 20:02:51 +00:00
Matthew Simpson	386546124f	[LV] Pass legality analysis in vectorizer constructor (NFC) The vectorizer already holds a pointer to the legality analysis in a member variable, so it makes sense that we would pass it in the constructor. llvm-svn: 283368	2016-10-05 19:53:20 +00:00
Peter Collingbourne	d799d28540	FastISel: Remove unused/un-overridden entry points. NFCI. llvm-svn: 283366	2016-10-05 19:25:20 +00:00
Matthew Simpson	6a8e0bcf3d	[LV] Remove obsolete comment (NFC) llvm-svn: 283365	2016-10-05 19:19:49 +00:00
Matthew Simpson	ee3fdc7e26	[LV] Use getScalarizationOverhead in memory instruction costs (NFC) This patch refactors the cost estimation of scalarized loads and stores to reuse getScalarizationOverhead for the cost of the extractelement and insertelement instructions we might create. The existing code accounted for this cost, but it was functionally equivalent to the helper function. llvm-svn: 283364	2016-10-05 19:11:54 +00:00
Sanjay Patel	a40c479fe9	fix documentation comments; NFC llvm-svn: 283361	2016-10-05 18:51:12 +00:00
Rafael Espindola	37fc0183d7	Allow the caller to pass in the hash. If the caller already has the hash we don't have to compute it. This will be used in lld. llvm-svn: 283359	2016-10-05 18:46:21 +00:00
Reid Kleckner	f9dddec21c	Improve DEBUG_VALUE assembly comments for spilled bitpieces Previously we would give up when we saw the bitpiece DWARF expression and print "[complex expression]" when actually we handled bitpiece expressions outside the loop. llvm-svn: 283355	2016-10-05 18:36:02 +00:00
Matthew Simpson	1755d81b29	[LV] Add helper function for predicated block probability (NFC) The cost model has to estimate the probability of executing predicated blocks. However, we currently always assume predicated blocks have a 50% chance of executing (this value is hardcoded in several places throughout the code). Since we always use the same value, this patch adds a helper function for getting this uniform probability. The function simplifies some comments and makes our assumptions more clear. In the future, we may want to extend this with actual block probability information if it's available. llvm-svn: 283354	2016-10-05 18:30:36 +00:00
Simon Dardis	299dbd6cd1	[mips][ias] fix li macro when values are negated with ~ The integrated assembler evaluates the expressions such as ~0x80000000 to 0xffffffff7fffffff early in the parsing process. This patch adds compatibility with gas so that li loads the expected value (0x7fffffff) in those cases. This only occurs iff all the upper 32bits are set and maintains existing checks by not truncating the result down to 32 bits if any of the the upper bits are not set. Reviewers: dsanders, zoran.jovanovic Differential Review: https://reviews.llvm.org/D23399 llvm-svn: 283353	2016-10-05 18:26:19 +00:00
Matthew Simpson	c631167609	[LV] Add isScalarWithPredication helper function (NFC) This patch adds a single helper function for checking if an instruction will be scalarized with predication. Such instructions include conditional stores and instructions that may divide by zero. Existing checks have been updated to use the new function. llvm-svn: 283350	2016-10-05 17:52:34 +00:00
Anna Zaks	e732ce4dff	Revert "[asan] LLVM: Switch to using dynamic shadow offset on iOS" This reverts commit abe77a118615cd90b0d7f127e4797096afa2b394. Revert as these changes broke a Chromium buildbot. llvm-svn: 283348	2016-10-05 17:42:02 +00:00
Bjorn Pettersson	12559441bd	[DAG] Teach computeKnownBits and ComputeNumSignBits in SelectionDAG to look through EXTRACT_VECTOR_ELT. Summary: Both computeKnownBits and ComputeNumSignBits can now do a simple look-through of EXTRACT_VECTOR_ELT. It will compute the result based on the known bits (or known sign bits) for the vector that the element is extracted from. Reviewers: bogner, tstellarAMD, mkuper Subscribers: wdng, RKSimon, jyknight, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D25007 llvm-svn: 283347	2016-10-05 17:40:27 +00:00
Rafael Espindola	24db10d8e1	Don't pass null to memcpy. Should fix the asan bots. llvm-svn: 283336	2016-10-05 16:33:03 +00:00
Simon Dardis	f45a59f80b	Recommit: "[mips] Add rsqrt, recip for MIPS" Add rsqrt.[ds], recip.[ds] for MIPS. Correct the microMIPS definitions for architecture support and register usage. Reviewers: vkalintiris, zoran.jovanoic Differential Review: https://reviews.llvm.org/D24499 llvm-svn: 283334	2016-10-05 16:11:01 +00:00
Hans Wennborg	c26c03d911	Revert r282920 "X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302)" This is suspected to cause a miscompile in Chromium. Reverting while investigating. llvm-svn: 283329	2016-10-05 15:39:27 +00:00
Simon Dardis	bbfd528748	Revert "[mips] Add rsqrt, recip for MIPS" This reverts commit r282485 which contain two patches instead of one. llvm-svn: 283327	2016-10-05 15:28:33 +00:00
Douglas Katzman	0411e8669b	[X86] Don't randomly encode %rip where illegal Differential Revision: https://reviews.llvm.org/D25112 llvm-svn: 283326	2016-10-05 15:23:35 +00:00
James Molloy	b7de497cb9	[Thumb] Don't try and emit LDRH/LDRB from the constant pool This is not a valid encoding - these instructions cannot do PC-relative addressing. The underlying problem here is of whitelist in ARMISelDAGToDAG that unwraps ARMISD::Wrappers during addressing-mode selection. This didn't realise TargetConstantPool was actually possible, so didn't handle it. llvm-svn: 283323	2016-10-05 14:52:13 +00:00
Oren Ben Simhon	a2010755fa	Test commit permission llvm-svn: 283318	2016-10-05 13:48:33 +00:00
Dylan McKay	afff169f17	[AVR] Don't select 'MOVW' instructions when they are not supported We have a subtarget feature which we were ignoring, which was causing us to generate unsupported instructions for some older chips. llvm-svn: 283317	2016-10-05 13:38:29 +00:00
Dylan McKay	82ef77091c	[AVR] Add AVRRegisterInfo::splitReg function No tests are included just yet - this is used from the pseudo instruction expander pass, which hasn't been pulled in-tree yet. llvm-svn: 283316	2016-10-05 13:27:30 +00:00
Krzysztof Parzyszek	e7c72cdbb0	Fix machine operand traversal in ScheduleDAGInstrs::fixupKills llvm-svn: 283315	2016-10-05 13:15:06 +00:00
Dylan McKay	ea55554803	[AVR] Update return type of dynamic alloca pass It was recently changed from 'const char*' to StringRef llvm-svn: 283312	2016-10-05 12:32:24 +00:00
Dylan McKay	192405a31a	[AVR] Add the AVR frame lowering code Summary: This allows AVR to lower frames into assembly code. Reviewers: arsenm, kparzysz Subscribers: japaric, wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25032 llvm-svn: 283311	2016-10-05 11:48:56 +00:00
Dylan McKay	c1760424de	[AVR] Split all of the AVR device definitions into a separate file We have ~500 lines of subtarget feature definitions, they don't belong in our main TableGen file. llvm-svn: 283310	2016-10-05 10:28:45 +00:00
Dylan McKay	5af1248230	[AVR] Enable the instruction printer in the target definition llvm-svn: 283309	2016-10-05 10:23:38 +00:00
Dylan McKay	f66e120b3b	[AVR] Add definitions for the ATTiny102 and ATtiny104 chips llvm-svn: 283308	2016-10-05 10:20:33 +00:00
Mehdi Amini	149f6eaed9	Re-commit "Use StringRef in Support/Darf APIs (NFC)" This reverts commit r283285 and re-commit r283275 with a fix for format("%s", Str); where Str is a StringRef. llvm-svn: 283298	2016-10-05 05:59:29 +00:00
Dylan McKay	efe40389c0	[AVR] Add the machine code backend Summary: This adds the AVR machine code backend (`AVRAsmBackend.cpp`). This will allow us to generate machine code from assembled AVR instructions. Reviewers: arsenm, kparzysz Subscribers: modocache, japaric, wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25029 llvm-svn: 283297	2016-10-05 05:30:19 +00:00
Dean Michael Berris	27358cff88	[Support][CommandLine] Add cl::getRegisteredSubcommands() This should allow users of the library to get a range to iterate through all the subcommands that are registered to the global parser. This allows users to define subcommands in libraries that self-register to have dispatch done at a different stage (like main). It allows for writing code like the following: for (auto S : cl::getRegisteredSubcommands()) { if (S) { // Dispatch on S->getName(). } } This change also contains tests that show this usage pattern. Reviewers: zturner, dblaikie, echristo Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24489 llvm-svn: 283296	2016-10-05 05:20:08 +00:00
Mehdi Amini	e4f0b75e3d	Blind attempt to fix windows build after r283290 - Use StringRef in StringSaver API (NFC) llvm-svn: 283294	2016-10-05 01:41:11 +00:00
Mehdi Amini	5b00770c35	Use StringRef in ARMConstantPool APIs (NFC) llvm-svn: 283293	2016-10-05 01:41:06 +00:00
Kyle Butt	25ac35d822	Revert "Codegen: Tail-duplicate during placement." This reverts commit 062ace9764953e9769142c1099281a345f9b6bdc. Issue with loop info and block removal revealed by polly. I have a fix for this issue already in another patch, I'll re-roll this together with that fix, and a test case. llvm-svn: 283292	2016-10-05 01:39:29 +00:00
Mehdi Amini	3e021be3b6	Use StringRef in FastISel API (NFC) llvm-svn: 283291	2016-10-05 01:37:29 +00:00
Mehdi Amini	ec4fb5ba97	Use StringRef in StringSaver API (NFC) llvm-svn: 283290	2016-10-05 01:32:41 +00:00
Mehdi Amini	a6f81ca8ea	Use StringRef in ARCRuntimeEntryPoints APIs (NFC) llvm-svn: 283288	2016-10-05 01:15:04 +00:00
Kostya Serebryany	379359c53a	[libFuzzer] add ShrinkValueProfileTest, move code around, NFC llvm-svn: 283286	2016-10-05 01:09:40 +00:00
Mehdi Amini	2bcac0fac4	Revert "Re-commit "Use StringRef in Support/Darf APIs (NFC)"" One test seems randomly broken: DebugInfo/X86/gnu-public-names.ll llvm-svn: 283285	2016-10-05 01:04:02 +00:00
Mehdi Amini	a28bb09f28	Use StringRef in MCSectionMachO (NFC) llvm-svn: 283284	2016-10-05 01:02:34 +00:00
Mehdi Amini	215ff8df74	Use StringRef in DarwinAsmParser (NFC) llvm-svn: 283283	2016-10-05 01:02:22 +00:00
Michael Zolotukhin	5cda89ad36	[LoopDistribute] Fix a typo in the pass name. llvm-svn: 283282	2016-10-05 00:44:52 +00:00
Mehdi Amini	32b297a42f	Re-commit "Use StringRef in Support/Darf APIs (NFC)" This reverts commit r283278 and re-commit r283275 with the update to fix the build on the LLDB side. llvm-svn: 283281	2016-10-05 00:37:18 +00:00
Kostya Serebryany	2455f0d013	[libFuzzer] clear the corpus elements if they are evicted (i.e. smaller elements with proper coverage are found). Make sure we never try to mutate empty element. Print the corpus size in bytes in the status lines llvm-svn: 283279	2016-10-05 00:25:17 +00:00
Mehdi Amini	78b04ae7ac	Revert "Use StringRef in Support/Darf APIs (NFC)" This reverts commit r283275, it broke LLDB Android debug server. llvm-svn: 283278	2016-10-05 00:21:14 +00:00
Mehdi Amini	c6caed8fa1	Use StringRef instead of raw pointers in ARMBuildAttrs (NFC) llvm-svn: 283277	2016-10-05 00:15:18 +00:00
Mehdi Amini	e0327be584	Use StringRef in Support/Darf APIs (NFC) llvm-svn: 283275	2016-10-04 23:55:40 +00:00
Kyle Butt	adabac2d57	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283274	2016-10-04 23:54:18 +00:00
Manuel Jacob	49fafb1109	[C API] Add LLVMConstExactUDiv and LLVMBuildExactUDiv functions. Summary: These are analog to the existing LLVMConstExactSDiv and LLVMBuildExactSDiv functions. Reviewers: deadalnix, majnemer Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D25259 llvm-svn: 283269	2016-10-04 23:32:42 +00:00
Rafael Espindola	39751afc4e	Misc improvements to StringTableBuilder. This patch adds write methods to StringTableBuilder so that it is easier to change the underlying implementation. Using the write methods, avoid creating a temporary buffer when using mmaped output. It also uses a more compact key in the DenseMap. Overall this produces a slightly faster lld: firefox master 6.853419709 patch 6.841968912 1.00167361138x faster chromium master 4.297280174 patch 4.298712163 1.00033323147x slower chromium fast master 1.802335952 patch 1.806872459 1.00251701521x slower the gold plugin master 0.3247149 patch 0.321971644 1.00852017888x faster clang master 0.551279945 patch 0.543733194 1.01387951128x faster llvm-as master 0.032743458 patch 0.032143478 1.01866568391x faster the gold plugin fsds master 0.350814247 patch 0.348571741 1.00643341309x faster clang fsds master 0.6281672 patch 0.621130222 1.01132931187x faster llvm-as fsds master 0.030168899 patch 0.029797155 1.01247582194x faster scylla master 3.104222518 patch 3.059590248 1.01458766252x faster llvm-svn: 283266	2016-10-04 22:43:25 +00:00
Alina Sbirlea	9a78ebd6d8	[cpu-detection] Copy simplified version of get_cpuid_max to remove dependency to clang's implementation Summary: Attempting to fix PR30384. Take the same approach as in compiler_rt and add a simplified version of __get_cpuid_max. Including cpuid.h is no longer needed. Reviewers: echristo, joerg Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24597 llvm-svn: 283265	2016-10-04 22:39:53 +00:00
David L Kreitzer	7c7ee89b01	Revert r283248. It caused failures in the hexagon buildbots. llvm-svn: 283254	2016-10-04 20:57:19 +00:00
Sanjay Patel	bfdbea6481	[Target] move reciprocal estimate settings from TargetOptions to TargetLowering The motivation for the change is that we can't have pseudo-global settings for codegen living in TargetOptions because that doesn't work with LTO. Ideally, these reciprocal attributes will be moved to the instruction-level via FMF, metadata, or something else. But making them function attributes is at least an improvement over the current state. The ingredients of this patch are: Remove the reciprocal estimate command-line debug option. Add TargetRecip to TargetLowering. Remove TargetRecip from TargetOptions. Clean up the TargetRecip implementation to work with this new scheme. Set the default reciprocal settings in TargetLoweringBase (everything is off). Update the PowerPC defaults, users, and tests. Update the x86 defaults, users, and tests. Note that if this patch needs to be reverted, the related clang patch checked in at r283251 should be reverted too. Differential Revision: https://reviews.llvm.org/D24816 llvm-svn: 283252	2016-10-04 20:46:43 +00:00
Kevin Enderby	f993d6e72c	Next set of additional error checks for invalid Mach-O files for the load commands that uses the MachO::encryption_info_command and MachO::encryption_info_command types but not used in llvm libObject code but used in llvm tool code. This includes just LC_ENCRYPTION_INFO and LC_ENCRYPTION_INFO_64 load commands. llvm-svn: 283250	2016-10-04 20:37:43 +00:00
David L Kreitzer	fedb9b67ca	[safestack] Requires a valid TargetMachine to be passed to the SafeStack pass. Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D24896 llvm-svn: 283248	2016-10-04 20:31:32 +00:00
Matthias Braun	46a5238682	AArch64: Macrofusion: Split features, add missing combinations. AArch64InstrInfo::shouldScheduleAdjacent() determines whether two instruction can benefit from macroop fusion on apple CPUs. The list turned out to be incomplete: - the "rr" variants of the instructions were missing - even the "rs" variants can have shift value == 0 and behave like the "rr" variants This also splits the MacropFusion target feature into ArithmeticBccFusion and ArithmeticCbzFusion. Differential Revision: https://reviews.llvm.org/D25142 llvm-svn: 283243	2016-10-04 19:28:21 +00:00
Anna Zaks	ef97d2c589	[asan] LLVM: Switch to using dynamic shadow offset on iOS The VM layout is not stable between iOS version releases, so switch to dynamic shadow offset. This is the LLVM counterpart of https://reviews.llvm.org/D25218 Differential Revision: https://reviews.llvm.org/D25219 llvm-svn: 283239	2016-10-04 19:02:29 +00:00
Hal Finkel	bdd6735a9e	Don't filter diagnostics written as YAML to the output file The purpose of the YAML diagnostic output file is to collect information on optimizations performed, or not performed, for later processing by tools that help users (and compiler developers) understand how code was optimized. As such, the diagnostics that appear in the file should not be coupled to what a user might want to see summarized for them as the compiler runs, and in fact, because the user likely does not know what optimization diagnostics their tools might want to use, the user cannot provide a useful filter regardless. As such, we shouldn't filter the diagnostics going to the output file. Differential Revision: https://reviews.llvm.org/D25224 llvm-svn: 283236	2016-10-04 18:13:45 +00:00
Adam Nemet	0428e93217	Serialize remark argument as a mapping to get proper quotation for the value. llvm-svn: 283231	2016-10-04 17:05:04 +00:00
Adam Nemet	2780ee0dc1	Allow derived classes of OptimizationRemarkAnalysis in YAML llvm-svn: 283230	2016-10-04 17:05:01 +00:00
Anna Thomas	479cbb9405	[RS4GC] Handle ShuffleVector instruction in findBasePointer Summary: This patch modifies the findBasePointer to handle the shufflevector instruction. Tests run: RS4GC tests, local downstream tests. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25197 llvm-svn: 283219	2016-10-04 13:48:37 +00:00
Rafael Espindola	fda3dc9266	Remove duplicated typedef. NFC. llvm-svn: 283216	2016-10-04 13:09:59 +00:00
Nemanja Ivanovic	6354d23555	[Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register set This patch corresponds to review: The newly added VSX D-Form (register + offset) memory ops target the upper half of the VSX register set. The existing ones target the lower half. In order to unify these and have the ability to target all the VSX registers using D-Form operations, this patch defines Pseudo-ops for the loads/stores which are expanded post-RA. The expansion then choses the correct opcode based on the register that was allocated for the operation. llvm-svn: 283212	2016-10-04 11:25:52 +00:00
Simon Dardis	86b3a1e79b	[mips][fastisel] Consider soft-float an unsupported floating point mode Treat soft-float as unsupported for fast-isel. Additionally, ensure we check that lowering f32 arguments also considers the case of soft-float mode. Reviewers: ehostunreach, vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D24505 llvm-svn: 283209	2016-10-04 10:35:07 +00:00
whitequark	7c4fe0e9a3	[SelectionDAG] Fix calling convention in expansion of ?MULO. The SMULO/UMULO DAG nodes, when not directly supported by the target, expand to a multiplication twice as wide. In case that the resulting type is not legal, an __mul?i3 intrinsic is used. Since the type is not legal, the legalizer cannot directly call the intrinsic with the wide arguments; instead, it "pre-lowers" them by splitting them in halves. The "pre-lowering" code in essence made assumptions about the calling convention, specifically that i(N*2) values will be split into two iN values and passed in consecutive registers in little-endian order. This, naturally, breaks on a big-endian system, such as our OR1K out-of-tree backend. Thanks to James Miller <james@aatch.net> for help in debugging. Differential Revision: https://reviews.llvm.org/D25223 llvm-svn: 283203	2016-10-04 09:07:49 +00:00
Sjoerd Meijer	535529b41c	Consistent fp denormal mode names. NFC. This fixes the inconsistency of the fp denormal option names: in LLVM this was DenormalType, but in Clang this is DenormalMode which seems better. Differential Revision: https://reviews.llvm.org/D24906 llvm-svn: 283192	2016-10-04 08:03:36 +00:00
Nemanja Ivanovic	11049f8f07	[Power9] Part-word VSX integer scalar loads/stores and sign extend instructions This patch corresponds to review: https://reviews.llvm.org/D23155 This patch removes the VSHRC register class (based on D20310) and adds exploitation of the Power9 sub-word integer loads into VSX registers as well as vector sign extensions. The new instructions are useful for a few purposes: Int to Fp conversions of 1 or 2-byte values loaded from memory Building vectors of 1 or 2-byte integers with values loaded from memory Storing individual 1 or 2-byte elements from integer vectors This patch implements all of those uses. llvm-svn: 283190	2016-10-04 06:59:23 +00:00
Kostya Serebryany	4820cc988f	[libFuzzer] remove dfsan support and some related stale code. This is not being used and as is is pretty weak anyway llvm-svn: 283187	2016-10-04 06:08:46 +00:00
Craig Topper	ee2d995661	[X86] Add MOV8rm_NOREX to switch in isReallyTriviallyReMaterializable to match MOV8rm. llvm-svn: 283184	2016-10-04 03:11:44 +00:00
Kostya Serebryany	5a52a11ce4	[libFuzzer] change the probabilities so that we choose only the inputs that are known to be minimal inputs for at least one coverage feature (works only with -shrink=1 for now) llvm-svn: 283178	2016-10-04 01:51:44 +00:00
Matt Arsenault	dcf0cfca4c	AMDGPU: Refactor indirect vector lowering Allow inserting multiple instructions in the expanded loop. llvm-svn: 283177	2016-10-04 01:41:05 +00:00
Matt Arsenault	283fbc24f6	AMDGPU: Factor SGPR spilling into separate functions llvm-svn: 283175	2016-10-04 01:14:56 +00:00
Kyle Butt	3ffb8529bc	Revert "Codegen: Tail-duplicate during placement." This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2. Causing crashes on aarch64 build. llvm-svn: 283172	2016-10-04 00:38:23 +00:00
Eli Friedman	74bed9d757	Make GlobalsAA ignore dead constant expressions. Slightly improves the precision of GlobalsAA in certain situations, and makes the behavior of optimization passes more predictable. Differential Revision: https://reviews.llvm.org/D24104 llvm-svn: 283165	2016-10-04 00:03:55 +00:00
Kyle Butt	396bfdd707	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. llvm-svn: 283164	2016-10-04 00:00:09 +00:00
Dan Gohman	e040533ece	[WebAssembly] Update to more stack-machine-oriented terminology. WebAssembly has officially switched from being an AST to being a stack machine. Update various bits of terminology and README.md entries accordingly. llvm-svn: 283154	2016-10-03 22:43:53 +00:00
Dan Gohman	ffc184bb1d	[WebAssemby] Clean up an obsolete comment. The comment is present inside the body of GetVRegDef. llvm-svn: 283153	2016-10-03 22:32:21 +00:00
Matthias Braun	9baa3e80a9	TargetMachine: Make the win32-macho workaround more specific. This is to avoid problems with win32 + ELF which surprisingly happens a lot in practice: If a user just specifies -march on the commandline the object format changes along with the architecture to ELF in many instances while the OS stays with the default/host OS. llvm-svn: 283151	2016-10-03 22:12:37 +00:00
Dan Gohman	16fa0d8159	[WebAssembly] Delete an unused function. NFC. llvm-svn: 283150	2016-10-03 22:06:28 +00:00
Dan Gohman	9850e87784	[WebAssembly] Fix indentation. NFC. llvm-svn: 283147	2016-10-03 21:33:09 +00:00
Dan Gohman	4b8e8becf6	[WebAssembly] Rename OPERAND_FP32IMM to OPERAND_F32IMM. WebAssembly documentation consistently says "f32" rather than "fp32" to describe 32-bit floating-point. llvm-svn: 283146	2016-10-03 21:31:31 +00:00
Quentin Colombet	3a06701913	[AArch64][RegisterBankInfo] Add getSameKindofOperandsMapping. Refactor the code so that the same function can be used for all instructions with all the same operands for up to 3 operands. This is going to be useful for cast instructions. NFC. llvm-svn: 283144	2016-10-03 20:20:13 +00:00
Krzysztof Parzyszek	c8b6ecabd8	[RDF] Fix liveness propagation through shadows Each shadow only represents data flow that is restricted to its reaching def. Propagating more than that could lead to spurious register liveness, resulting in extra (incorrectly) block live-ins. llvm-svn: 283143	2016-10-03 20:17:20 +00:00
Matthias Braun	a827ed8891	AArch64Subtarget: Remove unused CPUString field llvm-svn: 283142	2016-10-03 20:17:02 +00:00
Matthias Braun	eccdee9196	X86: Do not produce GOT relocations on windows Windows has no GOT relocations the way elf/darwin has. Some people use x86_64-pc-win32-macho to build EFI firmware; Do not produce GOT relocations for this target. Differential Revision: https://reviews.llvm.org/D24627 llvm-svn: 283140	2016-10-03 20:11:24 +00:00
Sanjoy Das	0359a193a7	[PruneEH] Be correct in the face IPO This fixes one spot I had missed in r265762. Credit goes to Philip Reames for spotting this one! llvm-svn: 283137	2016-10-03 19:35:30 +00:00
Dehao Chen	92abc7e9f2	Refactor LICM pass in preparation for LoopSink pass. Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778). Reviewers: davidxl, danielcdh, hfinkel, chandlerc Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D24168 llvm-svn: 283134	2016-10-03 18:52:08 +00:00
Konstantin Zhuravlyov	60a8373778	[AMDGPU] Pass optimization level to SelectionDAGISel llvm-svn: 283133	2016-10-03 18:47:26 +00:00
Konstantin Zhuravlyov	691e2e020b	[AMDGPU] Sign extend AShr when promoting (instead of zero extending) llvm-svn: 283130	2016-10-03 18:29:01 +00:00
Hans Wennborg	b4d2678c6f	Jump threading: avoid trying to split edge into landingpad block (PR27840) Splitting the edge is nontrivial because of the landing pad, and we would currently assert trying to do it. Differential Revision: https://reviews.llvm.org/D24680 llvm-svn: 283129	2016-10-03 18:18:04 +00:00
Rafael Espindola	9ddfb04d03	Revert "Use getSize instead of data().size(). NFC." This reverts commit r283125. lld needs to be updated. llvm-svn: 283127	2016-10-03 18:01:10 +00:00
Krzysztof Parzyszek	ab26e2dd3b	[RDF] Further improve readability of the graph Print target basic block for a branch. llvm-svn: 283126	2016-10-03 17:54:33 +00:00
Rafael Espindola	4bb425e848	Use getSize instead of data().size(). NFC. Also assert isFinalized in getSize(). This just reduces the noise from another patch. llvm-svn: 283125	2016-10-03 17:49:19 +00:00
Krzysztof Parzyszek	a77fe4eef3	[RDF] Replace RegisterAliasInfo with target-independent code using lane masks llvm-svn: 283122	2016-10-03 17:14:48 +00:00
Sanjay Patel	d27a21874b	[x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics (PR30433) This should fix: https://llvm.org/bugs/show_bug.cgi?id=30433 There are a couple of open questions about the codegen: 1. Should we let scalar ops be scalars and avoid vector constant loads/splats? 2. Should we have a pass to combine constants such as the inverted pair that we have here? Differential Revision: https://reviews.llvm.org/D25165 llvm-svn: 283119	2016-10-03 16:38:27 +00:00
Rafael Espindola	d7325ee702	Don't drop the llvm. prefix when renaming. If the llvm. prefix is dropped other parts of llvm don't see this as an intrinsic. This means that the number of regular symbols depends on the context the module is loaded into, which causes LTO to abort. Fixes PR30509. llvm-svn: 283117	2016-10-03 15:51:42 +00:00
Sanjay Patel	f7df85af87	fix formatting; NFC llvm-svn: 283115	2016-10-03 15:18:36 +00:00
Nirav Dave	157891c57f	Prevent out of order HashDirective lexing in AsmLexer. Retrying after buildbot reset. To lex hash directives we peek ahead to find component tokens, create a unified token, and unlex the peeked tokens so the parser does not need to parse the tokens then. Make sure we do not to lex another hash directive during peek operation. This fixes PR28921. Reviewers: rnk, loladiro Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24839 llvm-svn: 283111	2016-10-03 13:48:27 +00:00
Matt Arsenault	4a048a44e5	AMDGPU: Fix typo llvm-svn: 283108	2016-10-03 13:06:58 +00:00
Volkan Keles	1c38681ae6	Add new target hooks for LoadStoreVectorizer Summary: Added 6 new target hooks for the vectorizer in order to filter types, handle size constraints and decide how to split chains. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D24727 llvm-svn: 283099	2016-10-03 10:31:34 +00:00
Sjoerd Meijer	4dbe73c1ed	[ARM] Code size optimisation to lower udiv+urem to udiv+mls instead of a library call to __aeabi_uidivmod. This is an improved implementation of r280808, see also D24133, that got reverted because isel was stuck in a loop. That was caused by the optimisation incorrectly triggering on i64 ints, which shouldn't happen because there is no 64bit hwdiv support; that put isel's type legalization and this optimisation in a loop. A native ARM compiler and testing now shows that this is fixed. Patch mostly by Pablo Barrio. Differential Revision: https://reviews.llvm.org/D25077 llvm-svn: 283098	2016-10-03 10:12:32 +00:00
Konstantin Zhuravlyov	c9786f0d8a	[AMDGPU] Remove unused variables from SIOptimizeExecMasking Differential Revision: https://reviews.llvm.org/D25110 llvm-svn: 283087	2016-10-03 04:43:22 +00:00
Hal Finkel	530fa5fcc9	[PowerPC] Account for the ELFv2 function prologue during branch selection The PPC branch-selection pass, which performs branch relaxation, needs to account for the padding that might be introduced to satisfy block alignment requirements. We were assuming that the first block was at offset zero (i.e. had the alignment of the function itself), but under the ELFv2 ABI, a global entry function prologue is added to the first block, and it is a two-instruction sequence (i.e. eight-bytes long). If the function has 16-byte alignment, the fact that the first block is eight bytes offset from the start of the function is relevant to calculating where padding will be added in between later blocks. Unfortunately, I don't have a small test case. llvm-svn: 283086	2016-10-03 04:06:44 +00:00
Craig Topper	eab23d3bc4	[AVX-512] Remove isCheapAsAMove flag from VMOVAPSZ128rm_NOVLX and friends. This was accidentally copy and pasted from other Pseudos in the file. llvm-svn: 283084	2016-10-03 02:22:33 +00:00
Craig Topper	4e7b888ea4	[X86] Mark all sizes of (V)MOVUPD as trivially rematerializable. I don't know for sure that we truly needs this, but its the only vector load that isn't rematerializable. Making it consistent allows it to not be a special case in the td files. llvm-svn: 283083	2016-10-03 02:00:29 +00:00
Simon Pilgrim	a8d2168cb0	[X86][AVX2] Add support for combining target shuffles to VPERMD/VPERMPS llvm-svn: 283080	2016-10-02 21:07:58 +00:00
Sanjoy Das	4aeb0f2c7f	[SCEV] Rely on ConstantRange instead of custom logic; NFCI This was first landed in rL283058 and subsequenlty reverted since a change this depends on (rL283057) was buggy and had to be reverted. llvm-svn: 283079	2016-10-02 20:59:10 +00:00
Sanjoy Das	c7d3291b68	[ConstantRange] Make getEquivalentICmp smarter This change teaches getEquivalentICmp to be smarter about generating ICMP_NE and ICMP_EQ predicates. An earlier version of this change was landed as rL283057 which had a use-after-free bug. This new version has a fix for that bug, and a (C++ unittests/) test case that would have triggered it rL283057. llvm-svn: 283078	2016-10-02 20:59:05 +00:00
Yaron Keren	bd8731946c	Rangify for loops. llvm-svn: 283074	2016-10-02 19:21:41 +00:00
Simon Pilgrim	03afbe783d	[X86][AVX] Ensure broadcast loads respect dependencies To allow broadcast loads of a non-zero'th vector element, lowerVectorShuffleAsBroadcast can replace a load with a new load with an adjusted address, but unfortunately we weren't ensuring that the new load respected the same dependencies. This patch adds a TokenFactor and updates all dependencies of the old load to reference the new load instead. Bug found during internal testing. Differential Revision: https://reviews.llvm.org/D25039 llvm-svn: 283070	2016-10-02 15:59:15 +00:00
Craig Topper	46413af7f7	[X86] Don't set i64 ADDC/ADDE/SUBC/SUBE as Custom if the target isn't 64-bit. This way we don't have to catch them and do nothing with them in ReplaceNodeResults. llvm-svn: 283066	2016-10-02 06:13:43 +00:00
Craig Topper	68c08931fc	[X86] Fix indentation. NFC llvm-svn: 283065	2016-10-02 06:13:40 +00:00
Sanjoy Das	f230b0aa43	Revert r283057 and r283058 They've broken the sanitizer-bootstrap bots. Reverting while I investigate. Original commit messages: r283057: "[ConstantRange] Make getEquivalentICmp smarter" r283058: "[SCEV] Rely on ConstantRange instead of custom logic; NFCI" llvm-svn: 283062	2016-10-02 02:40:27 +00:00
Hal Finkel	a9321059b9	[PowerPC] Refactor soft-float support, and enable PPC64 soft float This change enables soft-float for PowerPC64, and also makes soft-float disable all vector instruction sets for both 32-bit and 64-bit modes. This latter part is necessary because the PPC backend canonicalizes many Altivec vector types to floating-point types, and so soft-float breaks scalarization support for many operations. Both for embedded targets and for operating-system kernels desiring soft-float support, it seems reasonable that disabling hardware floating-point also disables vector instructions (embedded targets without hardware floating point support are unlikely to have Altivec, etc. and operating system kernels desiring not to use floating-point registers to lower syscall cost are unlikely to want to use vector registers either). If someone needs this to work, we'll need to change the fact that we promote many Altivec operations to act on v4f32. To make it possible to disable Altivec when soft-float is enabled, hardware floating-point support needs to be expressed as a positive feature, like the others, and not a negative feature, because target features cannot have dependencies on the disabling of some other feature. So +soft-float has now become -hard-float. Fixes PR26970. llvm-svn: 283060	2016-10-02 02:10:20 +00:00
Sanjoy Das	1f7b813e2b	Remove duplicated code; NFC ICmpInst::makeConstantRange does exactly the same thing as ConstantRange::makeExactICmpRegion. llvm-svn: 283059	2016-10-02 00:09:57 +00:00
Sanjoy Das	1b9cefcf03	[SCEV] Rely on ConstantRange instead of custom logic; NFCI llvm-svn: 283058	2016-10-02 00:09:52 +00:00
Sanjoy Das	6ef69d97f5	[ConstantRange] Make getEquivalentICmp smarter This change teaches getEquivalentICmp to be smarter about generating ICMP_NE and ICMP_EQ predicates. llvm-svn: 283057	2016-10-02 00:09:49 +00:00
Sanjoy Das	54e6a21dca	[SCEV] Remove commented out code; NFC llvm-svn: 283056	2016-10-02 00:09:45 +00:00
Simon Pilgrim	630dd6ff02	[X86][SSE] Cleaned up shuffle decode assertion messages llvm-svn: 283050	2016-10-01 20:12:56 +00:00
Mehdi Amini	99d1b29503	Use StringRef for MemoryBuffer identifier API (NFC) llvm-svn: 283043	2016-10-01 16:38:28 +00:00
Simon Pilgrim	5b0c15ddf7	Fix signed/unsigned warning llvm-svn: 283041	2016-10-01 16:14:57 +00:00
Simon Pilgrim	1638d49f20	[X86][SSE] Add support for combining target shuffles to binary BLEND We already had support for 1-input BLEND with zero - this adds support for 2-input BLEND as well. llvm-svn: 283040	2016-10-01 16:04:28 +00:00
Mehdi Amini	7410717a62	Use StringRef in Registry API (NFC) llvm-svn: 283039	2016-10-01 15:44:54 +00:00
Simon Pilgrim	ae17cf20ce	[X86][SSE] Always combine target shuffles to MOVSD/MOVSS Now we can commute to BLENDPD/BLENDPS on SSE41+ targets if necessary, so simplify the combine matching where we can. This required me to add a couple of scalar math movsd/moss fold patterns that hadn't been needed in the past. llvm-svn: 283038	2016-10-01 15:33:01 +00:00
Simon Pilgrim	ccdd1ff49b	[X86][SSE] Enable commutation from MOVSD/MOVSS to BLENDPD/BLENDPS on SSE41+ targets Instead of selecting between MOVSD/MOVSS and BLENDPD/BLENDPS at shuffle lowering by subtarget this will help us select the instruction based on actual commutation requirements. We could possibly add BLENDPD/BLENDPS -> MOVSD/MOVSS commutation and MOVSD/MOVSS memory folding using a similar approach if it proves useful I avoided adding AVX512 handling as I'm not sure when we should be making use of VBLENDPD/VBLENDPS on EVEX targets llvm-svn: 283037	2016-10-01 14:26:11 +00:00
Nirav Dave	e4c6153cf1	Revert "[MC] Prevent out of order HashDirective lexing in AsmLexer." This reverts commit r282992 which appears to be causing an LTO test failure. llvm-svn: 283034	2016-10-01 10:57:55 +00:00
Kostya Serebryany	a5f1adab56	[libFuzzer] add fuzzer test for libxml2, finds https://bugzilla.gnome.org/show_bug.cgi?id=751631 llvm-svn: 283024	2016-10-01 07:37:40 +00:00
Kostya Serebryany	d1f31d0a49	[libFuzzer] fix a recent bugs (buffer overflow) llvm-svn: 283021	2016-10-01 07:13:25 +00:00
Craig Topper	5eb5ade894	[X86] Cleanup patterns for using VMOVDDUP for broadcasts. -Remove OptForSize. Not all of the backend follows the same rules for creating broadcasts and there is no conflicting pattern. -Don't stop selecting VEX VMOVDDUP when AVX512 is supported. We need VLX for EVEX VMOVDDUP. -Only use VMOVDDUP for v2i64 broadcasts if AVX2 is not supported. llvm-svn: 283020	2016-10-01 07:11:24 +00:00
Mehdi Amini	9af9a9d5f9	Revert "Use StringRef instead of raw pointer in TargetRegistry API (NFC)" This reverts commit r283017. Creates an infinite loop somehow. llvm-svn: 283019	2016-10-01 07:08:23 +00:00
Mehdi Amini	36d33fc109	Use StringRef instead of raw pointers in MCAsmInfo/MCInstrInfo APIs (NFC) llvm-svn: 283018	2016-10-01 06:46:33 +00:00
Mehdi Amini	cd354a659b	Use StringRef instead of raw pointer in TargetRegistry API (NFC) llvm-svn: 283017	2016-10-01 06:25:30 +00:00
Mehdi Amini	7419e940d2	Use StringRef instead of raw pointer in ExecutionEngine llvm-svn: 283016	2016-10-01 06:22:04 +00:00
Craig Topper	be351eea0c	[AVX-512] Add EVEX versions of VPBROADCASTW patterns with truncated i32 loads. llvm-svn: 283015	2016-10-01 06:01:23 +00:00
Mehdi Amini	48878ae579	Use StringRef in Datalayout API (NFC) llvm-svn: 283013	2016-10-01 05:57:55 +00:00
Mehdi Amini	f42ec7903f	DIFlags: use StringRef instead of raw pointer (NFC) llvm-svn: 283012	2016-10-01 05:57:50 +00:00
Mehdi Amini	217b246484	Revert "Use StringRef in Datalayout API (NFC)" This reverts commit r283009. Bots are broken. llvm-svn: 283011	2016-10-01 05:12:48 +00:00
Mehdi Amini	29baf9c0e1	Use StringRef in Datalayout API (NFC) llvm-svn: 283009	2016-10-01 04:17:59 +00:00
Mehdi Amini	f20abe53f4	Use StringRef in Pass Info/Support API (NFC) llvm-svn: 283008	2016-10-01 04:03:30 +00:00
Mehdi Amini	e11b745b66	Use StringRef in CommandLine Options handling (NFC) llvm-svn: 283007	2016-10-01 03:43:20 +00:00
Mehdi Amini	9a72cd7b21	Use StringRef in TLI instead of raw pointer (NFC) llvm-svn: 283005	2016-10-01 03:10:48 +00:00
Mehdi Amini	117296c0a0	Use StringRef in Pass/PassManager APIs (NFC) llvm-svn: 283004	2016-10-01 02:56:57 +00:00
Mehdi Amini	86eeda8e20	Revert "AMDGPU: Don't use offen if it is 0" This reverts commit r282999. Tests are not passing: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/20038 llvm-svn: 283003	2016-10-01 02:35:24 +00:00
Eric Christopher	e8d141c675	Remove getTargetTriple and update all uses to use the Triple off of the TargetMachine. NFC. llvm-svn: 283002	2016-10-01 01:50:33 +00:00
Eric Christopher	364dbe06d3	Stop calling getTargetTriple off of the AsmPrinter and constructing a TargetTriple, just grab it off of the TargetMachine. NFC. llvm-svn: 283001	2016-10-01 01:50:29 +00:00
Eric Christopher	98983d0aff	Remove TargetTriple from AArch64MCInstLower as it's used in few places and can be pulled from the TargetMachine. NFC. llvm-svn: 283000	2016-10-01 01:50:25 +00:00
Matt Arsenault	3070fdf798	AMDGPU: Don't use offen if it is 0 This removes many re-initializations of a base register to 0. llvm-svn: 282999	2016-10-01 01:37:15 +00:00
Mehdi Amini	4cc259a469	Use StringRef in LTOCodegenerator (NFC) llvm-svn: 282998	2016-10-01 01:18:23 +00:00
Mehdi Amini	05cfdd0800	Use StringRef in LTOModule implementation (NFC) llvm-svn: 282997	2016-10-01 01:18:16 +00:00
Mehdi Amini	b7fb124512	Use StringRef in Triple API (NFC) llvm-svn: 282996	2016-10-01 01:16:22 +00:00
Kostya Serebryany	d216922a80	[libFuzzer] implement the -shrink=1 option that tires to make elements of the corpus smaller, off by default llvm-svn: 282995	2016-10-01 01:04:29 +00:00
Nirav Dave	9f2bd4e7ea	[MC] Prevent out of order HashDirective lexing in AsmLexer. To lex hash directives we peek ahead to find component tokens, create a unified token, and unlex the peeked tokens so the parser does not need to parse the tokens then. Make sure we do not to lex another hash directive during peek operation. This fixes PR28921. Reviewers: rnk, loladiro Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24839 llvm-svn: 282992	2016-10-01 00:42:32 +00:00
Mehdi Amini	6610b01a27	[ASAN] Add the binder globals on Darwin to llvm.compiler.used to avoid LTO dead-stripping The binder is in a specific section that "reverse" the edges in a regular dead-stripping: the binder is live as long as a global it references is live. This is a big hammer that prevents LLVM from dead-stripping these, while still allowing linker dead-stripping (with special knowledge of the section). Differential Revision: https://reviews.llvm.org/D24673 llvm-svn: 282988	2016-10-01 00:05:34 +00:00
Kostya Serebryany	90f8f36bca	[libFuzzer] remove some experimental code llvm-svn: 282983	2016-09-30 23:29:27 +00:00
Matthias Braun	298e007e99	ScheduleDAGInstrs: Cleanup, use range based for; NFC llvm-svn: 282979	2016-09-30 23:08:07 +00:00
Kostya Serebryany	7022b94687	[libFuzzer] fix openssl fuzzer tests when running on a machine w/o openssl installed llvm-svn: 282972	2016-09-30 22:35:08 +00:00
Kostya Serebryany	e7e790bad6	[libFuzzer] remove unused option llvm-svn: 282971	2016-09-30 22:29:57 +00:00
Reid Kleckner	9cb915b7be	[SEH] Emit the parent frame offset label even if there are no funclets This avoids errors about references to undefined local labels from unreferenced filter functions. Fixes (sort of) PR30431 llvm-svn: 282967	2016-09-30 22:10:12 +00:00
Quentin Colombet	a6119958ff	[AArch64][RegisterBankInfo] Use the helper functions for the checks This makes sure the helper functions work as expected. NFC. llvm-svn: 282961	2016-09-30 21:46:21 +00:00
Quentin Colombet	7c3fa8e361	[AArch64][RegisterBankInfo] Rename getValueMappingIdx to getValueMapping We don't return index, we return the actual ValueMapping. NFC. llvm-svn: 282960	2016-09-30 21:46:19 +00:00
Quentin Colombet	b4afac7b32	[AArch64][RegisterBankInfo] Compress the ValueMapping table a bit. We don't need to have singleton ValueMapping on their own, we can just reuse one of the elements of the 3-ops mapping. This allows even more code sharing. NFC. llvm-svn: 282959	2016-09-30 21:46:17 +00:00
Quentin Colombet	7fc5fe41c5	[AArch64][RegisterBankInfo] Refactor the code to access AArch64::ValMapping Use a helper function to access ValMapping. This should make the code easier to understand and maintain. NFC. llvm-svn: 282958	2016-09-30 21:46:15 +00:00
Quentin Colombet	15dc25bb3d	[AArch64][RegisterBankInfo] Rename getRegBankIdx to getRegBankIdxOffset The function name did not make it clear that the returned value was an offset to apply to a register bank index. NFC. llvm-svn: 282957	2016-09-30 21:46:12 +00:00
Quentin Colombet	b2308987ab	[AArch64][RegisterBankInfo] Use the static opds mapping for alt mappings Avoid to rely on the dynamically allocated operands mapping for the alternative mapping. NFC. llvm-svn: 282956	2016-09-30 21:45:56 +00:00
Kostya Serebryany	b7e7a5473d	[libFuzzer] move common parts of shell scripts into a separate file llvm-svn: 282954	2016-09-30 21:12:30 +00:00
Piotr Padlewski	1beced8b75	NFC Add const llvm-svn: 282952	2016-09-30 21:05:55 +00:00
Piotr Padlewski	f3d122cd02	NFC fix doxygen comments llvm-svn: 282950	2016-09-30 21:05:49 +00:00
Rui Ueyama	5d6714e593	Do not pass a superblock to PDBFileBuilder. When we create a PDB file using PDBFileBuilder, the information in the superblock, such as the size of the resulting file, is not available. Previously, PDBFileBuilder::initialize took a superblock assuming that all the members of the struct are correct. That is useful when you want to restore the exact information from a YAML file, but that's probably the only use case in which that is useful. When we are creating a PDB file on the fly, we have to backfill the members. This patch redefines PDBFileBuilder::initialize to take only a block size. Now all the other members are left as default values, so that they'll be updated when commit() is called. Differential Revision: https://reviews.llvm.org/D25108 llvm-svn: 282944	2016-09-30 20:52:12 +00:00
Rui Ueyama	fc22cef98e	Pass a filename instead of a msf::WritableStream to PDBFileBuilder::commit. WritableStream needs the exact file size to open a file, but until we fix the final layout of a PDB file, we don't know the size of the file. This patch changes the parameter type of PDBFileBuilder::commit to solve that chiecken-and-egg problem. Now the function opens a file after fixing the layout, so it can create a file with the exact size. Differential Revision: https://reviews.llvm.org/D25107 llvm-svn: 282940	2016-09-30 20:34:44 +00:00
Joerg Sonnenberger	10c45e226b	Deal with the (historic) MAP_ANONYMOUS vs MAP_ANON directly by using CPP to check for the former, don't depend on (dangling) HAVE_MMAP_ANONYMOUS. llvm-svn: 282925	2016-09-30 20:17:23 +00:00
Joerg Sonnenberger	18a2fb2d28	Retire NEED_DEV_ZERO_FOR_MMAP. It should be needed only on outdated systems. It wasn't even hooked up in cmake, so problems on such systems would be visible with 3.9 release already. llvm-svn: 282924	2016-09-30 20:16:01 +00:00
Hans Wennborg	b5643b47b6	X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302) We can't use Jcc to leave a Win64 function in general, because that confuses the unwinder. However, for "leaf" functions, that is, functions where the return address is always on top of the stack and which don't have unwind info, it's OK. Differential Revision: https://reviews.llvm.org/D24836 llvm-svn: 282920	2016-09-30 20:07:35 +00:00
Joerg Sonnenberger	2cd87a0cf2	Turn ENABLE_CRASH_OVERRIDES into a 0/1 definition. llvm-svn: 282919	2016-09-30 20:06:19 +00:00
Joerg Sonnenberger	0e3cc3c67c	Convert ENABLE_BACKTRACES into a 0/1 definition. llvm-svn: 282918	2016-09-30 20:04:24 +00:00
Sanjay Patel	f7b851fe84	[InstCombine] allow non-splat folds of select cond (ext X), C llvm-svn: 282906	2016-09-30 19:49:22 +00:00
Gor Nishanov	a263a60ad5	[Coroutines] Part15c: Fix coro-split to correctly handle definitions between coro.save and coro.suspend Summary: In the case below, %Result.i19 is defined between coro.save and coro.suspend and used after coro.suspend. We need to correctly place such a value into the coroutine frame. ``` %save = call token @llvm.coro.save(i8* null) %Result.i19 = getelementptr inbounds %"struct.lean_future<int>::Awaiter", %"struct.lean_future<int>::Awaiter"* %ref.tmp7, i64 0, i32 0 %suspend = call i8 @llvm.coro.suspend(token %save, i1 false) switch i8 %suspend, label %exit [ i8 0, label %await.ready i8 1, label %exit ] await.ready: %val = load i32, i32* %Result.i19 ``` Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24418 llvm-svn: 282902	2016-09-30 19:24:19 +00:00
Gor Nishanov	c16219486a	[Coroutines] Part15b: Fix dbg information handling in coro-split. Summary: Without the fix, if there was a function inlined into the coroutine with debug information, CloneFunctionInto(NewF, &F, VMap, /ModuleLevelChanges=/true, Returns); would duplicate all of the debug information including the DICompileUnit. We know use VMap to indicate that debug metadata for a File, Unit and FunctionType should not be duplicated when we creating clones that will become f.resume, f.destroy and f.cleanup. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24417 llvm-svn: 282899	2016-09-30 19:05:06 +00:00
Gor Nishanov	768de2c604	[Coroutines] Part 15a: Lower coro.subfn.addr in CoroCleanup Summary: Not all coro.subfn.addr intrinsics can be eliminated in CoroElide through devirtualization. Those that remain need to be lowered in CoroCleanup. Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24412 llvm-svn: 282897	2016-09-30 18:41:35 +00:00
Dehao Chen	977853b7c5	Update loop unroller cost model to make sure debug info does not affect optimization decisions. Summary: Debug info should not affect optimization decisions. This patch updates loop unroller cost model to make it not affected by debug info. Reviewers: davidxl, mzolotukhin Subscribers: haicheng, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D25098 llvm-svn: 282894	2016-09-30 18:30:04 +00:00
Kostya Serebryany	cfa31b6307	[libFuzzer] add a fuzzer test that finds CVE-2015-3193 llvm-svn: 282892	2016-09-30 18:16:16 +00:00
Derek Schuff	e9e6891b2d	[WebAssembly] Make register stackification more conservative Register stackification currently checks VNInfo for changes. Make that more accurate by testing each intervening instruction for any other defs to the same virtual register. Patch by Jacob Gravelle Differential Revision: https://reviews.llvm.org/D24942 llvm-svn: 282886	2016-09-30 18:02:54 +00:00
Rui Ueyama	14a5ca0498	[Object] Define Archive::isEmpty(). llvm-svn: 282884	2016-09-30 17:54:31 +00:00
Etienne Bergeron	0ca0568604	[asan] Support dynamic shadow address instrumentation Summary: This patch is adding the support for a shadow memory with dynamically allocated address range. The compiler-rt needs to export a symbol containing the shadow memory range. This is required to support ASAN on windows 64-bits. Reviewers: kcc, rnk, vitalybuka Subscribers: zaks.anna, kubabrecka, dberris, llvm-commits, chrisha Differential Revision: https://reviews.llvm.org/D23354 llvm-svn: 282881	2016-09-30 17:46:32 +00:00
Konstantin Zhuravlyov	836cbff695	[AMDGPU] Choose VMCNT, EXPCNT, LGKMCNT masks and shifts based on the isa version Differential Revision: https://reviews.llvm.org/D24973 llvm-svn: 282877	2016-09-30 17:01:40 +00:00
Konstantin Zhuravlyov	d7bdf24f32	[AMDGPU] Ask subtarget if waitcnt instruction is needed before barrier instruction Differential Revision: https://reviews.llvm.org/D24985 llvm-svn: 282875	2016-09-30 16:50:36 +00:00
Konstantin Zhuravlyov	4658e5f7b0	[AMDGPU] Do not run scalar optimization passes at "-O0" Differential Revision: https://reviews.llvm.org/D25055 llvm-svn: 282873	2016-09-30 16:39:24 +00:00
Artur Pilipenko	2af93490fb	CVP. Turn marking adds as no wrap on by default (was turned off by 279082) With 282650 in tree extra no wrap on adds doesn't cause regressions anymore. Reenable the optimzation. llvm-svn: 282872	2016-09-30 16:20:08 +00:00
Matthew Simpson	7808833e28	[LV] Build all scalar steps for non-uniform induction variables When building the steps for scalar induction variables, we previously attempted to determine if all the scalar users of the induction variable were uniform. If they were, we would only emit the step corresponding to vector lane zero. This optimization was too aggressive. We generally don't know the entire set of induction variable users that will be scalar. We have isScalarAfterVectorization, but this is only a conservative estimate of the instructions that will be scalarized. Thus, an induction variable may have scalar users that aren't already known to be scalar. To avoid emitting unused steps, we can only check that the induction variable is uniform. This should fix PR30542. Reference: https://llvm.org/bugs/show_bug.cgi?id=30542 llvm-svn: 282863	2016-09-30 15:13:52 +00:00
Dylan McKay	4a25499b13	[AVR] Add the ELF object file writer Summary: This adds the ELF32 writer for AVR. Reviewers: kparzysz Subscribers: beanz, mgorny Differential Revision: https://reviews.llvm.org/D25031 llvm-svn: 282856	2016-09-30 14:09:20 +00:00
Dylan McKay	309eba75b1	Revert "[RegAllocGreedy] Attempt to split unspillable live intervals" It was accidentally committed. llvm-svn: 282855	2016-09-30 14:05:15 +00:00
Dylan McKay	1a7bd84a92	[AVR] Add the assembly instruction printer Summary: This change adds the AVR assembly instruction printer. No tests are included in this patch. I have left them downstream so we can add them once `llc` successfully runs (there's very few components left to upstream until this). Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25028 llvm-svn: 282854	2016-09-30 14:01:50 +00:00
Dylan McKay	2a80cc688a	[RegAllocGreedy] Attempt to split unspillable live intervals Summary: Previously, when allocating unspillable live ranges, we would never attempt to split. We would always bail out and try last ditch graph recoloring. This patch changes this by attempting to split all live intervals before performing recoloring. This fixes LLVM bug PR14879. I can't add test cases for any backends other than AVR because none of them have small enough register classes to trigger the bug. Reviewers: qcolombet Subscribers: MatzeB Differential Revision: https://reviews.llvm.org/D25070 llvm-svn: 282852	2016-09-30 13:59:20 +00:00
Craig Topper	f3e671e020	[AVX-512] Store address operand should be an input operand for the special stack spilling pseudos for XMM16-31 and YMM16-31 without VLX. llvm-svn: 282843	2016-09-30 05:35:47 +00:00
Craig Topper	1c01cbe9ee	[AVX-512] Add the special stack spilling pseudos for XMM16-31 and YMM16-31 without VLX to teh isFrameLoadOpcode and isFrameStoreOpcode. llvm-svn: 282842	2016-09-30 05:35:45 +00:00
Craig Topper	3f37a4180b	Revert r282835 "[AVX-512] Always use the full 32 register vector classes for addRegisterClass regardless of whether AVX512/VLX is enabled or not." Turns out this doesn't pass verify-machineinstrs. llvm-svn: 282841	2016-09-30 05:35:42 +00:00
Kostya Serebryany	cad612a472	[libfuzzer] test for c-ares CVE-2016-5180 llvm-svn: 282839	2016-09-30 05:15:45 +00:00
Adam Nemet	f744ad78e9	[LDist] Port to new streaming API for opt remarks llvm-svn: 282838	2016-09-30 04:56:25 +00:00
Craig Topper	de03ff7063	[X86] Add AVX-512 VTs to findRepresentativeClass as well as v16i16 which was also missing. Change register class to include the extra 16 AVX512 registers. I'm not completely sure what this method does or why all the 256-bit VTs returned VR128RegClass when the comments on the method definiton say it should return the largest super register class. I just figured AVX-512 should be similar. llvm-svn: 282836	2016-09-30 04:31:37 +00:00
Craig Topper	bc6e97b8f4	[AVX-512] Always use the full 32 register vector classes for addRegisterClass regardless of whether AVX512/VLX is enabled or not. If AVX512 is disabled, the registers should already be marked reserved. Pattern predicates and register classes on instructions should take care of most of the rest. Loads/stores and physical register copies for XMM16-31 and YMM16-31 without VLX have already been taken care of. I'm a little unclear why this changed the register allocation of the SSE2 run of the sad.ll test, but the registers selected appear to be valid after this change. llvm-svn: 282835	2016-09-30 04:31:33 +00:00
Adam Nemet	f57cc62abf	[LoopUnroll] Port to the new streaming interface for opt remarks. llvm-svn: 282834	2016-09-30 03:44:16 +00:00
Piotr Padlewski	d28694739c	[thinlto] Don't decay threshold for hot callsites Summary: We don't want to decay hot callsites to import chains of hot callsites. The same mechanism is used in LIPO. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24976 llvm-svn: 282833	2016-09-30 03:01:17 +00:00
Matt Arsenault	5d8eb25e78	AMDGPU: Use unsigned compare for eq/ne For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one. llvm-svn: 282832	2016-09-30 01:50:20 +00:00
Kostya Serebryany	b3949ef885	[libFuzzer] remove the code for -print_pcs=1 with the old coverage. It still works with the new one (trace-pc-guard) llvm-svn: 282831	2016-09-30 01:24:57 +00:00
Kostya Serebryany	2c55613a08	[libFuzzer] more the feature set to InputCorpus; on feature update, change the feature counter of the old best input llvm-svn: 282829	2016-09-30 01:19:56 +00:00
Adam Nemet	fce0178847	[LoopDataPrefetch] Port to new streaming API for opt remarks llvm-svn: 282826	2016-09-30 00:42:43 +00:00
Justin Lebar	9091055efa	Move UTF functions into namespace llvm. Summary: This lets people link against LLVM and their own version of the UTF library. I determined this only affects llvm, clang, lld, and lldb by running $ git grep -wl 'UTF[0-9]\+\\|\bConvertUTF\bisLegalUTF\\|getNumBytesFor' \| cut -f 1 -d '/' \| sort \| uniq clang lld lldb llvm Tested with ninja lldb ninja check-clang check-llvm check-lld (ninja check-lldb doesn't complete for me with or without this patch.) Reviewers: rnk Subscribers: klimek, beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D24996 llvm-svn: 282822	2016-09-30 00:38:45 +00:00
Adam Nemet	951c6b1955	[LV] Port the remarks in processLoop to the new streaming API This completes LV. llvm-svn: 282821	2016-09-30 00:29:30 +00:00
Adam Nemet	4fd9c42279	[LV] Port the last opt remark in Hints to the new streaming interface llvm-svn: 282820	2016-09-30 00:29:25 +00:00
Reid Kleckner	147f91c88e	[X86] Don't preserve Win64 SSE CSRs when SSE is disabled Code that doesn't use floating point and doesn't use SSE (kernel code) shouldn't save and restore SSE registers. Fixes PR30503 llvm-svn: 282819	2016-09-30 00:17:49 +00:00
Quentin Colombet	4b36e0c409	[AArch64][RegisterBankInfo] Use static mapping for 3-operands instrs. This uses a TableGen'ed like structure for all 3-operands instrs. The output of the RegBankSelect pass should be identical but the RegisterBankInfo will do less dynamic allocations. llvm-svn: 282817	2016-09-30 00:10:00 +00:00
Quentin Colombet	fdd303afe2	[AArch64][RegisterBankInfo] Add static value mapping for 3-op instrs. This is the kind of input TableGen should generate at some point. NFC. llvm-svn: 282816	2016-09-30 00:09:58 +00:00
Quentin Colombet	eb8d3da9a0	[AArch64][RegisterBankInfo] Check the statically created ValueMapping. Make sure that the ValueMappings contain the value we expect at the indices we expect. NFC. llvm-svn: 282815	2016-09-30 00:09:43 +00:00
Adam Nemet	877ccee8cc	[LAA, LV] Port to new streaming interface for opt remarks. Update LV (Recommit after making sure IsVerbose gets properly initialized in DiagnosticInfoOptimizationBase. See previous commit that takes care of this.) OptimizationRemarkAnalysis directly takes the role of the report that is generated by LAA. Then we need the magic to be able to turn an LAA remark into an LV remark. This is done via a new OptimizationRemark ctor. llvm-svn: 282813	2016-09-30 00:01:30 +00:00
Sanjay Patel	453ceff261	[InstCombine] fix function names; NFC Also, make foldSelectExtConst() a member of InstCombiner, remove unnecessary parameters from its interface, and group visitSelectInst helpers together in the header file. llvm-svn: 282796	2016-09-29 22:18:30 +00:00
Joerg Sonnenberger	8472f847ca	Make HAVE_DECL_ARC4RANDOM always defined. Sort the entry correctly. llvm-svn: 282768	2016-09-29 21:10:38 +00:00
Joerg Sonnenberger	5c50539503	HAVE_UNWIND_BACKTRACE -> HAVE__UNWIND_BACKTRACE Check for existance and not truth value. llvm-svn: 282767	2016-09-29 21:07:57 +00:00
Kevin Enderby	4f229d867b	Next set of additional error checks for invalid Mach-O files for the load command that uses the MachO::entry_point_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_MAIN load command. llvm-svn: 282766	2016-09-29 21:07:29 +00:00
Adrian McCarthy	d1185fc081	Clamp version number in S_COMPILE3 to avoid overflowing 16-bit field. llvm-svn: 282761	2016-09-29 20:28:25 +00:00
Adam Nemet	556a06b1ee	Revert "[LAA, LV] Port to new streaming interface for opt remarks. Update LV" This reverts commit r282758. There are some clang failures I haven't seen. llvm-svn: 282759	2016-09-29 20:17:37 +00:00
Adam Nemet	c1d21817d1	[LAA, LV] Port to new streaming interface for opt remarks. Update LV OptimizationRemarkAnalysis directly takes the role of the report that is generated by LAA. Then we need the magic to be able to turn an LAA remark into an LV remark. This is done via a new OptimizationRemark ctor. llvm-svn: 282758	2016-09-29 20:12:18 +00:00
Quentin Colombet	1b01677e61	[RegisterBankInfo] Change the default mapping for Copy and PHI. Instead of producing a mapping for all the operands, we only generate a mapping for the definition. Indeed, the other operands are not constrained by the instruction and thus, we should leave the choice to the actual definition to do the right thing. In pratice this is almost NFC, but with one advantage. We will have only one instance of OperandsMapping for each copy and phi that map to one register bank instead of one different instance for each different number of operands for each copy and phi. llvm-svn: 282756	2016-09-29 19:51:46 +00:00
Douglas Katzman	93a9a8397a	Generalize ArgList::AddAllArgs more llvm-svn: 282755	2016-09-29 19:47:58 +00:00
Adam Nemet	3628282a77	[LV] Port OptimizationRemarkAnalysisFPCommute and OptimizationRemarkAnalysisAliasing to new streaming API for opt remarks llvm-svn: 282742	2016-09-29 18:04:47 +00:00
Adam Nemet	6e1edd5d1f	[LV] Convert processLoop to new streaming API for opt remarks llvm-svn: 282740	2016-09-29 17:55:13 +00:00
Reid Kleckner	e45b2c7d8e	[codeview] Use character types for all byte-sized integer types The VS debugger doesn't appear to understand the 0x68 or 0x69 type indices, which were probably intended for use on a platform where a C 'int' is 8 bits. So, use the character types instead. Clang was already using the character types because '[u]int8_t' is usually defined in terms of 'char'. See the Rust issue for screenshots of what VS does: https://github.com/rust-lang/rust/issues/36646 Fixes PR30552 llvm-svn: 282739	2016-09-29 17:55:01 +00:00
Sanjay Patel	ccc2927b69	fix formatting; NFC llvm-svn: 282737	2016-09-29 17:48:19 +00:00
Kevin Enderby	245be3ed2a	Next set of additional error checks for invalid Mach-O files for the load command that uses the Mach::source_version_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_SOURCE_VERSION load command. llvm-svn: 282736	2016-09-29 17:45:23 +00:00
Kostya Serebryany	a9b0dd0e51	[sanitizer-coverage/libFuzzer] make the guards for trace-pc 32-bit; create one array of guards per function, instead of one guard per BB. reorganize the code so that trace-pc-guard does not create unneeded globals llvm-svn: 282735	2016-09-29 17:43:24 +00:00
Piotr Padlewski	ba72b95f7b	[thinlto] Add cold-callsite import heuristic Summary: Not tunned up heuristic, but with this small heuristic there is about +0.10% improvement on SPEC 2006 Reviewers: tejohnson, mehdi_amini, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24940 llvm-svn: 282733	2016-09-29 17:32:07 +00:00
Douglas Katzman	3ace13adfa	[X86] Avoid "unused" warnings if no asserts llvm-svn: 282732	2016-09-29 17:26:12 +00:00
Adam Nemet	eb0ba8d50f	[LV] Move static createMissedAnalysis from anonymous to global namespace This is an attempt to fix a windows bot. llvm-svn: 282730	2016-09-29 17:25:00 +00:00
Adam Nemet	0bfa441701	[LV] Convert CostModel to use the new streaming opt remark API Here we can already remove the member function emitAnalysis. llvm-svn: 282729	2016-09-29 17:15:48 +00:00
Adam Nemet	70757dd95a	[LV] Split most of createMissedAnalysis into a static function. NFC This will be shared between Legality and CostModel. llvm-svn: 282728	2016-09-29 17:05:35 +00:00
Adam Nemet	9988ca3db3	[LV] Convert all but one opt remark in Legality to new streaming interface The last one remaining after which emitAnalysis can be removed is when we convert the LAA's report to a vectorization report. This requires converting LAA to the new interface first. llvm-svn: 282726	2016-09-29 16:49:42 +00:00
Adam Nemet	9a1a5ef212	[LV] Convert emitRemark to new opt remark streaming interface Also renamed the function to emitRemarkWithHints to better reflect what the function actually does. llvm-svn: 282723	2016-09-29 16:23:12 +00:00
Kostya Serebryany	a9a135b4f5	[libFuzzer] initialize ValueBitMap::NumBits llvm-svn: 282721	2016-09-29 15:51:28 +00:00
Simon Pilgrim	97a4820ccd	[X86][SSE] Added common helper for shuffle mask constant pool decodes. The shuffle mask decodes have a large amount of repeated code extracting/splitting mask values from Constant data. This patch pulls all of this duplicated code into a single helper function to identify undef elements and combine/split constant integer data into the requested shuffle mask elements. Updated PSHUFB/VPERMIL/VPERMIL2/VPPERM decoders to use it (VPERMV/VPERMV3 could be converted as well in the future). llvm-svn: 282720	2016-09-29 15:25:48 +00:00
Volkan Keles	6ec2ac0416	Test commit. NFC. llvm-svn: 282717	2016-09-29 13:04:37 +00:00
Dylan McKay	7e91886a3f	Revert "[AVR] Add instruction selection lowering code" I accidentally comitted it. llvm-svn: 282712	2016-09-29 12:49:18 +00:00
Dylan McKay	b79c01a423	[AVR] Add instruction selection lowering code Summary: This adds AVRISelLowering.cpp Reviewers: kparzysz, arsenm Subscribers: wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25034 llvm-svn: 282711	2016-09-29 12:44:38 +00:00
Craig Topper	d875d6b9b4	[AVX-512] Support spills of XMM16-31 and YMM16-31 when VLX isn't available. This adds new pseudo instructions that can be selected during register allocation to represent loads and stores of XMM/YMM registers when AVX512F is available, but VLX isn't. They will be converted to VEX encoded moves if the register turns out to be XMM0-15/YMM0-15. Otherwise either an EVEX VEXTRACT(store) or VBROADCAST(load) will be used. Fixes one of the cases from PR29112. llvm-svn: 282690	2016-09-29 06:07:09 +00:00
Craig Topper	7eb0e7ce1f	[AVX-512] Replicate pattern from AVX to select VMOVDDUP for (v2f64 (X86VBroadcast f64:)). Add AVX512VL to command line of existing AVX2 test that hits this condition. llvm-svn: 282688	2016-09-29 05:54:43 +00:00
Craig Topper	e7f2611160	[X86] Add EVEX encoded VBROADCASTSS/SD and VPBROADCASTD/Q to execution domain fixing table. llvm-svn: 282687	2016-09-29 05:54:39 +00:00
Craig Topper	cb3ae5a03d	[X86] Remove AddedComplexity adjustments that don't seem to be needed. llvm-svn: 282686	2016-09-29 05:54:34 +00:00
Craig Topper	816a1d7783	[X86] Add VBROADCASTF128/VBROADCASTI128 to execution domain fixing tables. llvm-svn: 282684	2016-09-29 05:54:28 +00:00
Peter Collingbourne	f75609e015	Add explanatory comment. llvm-svn: 282678	2016-09-29 03:29:28 +00:00
Eric Christopher	20ac943748	Remove an unnecessary duplicate initialization of TLOF from the Mips AsmPrinter. This was reinitializing the Mangler after we moved the Mangler down to TLOF and causing us to have two different unnamed global values accessed with the same name. This should fix the problems on the ubsan tests here: http://lab.llvm.org:8011/builders/clang-cmake-mips/builds/15307 llvm-svn: 282675	2016-09-29 02:03:52 +00:00
Eric Christopher	1c3bb79864	Remove the default constructor and count variable from the Mangler since we can just use the size of the DenseMap as a unique counter. llvm-svn: 282674	2016-09-29 02:03:50 +00:00
Eric Christopher	b4b75a531e	Update comment about initializing TLOF with a pointer at the previous line or the other commented out place. llvm-svn: 282673	2016-09-29 02:03:47 +00:00
Eric Christopher	8e94895555	Tidy spelling and grammar. llvm-svn: 282672	2016-09-29 02:03:44 +00:00
Matthias Braun	aae7fe99d0	MachineFunction: Add missing newline in debug print() Should not be a functional but an aesthetic change. llvm-svn: 282669	2016-09-29 01:47:42 +00:00
Matt Arsenault	e6740754f0	AMDGPU: Partially fix control flow at -O0 Fixes to allow spilling all registers at the end of the block work with exec modifications. Don't emit s_and_saveexec_b64 for if lowering, and instead emit copies. Mark control flow mask instructions as terminators to get correct spill code placement with fast regalloc, and then have a separate optimization pass form the saveexec. This should work if SGPRs are spilled to VGPRs, but will likely fail in the case that an SGPR spills to memory and no workitem takes a divergent branch. llvm-svn: 282667	2016-09-29 01:44:16 +00:00
Peter Collingbourne	0d5636e517	LTO: Fix use-after-scope error. llvm-svn: 282665	2016-09-29 01:28:36 +00:00
Lei Liu	361615cfd0	AArch64: Set shift bit of TLSLE HI12 add instruction Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model. This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000. Reviewers: t.p.northover, peter.smith, rovka Subscribers: salim.nasser, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24702 llvm-svn: 282661	2016-09-29 01:05:48 +00:00
Evgeny Stupachenko	dc8a254663	Wisely choose sext or zext when widening IV. Summary: The patch fixes regression caused by two earlier patches D18777 and D18867. Reviewers: reames, sanjoy Differential Revision: http://reviews.llvm.org/D24280 From: Li Huang llvm-svn: 282650	2016-09-28 23:39:39 +00:00
Kevin Enderby	76966bf066	Next set of additional error checks for invalid Mach-O files for the load command that uses the Mach::rpath_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_RPATH load command. llvm-svn: 282649	2016-09-28 23:16:01 +00:00
Quentin Colombet	40cbc27ff3	[RegisterBankInfo] Uniquely generate OperandsMapping. This is a step toward statically allocate InstructionMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. This should already improve compile time by getting rid of a bunch of memmove of SmallVectors. llvm-svn: 282643	2016-09-28 22:20:49 +00:00
Quentin Colombet	97d2d21d65	[RegisterBankInfo] Rework the APIs of ValueMapping. This is a preparatory commit for more TableGen-like structure. NFC llvm-svn: 282642	2016-09-28 22:20:24 +00:00
Adrian Prantl	f8d10ceafe	Remove dead code from LiveDebugVariables.cpp (NFC) LiveDebugVariables doesn't propagate DBG_VALUEs accross basic block boundaries any more; this functionality was split into LiveDebugValues. We can thus drop the now dead references to LexicalScopes from LiveDebugVariables. llvm-svn: 282638	2016-09-28 21:34:23 +00:00
Kevin Enderby	32359dbf6b	Next set of additional error checks for invalid Mach-O files for the other load commands that use the Mach::version_min_command type but not used in llvm libObject code but used in llvm tool code. This includes LC_VERSION_MIN_MACOSX, LC_VERSION_MIN_IPHONEOS, LC_VERSION_MIN_TVOS and LC_VERSION_MIN_WATCHOS load commands. llvm-svn: 282635	2016-09-28 21:20:45 +00:00
Dehao Chen	5461d8bdb5	Refactor the ProfileSummaryInfo to use doInitialization and doFinalization to handle Module update. Summary: This refactors the change in r282616 Reviewers: davidxl, eraman, mehdi_amini Subscribers: mehdi_amini, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D25041 llvm-svn: 282630	2016-09-28 21:00:58 +00:00
Krzysztof Parzyszek	dcb1bcae0b	IfConversion: Add implicit uses for redefined regs with live subregisters Normally, if conversion would add implicit uses for redefined registers, e.g. R0<def> = add_if ..., R0<imp-use>. However, if only subregisters of R0 are known to be live but not R0 itself, such implicit uses will not be added, causing prior definitions of such subregisters and R0 itself to become dead. llvm-svn: 282626	2016-09-28 20:07:41 +00:00
Konstantin Zhuravlyov	e14df4b236	[AMDGPU] Promote uniform i16 ops to i32 ops for targets that have 16 bit instructions Differential Revision: https://reviews.llvm.org/D24125 llvm-svn: 282624	2016-09-28 20:05:39 +00:00
Dehao Chen	10d0abb515	Fix the bug introduced in r282616. llvm-svn: 282618	2016-09-28 18:54:36 +00:00
Dehao Chen	80c8ebb4d8	Fix the bug when -compile-twice is specified, the PSI will be invalidated. Summary: When using llc with -compile-twice, module is generated twice, but getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI will still get the old PSI with the original (invalidated) Module. This patch checks if the module has changed when calling getPSI, if yes, update the module and invalidate the Summary. The bug does not show up in the current llc because PSI is not used in CodeGen yet. But with https://reviews.llvm.org/D24989, the bug will be exposed by test/CodeGen/PowerPC/pr26378.ll Reviewers: eraman, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24993 llvm-svn: 282616	2016-09-28 18:41:14 +00:00
Artur Pilipenko	b6ce6e5dac	Don't look through addrspacecast in GetPointerBaseWithConstantOffset Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value. This is similar to D13008. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D24729 llvm-svn: 282612	2016-09-28 17:57:16 +00:00
Adrian Prantl	7f5866c227	Teach LiveDebugValues about lexical scopes. This addresses PR26055 LiveDebugValues is very slow. Contrary to the old LiveDebugVariables pass LiveDebugValues currently doesn't look at the lexical scopes before inserting a DBG_VALUE intrinsic. This means that we often propagate DBG_VALUEs much further down than necessary. This is especially noticeable in large C++ functions with many inlined method calls that all use the same "this"-pointer. For example, in the following code it makes no sense to propagate the inlined variable a from the first inlined call to f() into any of the subsequent basic blocks, because the variable will always be out of scope: void sink(int a); void __attribute((always_inline)) f(int a) { sink(a); } void foo(int i) { f(i); if (i) f(i); f(i); } This patch reuses the LexicalScopes infrastructure we have for LiveDebugVariables to take this into account. The effect on compile time and memory consumption is quite noticeable: I tested a benchmark that is a large C++ source with an enormous amount of inlined "this"-pointers that would previously eat >24GiB (most of them for DBG_VALUE intrinsics) and whose compile time was dominated by LiveDebugValues. With this patch applied the memory consumption is 1GiB and 1.7% of the time is spent in LiveDebugValues. https://reviews.llvm.org/D24994 Thanks to Daniel Berlin and Keith Walker for reviewing! llvm-svn: 282611	2016-09-28 17:51:14 +00:00
Adrian Prantl	16b2ace0ab	Rewrite loops to use range-based for. (NFC) llvm-svn: 282608	2016-09-28 17:31:17 +00:00
Artem Belevich	3e1211581c	[NVPTX] Added intrinsics for atom.gen.{sys\|cta}.* instructions. These are only available on sm_60+ GPUs. Differential Revision: https://reviews.llvm.org/D24943 llvm-svn: 282607	2016-09-28 17:25:38 +00:00
Sanjoy Das	f0022125e0	[SCEV] Use a SmallPtrSet as a temporary union predicate; NFC Summary: Instead of creating and destroying SCEVUnionPredicate instances (which internally creates and destroys a DenseMap), use temporary SmallPtrSet instances of remember the set of predicates that will get reified into a SCEVUnionPredicate. Reviewers: silviu.baranga, sbaranga Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D25000 llvm-svn: 282606	2016-09-28 17:14:58 +00:00
Nirav Dave	e524f50882	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604	2016-09-28 16:37:50 +00:00
Dylan McKay	1f69cdb321	[AVR] Rename the builtin calling convention names 'BUILTIN' is clearer than 'RT' in this context. llvm-svn: 282602	2016-09-28 16:04:40 +00:00
Marina Yatsina	76bfc6670b	[x86] Accept 'retn' as an alias to 'ret[lqw]'\'ret' (At&t\Intel) Implement 'retn' simply by aliasing it to the relevant 'ret' instruction Commit on behalf of coby Differential Revision: https://reviews.llvm.org/D24346 llvm-svn: 282601	2016-09-28 15:52:56 +00:00
Nirav Dave	e17e055b75	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600	2016-09-28 15:50:43 +00:00
Dylan McKay	536239f144	[AVR] Import the LLVM namespace inside AVRMCTargetDesc.cpp llvm-svn: 282598	2016-09-28 15:35:26 +00:00
Dylan McKay	e762094864	[AVR] Add AVRMCTargetDesc.cpp Summary: This adds the AVRMCTargetDesc file in tree. It allows creation of the core classes used in the backend. Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25023 llvm-svn: 282597	2016-09-28 15:31:12 +00:00
Dylan McKay	d6e7fc6d9a	[AVR] Update the signature of createAVRAsmBackend It has been recently changed to also take a MCTargetOptions structure. llvm-svn: 282594	2016-09-28 14:35:07 +00:00
Dylan McKay	f010a2b41a	[AVR] Enable the assembly parser We very recently landed the code. This commit enables the parser. It also adds a missing include to AVRAsmParser.cpp llvm-svn: 282593	2016-09-28 14:34:42 +00:00
Sanjay Patel	220a8730fb	[InstSimplify] allow or-of-icmps folds with vector splat constants llvm-svn: 282592	2016-09-28 14:27:21 +00:00
Sanjay Patel	1b312ad42d	[InstSimplify] allow and-of-icmps folds with vector splat constants llvm-svn: 282590	2016-09-28 13:53:13 +00:00
Dylan McKay	0fe1e63837	[AVR] Merge most recent changes to AVRInstrInfo.td This adds two new things: - Operand types per fixup - Atomic pseudo operations llvm-svn: 282588	2016-09-28 13:44:02 +00:00
Dylan McKay	b967d16c43	[AVR] Update the data layout The previous data layout caused issues when dealing with atomics. Foe example, it is illegal to load a 16-bit value with less than 16-bits of alignment. This changes the data layout so that all types are aligned by at least their own width. Interestingly, this also _slightly_ decreased register pressure in some cases. llvm-svn: 282587	2016-09-28 13:29:10 +00:00
Dylan McKay	35047ed741	[AVR] Handle AVR relocations when handling ELF files llvm-svn: 282586	2016-09-28 13:23:42 +00:00
Dylan McKay	1f877f06b9	[AVR] Add assembly parser Summary: This patch adds the AVRAsmParser library. Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, mgorny, kparzysz, simoncook, jtbandes, llvm-commits Differential Revision: https://reviews.llvm.org/D20046 llvm-svn: 282584	2016-09-28 13:02:57 +00:00
Guy Blank	2bdc74a471	[X86][FastISel] Use a COPY from K register to a GPR instead of a K operation The KORTEST was introduced due to a bug where a TEST instruction used a K register. but, turns out that the opposite case of KORTEST using a GPR is now happening The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR. Differential Revision: https://reviews.llvm.org/D24953 llvm-svn: 282580	2016-09-28 11:22:17 +00:00
Simon Pilgrim	55b8eaa505	Strip trailing whitespace llvm-svn: 282579	2016-09-28 11:08:00 +00:00
Jonas Paulsson	58c5a7f55a	[SystemZ] Implementation of getUnrollingPreferences(). This commit enables more unrolling for SystemZ by implementing the SystemZTargetTransformInfo::getUnrollingPreferences() method. It has been found that it is better to only unroll moderately, so the DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order to set this to a lower value for SystemZ (4). Reviewers: Evgeny Stupachenko, Ulrich Weigand. https://reviews.llvm.org/D24451 llvm-svn: 282570	2016-09-28 09:41:38 +00:00
Michael Kuperstein	3e06eafc20	[DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombine This check currently doesn't seem to do anything useful on any in-tree target: On non-x86, it always evaluates to false, so we never hit the code path that creates the shuffle with zero. On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to query in general, but doesn't make sense if only restricted to zero blends. Differential Revision: https://reviews.llvm.org/D24625 llvm-svn: 282567	2016-09-28 06:13:58 +00:00
Kostya Serebryany	3ee6c213d6	[libFuzzer] speedup TracePC::FinalizeTrace llvm-svn: 282562	2016-09-28 01:16:24 +00:00
Adam Nemet	69330e0bc5	[LAA] Rename emitAnalysis to recordAnalys. NFC Ever since LAA was split out into an analysis on its own, this function stopped emitting the report directly. Instead it stores it to be retrieved by the client which can then emit it as its own report (e.g. -Rpass-analysis=loop-vectorize). llvm-svn: 282561	2016-09-28 00:58:36 +00:00
Adam Nemet	c507ac96f5	[Inliner] Port all opt remarks to new streaming API llvm-svn: 282559	2016-09-27 23:47:03 +00:00
Kevin Enderby	3e490ef94e	Next set of additional error checks for invalid Mach-O files for the other load commands that use the MachO::dylinker_command type but not used in llvm libObject code but used in llvm tool code. This includes LC_ID_DYLINKER, LC_LOAD_DYLINKER and LC_DYLD_ENVIRONMENT load commands. llvm-svn: 282553	2016-09-27 23:24:13 +00:00
Quentin Colombet	c0f11a9fb8	[AArch64][RegisterBankInfo] Switch to statically allocated ValueMapping. Another step toward TableGen'ed like structure for the RegisterBankInfo of AArch64. By doing this, we also save a bit of compile time for the exact same output. llvm-svn: 282550	2016-09-27 22:55:04 +00:00
Quentin Colombet	caae9cd246	[AArch64][RegisterBankInfo] Fix copy/paste in comments. NFC. llvm-svn: 282549	2016-09-27 22:54:57 +00:00
Sanjay Patel	764ae8bd72	[x86] add folds for FP logic with vector zeros The 'or' case shows up in copysign. The copysign code also had redundant checking for a scalar zero operand with 'and', so I removed that. I'm not sure how to test vector 'and', 'andn', and 'xor' yet, but it seems better to just include all of the logic ops since we're fixing 'or' anyway. llvm-svn: 282546	2016-09-27 22:28:13 +00:00
Adam Nemet	04758ba385	Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFC With the new streaming interface, these class names need to be typed a lot and it's way too looong. llvm-svn: 282544	2016-09-27 22:19:23 +00:00
Geoff Berry	b124331db7	[TargetRegisterInfo, AArch64] Add target hook for isConstantPhysReg(). Summary: The current implementation of isConstantPhysReg() checks for defs of physical registers to determine if they are constant. Some architectures (e.g. AArch64 XZR/WZR) have registers that are constant and may be used as destinations to indicate the generated value is discarded, preventing isConstantPhysReg() from returning true. This change adds a TargetRegisterInfo hook that overrides the no defs check for cases such as this. Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy Subscribers: junbuml, aemerson, mcrosier, rengolin Differential Revision: https://reviews.llvm.org/D24570 llvm-svn: 282543	2016-09-27 22:17:27 +00:00
Adam Nemet	1142147e41	[Inliner] Fold the analysis remark into the missed remark There is really no reason for these to be separate. The vectorizer started this pretty bad tradition that the text of the missed remarks is pretty meaningless, i.e. vectorization failed. There, you have to query analysis to get the full picture. I think we should just explain the reason for missing the optimization in the missed remark when possible. Analysis remarks should provide information that the pass gathers regardless whether the optimization is passing or not. llvm-svn: 282542	2016-09-27 21:58:17 +00:00
Michael Zolotukhin	1a554be3b6	[LoopSimplify] When simplifying phis in loop-simplify, do it only if it preserves LCSSA form. llvm-svn: 282541	2016-09-27 21:03:45 +00:00
Adam Nemet	a62b7e1a28	Output optimization remarks in YAML (Re-committed after moving the template specialization under the yaml namespace. GCC was complaining about this.) This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282539	2016-09-27 20:55:07 +00:00
Adam Nemet	37c9e7f301	Sort headers llvm-svn: 282538	2016-09-27 20:55:01 +00:00
Matthias Braun	5391ffb671	Statistic: Bring back printing on exit by default Turns out several external projects relied on llvm printing statistics on exit. Let's go back to this behaviour by default and have an optional parameter to disable it. llvm-svn: 282532	2016-09-27 19:38:55 +00:00
Sanjay Patel	43ef1ad0ba	[x86] use isNullFPConstant(); NFCI Also, put the related FP logic functions together to see the similarities. llvm-svn: 282522	2016-09-27 18:48:02 +00:00
Reid Kleckner	6481822e28	[DebugInfo] Add comments to phi dbg.value tracking code, NFC LLVM developers might be surprised to learn that there are blocks without valid insertion points (catchswitch), so it seems worth calling that out explicitly. Also add a FIXME about what we should really be doing if we ever need to make optimized Windows EH code debuggable. While I'm here, make auto usage more consistent with LLVM standards and avoid an unecessary call to insertBefore. llvm-svn: 282521	2016-09-27 18:45:31 +00:00
Krzysztof Parzyszek	586fc12e32	[RDF] Add "dead" flag to node attributes llvm-svn: 282520	2016-09-27 18:24:33 +00:00
Krzysztof Parzyszek	1d32220721	[RDF] Special treatment of exception handling registers A landing pad can have live-in registers that are defined by the runtime, not the program (exception pointer register and exception selector register). Make sure to recognize that case and not link these registers with any defs in the program. Each landing pad will have phi nodes added at the beginning to provide definitions of these registers, but the uses of those phi nodes will not have any reaching defs. llvm-svn: 282519	2016-09-27 18:18:44 +00:00
Sanjoy Das	237c84540f	[SCEV] Replace a struct with a function; NFC We can do this now thanks to C++11 lambdas. llvm-svn: 282515	2016-09-27 18:01:48 +00:00
Sanjoy Das	a26021414a	[SCEV] Use find instead of find_as; NFC We don't need the extra generality here. llvm-svn: 282514	2016-09-27 18:01:46 +00:00
Sanjoy Das	c220ac79c4	[SCEV] Reduce the scope of a struct; NFC llvm-svn: 282513	2016-09-27 18:01:44 +00:00
Sanjoy Das	c46bceb632	[SCEV] Remove custom RAII wrapper; NFC Instead use the pre-existing `scope_exit` class. llvm-svn: 282512	2016-09-27 18:01:42 +00:00
Sanjoy Das	db93375711	[SCEV] Make PendingLoopPredicates more frugal; NFCI I don't expect `PendingLoopPredicates` to have very many elements (e.g. when -O3'ing the sqlite3 amalgamation, `PendingLoopPredicates` has at most 3 elements). So now we use a `SmallPtrSet` for it instead of the more heavyweight `DenseSet`. llvm-svn: 282511	2016-09-27 18:01:38 +00:00
Keith Walker	83ebef5db3	Propagate DBG_VALUE entries when there are unvisited predecessors Variables are sometimes missing their debug location information in blocks in which the variables should be available. This would occur when one or more predecessor blocks had not yet been visited by the routine which propagated the information from predecessor blocks. This is addressed by only considering predecessor blocks which have already been visited. The solution to this problem was suggested by Daniel Berlin on the LLVM developer mailing list. Differential Revision: https://reviews.llvm.org/D24927 llvm-svn: 282506	2016-09-27 16:46:07 +00:00
Adam Nemet	cc2a3fa8e8	Revert "Output optimization remarks in YAML" This reverts commit r282499. The GCC bots are failing llvm-svn: 282503	2016-09-27 16:39:24 +00:00
Adam Nemet	92e928c10a	Output optimization remarks in YAML This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282499	2016-09-27 16:15:16 +00:00
Rafael Espindola	eaeb6d91a1	Add xxhash to llvm. It will be used for fast fingerprinting in lld at least. llvm-svn: 282493	2016-09-27 15:45:57 +00:00
Konstantin Zhuravlyov	da4687c531	[AMDGPU] Enable changing instprinter's behavior based on the per-function subtarget This is a prerequisite for coming waitcnt changes Differential Revision: https://reviews.llvm.org/D24939 llvm-svn: 282489	2016-09-27 14:42:48 +00:00
Simon Dardis	d2ed8abb15	[mips] Disable tail calls temporarily Disable tail calls while the remaining bugs are fixed. Enable only for tests. Reviewers: vkalintiris Differential Review: https://reviews.llvm.org/D24912 llvm-svn: 282487	2016-09-27 13:15:54 +00:00
Simon Dardis	0486d585c5	[mips] Add rsqrt, recip for MIPS Add rsqrt.[ds], recip.[ds] for MIPS. Correct the microMIPS definitions for architecture support and register usage. Reviewers: vkalintiris, zoran.jovanoic Differential Review: https://reviews.llvm.org/D24499 llvm-svn: 282485	2016-09-27 12:25:15 +00:00
Nemanja Ivanovic	6f22b41398	[Power9] Builtins for ELF v.2 API conformance - back end portion This patch corresponds to review: https://reviews.llvm.org/D24396 This patch adds support for the "vector count trailing zeroes", "vector compare not equal" and "vector compare not equal or zero instructions" as well as "scalar count trailing zeroes" instructions. It also changes the vector negation to use XXLNOR (when VSX is enabled) so as not to increase register pressure (previously this was done with a splat immediate of all ones followed by an XXLXOR). This was done because the altivec.h builtins (patch to follow) use vector negation and the use of an additional register for the splat immediate is not optimal. llvm-svn: 282478	2016-09-27 08:42:12 +00:00
Craig Topper	789888002a	[X86] Use std::max to calculate alignment instead of assuming RC->getSize() will not return a value greater than 32. I think it theoretically could be 64 for AVX-512. llvm-svn: 282471	2016-09-27 06:44:25 +00:00
Kostya Serebryany	7d6935c184	[libFuzzer] run re2 test in 8 threads by default llvm-svn: 282469	2016-09-27 03:33:57 +00:00
Kostya Serebryany	45c144754b	[sanitizer-coverage] fix a bug in trace-gep llvm-svn: 282467	2016-09-27 01:55:08 +00:00
Kostya Serebryany	186d61801c	[sanitizer-coverage] don't emit the CTOR function if nothing has been instrumented llvm-svn: 282465	2016-09-27 01:08:33 +00:00
Ivan Krasin	4ff4f21e15	Revert r277556. Add -lowertypetests-bitsets-level to control bitsets generation Summary: We don't currently need this facility for CFI. Disabling individual hot methods proved to be a better strategy in Chrome. Also, the design of the feature is suboptimal, as pointed out by Peter Collingbourne. Reviewers: pcc Subscribers: kcc Differential Revision: https://reviews.llvm.org/D24948 llvm-svn: 282461	2016-09-27 00:29:53 +00:00
Kostya Serebryany	53543af036	[libFuzzer] add a test based on openssl-1.0.1f (finds heartbleed) llvm-svn: 282460	2016-09-27 00:27:40 +00:00
Kostya Serebryany	5ff481fd9e	[libFuzzer] add -exit_on_src_pos to test libFuzzer itself, add a test script for RE2 that uses this flag llvm-svn: 282458	2016-09-27 00:10:20 +00:00
Peter Collingbourne	53a852b648	LowerTypeTests: Remove unused variable. llvm-svn: 282456	2016-09-26 23:56:17 +00:00
Peter Collingbourne	6ed92e3f53	LowerTypeTests: Create LowerTypeTestsModule class and move implementation there. Related simplifications. llvm-svn: 282455	2016-09-26 23:54:39 +00:00
Davide Italiano	a9f85d68cc	[CodeGen] Add support for emitting .init_array instead of .ctors on FreeBSD. PR: 30494 llvm-svn: 282451	2016-09-26 22:53:15 +00:00
Derek Schuff	92d300eb8f	[WebAssembly] Use the frame pointer instead of the stack pointer When we have dynamic allocas we have a frame pointer, and when we're lowering frame indexes we should make sure we use it. Patch by Jacob Gravelle Differential Revision: https://reviews.llvm.org/D24889 llvm-svn: 282442	2016-09-26 21:18:03 +00:00
Kevin Enderby	90986e6c7c	Next set of additional error checks for invalid Mach-O files for the other load commands that use the Mach::linkedit_data_command type but not used in llvm libObject code but used in llvm tool code. This includes LC_FUNCTION_STARTS, LC_SEGMENT_SPLIT_INFO and LC_DYLIB_CODE_SIGN_DRS load commands. llvm-svn: 282441	2016-09-26 21:11:03 +00:00
Aditya Kumar	0a48b37cfd	Move computation past early return Reviewers: rafael spatel Differential Revision: https://reviews.llvm.org/D24843 llvm-svn: 282440	2016-09-26 21:01:13 +00:00
Piotr Padlewski	d9830eb79f	[thinlto] Basic thinlto fdo heuristic Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437	2016-09-26 20:37:32 +00:00
Nirav Dave	6477ce2697	Add support for Code16GCC [X86] The .code16gcc directive parses X86 assembly input in 32-bit mode and outputs in 16-bit mode. Teach parser to switch modes appropriately. Reviewers: dwmw2, craig.topper Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20109 llvm-svn: 282430	2016-09-26 19:33:36 +00:00
Andrew Kaylor	595307a468	Add optimization bisect support to an optional Mips pass Differential Revision: https://reviews.llvm.org/D19513 llvm-svn: 282428	2016-09-26 19:05:37 +00:00
Matthias Braun	c603551b4e	Statistic: Only print statistics on exit for -stats Previously enabling the statistics with EnableStatistics() would lead to them getting printed to stderr/-info-output-file on exit. However frontends may want a way to enable statistics and do the printing on their own instead of the forced printing on exit. This changes the code so that only the -stats option enables printing on exit, EnableStatistics() only enables the tracking but requires invoking one of the PrintStatistics() variants. Differential Revision: https://reviews.llvm.org/D24819 llvm-svn: 282425	2016-09-26 18:38:07 +00:00
Tom Stellard	1b9748c6a2	AMDGPU/SI: Don't crash on anonymous GlobalValues Summary: We need to call AsmPrinter::getNameWithPrefix() in order to handle anonymous GlobalValues (e.g. @0, @1). Reviewers: arsenm, b-sumner Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D24865 llvm-svn: 282420	2016-09-26 17:29:25 +00:00
Daniel Berlin	1e98c04226	Remove pruning of phi nodes in MemorySSA - it makes updating harder Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24923 llvm-svn: 282419	2016-09-26 17:22:54 +00:00
Matthew Simpson	b764aba2ab	[LV] Scalarize instructions marked scalar after vectorization This patch ensures that we actually scalarize instructions marked scalar after vectorization. Previously, such instructions may have been vectorized instead. Differential Revision: https://reviews.llvm.org/D23889 llvm-svn: 282418	2016-09-26 17:08:37 +00:00
Gor Nishanov	bc0ebb383c	[Coroutines] Part14: Handle coroutines with no suspend points. Summary: If coroutine has no suspend points, remove heap allocation and turn a coroutine into a normal function. Also, if a pattern is detected that coroutine resumes or destroys itself prior to coro.suspend call, turn the suspend point into a simple jump to resume or cleanup label. This pattern occurs when coroutines are used to propagate errors in functions that return expected<T>. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24408 llvm-svn: 282414	2016-09-26 15:49:28 +00:00
Geoff Berry	256fcf975f	[AArch64] Improve add/sub/cmp isel of uxtw forms. Don't match the UXTW extended reg forms of ADD/ADDS/SUB/SUBS if the 32-bit to 64-bit zero-extend can be done for free by taking advantage of the 32-bit defining instruction zeroing the upper 32-bits of the X register destination. This enables better instruction selection in a few cases, such as: sub x0, xzr, x8 instead of: mov x8, xzr sub x0, x8, w9, uxtw madd x0, x1, x1, x8 instead of: mul x9, x1, x1 add x0, x9, w8, uxtw cmp x2, x8 instead of: sub x8, x2, w8, uxtw cmp x8, #0 add x0, x8, x1, lsl #3 instead of: lsl x9, x1, #3 add x0, x9, w8, uxtw Reviewers: t.p.northover, jmolloy Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24747 llvm-svn: 282413	2016-09-26 15:34:47 +00:00
Evandro Menezes	e45de8a5ec	Add support to optionally limit the size of jump tables. Many high-performance processors have a dedicated branch predictor for indirect branches, commonly used with jump tables. As sophisticated as such branch predictors are, they tend to have well defined limits beyond which their effectiveness is hampered or even nullified. One such limit is the number of possible destinations for a given indirect branches that such branch predictors can handle. This patch considers a limit that a target may set to the number of destination addresses in a jump table. Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar <aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>. Differential revision: https://reviews.llvm.org/D21940 llvm-svn: 282412	2016-09-26 15:32:33 +00:00
Alexey Bataev	793c946ecb	[InstCombine] Fixed bug introduced in r282237 The index of the new insertelement instruction was evaluated in the wrong way, it was considered as the index of the inserted value instead of index of the position, where the value should be inserted. llvm-svn: 282401	2016-09-26 13:18:59 +00:00
Andrea Di Biagio	a82d52d11d	[InstCombine] Teach the udiv folding logic how to handle constant expressions. This patch fixes PR30366. Function foldUDivShl() worked under the assumption that one of the values in input to the function was always an instance of llvm::Instruction. However, function visitUDivOperand() (the only user of foldUDivShl) was clearly violating that precondition; internally, visitUDivOperand() uses pattern matches to check the operands of a udiv. Pattern matchers for binary operators know how to handle both Instruction and ConstantExpr values. This patch fixes the problem in foldUDivShl(). Now we use pattern matchers instead of explicit casts to Instruction. The reduced test case from PR30366 has been added to test file InstCombine/udiv-simplify.ll. Differential Revision: https://reviews.llvm.org/D24565 llvm-svn: 282398	2016-09-26 12:07:23 +00:00
Dylan McKay	c4ec11f451	[AVR] Add AVRMCExpr Summary: This adds the AVRMCExpr headers and implementation. Reviewers: arsenm, ruiu, grosbach, kparzysz Subscribers: wdng, beanz, mgorny, kparzysz, jtbandes, llvm-commits Differential Revision: https://reviews.llvm.org/D20503 llvm-svn: 282397	2016-09-26 11:35:32 +00:00
Sam Kolton	984461062f	Revert "[AMDGPU] Disassembler: print label names in branch instructions" This reverts commit 6c6dbe625263ec9fcf8de0df27263cf147cde550. llvm-svn: 282396	2016-09-26 11:29:03 +00:00
Sam Kolton	1559f76257	[AMDGPU] Disassembler: print label names in branch instructions Summary: Add AMDGPUSymbolizer for finding names for labels from ELF symbol table. Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D24802 llvm-svn: 282394	2016-09-26 10:05:50 +00:00
James Molloy	9abb2fa5bb	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). This recommit contains fixes for a nasty bug related to fast-isel fallback - because fast-isel doesn't know about this optimization, if it runs and emits references to a string that we inline (because fast-isel fell back to SDAG) we will end up with an inlined string and also an out-of-line string, and we won't emit the out-of-line string, causing backend failures. It also contains fixes for emitting .text relocations which made the sanitizer bots unhappy. llvm-svn: 282387	2016-09-26 07:26:24 +00:00
Zvi Rackover	839d15a194	[X86] Optimization for replacing LEA with MOV at frame index elimination time Summary: Replace a LEA instruction of the form 'lea (%esp), %ebx' --> 'mov %esp, %ebx' MOV is preferable over LEA because usually there are more issue-slots available to execute MOVs than LEAs. Latest processors also support zero-latency MOVs. Fixes pr29022. Reviewers: hfinkel, delena, igorb, myatsina, mkuper Differential Revision: https://reviews.llvm.org/D24705 llvm-svn: 282385	2016-09-26 06:42:07 +00:00
Ayman Musa	d7a5ed4141	[X86][avx512] Fix bug in masked compress store. Differential Revision: https://reviews.llvm.org/D23984 llvm-svn: 282381	2016-09-26 06:22:08 +00:00
Chandler Carruth	68abda52c2	[SCEV] Fix the order of members in the initializer list. Noticed due to the warning on this line. Sanjoy is on a less-than-awesome internet connection, so committing on his behalf. llvm-svn: 282380	2016-09-26 04:49:58 +00:00
Sanjoy Das	5cb11b6423	[SCEV] Assign LoopPropertiesCache in the move constructor In a previous change I collapsed two different caches into one. When doing that I noticed that ScalarEvolution's move constructor was not moving those caches. To keep the previous change simple, I've moved that bugfix into this separate change. llvm-svn: 282376	2016-09-26 02:44:10 +00:00
Sanjoy Das	5603fc00a6	[SCEV] Combine two predicates into one; NFC Both `loopHasNoSideEffects` and `loopHasNoAbnormalExits` involve walking the loop and maintaining similar sorts of caches. This commit changes SCEV to compute both the predicates via a single walk, and maintain a single cache instead of two. llvm-svn: 282375	2016-09-26 02:44:07 +00:00
Sanjoy Das	5c4869b39d	[SCEV] Make it obvious BackedgeTakenInfo's constructor steals storage Specifically, it moves SCEVUnionPredicates from its input into its own storage. Make this obvious at the type level. llvm-svn: 282374	2016-09-26 01:10:27 +00:00
Sanjoy Das	6b76cdf0d5	[SCEV] Further isolate incidental data structure; NFC llvm-svn: 282373	2016-09-26 01:10:25 +00:00
Sanjoy Das	7326861abd	[SCEV] Simplify BackedgeTakenInfo::getMax; NFC llvm-svn: 282372	2016-09-26 01:10:22 +00:00
Sanjoy Das	e935c77e20	[SCEV] Reserve space in SmallVector; NFC llvm-svn: 282368	2016-09-25 23:12:08 +00:00
Sanjoy Das	c9bbf56358	[SCEV] Have ExitNotTakenInfo keep a pointer to its predicate; NFC SCEVUnionPredicate is a "heavyweight" structure, so it is beneficial to store the (optional) data out of line. llvm-svn: 282366	2016-09-25 23:12:04 +00:00
Sanjoy Das	d1eb62ad11	[SCEV] Simplify tracking ExitNotTakenInfo instances; NFC This change simplifies a data structure optimization in the `BackedgeTakenInfo` class for loops with exactly one computable exit. I've sanity checked that this does not regress compile time performance, using sqlite3's amalgamated build. llvm-svn: 282365	2016-09-25 23:12:00 +00:00
Sanjoy Das	89eea6b2ed	[SCEV] Rename a couple of fields; NFC llvm-svn: 282364	2016-09-25 23:11:57 +00:00
Sanjoy Das	bdd9710252	[SCEV] Remove incidental data structure; NFC llvm-svn: 282363	2016-09-25 23:11:55 +00:00
Craig Topper	87155274b8	[X86] Remove what appears to be leftover MMX code involving (v1i64 scalar_to_vector). llvm-svn: 282361	2016-09-25 16:34:11 +00:00
Craig Topper	aab59a48e7	[X86] Remove patterns for scalar_to_vector from FR32/FR64 to 256-bit vectors. Lowering explicitly avoids creating this pattern. llvm-svn: 282360	2016-09-25 16:34:09 +00:00
Craig Topper	0cc188d979	[AVX-512] Replace get512BitSuperRegister with calls to TargetRegisterInfo::getMatchingSuperReg. llvm-svn: 282359	2016-09-25 16:34:06 +00:00
Craig Topper	60d3ef1d72	[AVX-512] Fix some patterns predicates to properly enforce priority for various versions of CVTDQ2PD instruction. llvm-svn: 282358	2016-09-25 16:34:02 +00:00
Craig Topper	3c9faa32c1	[AVX-512] Add rounding versions of instructions to hasUndefRegUpdate. llvm-svn: 282357	2016-09-25 16:33:59 +00:00
Craig Topper	d8b2bd492c	[AVX-512] Add the scalar unsigned integer to fp conversion instructions to hasUndefRegUpdate. llvm-svn: 282356	2016-09-25 16:33:57 +00:00
Craig Topper	ac941b9736	[AVX-512] Remove duplicate instructions for converting integer to scalar floating point. We can use patterns to point to the other instructions instead. llvm-svn: 282355	2016-09-25 16:33:53 +00:00
Craig Topper	8f2e85e669	[AVX-512] Don't use two opcodes for INTR_TYPE_SCALAR_MASK_RM. The handling was such that if the second opcode was present the first was ingored, so we can just have one opcode. llvm-svn: 282344	2016-09-25 01:03:10 +00:00
Craig Topper	1776f4c965	[X86] Teach combineShuffle to avoid creating floating point operations with integer types and integer operations with floating point types. Seems isOperationLegal lies for mismatched types and operations. Fixes PR30511. llvm-svn: 282341	2016-09-24 21:42:49 +00:00
Craig Topper	aeca0460f3	[AVX-512] Split scalar version of X86ISD::SELECT into a separate opcode because isel is not robust with multiple type profiles for the same opcode. llvm-svn: 282340	2016-09-24 21:42:47 +00:00
Craig Topper	7e664dad60	[AVX-512] Remove the patterns for selecting scalar VCOMI/VUCOMI instructions with SAE as there is no way to create the pattern. llvm-svn: 282339	2016-09-24 21:42:43 +00:00
Duncan P. N. Exon Smith	11c06ea55a	ObjCARC: Don't look at users of ConstantData Stop looking at users of UndefValue and ConstantPointerNull in the objective C ARC optimizers. The other users aren't actually interesting, since they're not pointing at a particular object. I imagine these calls could be optimized through -instcombine... maybe they already are? These early returns will be required at some point in the future, with a WIP patch that asserts when someone accesses a use-list on ConstantData. llvm-svn: 282338	2016-09-24 21:01:20 +00:00
Duncan P. N. Exon Smith	b1b208a1f5	Analysis: Return early for UndefValue in computeKnownBits There is no benefit in looking through assumptions on UndefValue to guess known bits. Return early to avoid walking their use-lists, and assert that all instances of ConstantData are handled here for similar reasons (UndefValue was the only integer/pointer holdout). llvm-svn: 282337	2016-09-24 20:42:02 +00:00
Sanjay Patel	752ad8fde7	[x86] don't try to create a vector integer inst for an SSE1 target (PR30512) This bug was introduced with: http://reviews.llvm.org/rL272511 We need to restrict the lowering to v4f32 comparisons because that's all SSE1 can handle. This should fix: https://llvm.org/bugs/show_bug.cgi?id=28044 llvm-svn: 282336	2016-09-24 20:24:06 +00:00
Duncan P. N. Exon Smith	4fd9b7e16f	Scalar: Ignore ConstantData in processAssumption Assumptions on UndefValue and ConstantPointerNull aren't relevant to other users. Ignore them entirely to avoid wasting cycles walking through their (possibly extremely extensive (cross-module)) use-lists. It wasn't clear how to add a specific test for this, and it'll be covered anyway by an eventual patch that asserts when trying to access the use-list of an instance of ConstantData. llvm-svn: 282334	2016-09-24 20:00:38 +00:00
Duncan P. N. Exon Smith	b479873912	Analysis: Return early in isKnownNonNullAt for ConstantData Check and return early for ConstantPointerNull and UndefValue specifically in isKnownNonNullAt, and assert that ConstantData never make it to isKnownNonNullFromDominatingCondition. This confirms that isKnownNonNullFromDominatingCondition never walks through the use-list of an instance of ConstantData. Given that such use-lists cross module boundaries, it never really made sense to do so, and was potentially very expensive. llvm-svn: 282333	2016-09-24 19:39:47 +00:00
Dylan McKay	907cde3cc2	[AVR] Update signature of AVRTargetObjectFile::SelectSectionForGlobal It was changed recently, and was breaking compilation of the backend. llvm-svn: 282329	2016-09-24 11:38:08 +00:00
Quentin Colombet	d816bfb282	[RegisterBankInfo] Constify the member of the XXXMapping maps. This makes it obvious that items in those maps behave like statically created objects. llvm-svn: 282327	2016-09-24 04:54:03 +00:00
Quentin Colombet	a50813f608	[RegisterBankInfo] Add statistics for dynamic value mappings. Like partial mappings, as we move toward TableGen'ed information, the number should reach zero eventually. llvm-svn: 282325	2016-09-24 04:53:55 +00:00
Quentin Colombet	fd8c95adf4	[RegisterBankInfo] Uniquely generate ValueMapping. This is a step toward statically allocate ValueMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. llvm-svn: 282324	2016-09-24 04:53:52 +00:00
Quentin Colombet	8159de4de9	[RegisterBankInfo] Keep valid pointers for PartialMappings. Previously we were using the address of the unique instance of a partial mapping in the related map to access this instance. However, when the map grows, the whole set of instances may be moved elsewhere and the previous addresses are not valid anymore. Instead, keep the address of the unique heap allocated instance of a partial mapping. Note: I did not see any actual bugs for that problem as the number of partial mappings dynamically allocated is small (<= 4). llvm-svn: 282323	2016-09-24 04:53:48 +00:00
Kostya Serebryany	273d767215	[libFuzzer] add a standalone build script llvm-svn: 282321	2016-09-24 04:00:00 +00:00
Duncan P. N. Exon Smith	c82c11428e	GlobalStatus: Don't walk use-lists of ConstantData Return early from llvm::isSafeToDestroyConstant() whenever the value `isa<ConstantData>()`. These constants are shared across the LLVMContext. We never really want to delete them here, and walking their use-lists can be very expensive. (This is motivated by an eventual goal of removing use-lists entirely from ConstantData.) llvm-svn: 282320	2016-09-24 02:30:11 +00:00
Kostya Serebryany	0800b81a21	[libFuzzer] simplify HandleTrace again, start re-running interesting units and collecting their features. llvm-svn: 282316	2016-09-23 23:51:58 +00:00
Peter Collingbourne	a638fe057c	Add qualification to fix MSVC build. llvm-svn: 282313	2016-09-23 23:23:23 +00:00
Sanjay Patel	0b36337d61	[x86] fix FCOPYSIGN lowering to create constants instead of ConstantPool loads This is similar to: https://reviews.llvm.org/rL279958 By not prematurely lowering to loads, we should be able to more easily eliminate the 'or' with zero instructions seen in copysign-constant-magnitude.ll. We should also be able to extend this code to handle vectors. llvm-svn: 282312	2016-09-23 23:17:29 +00:00
Petr Hosek	85b2f67613	[MC] Support .ds directives in assembler parser These directives are already supported by GNU assembler. Differential Revision: https://reviews.llvm.org/D24740 llvm-svn: 282303	2016-09-23 21:53:36 +00:00
Matthias Braun	729c989083	llc: Add -start-before/-stop-before options Differential Revision: https://reviews.llvm.org/D23089 llvm-svn: 282302	2016-09-23 21:46:02 +00:00
Peter Collingbourne	80186a57d6	LTO: Simplify caching interface. The NativeObjectOutput class has a design problem: it mixes up the caching policy with the interface for output streams, which makes the client-side code hard to follow and would for example make it harder to replace the cache implementation in an arbitrary client. This change separates the two aspects by moving the caching policy to a separate field in Config, replacing NativeObjectOutput with a NativeObjectStream class which only deals with streams and does not need to be overridden by most clients and introducing an AddFile callback for adding files (e.g. from the cache) to the link. Differential Revision: https://reviews.llvm.org/D24622 llvm-svn: 282299	2016-09-23 21:33:43 +00:00
Valery Pykhtin	fbf2d93f73	[AMDGPU] Fix for bz30427: wrong MTBUF encoding on VI Differential revision: https://reviews.llvm.org/D24875 llvm-svn: 282296	2016-09-23 21:21:21 +00:00
Kostya Serebryany	2d1d944f7e	[libFuzzer] first steps in adding a proper automated test suite based on real-life code: add a script to build RE2 at a revision that has known bugs llvm-svn: 282292	2016-09-23 20:43:22 +00:00
Kostya Serebryany	0d26de3922	[libFuzzer] reset Counters (trace-pc-guard) before every run llvm-svn: 282284	2016-09-23 20:04:13 +00:00
Petr Hosek	4cb08ce3bc	[MC] Support .dcb directives in assembler parser These directives are already supported by GNU assembler. Differential Revision: https://reviews.llvm.org/D24741 llvm-svn: 282283	2016-09-23 19:25:15 +00:00
Sanjay Patel	04949faf69	[TLI] isdigit / isascii / toascii param type should match return type (PR30484) We crash in LibCallSimplifier if we don't check the validity of the function signature properly. llvm-svn: 282278	2016-09-23 18:44:09 +00:00
Quentin Colombet	380cd88cfd	[ResetMachineFunction] Populate the comments in the header of the file. NFC llvm-svn: 282276	2016-09-23 18:38:15 +00:00
Quentin Colombet	0f4c20a1e7	[ResetMachineFunction] Add statistic on the number of reset functions. As the development of GlobalISel move forward, this statistic should strictly decrease until it reaches zero. At this point, it would mean GlobalISel can replace SDISel (at least on the tested inputs :P). llvm-svn: 282275	2016-09-23 18:38:13 +00:00
Quentin Colombet	d4c02437c1	[RegisterBankInfo] Add statistics for dynamic partial mappings. Collect statistics about the number of partial mappings dynamically allocated and accessed. Ultimately, when the whole TableGen infrastructure is set, those numbers should be zero. llvm-svn: 282274	2016-09-23 18:38:06 +00:00
Matthias Braun	1acb55e67c	ScheduleDAG: Match enum names when printing sdep kinds It is less confusing to have the same names in the debug print as the enum members. llvm-svn: 282273	2016-09-23 18:28:31 +00:00
Peter Collingbourne	6e8607527c	BitcodeReader: Deduplicate code. NFC. Differential Revision: https://reviews.llvm.org/D24852 llvm-svn: 282272	2016-09-23 18:27:42 +00:00
Quentin Colombet	c13ea88a14	[RegBankSelect] Use DEBUG_TYPE instead of repeating the name of the pass NFC llvm-svn: 282267	2016-09-23 17:50:06 +00:00
Quentin Colombet	09a9edcc26	[RegisterBank] Mark the dump method with LLVM_DUMP_METHOD. NFC llvm-svn: 282266	2016-09-23 17:50:03 +00:00
Jun Bum Lim	3822939ba7	Enhance calcColdCallHeuristics for InvokeInst Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst. Reviewers: mcrosier, hfinkel, davidxl Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D24868 llvm-svn: 282262	2016-09-23 17:26:14 +00:00
James Molloy	85124c76fc	Revert "[ARM] Promote small global constants to constant pools" This reverts commit r282241. It caused http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19882. llvm-svn: 282249	2016-09-23 13:35:43 +00:00
Nemanja Ivanovic	d2c3c51a70	[Power9] Exploit move and splat instructions for build_vector improvement This patch corresponds to review: https://reviews.llvm.org/D21135 This patch exploits the following instructions: mtvsrws lxvwsx mtvsrdd mfvsrld In order to improve some build_vector and extractelement patterns. llvm-svn: 282246	2016-09-23 13:25:31 +00:00
James Molloy	1ce54d6be2	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). This recommit contains fixes for a nasty bug related to fast-isel fallback - because fast-isel doesn't know about this optimization, if it runs and emits references to a string that we inline (because fast-isel fell back to SDAG) we will end up with an inlined string and also an out-of-line string, and we won't emit the out-of-line string, causing backend failures. It also contains fixes for emitting .text relocations which made the sanitizer bots unhappy. llvm-svn: 282241	2016-09-23 12:15:58 +00:00
George Rimar	4f82df52ae	Revert r282238 "Revert r282235 "[llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section."" Build bot issues (http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15856/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Adwarfdump-dump-gdbindex.test) should be fixed in that version. Issue was that MSVS does not support "%zu". Though it works fine on MSCS 2015, Bot looks running MSVS 2013 that does not like it. MSDN also says that "z" prefix is not supported: https://msdn.microsoft.com/en-us/library/tcxf1dw6.aspx I had to use PRId64 instead. Original commit message: [llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section. gold linker's --gdb-index option currently is able to create the .gdb_index section that allows GDB to locate and read the .dwo files as it needs them, this helps reduce the total size of the object files processed by the linker. More info about that: https://gcc.gnu.org/wiki/DebugFission https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html Patch teaches dwarfdump tool to dump this section. Differential revision: https://reviews.llvm.org/D21503 llvm-svn: 282239	2016-09-23 11:01:53 +00:00
George Rimar	a348527186	Revert r282235 "[llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section." It broke BB: http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15856 llvm-svn: 282238	2016-09-23 10:12:56 +00:00
Alexey Bataev	fee9078dcd	[InstCombine] Fix for PR29124: reduce insertelements to shufflevector If inserting more than one constant into a vector: define <4 x float> @foo(<4 x float> %x) { %ins1 = insertelement <4 x float> %x, float 1.0, i32 1 %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2 ret <4 x float> %ins2 } InstCombine could reduce that to a shufflevector: define <4 x float> @goo(<4 x float> %x) { %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3> ret <4 x float> %shuf } Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e. shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> -> insertelement <4 x float> %v, float 1.0, 1 Differential Revision: https://reviews.llvm.org/D24182 llvm-svn: 282237	2016-09-23 09:14:08 +00:00
George Rimar	a77bcf5e42	[llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section. gold linker's --gdb-index option currently is able to create the .gdb_index section that allows GDB to locate and read the .dwo files as it needs them, this helps reduce the total size of the object files processed by the linker. More info about that: https://gcc.gnu.org/wiki/DebugFission https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html Patch teaches dwarfdump tool to dump this section. Differential revision: https://reviews.llvm.org/D21503 llvm-svn: 282235	2016-09-23 09:09:26 +00:00
Valery Pykhtin	355103f6c0	[AMDGPU] Refactor VOP1 and VOP2 instruction TD definitions Differential revision: https://reviews.llvm.org/D24738 llvm-svn: 282234	2016-09-23 09:08:07 +00:00
Craig Topper	a02e394872	[AVX-512] Split X86ISD::VFPROUND and X86ISD::VFPEXT into separate opcodes for each type constraint. This revealed that scalar intrinsics could create nodes with a rounding mode of FROUND_CUR_DIRECTION, but the patterns didn't check for it. It just worked because isel doesn't check operand count and we had a pattern without the rounding mode argument at all. llvm-svn: 282231	2016-09-23 06:24:43 +00:00
Craig Topper	3174b6e467	[AVX-512] Add separate ISD opcodes for each form of CVT instructions. Don't reuse non-X86 ISD opcodes with extra X86 specific arguments. llvm-svn: 282230	2016-09-23 06:24:39 +00:00
Craig Topper	d70ec9b25e	[AVX-512] Use different ISD opcodes for some of the scalar intrinsic lowering. Isel is not very robust against using the same ISD opcode with different number of operands so its better to separate. llvm-svn: 282229	2016-09-23 06:24:35 +00:00
Kostya Serebryany	ce1cab169f	[libFuzzer] be more precise about what we reset in TracePC llvm-svn: 282225	2016-09-23 02:18:59 +00:00
Kostya Serebryany	16a145fd0f	[libFuzzer] fix merging with trace-pc-guard llvm-svn: 282224	2016-09-23 01:58:51 +00:00
Tom Stellard	e88bbc34c6	AMDGPU/SI: Include implicit arguments in kernarg_segment_byte_size Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D24835 llvm-svn: 282223	2016-09-23 01:33:26 +00:00
Kostya Serebryany	87a598e19f	[libFuzzer] simplify the TracePC logic llvm-svn: 282222	2016-09-23 01:20:07 +00:00
Quentin Colombet	ceb71c2f43	[RegisterBankInfo] Mark the dump methods with LLVM_DUMP_METHOD. NFC llvm-svn: 282221	2016-09-23 00:59:12 +00:00
Quentin Colombet	fd0ab5c660	[AArch64][RegisterBankInfo] Sanity check TableGen'ed like inputs. Make sure the entries written to mimic the behavior of TableGen are sane. llvm-svn: 282220	2016-09-23 00:59:07 +00:00
Kostya Serebryany	ab73c6924f	[libFuzzer] move value profiling logic into TracePC llvm-svn: 282219	2016-09-23 00:46:18 +00:00
Tom Stellard	e190cd2834	Triple: Add opencl environment type Summary: For AMDGPU, we have been using the operating system component of the triple for specifying the low-level runtime that is being used. The rationale for this is that the host operating system (e.g. Linux) is irrelevant for GPU code, since its execution enviroment will be mostly controled by the low-level runtime being used to execute the code. In most cases, higher level languages have their own runtime which is implemented on top of the low-level runtime. The kernel ABIs of each language mostly depend on the low-level runtime, but there may be some slight differences between languages. OpenCL for example, may append additional arguments to the kernel in order to pass values like global offsets or buffers for printf. OpenMP, HCC, or other languages may want to add their own values which differ from OpenCL. The reason for adding a new opencl environment type is to make it possible for the backend to distinguish between the ABIs of the higher-level languages and handle them correctly. It seems cleaner to use the enviroment component for this rather than creating a new OS type for every combination of low-level runtime / high-level language. Reviewers: Anastasia, chandlerc Subscribers: whchung, pekka.jaaskelainen, wdng, yaxunl, llvm-commits Differential Revision: https://reviews.llvm.org/D24735 llvm-svn: 282218	2016-09-23 00:42:56 +00:00
Petr Hosek	2f4ac44dc3	[MC] Support skip and count for .incbin directive These optional arguments are supported by GNU assembler. Differential Revision: https://reviews.llvm.org/D24714 llvm-svn: 282217	2016-09-23 00:41:06 +00:00
Kostya Serebryany	d28099de5d	[libFuzzer] change ValueBitMap to remember the number of bits in it llvm-svn: 282216	2016-09-23 00:22:46 +00:00
Quentin Colombet	5b16d931dc	[AArch64][RegisterBankInfo] Switch to TableGen'ed like PartialMapping. Statically instanciate the most common PartialMappings. This should be closer to what the code would look like when TableGen support is added for GlobalISel. As a side effect, this should improve compile time. llvm-svn: 282215	2016-09-23 00:14:36 +00:00
Quentin Colombet	1946580cf2	[RegisterBankInfo] Check that the mapping covers the interesting bits. In the verify method of the ValueMapping class we used to check that the mapping exactly matches the bits of the input value. This is problematic for statically allocated mappings because we would need a different mapping for each different size of the value that maps on one instruction. For instance, with such scheme, we would need a different mapping for a value of size 1, 5, 23 whereas they all end up on a 32-bit wide instruction. Therefore, change the verifier to check that the meaningful bits are covered by the mapping instead of matching them. llvm-svn: 282214	2016-09-23 00:14:34 +00:00
Quentin Colombet	0afa7d6b82	[RegisterBankInfo] Use array instead of SmallVector for BreakDown. This is another step toward TableGen'ed like structures. The BreakDown of the mapping of the value will be statically computed by TableGen, thus we only have to point to the right entry in the table instead of dynamically allocate the mapping for each instruction. We still support the dynamic allocation through a factory of PartialMapping to ease the bring-up of the targets while the TableGen backend is not available. llvm-svn: 282213	2016-09-23 00:14:30 +00:00
Kostya Serebryany	be0ed59cdc	[libFuzzer] simplify the crash minimizer; split MaxLen into two: MaxInputLen and MaxMutationLen, allow MaxMutationLen to be less than MaxInputLen llvm-svn: 282211	2016-09-22 23:16:36 +00:00
Sanjay Patel	30ef70b090	[InstCombine] fold X urem C -> X < C ? X : X - C when C is big (PR28672) We already have the udiv variant of this transform, so I think this is ok for InstCombine too even though there is an increase in IR instructions. As the tests and TODO comments show, the transform can lead to follow-on combines. This should fix: https://llvm.org/bugs/show_bug.cgi?id=28672 Differential Revision: https://reviews.llvm.org/D24527 llvm-svn: 282209	2016-09-22 22:36:26 +00:00
Davide Italiano	fcee2d8001	[AsmParser] Remove unused partial template specialization. llvm-svn: 282206	2016-09-22 22:02:59 +00:00
Matthias Braun	5f8492e2ce	MachineScheduler: Slightly simplify release node llvm-svn: 282201	2016-09-22 21:39:56 +00:00
Matthias Braun	46533e614b	MachineScheduler: Remove ineffective heuristic; NFC Currently all nodes get added to the NextSU list when they are released, so any candidate must be in that list, making the heuristic ineffective. Remove it for now, we can add it back later in a working fashion if necessary. llvm-svn: 282200	2016-09-22 21:39:52 +00:00
Hans Wennborg	c7957ef86c	Revert r282168 "GVN-hoist: fix store past load dependence analysis (PR30216)" and also the dependent r282175 "GVN-hoist: do not dereference null pointers" It's causing compiler crashes building Harfbuzz (PR30499). llvm-svn: 282199	2016-09-22 21:20:53 +00:00
Krzysztof Parzyszek	29e93f3880	[RDF] Add initial support for lane masks in the DFG Use lane masks for calculating covering and aliasing of register references. llvm-svn: 282194	2016-09-22 21:01:24 +00:00
Krzysztof Parzyszek	de5fcbaf92	[Hexagon] Remove USR_OVF from CtrRegs register class USR_OVF is a subregister of USR, which is a member of CtrRegs. Having both a register and its proper subregister in the same register class has bad consequences for lane mask calculation: based solely on the lane mask info, USR_OVF would not appear to be a subregister of USR. llvm-svn: 282192	2016-09-22 20:59:41 +00:00
Krzysztof Parzyszek	670e0ca24f	[RDF] Print the function name for calls in dumps llvm-svn: 282191	2016-09-22 20:58:19 +00:00
Krzysztof Parzyszek	c51f2394a6	[RDF] Use uint32_t for register numbers instead of unsigned llvm-svn: 282190	2016-09-22 20:56:39 +00:00
Arnold Schwaighofer	0fd32c005b	i386 does not support optimized swifterror handling rdar://28432565 llvm-svn: 282186	2016-09-22 20:06:25 +00:00
Hans Wennborg	c4b1d20ba2	Win64: Don't emit unwind info for "leaf" functions (PR30337) According to MSDN (see the PR), functions which don't touch any callee-saved registers (including %rsp) don't need any unwind info. This patch makes LLVM not emit unwind info for such functions, to save binary size. Differential Revision: https://reviews.llvm.org/D24748 llvm-svn: 282185	2016-09-22 19:50:05 +00:00
Nemanja Ivanovic	8dacca943a	[PowerPC] Sign extend sub-word values for atomic comparisons Atomic comparison instructions use the sub-word load instruction on Power8 and up but the value is not sign extended prior to the signed word compare instruction. This patch adds that sign extension. llvm-svn: 282182	2016-09-22 19:06:38 +00:00
Nirav Dave	9011da3d44	[DAG] Fix incorrect alignment of ext load. Correctly use alignment size from loaded size not output value size. Reviewers: jyknight, tstellarAMD, arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23356 llvm-svn: 282177	2016-09-22 17:28:43 +00:00
Sebastian Pop	1531f30ccc	GVN-hoist: do not dereference null pointers there may be basic blocks without memory accesses, in which case the list of accesses is a null pointer. llvm-svn: 282175	2016-09-22 17:22:58 +00:00
Krzysztof Parzyszek	b66efb855c	[PPC] Set SP after loading data from stack frame, if no red zone is present Follow-up to r280705: Make sure that the SP is only restored after all data is loaded from the stack frame, if there is no red zone. This completes the fix for https://llvm.org/bugs/show_bug.cgi?id=26519. Differential Revision: https://reviews.llvm.org/D24466 llvm-svn: 282174	2016-09-22 17:22:43 +00:00
Zachary Turner	d5d57635ba	Speculative fix for build failures due to consumeInteger. A recent patch added support for consumeInteger() and made getAsInteger delegate to this function. A few buildbots are failing as a result with an assertion failure. On a hunch, I tested what happens if I call getAsInteger() on an empty string, and sure enough it crashes the same way that the buildbots are crashing. I confirmed that getAsInteger() on an empty string did not crash before my patch, so I suspect this to be the cause. I also added a unit test for the empty string. llvm-svn: 282170	2016-09-22 15:55:05 +00:00
Sebastian Pop	8e6e3318c2	GVN-hoist: fix store past load dependence analysis (PR30216) To hoist stores past loads, we used to search for potential conflicting loads on the hoisting path by following a MemorySSA def-def link from the store to be hoisted to the previous defining memory access, and from there we followed the def-use chains to all the uses that occur on the hoisting path. The problem is that the def-def link may point to a store that does not alias with the store to be hoisted, and so the loads that are walked may not alias with the store to be hoisted, and even as in the testcase of PR30216, the loads that may alias with the store to be hoisted are not visited. The current patch visits all loads on the path from the store to be hoisted to the hoisting position and uses the alias analysis to ask whether the store may alias the load. I was not able to use the MemorySSA functionality to ask for whether load and store are clobbered: I'm not sure which function to call, so I used a call to AA->isNoAlias(). Store past store is still working as before using a MemorySSA query: I added an extra test to pr30216.ll to make sure store past store does not regress. Differential Revision: https://reviews.llvm.org/D24517 llvm-svn: 282168	2016-09-22 15:33:51 +00:00
Sebastian Pop	5d68aa7913	GVN-hoist: fix typo llvm-svn: 282165	2016-09-22 15:08:09 +00:00
Zachary Turner	65fd2fc7b4	[Support] Add StringRef::consumeInteger. StringRef::getInteger() exists and treats the entire string as an integer of the specified radix, failing if any invalid characters are encountered or the number overflows. Sometimes you might have something like "123456foo" and you want to get the number 123456 and leave the string "foo" remaining. This is similar to what would be possible by using the standard runtime library functions strtoul et al and specifying an end pointer. This patch adds consumeInteger(), which does exactly that. It consumes as much as possible until an invalid character is found, and modifies the StringRef in place so that upon return only the portion of the StringRef after the number remains. Differential Revision: https://reviews.llvm.org/D24778 llvm-svn: 282164	2016-09-22 15:05:19 +00:00
Etienne Bergeron	7f0e315327	[compiler-rt] fix typo in option description [NFC] llvm-svn: 282163	2016-09-22 14:57:24 +00:00
Sebastian Pop	440f15b7fc	GVN-hoist: only hoist relevant scalar instructions Without this patch, GVN-hoist would think that a branch instruction is a scalar instruction and would try to value number it. The patch filters out all such kind of irrelevant instructions. A bit frustrating is that there is no easy way to discard all those very infrequent instructions, a bit like isa<TerminatorInst> that stands for a large family of instructions. I'm thinking that checking for those very infrequent other instructions would cost us more in compilation time than just letting those instructions getting numbered, so I'm still thinking that a simpler check: if (isa<TerminatorInst>(I)) return false; is better than listing all the other less frequent instructions. Differential Revision: https://reviews.llvm.org/D23929 llvm-svn: 282160	2016-09-22 14:45:40 +00:00
Keith Walker	ba1598975f	Reapplying r281895 (and follow-up r281964) after fixing pr30468. The additional fix is: When adding debug information to a lowered phi node in mem2reg check that we have a valid insertion point after the phi for adding the debug information. This change addresses the issue in pr30468 where a lowered phi was added before a catchswitch and no debug information should be added after the phi in this case. Differential Revision: https://reviews.llvm.org/D24797 llvm-svn: 282155	2016-09-22 14:13:25 +00:00
Tim Northover	a5e38fa00d	GlobalISel: handle stack-based parameters on AArch64. llvm-svn: 282153	2016-09-22 13:49:25 +00:00
Anna Thomas	82c3717f54	[RS4GC] Remat in presence of phi and use live value Summary: Reviewers: Subscribers: llvm-svn: 282150	2016-09-22 13:13:06 +00:00
Artem Tamazov	2146a0a90e	[AMDGPU][mc] Add support for absolute expressions in DPP modifiers. Also added range checking for DPP attributes. Assembler tests added as well. Differential Revision: https://reviews.llvm.org/D24755 llvm-svn: 282145	2016-09-22 11:47:21 +00:00
Nemanja Ivanovic	e78ffede6f	[PowerPC] Remove LE patterns matching generic stores/loads to VSX permuting ops This patch corresponds to: https://reviews.llvm.org/D21409 The LXVD2X, LXVW4X, STXVD2X and STXVW4X instructions permute the two doublewords in the vector register when in little-endian mode. Custom code ensures that the necessary swaps are inserted for these. This patch simply removes the possibilty that a load/store node will match one of these instructions in the SDAG as that would not insert the necessary swaps. llvm-svn: 282144	2016-09-22 10:32:03 +00:00
Nemanja Ivanovic	6e7879c5e6	[Power9] Add exploitation of non-permuting memory ops This patch corresponds to review: https://reviews.llvm.org/D19825 The new lxvx/stxvx instructions do not require the swaps to line the elements up correctly. In order to select them over the lxvd2x/lxvw4x instructions which require swaps, the patterns for the old instruction have a predicate that ensures they won't be selected on Power9 and newer CPUs. llvm-svn: 282143	2016-09-22 09:52:19 +00:00
Sagar Thakur	e74eb4e7b9	[EfficiencySanitizer] Using '$' instead of '#' for struct counter name For MIPS '#' is the start of comment line. Therefore we get assembler errors if # is used in the structure names. Differential: D24334 Reviewed by: zhaoqin llvm-svn: 282141	2016-09-22 08:33:06 +00:00
Dorit Nuzman	d1247a684e	Fix revision 281960 llvm-svn: 282139	2016-09-22 07:56:23 +00:00
Craig Topper	202b453a8a	[AVX-512] Add support for commuting VPTERNLOG instructions. VPTERNLOG is a ternary instruction with an immediate specifying the logical operation to perform. For each bit position in the 3 source vectors the bit from each source is concatenated together and the resulting 3-bit value is used to select a bit in the immediate. This bit value is written to the result vector. We can commute this by swapping operands and modifying the immediate. To modify the immediate we need to swap two pairs of bits. The pairs correspond to the locations in the immediate where the commuted operands bits have opposite values and the uncommuted operand has the same value. Bits 0 and 7 will never be swapped since the relevant bits from all sources are the same value. This refactors and reuses parts of the FMA3 commuting code which is also a three operand instruction. llvm-svn: 282132	2016-09-22 03:00:50 +00:00
Quentin Colombet	6a76323c64	[RegisterBankInfo] Move to statically allocated RegisterBank. This commit is basically the first step toward what will RegisterBankInfo look when it gets TableGen'ed. It introduces a XXXGenRegisterBankInfo.def file that is what TableGen will issue at some point. Moreover, the RegBanks field in RegisterBankInfo changed to reflect the static (compile time) aspect of the information. llvm-svn: 282131	2016-09-22 02:10:37 +00:00
Quentin Colombet	b202eb88aa	[RegisterBankInfo] Take advantage of the extra argument of SmallVector::resize. When initializing an instance of OperandsMapper, instead of using SmallVector::resize followed by std::fill, use the function that directly does that in SmallVector. llvm-svn: 282130	2016-09-22 02:10:32 +00:00
Kostya Serebryany	624f59f4d8	[libFuzzer] add 'features' to the corpus elements, allow mutations with Size > MaxSize, fix sha1 in corpus stats; various refactorings llvm-svn: 282129	2016-09-22 01:34:58 +00:00
Kostya Serebryany	c9e3de35ed	[libFuzzer] one more test llvm-svn: 282127	2016-09-22 00:57:29 +00:00
Kostya Serebryany	29bb664075	[libFuzzer] add stats to the corpus; more refactoring llvm-svn: 282121	2016-09-21 22:42:17 +00:00
Kostya Serebryany	20801e1b8a	[libFuzzer] more refactoring; don't compute sha1sum every time we mutate a unit from the corpus, use the stored one. llvm-svn: 282115	2016-09-21 21:41:48 +00:00
Kostya Serebryany	8658618ea0	[libFuzzer] more refactoring llvm-svn: 282113	2016-09-21 21:17:23 +00:00
Kevin Enderby	e71e13c7d6	Next set of additional error checks for invalid Mach-O files for bad LC_UUID load commands. Added a missing check and made the check for more than one like other other “more than one” checks. And of course added test cases. llvm-svn: 282104	2016-09-21 20:03:09 +00:00
Chad Rosier	00eb8db3a1	[LoopInterchange] Track all dependencies, not just anti dependencies. Currently, we give up on loop interchange if we encounter a flow dependency anywhere in the loop list. Worse yet, we don't even track output dependencies. This patch updates the dependency matrix computation to track flow and output dependencies in the same way we track anti dependencies. This improves an internal workload by 2.2x. Note the loop interchange pass is off by default and it can be enabled with '-mllvm -enable-loopinterchange' Differential Revision: https://reviews.llvm.org/D24564 llvm-svn: 282101	2016-09-21 19:16:47 +00:00
Teresa Johnson	3f212b8908	[ThinLTO] Emit files for distributed builds for all modules With the new LTO API in r278338, we stopped emitting the individual index files and imports files for some modules in the distributed backend case (thinlto-index-only plugin option). Specifically, this is when the linker decides not to include a module in the link, because it was in an archive library and did not have a strong reference to it. Not creating the expected output files makes the distributed build system implementation more difficult, in terms of checking for the expected outputs of the thin link, and scheduling the backend jobs. To address this, the gold-plugin will write dummy empty .thinlto.bc and .imports files for modules not included in the link (which LTO never sees). Augmented a gold v1.12+ test, since that version of gold has the handling for notifying on modules not being included in the link. llvm-svn: 282100	2016-09-21 19:12:05 +00:00
Davide Italiano	0f8b5d65f6	[MIRParser] Delete dead code. NFCI. llvm-svn: 282098	2016-09-21 18:26:08 +00:00
Nico Weber	a489438849	revert 281908 because 281909 got reverted llvm-svn: 282097	2016-09-21 18:25:43 +00:00
Arnold Schwaighofer	de2490d0dc	Disable tail calls if there is an swifterror argument ISel does not handle them correctly yet i.e we crash trying to emit tail call code. radar://28407842 llvm-svn: 282088	2016-09-21 16:53:36 +00:00
Matthew Simpson	15869f86d8	[LV] Don't emit unused scalars for uniform instructions If we identify an instruction as uniform after vectorization, we know that we should only use the value corresponding to the first vector lane of each unroll iteration. However, when scalarizing such instructions, we still produce values for the other vector lanes. This patch prevents us from generating the unused scalars. Differential Revision: https://reviews.llvm.org/D24275 llvm-svn: 282087	2016-09-21 16:50:24 +00:00
Artem Tamazov	2e217b87cb	[AMDGPU][mc] Add support for ds_add_[rtn_]f32. Lit tests added. Resolves https://github.com/RadeonOpenCompute/hcc/issues/122. Differential Revision: https://reviews.llvm.org/D24765 llvm-svn: 282086	2016-09-21 16:35:44 +00:00
Dehao Chen	160fbc3f95	Change the basic block weight calculation algorithm to use max instead of voting. Summary: Now that we have more precise debug info, we should change back to use maximum to get basic block weight. Reviewers: dnovillo Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D24788 llvm-svn: 282084	2016-09-21 16:26:51 +00:00
Matthew Simpson	a95e8bb7ed	[LV] Rename "Width" to "Lane" (NFC) llvm-svn: 282083	2016-09-21 16:09:23 +00:00
Hans Wennborg	1049085c78	Revert r281895 "Add @llvm.dbg.value entries for the phi node created by -mem2reg" (And follow-up r281964.) It caused PR30468. llvm-svn: 282077	2016-09-21 15:55:53 +00:00
Nico Weber	903859c0e4	Revert r281715, it caused PR30475 llvm-svn: 282076	2016-09-21 15:33:24 +00:00
Arnold Schwaighofer	f62ba1031f	DeadArgElim: Don't mark swifterror arguments as unused Replacing swifterror arguments with undef creates invalid IR. rdar://28300490 llvm-svn: 282075	2016-09-21 15:29:08 +00:00
Chad Rosier	f7c76f91e0	[LoopInterchange] Various cleanup. NFC. llvm-svn: 282071	2016-09-21 13:28:41 +00:00
Tim Northover	9a46718378	GlobalISel: produce correct code for signext/zeroext ABI flags. We still don't really have an equivalent of "AssertXExt" in DAG, so we don't exploit the guarantees on the receiving side yet, but this should produce conservatively correct code on iOS ABIs. llvm-svn: 282069	2016-09-21 12:57:45 +00:00
Tim Northover	862758ec14	GlobalISel: pass Function to lowerFormalArguments directly (NFC). The only implementation that exists immediately looks it up anyway, and the information is needed to handle various parameter attributes (stored on the function itself). llvm-svn: 282068	2016-09-21 12:57:35 +00:00
Sam Kolton	12b633beda	[AMDGPU] Assembler: remove unused AMDGPUMCObjectWriter. Summary: It is replaced by AMDGPUELFObjectWriter Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl Differential Revision: https://reviews.llvm.org/D24654 llvm-svn: 282065	2016-09-21 10:33:32 +00:00
Simon Dardis	9a66bbecae	[mips] LLVM PR/30197 - Tail call incorrectly clobbers arguments for mips The postRA scheduler performs alias analysis to determine if stores and loads can moved past each other. When a function has more arguments than argument registers for the calling convention used, excess arguments are spilled onto the stack. LLVM by default assumes that argument slots are immutable, unless the function contains a tail call. Without the knowledge of that a function contains a tail call site, stores and loads to fixed stack slots may be re-ordered causing the out-going arguments to clobber the incoming arguments before the incoming arguments are supposed to be dead. Reviewers: vkalintiris Differential Review: https://reviews.llvm.org/D24077 llvm-svn: 282063	2016-09-21 09:43:40 +00:00
Diana Picus	2a3f066349	Revert "AArch64: Set shift bit of TLSLE HI12 add instruction" This reverts commit r282057 because it broke the buildbots - see e.g. http://lab.llvm.org:8011/builders/clang-cmake-aarch64-42vma/builds/12063 llvm-svn: 282058	2016-09-21 08:24:41 +00:00
Lei Liu	6c87f23526	AArch64: Set shift bit of TLSLE HI12 add instruction Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model. This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000. Reviewers: t.p.northover, peter.smith, rovka Subscribers: salim.nasser, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24702 llvm-svn: 282057	2016-09-21 07:41:41 +00:00
Craig Topper	29f1a1f834	[AVX-512] Split the 3 different usages of the X86ISD::FSETCC opcode into 3 different opcodes. It turns out isel is really not robust against having different type profiles for the same opcode. It turns out that if you put an illegal rounding mode(i.e. not CUR_DIRECTION or NO_EXC) on a comiss intrinsic we would generate the FSETCC form with the rounding mode added, but then pattern match to an instruction with ROUND_CUR_DIRECTION. We can probably get away with just one FSETCCM opcode that always contains the rounding mode and explicitly put ROUND_CUR_DIRECTION in the pattern, but I'll leave that for future work. With this change the clang tests for the comiss intrinsics that used an incorrect rounding mode of 3 properly fail isel instead of silently doing the wrong thing. Those clang tests will be fixed in a follow up commit and I also plan to add rounding mode checking to clang. llvm-svn: 282055	2016-09-21 06:37:54 +00:00
Craig Topper	d868870f17	[AVX-512] Don't add an additional rounding mode operand to the avx512 vcvtps2ph intrinsic lowering. There was no way to control its value so it was always FROUND_CURRENT making it unnecessary. The true rounding mode is encoded in the immediate operand of the instruction. This also removes the pattern from the rb form of the instructions since there is no way to specify the FROUND_NO_EXC rounding mode it required. llvm-svn: 282052	2016-09-21 03:58:44 +00:00
Craig Topper	a27f54b4d9	[AVX-512] Simplify handling of INTR_TYPE_1OP_MASK_RM to remove support for the second opcode since its never used. This makes it consistent with INTR_TYPE_2OP_MASK_RM and INTR_TYPE_3OP_MASK_RM. And even if it was used we were passing the same operands to both so it wouldn't make sense to have two opcodes. llvm-svn: 282051	2016-09-21 03:58:41 +00:00
Kostya Serebryany	225d8e45d4	[libFuzzer] fix libc++ build llvm-svn: 282050	2016-09-21 03:50:37 +00:00
Adam Nemet	e3cef93727	[LV] When reporting about a specific instruction without debug location use loop's This can occur for example if some optimization drops the debug location. llvm-svn: 282048	2016-09-21 03:14:20 +00:00
Kostya Serebryany	556894fb10	[libFuzzer] more refactoring; NFC llvm-svn: 282047	2016-09-21 02:05:39 +00:00
Craig Topper	e18258dc1c	[AVX-512] Don't lower avx512 vcvtps2ph/vcvtph2ps nodes to ISD::FP16_TO_FP/ISD::FP_TO_FP16 with an extra x86 specific rounding mode operand. We should use a target specific ISD opcode. llvm-svn: 282046	2016-09-21 02:05:22 +00:00
Jacques Pienaar	98345fc0a1	[NVPTX] Check if callsite is defined when computing argument allignment Summary: In getArgumentAlignment check if the ImmutableCallSite pointer CS is non-null before dereferencing. If CS is 0x0 fall back to the ABI type alignment else compute the alignment as before. Reviewers: eliben, jpienaar Subscribers: jlebar, vchuravy, cfe-commits, jholewinski Differential Revision: https://reviews.llvm.org/D9168 llvm-svn: 282045	2016-09-21 01:57:57 +00:00
Kostya Serebryany	6f5a804cdb	[libFuzzer] refactoring: split the large header into many; NFC llvm-svn: 282044	2016-09-21 01:50:50 +00:00
Kostya Serebryany	09aa01a6f8	[libFuzzer] refactoring: move the Corpus into a separate class; delete two unused experimental features llvm-svn: 282042	2016-09-21 01:04:43 +00:00
Michael Kuperstein	79dcc274b4	[InferAttributes] Don't access parameters that don't exist. Check for the correct number of parameters before querying their type. This fixes PR30455. llvm-svn: 282038	2016-09-20 23:10:31 +00:00
Teresa Johnson	620c140a9b	[ThinLTO] Always emit a summary when compiling in ThinLTO mode Summary: Emit an empty summary section, instead of no summary section, when there are no global variables in the index. This ensures that LTO will treat these files as ThinLTO inputs, instead of as regular LTO inputs. In addition to not being what the user likely intended when compiling with -flto=thin, the current behavior is problematic for distributed build systems that expect to get ThinLTO index and imports files back for each input compiled with -flto=thin. Combining into a single regular LTO module also reduces the backend parallelism. And in the case where the index was suppressed due to uses in inline assembly, combining into a single LTO module could provoke renaming of duplicates that we were trying to prevent by suppressing the index. This change required a couple of fixes to handle the empty summary section. Reviewers: mehdi_amini Subscribers: mehdi_amini, llvm-commits, pcc Differential Revision: https://reviews.llvm.org/D24779 llvm-svn: 282037	2016-09-20 23:07:17 +00:00
Xinliang David Li	9780fc1451	code cleanup -- commoning IR travsersals llvm-svn: 282034	2016-09-20 22:39:47 +00:00
Eric Christopher	5653e5dffc	Remove the default subtarget from the x86 port as it isn't necessary (or correct) anymore. llvm-svn: 282031	2016-09-20 22:19:33 +00:00
Eric Christopher	c4636b3002	Revert "Remove extra argument used once on TargetMachine::getNameWithPrefix and inline the result into the singular caller." and "Remove more guts of TargetMachine::getNameWithPrefix and migrate one check to the TLOF mach-o version." temporarily until I can get the whole call migrated out of the TargetMachine as we could hit places where TLOF isn't valid. This reverts commits r281981 and r281983. llvm-svn: 282028	2016-09-20 22:03:28 +00:00
Anna Thomas	8cd7de1d18	[RS4GC] Refactor code for Rematerializing in presence of phi. NFC Summary: This is an NFC refactoring change as a precursor to the actual fix for rematerializing in presence of phi. https://reviews.llvm.org/D24399 Pasted from review: findRematerializableChainToBasePointer changed to return the root of the chain. instead of true or false. move the PHI matching logic into the caller by inspecting the root return value. This includes an assertion that the alternate root is in the liveset for the call. Tested with current RS4GC tests. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24780 llvm-svn: 282023	2016-09-20 21:36:02 +00:00
George Burgess IV	b44b6d98b3	[CodeGen] stop short-circuiting the SSP code for sspstrong. This check caused us to skip adding layout information for calls to alloca in sspreq/sspstrong mode. We check properly for sspstrong later on (and add the correct layout info when doing so), so removing this shouldn't hurt. No test is included, since testing this using lit seems to require checking for exact offsets in asm, which is something that the lit tests for this avoid. If someone cares deeply, I'm happy to write a unittest or something to cover this, but that feels like overkill. Patch by Daniel Micay. Differential Revision: https://reviews.llvm.org/D22714 llvm-svn: 282022	2016-09-20 21:30:01 +00:00
Petr Hosek	1290355f5a	Mark ELF sections whose name start with .note as note Previously, such section would be marked as SHT_PROGBITS which makes it impossible to use an initialized C variable declaration to emit an (allocated) ELF note. The new behavior is also consistent with ELF assembly parser. Differential Revision: https://reviews.llvm.org/D24692 llvm-svn: 282010	2016-09-20 20:21:13 +00:00
Xinliang David Li	c7368287b7	[Profile] Do not annotate select insts not covered in profile. Fixed PR/30466 llvm-svn: 282009	2016-09-20 20:20:01 +00:00
Kevin Enderby	fc0929adb8	Next set of additional error checks for invalid Mach-O files for bad load commands that use the Mach::dylib_command type for the load commands that are currently used in the MachOObjectFile constructor. This contains the missing checks for LC_ID_DYLIB, LC_ID_DYLIB, etc. load commands and the fields for the Mach::dylib_command type. Also checks that only an MH_DYLIB or MH_STUB_DYLIB has an LC_ID_DYLIB load command (and others filetype don’t) and there is not more than one of these load commands. llvm-svn: 282008	2016-09-20 20:14:14 +00:00

... 21 22 23 24 25 ...

97308 Commits