llvm-project

Commit Graph

Author	SHA1	Message	Date
Rafael Espindola	d5aa733769	Avoid using alignas and constexpr. This requires removing the custom allocator, since Demangle cannot depend on Support and so cannot use Compiler.h. llvm-svn: 280750	2016-09-06 20:36:24 +00:00
Konstantin Zhuravlyov	1d65026ca6	[AMDGPU] Wave and register controls - Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 llvm-svn: 280747	2016-09-06 20:22:28 +00:00
Rafael Espindola	ec73f5dacf	Try to fix a circular dependency in the modules build. llvm-svn: 280746	2016-09-06 20:16:19 +00:00
Tom Stellard	2add8a1140	AMDGPU/SI: Teach SIInstrInfo::FoldImmediate() to fold immediates into copies Summary: I put this code here, because I want to re-use it in a few other places. This supersedes some of the immediate folding code we have in SIFoldOperands. I think the peephole optimizers is probably a better place for folding immediates into copies, since it does some register coalescing in the same time. This will also make it easier to transition SIFoldOperands into a smarter pass, where it looks at all uses of instruction at once to determine the optimal way to fold operands. Right now, the pass just considers one operand at a time. Reviewers: arsenm Subscribers: wdng, nhaehnle, arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23402 llvm-svn: 280744	2016-09-06 20:00:26 +00:00
Wei Ding	5e832e866e	AMDGPU : Add XNACK feature to GPUs that support it. Differential Revision: http://reviews.llvm.org/D24276 llvm-svn: 280742	2016-09-06 19:55:17 +00:00
Reid Kleckner	b2881f1fbe	Fix ItaniumDemangle.cpp build with MSVC 2013 llvm-svn: 280740	2016-09-06 19:39:56 +00:00
Evandro Menezes	405c90e6cc	[AArch64] Adjust the scheduling model for Exynos M1. Further refine the model for branches. llvm-svn: 280736	2016-09-06 19:22:29 +00:00
Evandro Menezes	77e6b5d4e0	[AArch64] Adjust the scheduling model for Exynos M1. Further refine the model for stores. llvm-svn: 280735	2016-09-06 19:22:27 +00:00
Evandro Menezes	199cad4f17	[AArch64] Adjust the scheduling model for Exynos M1. Further refine the model for loads. llvm-svn: 280734	2016-09-06 19:22:19 +00:00
Rafael Espindola	b940b66c60	Add an c++ itanium demangler to llvm. This adds a copy of the demangler in libcxxabi. The code also has no dependencies on anything else in LLVM. To enforce that I added it as another library. That way a BUILD_SHARED_LIBS will fail if anyone adds an use of StringRef for example. The no llvm dependency combined with the fact that this has to build on linux, OS X and Windows required a few changes to the code. In particular: No constexpr. No alignas On OS X at least this library has only one global symbol: __ZN4llvm16itanium_demangleEPKcPcPmPi My current plan is: Commit something like this Change lld to use it Change lldb to use it as the fallback Add a few #ifdefs so that exactly the same file can be used in libcxxabi to export abi::__cxa_demangle. Once the fast demangler in lldb can handle any names this implementation can be replaced with it and we will have the one true demangler. llvm-svn: 280732	2016-09-06 19:16:48 +00:00
Sanjay Patel	4e463b4a2c	fix formatting; NFC llvm-svn: 280727	2016-09-06 18:16:31 +00:00
Davide Italiano	5715012b9e	[MCTargetDesc] Delete dead code. Found by GCC7 -Wunused-function. Also unbreak newer gcc build with -Werror. llvm-svn: 280726	2016-09-06 18:02:09 +00:00
Krzysztof Parzyszek	7c9b012629	[RDF] Ignore undef use operands llvm-svn: 280717	2016-09-06 17:03:13 +00:00
Leny Kholodov	40c6235b79	Formatting with clang-format patch r280700 llvm-svn: 280716	2016-09-06 17:03:02 +00:00
Simon Pilgrim	1b4462b7c1	[SelectionDAG] Simplify extract_subvector( insert_subvector ( Vec, In, Idx ), Idx ) -> In If we are extracting a subvector that has just been inserted then we should just use the original inserted subvector. This has come up in certain several x86 shuffle lowering cases where we are crossing 128-bit lanes. Differential Revision: https://reviews.llvm.org/D24254 llvm-svn: 280715	2016-09-06 16:42:05 +00:00
Adam Nemet	c520822dbf	[JumpThreading] Only write back branch-weight MDs for blocks that originally had PGO info Currently the pass updates branch weights in the IR if the function has any PGO info (entry frequency is set). However we could still have regions of the CFG that does not have branch weights collected (e.g. a cold region). In this case we'd use static estimates. Since static estimates for branches are determined independently, they are inconsistent. Updating them can "randomly" inflate block frequencies. I've run into this in a completely cold loop of h264ref from SPEC. -Rpass-with-hotness showed the loop to be completely cold during inlining (before JT) but completely hot during vectorization (after JT). The new testcase demonstrate the problem. We check array elements against 1, 2 and 3 in a loop. The check against 3 is the loop-exiting check. The block names should be self-explanatory. In this example, jump threading incorrectly updates the weight of the loop-exiting branch to 0, drastically inflating the frequency of the loop (in the range of billions). There is no run-time profile info for edges inside the loop, so branch probabilities are estimated. These are the resulting branch and block frequencies for the loop body: check_1 (16) (8) / \| eq_1 \| (8) \ \| check_2 (16) (8) / \| eq_2 \| (8) \ \| check_3 (16) (1) / \| (loop exit) \| (15) \| (back edge) First we thread eq_1 -> check_2 to check_3. Frequencies are updated to remove the frequency of eq_1 from check_2 and then from the false edge leaving check_2. Changed frequencies are highlighted with * : check_1 (16) (8) / \| eq_1~ \| (8) / \| / check_2 (8) / (8) / \| \ eq_2 \| (0) \ \ \| ` --- check_3 (16) (1) / \| (loop exit) \| (15) \| (back edge) Next we thread eq_1 -> check_3 and eq_2 -> check_3 to check_1 as new back edges. Frequencies are updated to remove the frequency of eq_1 and eq_3 from check_3 and then the false edge leaving check_3 (changed frequencies are highlighted with ): check_1 (16) (8) / \| eq_1~ \| (8) / \| / check_2 (8) / (8) / \| /-- eq_2~ \| (0) (back edge) \| check_3 (0) (0) / \| (loop exit) \| (0*) \| (back edge) As a result, the loop exit edge ends up with 0 frequency which in turn makes the loop header to have maximum frequency. There are a few potential problems here: 1. The profile data seems odd. There is a single profile sample of the loop being entered. On the other hand, there are no weights inside the loop. 2. Based on static estimation we shouldn't set edges to "extreme" values, i.e. extremely likely or unlikely. 3. We shouldn't create profile metadata that is calculated from static estimation. I am not sure what policy is but it seems to make sense to treat profile metadata as something that is known to originate from profiling. Estimated probabilities should only be reflected in BPI/BFI. Any one of these would probably fix the immediate problem. I went for 3 because I think it's a good policy to have and added a FIXME about 2. Differential Revision: https://reviews.llvm.org/D24118 llvm-svn: 280713	2016-09-06 16:08:33 +00:00
Chris Dewhurst	92cac9322d	[Sparc][Leon] Corrected supported atomics size for processors supporting Leon CASA instruction back to 32 bits. This was erroneously checked-in for 64 bits while trying to find if there was a way to get 64 bit atomicity in Leon processors. There is not and this change should not have been checked-in. There is no unit test for this as the existing unit tests test for behaviour to 32 bits, which was the original intention of the code. llvm-svn: 280710	2016-09-06 14:41:09 +00:00
Simon Dardis	b432a3ed7e	[mips] Tighten FastISel restrictions LLVM PR/29052 highlighted that FastISel for MIPS attempted to lower arguments assuming that it was using the paired 32bit registers to perform operations for f64. This mode of operation is not supported for MIPSR6. This patch resolves the reported issue by adding additional checks for unsupported floating point unit configuration. Thanks to mike.k for reporting this issue! Reviewers: seanbruno, vkalintiris Differential Review: https://reviews.llvm.org/D23795 llvm-svn: 280706	2016-09-06 12:36:24 +00:00
Krzysztof Parzyszek	020ec299bf	[PPC] Claim stack frame before storing into it, if no red zone is present Unlike PPC64, PPC32/SVRV4 does not have red zone. In the absence of it there is no guarantee that this part of the stack will not be modified by any interrupt. To avoid this, make sure to claim the stack frame first before storing into it. This fixes https://llvm.org/bugs/show_bug.cgi?id=26519. Differential Revision: https://reviews.llvm.org/D24093 llvm-svn: 280705	2016-09-06 12:30:00 +00:00
Leny Kholodov	5fcc4185f5	DebugInfo: use strongly typed enum for debug info flags Use ADT/BitmaskEnum for DINode::DIFlags for the following purposes: Get rid of unsigned int for flags to avoid problems on platforms with sizeof(int) < 4 Flags are now strongly typed Patch by: Victor Leschuk <vleschuk@gmail.com> Differential Revision: https://reviews.llvm.org/D23766 llvm-svn: 280700	2016-09-06 10:46:28 +00:00
Silviu Baranga	0b7c4af359	[RegisterScavenger] Remove aliasing registers of operands from the candidate set Summary: In addition to not including the register operand of the current instruction also don't include any aliasing registers. We can't consider these as candidates because using them will clobber the corresponding register operand of the current instruction. This change doesn't include a test case and it would probably be difficult to produce a stable one since the bug depends on the results of register allocation. Reviewers: MatzeB, qcolombet, hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D24130 llvm-svn: 280698	2016-09-06 10:10:21 +00:00
Craig Topper	4fa3b50fc3	[AVX-512] Fix masked VPERMI2PS isel when the index comes from a bitcast. We need to bitcast the index operand to a floating point type so that it matches the result type. If not then the passthru part of the DAG will be a bitcast from the index's original type to the destination type. This makes it very difficult to match. The other option would be to add 5 sets of patterns for every other possible type. llvm-svn: 280696	2016-09-06 06:56:59 +00:00
Craig Topper	43fbd840dd	[X86] Remove unused encoding from IntrinsicType enum. llvm-svn: 280694	2016-09-06 05:45:24 +00:00
Craig Topper	a0055d315d	[X86] Fix indentation. NFC llvm-svn: 280693	2016-09-06 05:45:21 +00:00
Saleem Abdulrasool	bfa25bd1ac	ARM: workaround bundled operation predication This is a Windows ARM specific issue. If the code path in the if conversion ends up using a relocation which will form a IMAGE_REL_ARM_MOV32T, we end up with a bundle to ensure that the mov.w/mov.t pair is not split up. This is normally fine, however, if the branch is also predicated, then we end up trying to predicate the bundle. For now, report a bundle as being unpredicatable. Although this is false, this would trigger a failure case previously anyways, so this is no worse. That is, there should not be any code which would previously have been if converted and predicated which would not be now. Under certain circumstances, it may be possible to "predicate the bundle". This would require scanning all bundle instructions, and ensure that the bundle contains only predicatable instructions, and converting the bundle into an IT block sequence. If the bundle is larger than the maximal IT block length (4 instructions), it would require materializing multiple IT blocks from the single bundle. llvm-svn: 280689	2016-09-06 04:00:12 +00:00
Mehdi Amini	3821b53b84	Revert "DebugInfo: use strongly typed enum for debug info flags" This reverts commit r280686, bots are broken. llvm-svn: 280688	2016-09-06 03:26:37 +00:00
Mehdi Amini	767e1457d8	[LTO] Constify (NFC) llvm-svn: 280687	2016-09-06 03:23:45 +00:00
Mehdi Amini	356d6b636b	DebugInfo: use strongly typed enum for debug info flags Use ADT/BitmaskEnum for DINode::DIFlags for the following purposes: * Get rid of unsigned int for flags to avoid problems on platforms with sizeof(int) < 4 * Flags are now strongly typed Patch by: Victor Leschuk <vleschuk@gmail.com> Differential Revision: https://reviews.llvm.org/D23766 llvm-svn: 280686	2016-09-06 03:14:06 +00:00
Craig Topper	62d0a5e7d3	[AVX-512] Fix v8i64 shift by immediate lowering on 32-bit targets. llvm-svn: 280684	2016-09-06 00:31:10 +00:00
Saleem Abdulrasool	a6519b1d54	CodeGen: ensure that libcalls are always AAPCS CC All of the builtins are designed to be invoked with ARM AAPCS CC even on ARM AAPCS VFP CC hosts. Tweak the default initialisation to ARM AAPCS CC rather than C CC for ARM/thumb targets. The changes to the tests are necessary to ensure that the calling convention for the lowered library calls are honoured. Furthermore, these adjustments cause certain branch invocations to change to branch-and-link since the returned value needs to be moved across registers (d0 -> r0, r1). llvm-svn: 280683	2016-09-06 00:28:43 +00:00
Craig Topper	dfc4fc9f02	[AVX-512] Teach fastisel load/store handling to use EVEX encoded instructions for 128/256-bit vectors and scalar single/double. Still need to fix the register classes to allow the extended range of registers. llvm-svn: 280682	2016-09-05 23:58:40 +00:00
Gor Nishanov	ccabaca273	[Coroutines] Part12: Handle alloca address-taken Summary: Move early uses of spilled variables after CoroBegin. For example, if a parameter had address taken, we may end up with the code like: define @f(i32 %n) { %n.addr = alloca i32 store %n, %n.addr ... call @coro.begin This patch fixes the problem by moving uses of spilled variables after CoroBegin. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24234 llvm-svn: 280678	2016-09-05 23:45:45 +00:00
Sanjay Patel	eea2ef7862	[InstCombine] don't assert that division-by-constant has been folded (PR30281) This is effectively a revert of: https://reviews.llvm.org/rL280115 And this should fix https://llvm.org/bugs/show_bug.cgi?id=30281: llvm-svn: 280677	2016-09-05 23:38:22 +00:00
Sanjay Patel	46f9df5b71	[InstCombine] revert r280637 because it causes test failures on an ARM bot http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/14952/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Aicmp.ll llvm-svn: 280676	2016-09-05 22:36:32 +00:00
Craig Topper	93f7b5699b	[AVX-512] Integrate mask register copying more completely into X86InstrInfo::copyPhysReg and simplify. No functional change intended. The code is now written in terms of source and dest classes with feature checks inside each type of copy instead of having separate functions for each feature set. llvm-svn: 280673	2016-09-05 20:34:50 +00:00
Benjamin Kramer	ef0a45aaa5	[WebAssembly] Unbreak the build. Not sure why ADL isn't working here. llvm-svn: 280656	2016-09-05 12:06:47 +00:00
Valery Pykhtin	8bc659637c	[AMDGPU] Refactor FLAT TD instructions Differential revision: https://reviews.llvm.org/D24072 llvm-svn: 280655	2016-09-05 11:22:51 +00:00
James Molloy	728cf85950	[Thumb1] Add relocations for fixups fixup_arm_thumb_{br,bcc} These need to be mapped through to R_ARM_THM_JUMP{11,8} respectively. Fixes PR30279. llvm-svn: 280651	2016-09-05 08:29:15 +00:00
Igor Breger	a2f8ca9a34	[AVX512] Fix v8i1 /v16i1 zext + bitcast lowering pattern. Explicitly zero upper bits. Differential Revision: http://reviews.llvm.org/D23983 llvm-svn: 280650	2016-09-05 08:26:51 +00:00
Craig Topper	428169a5d6	[X86] Make some static arrays of opcodes const and shrink to uint16_t. NFC llvm-svn: 280649	2016-09-05 07:14:21 +00:00
Craig Topper	d9ca3d97ef	[AVX-512] Simplify X86InstrInfo::copyPhysReg for 128/256-bit vectors with AVX512, but not VLX. We should use the VEX opcodes and trust the register allocator to not use the extended XMM/YMM register space. Previously we were extending to copying the whole ZMM register. The register allocator shouldn't use XMM16-31 or YMM16-31 in this configuration as the instructions to spill them aren't available. llvm-svn: 280648	2016-09-05 06:43:06 +00:00
Gor Nishanov	0e18f75a92	[Coroutines] Part11: Add final suspend handling. Summary: A frontend may designate a particular suspend to be final, by setting the second argument of the coro.suspend intrinsic to true. Such a suspend point has two properties: * it is possible to check whether a suspended coroutine is at the final suspend point via coro.done intrinsic; * a resumption of a coroutine stopped at the final suspend point leads to undefined behavior. The only possible action for a coroutine at a final suspend point is destroying it via coro.destroy intrinsic. This patch adds final suspend handling logic to CoroEarly and CoroSplit passes. Now, the final suspend point example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex5.ll). Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24068 llvm-svn: 280646	2016-09-05 04:44:30 +00:00
Craig Topper	e3807febd8	[X86] Remove FsVMOVAPSrm/FsVMOVAPDrm/FsMOVAPSrm/FsMOVAPDrm. Due to their placement in the td file they had lower precedence than (V)MOVSS/SD and could almost never be selected. The only way to select them was in AVX512 mode because EVEX VMOVSS/SD was below them and the patterns weren't qualified properly for AVX only. So if you happened to have an aligned FR32/FR64 load in AVX512 you could get a VEX encoded VMOVAPS/VMOVAPD. I tried to search back through history and it seems like these instructions were probably unselectable for at least 5 years, at least to the time the VEX versions were added. But I can't prove they ever were. llvm-svn: 280644	2016-09-05 02:20:49 +00:00
Sanjay Patel	c641e9d6ff	[InstCombine] allow icmp (and X, C2), C1 folds for splat constant vectors The code to calculate 'UsesRemoved' could be simplified. As-is, that code is a victim of PR30273: https://llvm.org/bugs/show_bug.cgi?id=30273 llvm-svn: 280637	2016-09-04 20:58:27 +00:00
Craig Topper	040b10784e	[AVX-512] Add EVEX encoded scalar FMA intrinsic instructions to isNonFoldablePartialRegisterLoad. llvm-svn: 280636	2016-09-04 19:33:47 +00:00
Craig Topper	4177345d7f	[AVX-512] Remove 128-bit and 256-bit masked floating point add/sub/mul/div intrinsics and upgrade to native IR. llvm-svn: 280633	2016-09-04 18:13:33 +00:00
Lang Hames	38c7927b6f	[ORC] Clone module flags metadata into the globals module in the CompileOnDemandLayer. Also contains a tweak to the orc-lazy jit in LLI to enable the test case. llvm-svn: 280632	2016-09-04 17:53:30 +00:00
Sanjay Patel	6b4909749b	[InstCombine] recode icmp fold in a vector-friendly way; NFC The transform in question: icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1' ...is still not enabled for vectors, thus no functional change intended. It's not clear to me if this is a good transform for vectors or even scalars in general. Changing that behavior may be a follow-on patch. llvm-svn: 280627	2016-09-04 14:32:15 +00:00
Hal Finkel	f0bc9db96e	[PowerPC] During branch relaxation, recompute padding offsets before each iteration We used to compute the padding contributions to the block sizes during branch relaxation only at the start of the transformation. As we perform branch relaxation, we change the sizes of the blocks, and so the amount of inter-block padding might change. Accordingly, we need to recompute the (alignment-based) padding in between every iteration on our way toward the fixed point. Unfortunately, I don't have a test case (and none was provided in the bug report), and while this obviously seems needed, algorithmically, I don't have any way of generating a small and/or non-fragile regression test. llvm-svn: 280626	2016-09-04 14:18:29 +00:00
Igor Breger	7e2a0dfa0c	revert r279960. https://llvm.org/bugs/show_bug.cgi?id=30249 llvm-svn: 280625	2016-09-04 14:03:52 +00:00
Simon Pilgrim	122b0de1c1	Strip trailing whitespace llvm-svn: 280623	2016-09-04 13:28:46 +00:00
Chandler Carruth	11b3f60cd9	[LCG] Clean up and make NDEBUG verify calls more rigorous with make_scope_exit now that we have that utility. This makes the code much more clear and readable by isolating the check. It also makes it easy to go through and make sure all the interesting update routines have a start and end verify so we don't slowly let the graph drift into an invalid state. llvm-svn: 280619	2016-09-04 08:34:31 +00:00
Chandler Carruth	1f621f0a70	[LCG] A NFC refactoring to extract the logic for doing a postorder-sequence based update after edge insertion into a generic helper function. This separates the SCC-specific logic into two fairly simple lambdas and extracts the rest into a generic helper template function. I think this is a net win on its own merits because it disentangles different pieces of the algorithm. Now there is one place that does the two-step partition to identify a set of newly connected components and at the same time update the postorder sequence. However, I'm also hoping to re-use this an upcoming patch to update a cached post-order sequence of RefSCCs when doing the analogous update to the RefSCC graph, and I don't want to have two copies. The diff is quite messy but this really is just moving things around and making types generic rather than specific. llvm-svn: 280618	2016-09-04 08:34:24 +00:00
Dorit Nuzman	abd15f69b2	[InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing memcpy with ld/st. When InstCombine replaces a memcpy with loads+stores it does not copy over the llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes that. Differential Revision: https://reviews.llvm.org/D23499 llvm-svn: 280617	2016-09-04 07:49:39 +00:00
Lang Hames	3301c7eebd	[ExecutionEngine] Move ObjectCache::anchor from MCJIT to ExecutionEngine. ObjectCache is an ExecutionEngine utility, so its anchor belongs there. The practical impact of this change is that ORC users no longer need to link MCJIT to use ObjectCaches. llvm-svn: 280616	2016-09-04 07:24:11 +00:00
Dorit Nuzman	7673ba7ac2	Test commit. llvm-svn: 280615	2016-09-04 07:06:00 +00:00
Hal Finkel	73390c7acd	[PowerPC] Zero-extend constants in FastISel As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which are illegal types promoted to i32 on PowerPC, is a choice constrained by assumptions within the infrastructure. Specifically, the logic in FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI operands will be zero extended, and so, at least when materializing constants that are PHI operands, we must do the same. The rest of our fast-isel implementation does not appear to depend on the fact that we were sign-extending i8/i16 constants, and all other targets also appear to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we had been doing this only for i1 constants, and sign-extending the others). Fixes PR27721. llvm-svn: 280614	2016-09-04 06:07:19 +00:00
Craig Topper	af0d63d2e7	[AVX-512] Remove masked integer add/sub/mull intrinsics and upgrade to native IR. llvm-svn: 280611	2016-09-04 02:09:53 +00:00
Joseph Tremoulet	e92e0a9042	Fix inliner funclet unwind memoization Summary: The inliner may need to determine where a given funclet unwinds to, and this determination may depend on other funclets throughout the funclet tree. The code that performs this walk in getUnwindDestToken memoizes results to avoid redundant computations. In the case that a funclet's unwind destination is derived from its ancestor, there's code to walk back down the tree from the ancestor updating the memo map of its descendants to record the unwind destination. This change fixes that code to account for the case that some descendant has a different unwind destination, which can happen if that unwind dest is a descendant of the EHPad being queried and thus didn't determine its unwind destination. Also update test inline-funclets.ll, which is supposed to cover such scenarios, to include a case that fails an assertion without this fix but passes with it. Fixes PR29151. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24117 llvm-svn: 280610	2016-09-04 01:23:20 +00:00
Craig Topper	a57d2ca406	[X86] Combine some of the strings in autoupgrade code. llvm-svn: 280603	2016-09-03 23:55:13 +00:00
Xinliang David Li	7a28a7fbd8	Cleanup : Use metadata preserving API for branch creation Use the wrapper API in IRBuilder that does meta data copy to create new branch in LoopUnswitch. llvm-svn: 280602	2016-09-03 22:26:11 +00:00
Xinliang David Li	241e6c7086	[Profile] preserve branch metadata lowering select in CGP CGP currently drops select's MD_prof profile data when generating conditional branch which can lead to bad code layout. The patch fixes the issue. Differential Revision: http://reviews.llvm.org/D24169 llvm-svn: 280600	2016-09-03 21:26:36 +00:00
Mehdi Amini	ebb3434850	Fix ThinLTO crash with debug info Because the recent change about ODR type uniquing in the context, we can reach types defined in another module during IR linking. This triggered some assertions in case we IR link without starting from an empty module. To alleviate that, we can self-map metadata defined in the destination module so that they won't be visited. Differential Revision: https://reviews.llvm.org/D23841 llvm-svn: 280599	2016-09-03 21:12:33 +00:00
Simon Pilgrim	3606d2346c	Strip trailing whitespace llvm-svn: 280598	2016-09-03 20:36:05 +00:00
Matt Arsenault	ac42ba8633	AMDGPU: Set sizes of spill pseudos llvm-svn: 280595	2016-09-03 17:25:44 +00:00
Matt Arsenault	5ffe3e1d93	AMDGPU: Fix adding duplicate implicit exec uses I'm not sure if this should be considered a bug in copyImplicitOps or not, but implicit operands that are part of the static instruction definition should not be copied. llvm-svn: 280594	2016-09-03 17:25:39 +00:00
Craig Topper	907b580d72	[AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an AVX512 stack folding test. llvm-svn: 280593	2016-09-03 17:20:07 +00:00
Craig Topper	392cd0300d	[AVX-512] Mark EVEX encoded vpcmpeq as commutable just like its AVX and SSE equivalent. llvm-svn: 280592	2016-09-03 16:28:03 +00:00
Nicolai Haehnle	3bba6a8438	AMDGPU: Reduce the duration of whole-quad-mode Summary: This contains two changes that reduce the time spent in WQM, with the intention of reducing bandwidth required by VMEM loads: 1. Sampling instructions by themselves don't need to run in WQM, only their coordinate inputs need it (unless of course there is a dependent sampling instruction). The initial scanInstructions step is modified accordingly. 2. When switching back from WQM to Exact, switch back as soon as possible. This affects the logic in processBlock. This should always be a win or at best neutral. There are also some cleanups (e.g. remove unused ExecExports) and some new debugging output. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D22092 llvm-svn: 280590	2016-09-03 12:26:38 +00:00
Nicolai Haehnle	a246dccc26	AMDGPU: Fix an interaction between WQM and polygon stippling Summary: This fixes a rare bug in polygon stippling with non-monolithic pixel shaders. The underlying problem is as follows: the prolog part contains the polygon stippling sequence, i.e. a kill. The main part then enables WQM based on the _reduced_ exec mask, effectively undoing most of the polygon stippling. Since we cannot know whether polygon stippling will be used, the main part of a non-monolithic shader must always return to exact mode to fix this problem. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23131 llvm-svn: 280589	2016-09-03 12:26:32 +00:00
Matt Arsenault	46a0382ab2	AMDGPU: Do basic folding of class intrinsic This allows more of the OCML builtin library to be constant folded. llvm-svn: 280586	2016-09-03 07:06:58 +00:00
Matt Arsenault	2510a31677	AMDGPU: Fix spilling of m0 readlane/writelane do not support using m0 as the output/input. Constrain the register class of spill vregs to try to avoid this, but also handle spilling of the physreg when necessary by inserting an additional copy to a normal SGPR. llvm-svn: 280584	2016-09-03 06:57:55 +00:00
Matt Arsenault	f3d1a1a1b6	Improve debug error message with register name llvm-svn: 280583	2016-09-03 06:57:49 +00:00
Craig Topper	892ce56901	[AVX-512] Add EVEX encoded VPCMPEQ and VPCMPGT to the load folding tables. llvm-svn: 280581	2016-09-03 04:37:50 +00:00
Hal Finkel	522e4d9d66	[PowerPC] Support asm parsing for bc[l][a][+-] mnemonics PowerPC assembly code in the wild, so it seems, has things like this: bc+ 12, 28, .L9 This is a bit odd because the '+' here becomes part of the BO field, and the BO field is otherwise the first operand. Nevertheless, the ISA specification does clearly say that the +- hint syntax applies to all conditional-branch mnemonics (that test either CTR or a condition register, although not the forms which check both), both basic and extended, so this is supposed to be valid. This introduces some asm-parser-only definitions which take only the upper three bits from the specified BO value, and the lower two bits are implied by the +- suffix (via some associated aliases). Fixes PR23646. llvm-svn: 280571	2016-09-03 02:31:44 +00:00
Duncan P. N. Exon Smith	4e229a7c0a	ADT: Do not inherit from std::iterator in ilist_iterator Inheriting from std::iterator uses more boiler-plate than manual typedefs. Avoid that in both ilist_iterator and MachineInstrBundleIterator. This has the side effect of removing ilist_iterator from certain ADL lookups in namespace std; calls to std::next need to be qualified by "std::" that didn't have to before. The one case of this in-tree was operating on a temporary, so I used the more compact operator++. llvm-svn: 280570	2016-09-03 02:27:35 +00:00
Duncan P. N. Exon Smith	8cc24eadd2	ADT: Remove external uses of ilist_iterator, NFC Delete the dead code for Write(ilist_iterator) in the IR Verifier, inline report(ilist_iterator) at its call sites in the MachineVerifier, and use simple_ilist<>::iterator in SymbolTableListTraits. The only remaining reference to ilist_iterator outside of the ilist implementation is from MachineInstrBundleIterator. I'll get rid of that in a follow-up. llvm-svn: 280565	2016-09-03 01:22:56 +00:00
Hal Finkel	28842b96f3	[PowerPC] Add asm parser/disassembler support for hrfid,nap,slbmfev These few book-III instructions are used by the Linux kernel. Partially fixes PR24796. llvm-svn: 280560	2016-09-02 23:42:01 +00:00
Hal Finkel	277736eee6	[PowerPC] Add support for the extended dcbf form and mnemonics dcbf has an optional hint-like field, add support for the extended form and the associated mnemonics (dcbfl and dcbflp). Partially fixes PR24796. llvm-svn: 280559	2016-09-02 23:41:54 +00:00
Yunzhong Gao	27ea29b3b7	(LLVM part) Implement MASM-flavor intel syntax behavior for inline MS asm block: 1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not. 0xNN and NNh may come with optional U or L suffix. 2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not. NNb may come with optional U or L suffix. Differential Revision: https://reviews.llvm.org/D22112 llvm-svn: 280555	2016-09-02 23:15:29 +00:00
Ron Lieberman	88159e5549	Make sure to maintain register liveness when generating predicated instructions. Author: Krzysztof Parzyszek <kparzysz@codeaurora.org> Differential Revision: https://reviews.llvm.org/D24209 llvm-svn: 280552	2016-09-02 22:56:24 +00:00
Xinliang David Li	40afd5c9e4	[Profile] handle select instruction in 'expect' lowering Builtin expect lowering currently ignores select. This patch fixes the issue Differential Revision: http://reviews.llvm.org/D24166 llvm-svn: 280547	2016-09-02 22:03:40 +00:00
Hal Finkel	7b104d4721	[PowerPC] For larger offsets, when possible, fold offset into addis toc@ha When we have an offset into a global, etc. that is accessed relative to the TOC base pointer, and the offset is larger than the minimum alignment of the global itself and the TOC base pointer (which is 8-byte aligned), we can still fold the @toc@ha into the memory access, but we must update the addis instruction's symbol reference with the offset as the symbol addend. When there is only one use of the addi to be folded and only one use of the addis that would need its symbol's offset adjusted, then we can make the adjustment and fold the @toc@l into the memory access. llvm-svn: 280545	2016-09-02 21:37:07 +00:00
James Y Knight	6ef32bf2af	[Sparc] Mark i128 shift libcalls unavailable in 32-bit mode. Recently, llvm wants to emit calls to these functions, while it didn't seem to be an issue before. Not sure why. Nor do I know why only these three are important to disable, out of all of the i128 libcalls. Nevertheless, many other targets have this snippet of code, so, just copying it to sparc as well, to unbreak things. llvm-svn: 280537	2016-09-02 20:29:11 +00:00
Jan Vesely	ea45746d5a	AMDGPU/R600: EXTRACT_VECT_ELT should only bypass BUILD_VECTOR if the vectors have the same number of elements. Fixes R600 piglit regressions since r280298 Differential Revision: https://reviews.llvm.org/D24174 llvm-svn: 280535	2016-09-02 20:13:19 +00:00
Sjoerd Meijer	6c4140b6c0	Setting fp trapping mode and denormal type: this an improvement of r280246 and calculates compatibility of functions attributes in a better way. Differential Revision: https://reviews.llvm.org/D24070 llvm-svn: 280534	2016-09-02 19:51:34 +00:00
Krzysztof Parzyszek	3bf4aeccd6	Do not consider subreg defs as reads when computing subrange liveness Subregister definitions are considered uses for the purpose of tracking liveness of the whole register. At the same time, when calculating live interval subranges, subregister defs should not be treated as uses. Differential Revision: https://reviews.llvm.org/D24190 llvm-svn: 280532	2016-09-02 19:48:55 +00:00
Chad Rosier	c50dfe38ac	[SLP] Don't pass a global CL option as an argument. NFC. Differential Revision: https://reviews.llvm.org/D24199 llvm-svn: 280527	2016-09-02 19:09:50 +00:00
Jan Vesely	00864886f4	AMDGPU/R600: Expand unaligned writes to local and global AS LOCAL and GLOBAL AS only PRIVATE needs special treatment Differential Revision: https://reviews.llvm.org/D23971 llvm-svn: 280526	2016-09-02 19:07:06 +00:00
Reid Kleckner	fa28396f97	[codeview] Use the correct max CV record length of 0xFF00 Previously we were splitting our records at 0xFFFF bytes, which the Microsoft tools don't like. Should fix failure on the new Windows self-host buildbot. This length appears in microsoft-pdb/PDB/dbi/dbiimpl.h llvm-svn: 280522	2016-09-02 18:43:27 +00:00
Kyle Butt	e31cc84290	IfConversion: Add assertions that both sides of a diamond don't pred-clobber. One side of a diamond may end with a predicate clobbering instruction. That side of the diamond has to be if-converted second. Both sides can't clobber the predicate or the ifconversion is invalid. This is checked elsewhere, but add an assert as a safety check. NFC llvm-svn: 280518	2016-09-02 18:29:28 +00:00
Kyle Butt	8699921c4b	IfConversion: Fix bug introduced by rescanning diamonds. Passing the wrong values for predicate-clobbering. Simple to miss. Added an assert to make this easier to catch in the future. llvm-svn: 280517	2016-09-02 18:29:26 +00:00
Wei Mi	c54d1298f5	Split the store of a wide value merged from an int-fp pair into multiple stores. For the store of a wide value merged from a pair of values, especially int-fp pair, sometimes it is more efficent to split it into separate narrow stores, which can remove the bitwise instructions or sink them to colder places. Now the feature is only enabled on x86 target, and only store of int-fp pair is splitted. It is possible that the application scope gets extended with perf evidence support in the future. Differential Revision: https://reviews.llvm.org/D22840 llvm-svn: 280505	2016-09-02 17:17:04 +00:00
Sanjay Patel	521f19f249	[InsttCombine] fold insertelement of constant into shuffle with constant operand (PR29126) The motivating case occurs with SSE/AVX scalar intrinsics, so this is a first step towards shrinking that to a single shufflevector. Note that the transform is intentionally limited to shuffles that are equivalent to vector selects to avoid creating arbitrary shuffle masks that may not lower well. This should solve PR29126: https://llvm.org/bugs/show_bug.cgi?id=29126 Differential Revision: https://reviews.llvm.org/D23886 llvm-svn: 280504	2016-09-02 17:05:43 +00:00
Davide Italiano	41bfd6bd6c	[lib/LTO] Simplify. No functional change intended. llvm-svn: 280503	2016-09-02 16:37:31 +00:00
Derek Schuff	a66ae923e4	[WebAssembly] Update known test failures Fixed an issue with the experimental C headers llvm-svn: 280498	2016-09-02 16:26:24 +00:00
Matthew Simpson	b65c230eab	[LV] Ensure reverse interleaved group GEPs remain uniform For uniform instructions, we're only required to generate a scalar value for the first vector lane of each unroll iteration. Thus, if we have a reverse interleaved group, computing the member index off the scalar GEP corresponding to the last vector lane of its pointer operand technically makes the GEP non-uniform. We should compute the member index off the first scalar GEP instead. I've added the updated member index computation to the existing reverse interleaved group test. llvm-svn: 280497	2016-09-02 16:19:22 +00:00
Andrea Di Biagio	bff3fd6700	Simplify code a bit. No functional change intended. We don't need to call `GetCompareTy(LHS)' every single time true or false is returned from function SimplifyFCmpInst as suggested by Sanjay in review D24142. llvm-svn: 280491	2016-09-02 15:55:25 +00:00
Sanjay Patel	bd51c164d9	fix documentation comments; NFC llvm-svn: 280489	2016-09-02 15:43:25 +00:00
Andrea Di Biagio	805815f407	[instsimplify] Fix incorrect folding of an ordered fcmp with a vector of all NaN. This patch fixes a crash caused by an incorrect folding of an ordered comparison between a packed floating point vector and a splat vector of NaN. An ordered comparison between a vector and a constant vector of NaN, should always be folded into a constant vector where each element is i1 false. Since revision 266175, SimplifyFCmpInst folds the ordered fcmp into a scalar 'false'. Later on, this would cause an assertion failure, since the value type of the folded value doesn't match the expected value type of the uses of the original instruction: "Assertion failed: New->getType() == getType() && "replaceAllUses of value with new value of different type!". This patch fixes the issue and adds a test case to the already existing test InstSimplify/floating-point-compares.ll. Differential Revision: https://reviews.llvm.org/D24143 llvm-svn: 280488	2016-09-02 14:47:43 +00:00
Andrea Di Biagio	fd503e5af3	[DAGcombiner] Fix incorrect sinking of a truncate into the operand of a shift. This fixes a regression introduced by revision 268094. Revision 268094 added the following dag combine rule: // trunc (shl x, K) -> shl (trunc x), K => K < vt.size / 2 That rule converts a truncate of a shift-by-constant into a shift of a truncated value. We do this only if the shift count is less than half the size in bits of the truncated value (K < vt.size / 2). The problem is that the constraint on the shift count is incorrect, so the rule doesn't work well in some cases involving vector types. The combine rule should have been written instead like this: // trunc (shl x, K) -> shl (trunc x), K => K < vt.getScalarSizeInBits() Basically, if K is smaller than the "scalar size in bits" of the truncated value then we know that by "sinking" the truncate into the operand of the shift we would never accidentally make the shift undefined. This patch fixes the check on the shift count, and adds test cases to make sure that we don't regress the behavior. Differential Revision: https://reviews.llvm.org/D24154 llvm-svn: 280482	2016-09-02 11:29:09 +00:00
George Rimar	d8dfeec019	[Support] - Fix possible crash in match() of llvm::Regex. Crash was possible if match() method was called on object that was moved or object created with empty constructor. Testcases updated. DIfferential revision: https://reviews.llvm.org/D24123 llvm-svn: 280473	2016-09-02 08:44:46 +00:00
James Molloy	f3cf2a494b	[SimplifyCFG] Add a workaround to fix PR30188 We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions. The real fix is in SROA, which I'll be looking into. llvm-svn: 280470	2016-09-02 07:29:00 +00:00
Craig Topper	e75c49543c	[AVX-512] Remove floating point logical operation instrinsics and replace them with native IR. llvm-svn: 280466	2016-09-02 05:29:17 +00:00
Craig Topper	45d6503089	[AVX-512] Add more patterns for masked and broadcasted logical operations where the select or broadcast has a floating point type. These are needed in order to remove the masked floating point logical operation intrinsics and use native IR. llvm-svn: 280465	2016-09-02 05:29:13 +00:00
Craig Topper	00aecd97bf	[AVX-512] Add execution domain fixing for logical operations with broadcast loads. This builds on the handling of masked ops since we need to keep element size the same. llvm-svn: 280464	2016-09-02 05:29:09 +00:00
Craig Topper	f8ad647b93	[X86] Strengthen some SDNode type constraints. llvm-svn: 280463	2016-09-02 04:25:33 +00:00
Craig Topper	8b9e671e97	[AVX-512] Add NoVLX Predicates to some patterns so they don't rely on pattern ordering to be lower priority than their equivalent VLX pattern. llvm-svn: 280462	2016-09-02 04:25:30 +00:00
Hal Finkel	5ef4b03106	[PowerPC] hasAndNotCompare should return true As Sanjay suggested when he added the hook, PPC should return true from hasAndNotCompare. We have an efficient negated 'and' on PPC (which can feed a compare). Fixes PR27203. llvm-svn: 280457	2016-09-02 02:58:25 +00:00
Hal Finkel	a39fd4bc53	[PowerPC] Add a pattern for a runtime bit check Following a suggestion by Sanjay, we should lower: %shl = shl i32 1, %y %and = and i32 %x, %shl %cmp = icmp eq i32 %and, %shl ret i1 %cmp into: subfic r4, r4, 32 rlwnm r3, r3, r4, 31, 31 Add this pattern and some associated patterns for the 64-bit case and the not-equal case. Fixes PR27356. llvm-svn: 280454	2016-09-02 02:34:44 +00:00
Dehao Chen	4b5e7f750c	revert r280429 and r280425: r280425 \| dehao \| 2016-09-01 16:15:50 -0700 (Thu, 01 Sep 2016) \| 9 lines Refactor LICM pass in preparation for LoopSink pass. Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778). r280429 \| dehao \| 2016-09-01 16:31:25 -0700 (Thu, 01 Sep 2016) \| 9 lines Refactor LICM to expose canSinkOrHoistInst to LoopSink pass. Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778 llvm-svn: 280453	2016-09-02 01:59:27 +00:00
Dehao Chen	820372c0ed	revert r280432: r280432 \| dehao \| 2016-09-01 16:51:37 -0700 (Thu, 01 Sep 2016) \| 9 lines Explicitly require DominatorTreeAnalysis pass for instsimplify pass. Summary: DominatorTreeAnalysis is always required by instsimplify. llvm-svn: 280452	2016-09-02 01:47:13 +00:00
Kyle Butt	93e94e8a12	IfConversion: Don't count branches in # of duplicates. If the entire blocks match, we would count the branch instructions toward the number of duplicated instructions. This doesn't match what we do elsewhere, and was causing a bug. llvm-svn: 280448	2016-09-02 01:20:06 +00:00
Reid Kleckner	1a4398a198	Fix a real temp file leak in FileOutputBuffer If we failed to commit the buffer but did not die to a signal, the temp file would remain on disk on Windows. Having an open file mapping and file handle prevents the file from being deleted. I am choosing not to add an assertion of success on the temp file removal, since virus scanners and other environmental things can often cause removal to fail in real world tools. Also fix more temp file leaks in unit tests. llvm-svn: 280445	2016-09-02 01:10:53 +00:00
Hal Finkel	b54579fab6	[PowerPC] Don't apply the PPC64 address-formation peephole for offsets greater than 7 When applying our address-formation PPC64 peephole, we are reusing the @ha TOC addis value with the low parts associated with different offsets (i.e. different effective symbol addends). We were assuming this was okay so long as the offsets were less than the alignment of the global variable being accessed. This ignored the fact, however, that the TOC base pointer itself need only be 8-byte aligned. As a result, what we were doing is legal only for offsets less than 8 regardless of the alignment of the object being accessed. Fixes PR28727. llvm-svn: 280441	2016-09-02 00:28:20 +00:00
Hal Finkel	1e8218cc09	[PowerPC] Don't consider fusion in PPC64 address-formation peephole The logic in this function assumes that the P8 supports fusion of addis/addi, but it does not. As a result, there is no advantage to restricting our peephole application, merging addi instructions into dependent memory accesses, even when the addi has multiple users, regardless of whether or not we're optimizing for size. We might need something like this again for the P9; I suspect we'll revisit this code when we work on P9 tuning. llvm-svn: 280440	2016-09-02 00:27:50 +00:00
Dehao Chen	e573b77772	Explicitly require DominatorTreeAnalysis pass for instsimplify pass. Summary: DominatorTreeAnalysis is always required by instsimplify. Reviewers: davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24173 llvm-svn: 280432	2016-09-01 23:51:37 +00:00
Aditya Kumar	356f79d535	[SelectionDAGBuilder] Add const to relevant places Reviewers: hans, evandro, sebpop Differential Revision: https://reviews.llvm.org/D24112 llvm-svn: 280430	2016-09-01 23:35:26 +00:00
Dehao Chen	e81d50b3b9	Refactor LICM to expose canSinkOrHoistInst to LoopSink pass. Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778 Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24171 llvm-svn: 280429	2016-09-01 23:31:25 +00:00
Dehao Chen	ddd0c125e3	Refactor replaceDominatedUsesWith to have a flag to control whether to replace uses in BB itself. Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking. Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24170 llvm-svn: 280427	2016-09-01 23:26:48 +00:00
Dehao Chen	bc4e5bba0e	Refactor LICM pass in preparation for LoopSink pass. Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778). Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24168 llvm-svn: 280425	2016-09-01 23:15:50 +00:00
Michael Kuperstein	7bc54cebea	[Legalizer] Don't throw away false low half when expanding GT/LT SETCC When expanding a SETCC for which the low half is known to evaluate to false, we can only throw it away for LT/GT comparisons, not LE/GE. This fixes PR29170. Differential Revision: https://reviews.llvm.org/D24151 llvm-svn: 280424	2016-09-01 23:02:32 +00:00
Michael Kuperstein	5f17d08f49	[SelectionDAG] Generate vector_shuffle nodes for undersized result vector sizes Prior to this, we could generate a vector_shuffle from an IR shuffle when the size of the result was exactly the sum of the sizes of the input vectors. If the output vector was narrower - e.g. a <12 x i8> being formed by a shuffle with two <8 x i8> inputs - we would lower the shuffle to a sequence of extracts and inserts. Instead, we can form a larger vector_shuffle, and then extract a subvector of the right size - e.g. shuffle the two <8 x i8> inputs into a <16 x i8> and then extract a <12 x i8>. This also includes a target-specific X86 combine that in the presence of AVX2 combines: (vector_shuffle <mask> (concat_vectors t1, undef) (concat_vectors t2, undef)) into: (vector_shuffle <mask> (concat_vectors t1, t2), undef) in cases where this allows us to form VPERMD/VPERMQ. (This is not a separate commit, as that pattern does not appear without the DAGBuilder change.) llvm-svn: 280418	2016-09-01 21:32:09 +00:00
Heejin Ahn	c0f18172f5	[WebAssembly] Add asm.js-style setjmp/longjmp handling for wasm (reland r280302) Summary: This patch adds asm.js-style setjmp/longjmp handling support for WebAssembly. It also uses JavaScript's try and catch mechanism. Reviewers: jpp, dschuff Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D24121 llvm-svn: 280415	2016-09-01 21:05:15 +00:00
Tim Northover	8d8812c5d7	GlobalISel: add a G_PHI instruction to give phis a type. They're another source of generic vregs, which are going to need a type on the definition when we remove the register width from MachineRegisterInfo. llvm-svn: 280412	2016-09-01 20:45:41 +00:00
Reid Kleckner	d1882f2188	Fix the ASan fuse-lld.cc test after LLD r280012 With that change, images built with 'lld-link /debug' always have a debug directory. If no PDB filename was passed on the command line, then the filename in the executable is empty. PDB information would never work anyway if the PDB file name is empty, so go ahead and try DWARF in that case. llvm-svn: 280410	2016-09-01 20:28:59 +00:00
Matthew Simpson	9e7851ed31	[LV] Use ScalarParts for ad-hoc pointer IV scalarization (NFCI) We can now maintain scalar values in VectorLoopValueMap. Thus, we no longer have to create temporary vectors with insertelement instructions when handling pointer induction variables. This case was mistakenly missed from r279649 when refactoring the other scalarization code. llvm-svn: 280405	2016-09-01 19:40:19 +00:00
Andrey Turetskiy	cde38b6a99	[X86] Loosen memory folding requirements for cvtdq2pd and cvtps2pd instructions. According to spec cvtdq2pd and cvtps2pd instructions don't require memory operand to be aligned to 16 bytes. This patch removes this requirement from the memory folding table. Differential Revision: https://reviews.llvm.org/D23919 llvm-svn: 280402	2016-09-01 18:50:02 +00:00
Yaxun Liu	add05a8d95	AMDGPU: Add runtime metadata for pointee alignment of argument. Add runtime metdata for pointee alignment of pointer type kernel argument. The key is KeyArgPointeeAlign and the value is a 32 bit unsigned integer. Differential Revision: https://reviews.llvm.org/D24145 llvm-svn: 280399	2016-09-01 18:46:49 +00:00
Davide Italiano	22d36c167e	[lib/LTO] Simplify a bit. NFCI. llvm-svn: 280396	2016-09-01 18:34:47 +00:00
Michael Kuperstein	b4743597bd	Rename some variables to have meaningful names. NFC. llvm-svn: 280391	2016-09-01 18:24:42 +00:00
Matthew Simpson	922af076c7	[LV] Move VectorParts allocation and mapping into PHI widening (NFC) This patch moves the allocation of VectorParts for PHI nodes into the actual PHI widening code. Previously, we allocated these VectorParts in vectorizeBlockInLoop, and passed them by reference to widenPHIInstruction. Upon returning, we would then map the VectorParts in VectorLoopValueMap. This behavior is problematic for the cases where we only want to generate a scalar version of a PHI node. For example, if in the future we only generate a scalar version of an induction variable, we would end up inserting an empty vector entry into the map once we return to vectorizeBlockInLoop. We now no longer need to pass VectorParts to the various PHI widening functions, and we can keep VectorParts allocation as close as possible to the point at which they are actually mapped in VectorLoopValueMap. llvm-svn: 280390	2016-09-01 18:14:27 +00:00
Zachary Turner	5c7c2307a8	[codeview] Properly propagate the TypeLeafKind through the pipeline. llvm-svn: 280388	2016-09-01 18:08:19 +00:00
Michael Kuperstein	65bc3c89ff	[DAGCombine] Don't fold a trunc if it feeds an anyext Legalization tends to create anyext(trunc) patterns. This should always be combined - into either a single trunc, a single ext, or nothing if the types match exactly. But if we happen to combine the trunc first, we may pull the trunc away from the anyext or make it implicit (e.g. the truncate(extract) -> extract(bitcast) fold). To prevent this, we can avoid doing the fold, similarly to how we already handle fpround(fpextend). Differential Revision: https://reviews.llvm.org/D23893 llvm-svn: 280386	2016-09-01 17:59:24 +00:00
Changpeng Fang	b28fe0307f	AMDGPU/SI: MIMG TD Refactoring. Summary: Created a new td file MIMGInstructions.td which contains all definitions of MIMG related instructions. Reviewed by: kzhuravl, vpykhtin Differential Revision: http://reviews.llvm.org/D24106 llvm-svn: 280385	2016-09-01 17:54:54 +00:00
Geoff Berry	fcb186ca9d	[EarlyCSE] Change C API pass interface for EarlyCSE w/ MemorySSA Previous change broke the C API for creating an EarlyCSE pass w/ MemorySSA by adding a bool parameter to control whether MemorySSA was used or not. This broke the OCaml bindings. Instead, change the old C API entry point back and add a new one to request an EarlyCSE pass with MemorySSA. llvm-svn: 280379	2016-09-01 15:07:46 +00:00
Simon Dardis	1fa1fb0f8d	[mips] Include missed file from previous commit llvm-svn: 280377	2016-09-01 15:03:13 +00:00
Simon Pilgrim	ce0e9f0b91	[X86][SSE] Dropped (V)CVTPD2PS intrinsic patterns now that its bound to X86vfpround It now uses X86vfpround patterns directly instead. Followup to D23797 llvm-svn: 280376	2016-09-01 14:59:20 +00:00
Simon Dardis	bd27154757	[mips] interAptiv based generic schedule model This scheduler describes a processor which covers all MIPS ISAs based around the interAptiv and P5600 timings. Reviewers: vkalintiris, dsanders Differential Revision: https://reviews.llvm.org/D23551 llvm-svn: 280374	2016-09-01 14:53:53 +00:00
Sanjay Patel	dd861964d1	[InstCombine] remove fold of an icmp pattern that should never happen While removing a scalar shackle from an icmp fold, I noticed that I couldn't find any tests to trigger this code path. The 'and' shrinking transform should be handled by InstCombiner::foldCastedBitwiseLogic() or eliminated with InstSimplify. The icmp narrowing is part of InstCombiner::foldICmpWithCastAndCast(). Differential Revision: https://reviews.llvm.org/D24031 llvm-svn: 280370	2016-09-01 14:20:43 +00:00
Krzysztof Parzyszek	07d9f53b51	[Hexagon] Deal with undefs when extending live intervals Reapply r280275, since MSVC accepts r280358. llvm-svn: 280369	2016-09-01 13:59:35 +00:00
Elena Demikhovsky	4d7738dfde	Optimized FMA intrinsic + FNEG , like -(ab+c) and FNEG + FMA, like ab-c or (-a)*b+c. The bug description is here : https://llvm.org/bugs/show_bug.cgi?id=28892 Differential revision: https://reviews.llvm.org/D23313 llvm-svn: 280368	2016-09-01 13:58:53 +00:00
James Molloy	88cad7e5cf	[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted. As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider: if (a) x(1); else if (b) x(2); This produces the following CFG: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ \| / [ end ] [end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch. We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects). We are now able to detect this case and split off the unconditional arcs to a common successor: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ / \| [sink.split] \| \ / [ end ] Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases. llvm-svn: 280364	2016-09-01 12:58:13 +00:00
Krzysztof Parzyszek	4f863d75f4	Add an optional parameter with a list of undefs to extendToIndices Reapply r280268, hopefully in a version that MSVC likes. llvm-svn: 280358	2016-09-01 12:10:36 +00:00
Honggyu Kim	9eb6a10251	[IR] Properly handle escape characters in Attribute::getAsString() If an attribute name has special characters such as '\01', it is not properly printed in LLVM assembly language format. Since the format expects the special characters are printed as it is, it has to contain escape characters to make it printable. Before: attributes #0 = { ... "counting-function"="^A__gnu_mcount_nc" ... After: attributes #0 = { ... "counting-function"="\01__gnu_mcount_nc" ... Reviewers: hfinkel, rengolin, rjmccall, compnerd Subscribers: nemanjai, mcrosier, hans, shenhan, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D23792 llvm-svn: 280357	2016-09-01 11:44:06 +00:00
James Molloy	eec6df3193	[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything. This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink. This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example: %a = load i32* %b %d = load i32* %b %c = gep i32* %a, i32 0 %e = gep i32* %d, i32 1 Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor). This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough. Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm. In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging. In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive. This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans. llvm-svn: 280351	2016-09-01 10:44:35 +00:00
Hal Finkel	5081ac27c7	Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPC LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is lowered using: ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET) where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86, FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not work for PowerPC. Because of the way that the stack layout works, the canonical frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC (there is a lower save-area offset as well), so it is not just a matter of implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its semantics -- We can do that, since it is currently used only for @llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips currently does this, but by using a custom lowering for ADD that specifically recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern. This change introduces a ISD::EH_DWARF_CFA node, which by default expands using the existing logic, but can be directly lowered by the target. Mips is updated to use this method (which simplifies its implementation, and I suspect makes it more robust), and updates PowerPC to do the same. Fixes PR26761. Differential Revision: https://reviews.llvm.org/D24038 llvm-svn: 280350	2016-09-01 10:28:47 +00:00
Valery Pykhtin	1b13886b5f	[AMDGPU] Scalar Memory instructions TD refactoring Differential revision: https://reviews.llvm.org/D23996 llvm-svn: 280349	2016-09-01 09:56:47 +00:00
Hal Finkel	40d7f5c277	Add a counter-function insertion pass As discussed in https://reviews.llvm.org/D22666, our current mechanism to support -pg profiling, where we insert calls to mcount(), or some similar function, is fundamentally broken. We insert these calls in the frontend, which means they get duplicated when inlining, and so the accumulated execution counts for the inlined-into functions are wrong. Because we don't want the presence of these functions to affect optimizaton, they should be inserted in the backend. Here's a pass which would do just that. The knowledge of the name of the counting function lives in the frontend, so we're passing it here as a function attribute. Clang will be updated to use this mechanism. Differential Revision: https://reviews.llvm.org/D22825 llvm-svn: 280347	2016-09-01 09:42:39 +00:00
Chandler Carruth	1ab16f8c2d	[Support] Fix a warning introduced in r280339 due to the member initializers not being in the same order as the members. Specifically, 'preg' is the first member followed by 'error', so they will be initialized in that order and should be written in the member initializer list in that order. For the constructor in question, there is no change in behavior. llvm-svn: 280345	2016-09-01 09:31:02 +00:00
James Molloy	21744689b9	[SimplifyCFG] Fix nondeterministic iteration order We iterate over the result from SafeToMergeTerminators, so make it a SmallSetVector instead of a SmallPtrSet. Should fix stage3 convergence builds. llvm-svn: 280342	2016-09-01 09:01:34 +00:00
George Rimar	a9ff072fe8	[LLVM/Support] - Create no-arguments constructor for llvm::Regex This is useful when need to defer the construction, e.g. using Regex as a member of class. Differential revision: https://reviews.llvm.org/D24101 llvm-svn: 280339	2016-09-01 08:00:28 +00:00
James Molloy	e656642295	[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases A very important case is not handled here: multiple arcs to a single block with a PHI. Consider: a: %1 = icmp %b, 1 br %1, label %c, label %e c: %2 = icmp %b, 2 br %2, label %d, label %e d: br %e e: phi [0, %a], [1, %c], [2, %d] FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs. llvm-svn: 280338	2016-09-01 07:45:25 +00:00
Dean Michael Berris	6d6addbe15	[NFC] Remove unnecessary comment llvm-svn: 280336	2016-09-01 01:58:24 +00:00
Dean Michael Berris	e8ae5baaf7	[XRay] Detect and emit sleds for sibling/tail calls Summary: This change promotes the 'isTailCall(...)' member function to TargetInstrInfo as a query interface for determining on a per-target basis whether a given MachineInstr is a tail call instruction. We build upon this in the XRay instrumentation pass to emit special sleds for tail call optimisations, where we emit the correct kind of sled. The tail call sleds look like a mix between the function entry and function exit sleds. Form-wise, the sled comes before the "jmp" instruction that implements the tail call similar to how we do it for the function entry sled. Functionally, because we know this is a tail call, it behaves much like an exit sled -- i.e. at runtime we may use the exit trampolines instead of a different kind of trampoline. A follow-up change to recognise these sleds will be done in compiler-rt, so that we can start intercepting these initially as exits, but also have the option to have different log entries to more accurately reflect that this is actually a tail call. Reviewers: echristo, rSerge, majnemer Subscribers: mehdi_amini, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D23986 llvm-svn: 280334	2016-09-01 01:29:13 +00:00
Kostya Serebryany	e2d0f63654	[libFuzzer] add -minimize_crash flag (to minimize crashers). also add two tests that I failed to commit last time llvm-svn: 280332	2016-09-01 01:22:27 +00:00
Dean Michael Berris	40e6ba16a1	[XRay][NFC] Promote isTailCall() as virtual in TargetInstrInfo. This change is broken out from D23986, where XRay detects tail call exits. llvm-svn: 280331	2016-09-01 01:03:22 +00:00
Heejin Ahn	10a7086700	Revert "Add asm.js-style setjmp/longjmp handling for wasm" This reverts commit r280302, it broke the integration tests. llvm-svn: 280329	2016-09-01 00:44:37 +00:00
Nick Lewycky	8dd4dad08b	Add cast to appease windows builder. Fixes build break introduced in r280306. llvm-svn: 280311	2016-08-31 23:24:43 +00:00
Zachary Turner	77807637ff	[codeview] Have visitTypeBegin return the record type. Previously we were assuming that any visitation of types would necessarily be against a type we had binary data for. Reasonable assumption when were just reading PDBs and dumping them, but once we start writing PDBs from Yaml this breaks down, because we have no binary data yet, only Yaml, and from that we need to read the record kind and perform the switch based on that. So this patch does that. Instead of having the visitor switch on the kind that is already in the CVType record, we change the visitTypeBegin() method to return the Kind, and switch on the returned value. This way, the default implementation can still return the value from the CVType, but the implementation which visits Yaml records and serializes binary PDB type records can use the field in the Yaml as the source of the switch. llvm-svn: 280307	2016-08-31 23:14:31 +00:00
Nick Lewycky	97e49ac59e	Add -fprofile-dir= to clang. -fprofile-dir=path allows the user to specify where .gcda files should be emitted when the program is run. In particular, this is the first flag that causes the .gcno and .o files to have different paths, LLVM is extended to support this. -fprofile-dir= does not change the file name in the .gcno (and thus where lcov looks for the source) but it does change the name in the .gcda (and thus where the runtime library writes the .gcda file). It's different from a GCOV_PREFIX because a user can observe that the GCOV_PREFIX_STRIP will strip paths off of -fprofile-dir= but not off of a supplied GCOV_PREFIX. To implement this we split -coverage-file into -coverage-data-file and -coverage-notes-file to specify the two different names. The !llvm.gcov metadata node grows from a 2-element form {string coverage-file, node dbg.cu} to 3-elements, {string coverage-notes-file, string coverage-data-file, node dbg.cu}. In the 3-element form, the file name is already "mangled" with .gcno/.gcda suffixes, while the 2-element form left that to the middle end pass. llvm-svn: 280306	2016-08-31 23:04:32 +00:00
Heejin Ahn	23d57103a4	Add asm.js-style setjmp/longjmp handling for wasm Summary: This patch adds asm.js-style setjmp/longjmp handling support for WebAssembly. It also uses JavaScript's try and catch mechanism. Reviewers: jpp, dschuff Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D23928 llvm-svn: 280302	2016-08-31 22:40:34 +00:00
Reid Kleckner	109448ee81	Revert "Add an optional parameter with a list of undefs to extendToIndices" This reverts commit r280268, it causes all MSVC 2013 to ICE. This appears to have been fixed in a later MSVC 2013 update, because I cannot reproduce it locally. That said, all upstream LLVM bots are broken right now, so I am reverting. Also reverts dependent change r280275, "[Hexagon] Deal with undefs when extending live intervals". llvm-svn: 280301	2016-08-31 22:36:02 +00:00
Sanjay Patel	0d70831d73	[InstCombine] allow icmp (shr exact X, C2), C fold for splat constant vectors The enhancement to foldICmpDivConstant ( http://llvm.org/viewvc/llvm-project?view=revision&revision=280299 ) allows us to remove the ConstantInt check; no other changes needed. llvm-svn: 280300	2016-08-31 22:18:43 +00:00
Sanjay Patel	541aef4661	[InstCombine] allow icmp (div X, Y), C folds for splat constant vectors Converting all of the overflow ops to APInt looked risky, so I've left that as a TODO. llvm-svn: 280299	2016-08-31 21:57:21 +00:00
Matt Arsenault	b50eb8dc2b	AMDGPU: Fix introducing stack access on unaligned v16i8 llvm-svn: 280298	2016-08-31 21:52:27 +00:00
Matt Arsenault	1d2151781b	AMDGPU: Use copy instead of mov during frame lowering This occurs before RA pseudos are expanded. It's less code to emit the copy. llvm-svn: 280297	2016-08-31 21:52:25 +00:00
Matt Arsenault	57bc4324f8	AMDGPU: Refactor frame lowering This will make future changes easier. llvm-svn: 280296	2016-08-31 21:52:21 +00:00
Zachary Turner	2f951ce9c9	[codeview] Add TypeVisitorCallbackPipeline. We were kind of hacking this together before by embedding the ability to forward requests into the TypeDeserializer. When we want to start adding more different kinds of visitor callback interfaces though, this doesn't scale well and is very inflexible. So introduce the notion of a pipeline, which itself implements the TypeVisitorCallbacks interface, but which contains an internal list of other callbacks to invoke in sequence. Also update the existing uses of CVTypeVisitor to use this new pipeline class for deserializing records before visiting them with another visitor. llvm-svn: 280293	2016-08-31 21:42:26 +00:00
Tim Northover	11a2354670	GlobalISel: use G_TYPE to annotate physregs with a type. More preparation for dropping source types from MachineInstrs: regsters coming out of already-selected code (i.e. non-generic instructions) don't have a type, but that information is needed so we must add it manually. This is done via a new G_TYPE instruction. llvm-svn: 280292	2016-08-31 21:24:02 +00:00
Derek Schuff	1b258d313c	[WebAssembly] Disable folding of GA+reg into load/store constant offsets Summary: If the register has a negative value then unsigned overflow will occur; this case is sometimes even created intentionally by LSR. For now disable GA+reg folding. Fixes PR29127 Differential Revision: https://reviews.llvm.org/D24053 llvm-svn: 280285	2016-08-31 20:27:20 +00:00
Sanjay Patel	85d79744df	[InstCombine] change insertRangeTest() to use APInt instead of Constant; NFCI This is prep work before changing the callers to also use APInt which will allow folds for splat vectors. Currently, the callers have ConstantInt guards in place, so no functional change intended with this commit. llvm-svn: 280282	2016-08-31 19:49:56 +00:00
Michael Zolotukhin	e0b2d97b52	[LoopInfo] Add verification by recomputation. Summary: Current implementation of LI verifier isn't ideal and fails to detect some cases when LI is incorrect. For instance, it checks that all recorded loops are in a correct form, but it has no way to check if there are no more other (unrecorded in LI) loops in the function. This patch adds a way to detect such bugs. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits, silvas, mzolotukhin Differential Revision: https://reviews.llvm.org/D23437 llvm-svn: 280280	2016-08-31 19:26:19 +00:00
Geoff Berry	8d84605f25	[EarlyCSE] Optionally use MemorySSA. NFC. Summary: Use MemorySSA, if requested, to do less conservative memory dependency checking. This change doesn't enable the MemorySSA enhanced EarlyCSE in the default pipelines, so should be NFC. Reviewers: dberlin, sanjoy, reames, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19821 llvm-svn: 280279	2016-08-31 19:24:10 +00:00
Krzysztof Parzyszek	e21a0b3b9f	[Hexagon] Deal with undefs when extending live intervals llvm-svn: 280275	2016-08-31 18:52:09 +00:00
Tom Stellard	ba5730884b	AMDGPU/SI: Make sure llvm.amdgcn.implicitarg.ptr() is at least 4-byte aligned Summary: This fixes some OpenCV tests that were broken by libclc commit r276443. Reviewers: arsenm, jvesely Subscribers: arsenm, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24051 llvm-svn: 280274	2016-08-31 18:46:07 +00:00
Quentin Colombet	1c06a73a7c	[TargetPassConfig] Add a hook to tell whether GlobalISel should warm on fallback. Thanks to this patch, we know have a way to easly see if GlobalISel failed. llvm-svn: 280273	2016-08-31 18:43:04 +00:00
Quentin Colombet	612cd1fdd6	[ResetMachineFunction] Emit the diagnostic isel fallback when asked. This pass is now able to report when the function is being reset. llvm-svn: 280272	2016-08-31 18:43:01 +00:00
Quentin Colombet	0a243f47b0	[DiagnosticInfo] Add a diagnostic class for the fallback of ISel. This will be used to warm when we fallback in GlobalISel. llvm-svn: 280271	2016-08-31 18:42:55 +00:00
Chad Rosier	83a120337a	Fix indent. NFC. llvm-svn: 280270	2016-08-31 18:37:52 +00:00
Krzysztof Parzyszek	576225daf5	Add an optional parameter with a list of undefs to extendToIndices llvm-svn: 280268	2016-08-31 18:02:19 +00:00
Kevin Enderby	9d0c945ad6	Next set of additional error checks for invalid Mach-O files for bad load commands that use the Mach::linkedit_data_command type for the load commands that are currently used in the MachOObjectFile constructor. This contains the missing checks for LC_DATA_IN_CODE and LC_LINKER_OPTIMIZATION_HINT load commands and the fields for the Mach::linkedit_data_command type. Checking for other load commands that use this type will be added later. Also fixed a couple of places that was using sizeof(MachOObjectFile::LoadCommandInfo) that should have been using sizeof(MachO::load_command). llvm-svn: 280267	2016-08-31 17:57:46 +00:00
Geoff Berry	64f5ed172a	[EarlyCSE] Allow forwarding a non-invariant load into an invariant load. Reviewers: sanjoy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23935 llvm-svn: 280265	2016-08-31 17:45:31 +00:00
Chad Rosier	0de580aaab	[SLP] Update the debug based on Michael's suggestion. Passing the types/opcode check still doesn't guarantee we'll actually vectorize. Therefore, just make it clear we're attempting to vectorize. llvm-svn: 280263	2016-08-31 17:41:12 +00:00
Chad Rosier	54807a9b9d	[SLP] Sink debug after checking for matching types/opcode. Differential Revision: https://reviews.llvm.org/D24090 llvm-svn: 280260	2016-08-31 17:31:09 +00:00
Davide Italiano	1e9d3d3b40	[lib/LTO] Factor out logic for running passes. This is in preparation for adding an option to run a custom pipeline with the new PM. It's currently used in lld. llvm-svn: 280258	2016-08-31 17:02:44 +00:00
Tim Shen	48f814e8a3	s/static inline/static/ for headers I have changed in r279475. NFC. llvm-svn: 280257	2016-08-31 16:48:13 +00:00
Reid Kleckner	9dac47319d	[codeview] Emit vtable shape information The shape of the vtable is passed down as the size of the __vtbl_ptr_type. This special pointer type appears both as the pointee type of the vptr type, and by itself in every dynamic class. For classes with multiple vtables, only the shape of the primary vftable is included, as the shape of all secondary vftables will be the same as in the base class. Fixes PR28150 llvm-svn: 280254	2016-08-31 15:59:30 +00:00
Philip Reames	2b1084ac93	[statepoints][experimental] Add support for live-in semantics of values in deopt bundles This is a first step towards supporting deopt value lowering and reporting entirely with the register allocator. I hope to build on this in the near future to support live-on-return semantics, but I have a use case which allows me to test and investigate code quality with just the live-in semantics so I've chosen to start there. For those curious, my use cases is our implementation of the "__llvm_deoptimize" function we bind to @llvm.deoptimize. I'm choosing not to hard code that fact in the patch and instead make it configurable via function attributes. The basic approach here is modelled on what is done for the "Live In" values on stackmaps and patchpoints. (A secondary goal here is to remove one of the last barriers to merging the pseudo instructions.) We start by adding the operands directly to the STATEPOINT SDNode. Once we've lowered to MI, we extend the remat logic used by the register allocator to fold virtual register uses into StackMap::Indirect entries as needed. This does rely on the fact that the register allocator rematerializes. If it didn't along some code path, we could end up with more vregs than physical registers and fail to allocate. Today, we only fold in the register allocator. This can create some weird effects when combined with arguments passed on the stack because we don't fold them appropriately. I have an idea how to fix that, but it needs this patch in place to work on that effectively. (There's some weird interaction with the scheduler as well, more investigation needed.) My near term plan is to land this patch off-by-default, experiment in my local tree to identify any correctness issues and then start fixing codegen problems one by one as I find them. Once I have the live-in lowering fully working (both correctness and code quality), I'm hoping to move on to the live-on-return semantics. Note: I don't have any known miscompiles with this patch enabled, but I'm pretty sure I'll find at least a couple. Thus, the "experimental" tag and the fact it's off by default. Differential Revision: https://reviews.llvm.org/D24000 llvm-svn: 280250	2016-08-31 15:12:17 +00:00
Simon Pilgrim	6199b4fd49	[X86][SSE] Improve awareness of (v)cvtpd2ps implicit zeroing of upper 64-bits of xmm result Associate x86_sse2_cvtpd2ps with X86ISD::VFPROUND to avoid inserting unnecessary zeroing shuffles. Differential Revision: https://reviews.llvm.org/D23797 llvm-svn: 280249	2016-08-31 15:09:34 +00:00
Chad Rosier	669130ffdf	[SLP] Arguments should be camel case, and start with an upper case letter. NFC. llvm-svn: 280248	2016-08-31 15:06:58 +00:00
Sjoerd Meijer	46b5b88387	Clang patch r280064 introduced ways to set the FP exceptions and denormal types. This is the LLVM counterpart and it adds options that map onto FP exceptions and denormal build attributes allowing better fp math library selections. Differential Revision: https://reviews.llvm.org/D24070 llvm-svn: 280246	2016-08-31 14:17:38 +00:00
Krzysztof Parzyszek	64488118bf	Fixed spill stack objects are mutable Differential Revision: https://reviews.llvm.org/D24039 llvm-svn: 280244	2016-08-31 13:52:17 +00:00
James Molloy	3c1137c639	Revert "[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases" This reverts commit r280218. This also causes buildbot errors. Sigh. Not a successful day all around! llvm-svn: 280239	2016-08-31 13:32:28 +00:00
James Molloy	cacfc16109	Revert "[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd" This reverts commit r280216 - it caused buildbot failures. llvm-svn: 280234	2016-08-31 13:16:52 +00:00
James Molloy	76c9d423a7	Revert "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches" This reverts commit r280217. r280216 caused buildbot failures - backing out the entire chain. llvm-svn: 280233	2016-08-31 13:16:45 +00:00
James Molloy	06a45483a1	Revert "[SimplifyCFG] Add a workaround to fix PR30188" This reverts commit r280219. r280216 caused buildbot failures - backing out the entire chain. llvm-svn: 280232	2016-08-31 13:16:36 +00:00
James Molloy	8a66a39cbf	Revert "[SimplifyCFG] Fix bootstrap failure after r280220" This reverts commit r280228. r280216 caused buildbot failures - backing out the entire sequence. llvm-svn: 280231	2016-08-31 13:16:30 +00:00
Diana Picus	760c757633	Use abstraction in AArch64AsmPrinter::lowerSTACKMAP. NFCI Use functionality from StackMapOpers instead of hardcoding an operand access. llvm-svn: 280230	2016-08-31 12:43:49 +00:00
Diana Picus	16c818820b	Typo fixes. NFC llvm-svn: 280229	2016-08-31 12:43:44 +00:00
James Molloy	b7efa6c227	[SimplifyCFG] Fix bootstrap failure after r280220 We check that a sinking candidate is used by only one PHI node during our legality checks. However for instructions that are used by other sinking candidates our heuristic is less conservative. This can result in a candidate actually being illegal when we come to sink it because of how we sunk a predecessor. Do the used-by-only-one-PHI checks again during sinking to ensure we don't crash. llvm-svn: 280228	2016-08-31 12:33:48 +00:00
Nikolay Haustov	eba808957e	AMDGPU/SI: Handle aliases in AMDGPUAlwaysInlinePass Summary: Simply replace usage of aliases to functions with aliasee. This came up when bitcode linking to builtin library and calls to aliases not being resolved. Also made minor improvements to existing test. Reviewers: tstellarAMD, alex-t, vpykhtin Subscribers: arsenm, wdng, rampitec Differential Revision: https://reviews.llvm.org/D24023 llvm-svn: 280221	2016-08-31 11:18:33 +00:00
James Molloy	171fdac7ce	[SimplifyCFG] Add a workaround to fix PR30188 We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions. The real fix is in SROA, which I'll be looking into. llvm-svn: 280219	2016-08-31 10:46:45 +00:00
James Molloy	8e69b032e5	[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases A very important case is not handled here: multiple arcs to a single block with a PHI. Consider: a: %1 = icmp %b, 1 br %1, label %c, label %e c: %2 = icmp %b, 2 br %2, label %d, label %e d: br %e e: phi [0, %a], [1, %c], [2, %d] FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs. llvm-svn: 280218	2016-08-31 10:46:39 +00:00
James Molloy	c53b40b509	[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted. As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider: if (a) x(1); else if (b) x(2); This produces the following CFG: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ \| / [ end ] [end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch. We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects). We are now able to detect this case and split off the unconditional arcs to a common successor: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ / \| [sink.split] \| \ / [ end ] Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases. llvm-svn: 280217	2016-08-31 10:46:33 +00:00
James Molloy	55bd04cd20	[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything. This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink. This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example: %a = load i32* %b %d = load i32* %b %c = gep i32* %a, i32 0 %e = gep i32* %d, i32 1 Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor). This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough. Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm. In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging. In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive. This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans. llvm-svn: 280216	2016-08-31 10:46:23 +00:00
James Molloy	923e98c232	[SimplifyCFG] Tail-merge calls with sideeffects This was deliberately disabled during my rewrite of SinkIfThenToEnd to keep behaviour at least vaguely consistent with the previous version and keep it as close to NFC as I could. There's no real reason not to merge sideeffect calls though, so let's do it! Small fixup along the way to ensure we don't create indirect calls. Should fix PR28964. llvm-svn: 280215	2016-08-31 10:46:16 +00:00
Simon Pilgrim	7b09af193a	[X86][SSE] Improve awareness of fptrunc implicit zeroing of upper 64-bits of xmm result Add patterns to avoid inserting unnecessary zeroing shuffles when lowering fptrunc to (v)cvtpd2ps Differential Revision: https://reviews.llvm.org/D23797 llvm-svn: 280214	2016-08-31 10:35:13 +00:00
Igor Kudrin	f3c8a9cfbb	[Coverage] Make sorting criteria for CounterMappingRegions local. Move the comparison function into the only place there it is used, i.e. the call to std::stable_sort in CoverageMappingWriter::write(). Add sorting by region kinds as it is required to ensure stable order in our tests and to simplify D23987. Differential Revision: https://reviews.llvm.org/D24034 llvm-svn: 280198	2016-08-31 07:01:17 +00:00
Craig Topper	8f6827c945	[AVX-512] Add patterns to select masked logical operations if the select has a floating point type. This is needed in order to replace the masked floating point logical op intrinsics with native IR. llvm-svn: 280195	2016-08-31 05:37:52 +00:00
Dean Michael Berris	047669f18c	[XRay] Support multiple return instructions in a single basic block Add a .mir test to catch this case, and fix the xray-instrumentation pass to handle it appropriately. llvm-svn: 280192	2016-08-31 05:20:08 +00:00
David Majnemer	a90e51e106	[Loads] Properly populate the visited set in isDereferenceableAndAlignedPointer There were paths where we wouldn't populate the visited set, causing us to recurse forever if an SSA variable was defined in terms of itself. This fixes PR30210. llvm-svn: 280191	2016-08-31 03:22:32 +00:00
Hal Finkel	97a189c716	[PowerPC] Don't spill the frame pointer twice When a function contains something, such as inline asm, which explicitly clobbers the register used as the frame pointer, don't spill it twice. If we need a frame pointer, it will be saved/restored in the prologue/epilogue code. Explicitly spilling it again will reuse the same spill slot used by the prologue/epilogue code, thus clobbering the saved value. The same applies to the base-pointer or PIC-base register. Partially fixes PR26856. Thanks to Ulrich for his analysis and the small inline-asm reproducer. llvm-svn: 280188	2016-08-31 00:52:03 +00:00
Gor Nishanov	50d7fb974f	[Coroutines] Part 10: Add coroutine promise support. Summary: 1) CoroEarly now lowers llvm.coro.promise intrinsic that allows to obtain a coroutine promise pointer from a coroutine frame and vice versa. 2) CoroFrame now interprets Promise argument of llvm.coro.begin to place CoroutinPromise alloca at a deterministic offset from the coroutine frame. Now, the coroutine promise example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex4.ll). Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23993 llvm-svn: 280184	2016-08-31 00:35:41 +00:00
Sanjay Patel	7d9ebaf337	[InstCombine] clean up InsertRangeTest; NFCI It's much less code and easier to read if we don't duplicate everything between the 'Inside' and not 'Inside' cases. As noted with the FIXME, the goal is to make this vector-friendly in a follow-up patch. llvm-svn: 280183	2016-08-31 00:19:35 +00:00
Alina Sbirlea	3f8f7840bf	[LoadStoreVectorizer] Change VectorSet to Vector to match head and tail positions. Resolves PR29148. Summary: LSV was using two vector sets (heads and tails) to track pairs of adjiacent position to vectorize. A recent optimization is trying to obtain the longest chain to vectorize and assumes the positions in heads(H) and tails(T) match, which is not the case is there are multiple tails for the same head. e.g.: i1: store a[0] i2: store a[1] i3: store a[1] Leads to: H: i1 T: i2 i3 Instead of: H: i1 i1 T: i2 i3 So the positions for instructions that follow i3 will have different indexes in H/T. This patch resolves PR29148. This issue also surfaced the fact that if the chain is too long, and TLI returns a "not-fast" answer, the whole chain will be abandoned for vectorization, even though a smaller one would be beneficial. Added a testcase and FIXME for this. Reviewers: tstellarAMD, arsenm, jlebar Subscribers: mzolotukhin, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24057 llvm-svn: 280179	2016-08-30 23:53:59 +00:00
Reid Kleckner	dbaa61cbe3	[codeview] Remove redundant TypeTable lookup As written, the code should assert if this lookup would have ever succeeded. Without looking through composite types, the type graph should be acyclic. llvm-svn: 280168	2016-08-30 21:48:14 +00:00
Kevin Enderby	dcbc504c47	Next set of additional error checks for invalid Mach-O files for bad LC_DYSYMTAB’s. This contains the missing checks for LC_DYSYMTAB load command fields. llvm-svn: 280161	2016-08-30 21:28:30 +00:00
Tim Northover	991b12bf09	GlobalISel: combine extracts & sequences created for legalization Legalization ends up creating many G_SEQUENCE/G_EXTRACT pairs which leads to inefficient codegen (even for -O0), so add a quick pass over the function to remove them again. llvm-svn: 280155	2016-08-30 20:51:25 +00:00
Matt Arsenault	a609e2d5ce	AMDGPU: Relax SGPR asm constraint register class s should be SReg_32 to be as general as possible. This can avoid a copy from m0. llvm-svn: 280154	2016-08-30 20:50:08 +00:00
Mike Aizatsky	b077d3fef2	[libfuzzer] simplified unit truncation; do not write trunc items to disc Differential Revision: https://reviews.llvm.org/D24049 llvm-svn: 280153	2016-08-30 20:49:07 +00:00
Michael Kuperstein	2954d1db77	[LoopVectorizer] Predicate instructions in blocks with several incoming edges We don't need to limit predication to blocks that have a single incoming edge, we just need to use the right mask. This fixes PR30172. Differential Revision: https://reviews.llvm.org/D24009 llvm-svn: 280148	2016-08-30 20:22:21 +00:00
David Majnemer	ac8cfab51f	[COFFObjectFile] Ignore broken symbol table When binaries are compressed by UPX, information about symbol table offset and symbol count remain unchanged (but became invalid due to compression). This causes failure in the constructor and the rest of the binary cannot be processed. Instead, reset symbol related information (symbol/string table pointers, sizes) - this should disable the related iterators and functions while the rest of the binary can still be processed. Patch by Bandzi Michal! llvm-svn: 280147	2016-08-30 20:20:24 +00:00
Duncan P. N. Exon Smith	73b8dbdd94	CodeGen: Fixup for r280128, since GCC isn't as permissive as Clang Fixes the bots, e.g.: http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/10055 llvm-svn: 280135	2016-08-30 19:11:11 +00:00
Tim Northover	e5102de678	GlobalISel: forbid physical registers on generic MIs. We're intending to move to a world where the type of a register is determined by its (unique) def. This is incompatible with physregs, which are untyped. It also means the other passes don't have to worry quite so much about register-class compatibility and inserting COPYs appropriately. llvm-svn: 280132	2016-08-30 18:52:46 +00:00
Duncan P. N. Exon Smith	f947c3afe1	ADT: Split ilist_node_traits into alloc and callback, NFC Many lists want to override only allocation semantics, or callbacks for iplist. Split these up to prevent code duplication. - Specialize ilist_alloc_traits to change the implementations of deleteNode() and createNode(). - One common desire is to do nothing deleteNode() and disable createNode(). Specialize ilist_alloc_traits to inherit from ilist_noalloc_traits for that behaviour. - Specialize ilist_callback_traits to use the addNodeToList(), removeNodeFromList(), and transferNodesFromList() callbacks. As a drive-by, add some coverage to the callback-related unit tests. llvm-svn: 280128	2016-08-30 18:40:47 +00:00
Kyle Butt	61aca6ef79	TailDuplication: Extract Indirect-Branch block limit as option. NFC The existing code hard-coded a limit of 20 instructions for duplication when a block ended with an indirect branch. Extract this as an option. No functional change intended. llvm-svn: 280125	2016-08-30 18:18:54 +00:00
Duncan P. N. Exon Smith	b7668d5164	ADT: Guarantee transferNodesFromList is only called on transfers Guarantee that ilist_traits<T>::transferNodesFromList is only called when nodes are actually changing lists. I also moved all the callbacks to occur first, before the operation. This is the only choice for iplist<T>::merge, so we might as well be consistent. I expect this to have no effect in practice, although it simplifies the logic in both iplist<T>::transfer and iplist<T>::insert. llvm-svn: 280122	2016-08-30 18:00:45 +00:00
Sanjay Patel	b37145712e	[InstCombine] replace divide-by-constant checks with asserts; NFC These folds already have tests for scalar and vector types, except for the vector div-by-0 case, so I'm adding tests for that. llvm-svn: 280115	2016-08-30 17:31:34 +00:00
Sanjay Patel	a7cb477277	[InstCombine] clean up foldICmpDivConstant; NFCI 1. Fix comments to match variable names 2. Remove redundant CmpRHS variable 3. Add FIXME to replace some checks with asserts llvm-svn: 280112	2016-08-30 17:10:49 +00:00
NAKAMURA Takumi	b673b16857	Fixup r279618, instantiate AnalysisManagerProxy<AnalysisManager,LazyCallGraph::SCC>, instead of AnalysisManagerProxy<AnalysisManager,LazyCallGraph::SCC,LazyCallGraph&>, for PassID. Or they were not instantiated as expected; llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID llvm::InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::Function>, llvm::LazyCallGraph::SCC>::PassID llvm-svn: 280105	2016-08-30 15:47:13 +00:00
Valery Pykhtin	a34fb49f8f	[AMDGPU] Refactor SOP instructions TD files. Differential revision: https://reviews.llvm.org/D23617 llvm-svn: 280101	2016-08-30 15:20:31 +00:00
Kostya Serebryany	a016a45d60	[libFuzzer] fix a bug when running a single unit of N bytes with -max_len=M, M<N, caused a buffer overflow llvm-svn: 280098	2016-08-30 14:52:05 +00:00
Kostya Serebryany	248d11519a	[libFuzzer] stop using bits for memcmp's value profile -- seems to blow up the corpus too much llvm-svn: 280096	2016-08-30 14:39:33 +00:00
Nirav Dave	d8858cafa9	[MC] Move parser helper functions from Asmparser to MCAsmParser NFC Intended. llvm-svn: 280092	2016-08-30 14:15:43 +00:00
Chad Rosier	27ac0d8670	[Reassociate] Add additional debug output. NFC. llvm-svn: 280090	2016-08-30 13:58:35 +00:00
NAKAMURA Takumi	9720f57a17	SILoadStoreOptimizer.cpp: Fix a warning in r279991. [-Wunused-variable] llvm-svn: 280075	2016-08-30 11:50:21 +00:00
James Molloy	d13b1239e4	[SimplifyCFG] Properly CSE metadata in SinkThenElseCodeToEnd This was missing, meaning the metadata in sunk instructions was potentially bogus and could cause miscompiles. llvm-svn: 280072	2016-08-30 10:56:08 +00:00
James Y Knight	d7d9e1069b	Replace incorrect "#ifdef DEBUG" with "#ifndef NDEBUG". The former is simply wrong -- the code will either never be used or will always be used, rather than being dependent upon whether it's built with debug assertions enabled. The macro DEBUG isn't ever set by the llvm build system. But, the macro DEBUG(X) is defined (unconditionally) if you happen to include llvm/Support/Debug.h. The code in Value.h which was erroneously protected by the #ifdef DEBUG didn't even compile -- you can't cast<> from an LLVMOpaqueValue directly. Fortunately, it was never invoked, as Core.cpp included Value.h before Debug.h. The conditionalized code in AArch64CollectLOH.cpp was previously always used, as it includes Debug.h. llvm-svn: 280056	2016-08-30 03:16:16 +00:00
Kostya Serebryany	d4492f8101	[libFuzzer] use bits instead of bytes for memcmp/strcmp value profile -- the fuzzer reaches the goal much faster, at least on the simple puzzles llvm-svn: 280054	2016-08-30 03:05:50 +00:00
Anna Thomas	1aea6da564	[RewriteStatepointsForGC] Update comment for same PHI node check. NFC llvm-svn: 280052	2016-08-30 02:36:48 +00:00
Hal Finkel	18d0e3f44c	[PowerPC] Force entry alignment in .got2 Implement Bill's suggested fix for 32-bit targets for PR22711 (for the alignment of each entry). As pointed out in the bug report, we could just force the section alignment, since we only add pointer-sized things currently, but this fix is somewhat more future-proof. llvm-svn: 280049	2016-08-30 01:43:38 +00:00
Sanjoy Das	6d3c9132e3	Fix coding style; NFC Avoid variables starting with lowercase. llvm-svn: 280048	2016-08-30 01:38:59 +00:00
Kostya Serebryany	4d22e4fcb9	[libFuzzer] use trace-div and trace-gep for guided fuzzing, add tests llvm-svn: 280046	2016-08-30 01:30:14 +00:00
Kostya Serebryany	5ac427b8e4	[sanitizer-coverage] add two more modes of instrumentation: trace-div and trace-gep, mostly usaful for value-profile-based fuzzing; llvm part llvm-svn: 280043	2016-08-30 01:12:10 +00:00
Hal Finkel	b074a608ce	[PowerPC] Add support for -mlongcall The "long call" option forces the use of the indirect calling sequence for all calls (even those that don't really need it). GCC provides this option; This is helpful, under certain circumstances, for building very-large binaries, and some other specialized use cases. Fixes PR19098. llvm-svn: 280040	2016-08-30 00:59:23 +00:00
Piotr Padlewski	442d38c0b4	NFC: add early exit in ModuleSummaryAnalysis Summary: Changed this code because it was not very readable. The one question that I got after changing it is, should we count calls to intrinsics? We don't add them to caller summary, so maybe we shouldn't also count them? Reviewers: tejohnson, eraman, mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23949 llvm-svn: 280036	2016-08-30 00:46:26 +00:00
Duncan P. N. Exon Smith	5c001c367f	ADT: Give ilist<T>::reverse_iterator a handle to the current node Reverse iterators to doubly-linked lists can be simpler (and cheaper) than std::reverse_iterator. Make it so. In particular, change ilist<T>::reverse_iterator so that it is never invalidated unless the node it references is deleted. This matches the guarantees of ilist<T>::iterator. (Note: MachineBasicBlock::iterator is not an ilist iterator, but a MachineInstrBundleIterator<MachineInstr>. This commit does not change MachineBasicBlock::reverse_iterator, but it does update MachineBasicBlock::reverse_instr_iterator. See note at end of commit message for details on bundle iterators.) Given the list (with the Sentinel showing twice for simplicity): [Sentinel] <-> A <-> B <-> [Sentinel] the following is now true: 1. begin() represents A. 2. begin() holds the pointer for A. 3. end() represents [Sentinel]. 4. end() holds the poitner for [Sentinel]. 5. rbegin() represents B. 6. rbegin() holds the pointer for B. 7. rend() represents [Sentinel]. 8. rend() holds the pointer for [Sentinel]. The changes are #6 and #8. Here are some properties from the old scheme (which used std::reverse_iterator): - rbegin() held the pointer for [Sentinel] and rend() held the pointer for A; - operator() cost two dereferences instead of one; - converting from a valid iterator to its valid reverse_iterator involved a confusing increment; and - "RI++->erase()" left RI invalid. The unintuitive replacement was "RI->erase(), RE = end()". With vector-like data structures these properties are hard to avoid (since past-the-beginning is not a valid pointer), and don't impose a real cost (since there's still only one dereference, and all iterators are invalidated on erase). But with lists, this was a poor design. Specifically, the following code (which obviously works with normal iterators) now works with ilist::reverse_iterator as well: for (auto RI = L.rbegin(), RE = L.rend(); RI != RE;) fooThatMightRemoveArgFromList(RI++); Converting between iterator and reverse_iterator for the same node uses the getReverse() function. reverse_iterator iterator::getReverse(); iterator reverse_iterator::getReverse(); Why doesn't iterator <=> reverse_iterator conversion use constructors? In order to catch and update old code, reverse_iterator does not even have an explicit conversion from iterator. It wouldn't be safe because there would be no reasonable way to catch all the bugs from the changed semantic (see the changes at call sites that are part of this patch). Old code used this API: std::reverse_iterator::reverse_iterator(iterator); iterator std::reverse_iterator::base(); Here's how to update from old code to new (that incorporates the semantic change), assuming I is an ilist<>::iterator and RI is an ilist<>::reverse_iterator: [Old] ==> [New] reverse_iterator(I) (--I).getReverse() reverse_iterator(I) ++I.getReverse() --reverse_iterator(I) I.getReverse() reverse_iterator(++I) I.getReverse() RI.base() (--RI).getReverse() RI.base() ++RI.getReverse() --RI.base() RI.getReverse() (++RI).base() RI.getReverse() delete &RI, RE = end() delete &RI++ RI->erase(), RE = end() RI++->erase() ======================================= Note: bundle iterators are out of scope ======================================= MachineBasicBlock::iterator, also known as MachineInstrBundleIterator<MachineInstr>, is a wrapper to represent MachineInstr bundles. The idea is that each operator++ takes you to the beginning of the next bundle. Implementing a sane reverse iterator for this is harder than ilist. Here are the options: - Use std::reverse_iterator<MBB::i>. Store a handle to the beginning of the next bundle. A call to operator() runs a loop (usually operator--() will be called 1 time, for unbundled instructions). Increment/decrement just works. This is the status quo. - Store a handle to the final node in the bundle. A call to operator() still runs a loop, but it iterates one time fewer (usually operator--() will be called 0 times, for unbundled instructions). Increment/decrement just works. - Make the ilist_sentinel<MachineInstr> always store that it's the sentinel (instead of just in asserts mode). Then the bundle iterator can sniff the sentinel bit in operator++(). I initially tried implementing the end() option as part of this commit, but updating iterator/reverse_iterator conversion call sites was error-prone. I have a WIP series of patches that implements the final option. llvm-svn: 280032	2016-08-30 00:13:12 +00:00
Jan Vesely	89876673cd	AMDGPU/R600: Cleanup DAGCombine Move SDLoc initialization to comon place. fall back to AMDGPU version in one place Differential Revision: https://reviews.llvm.org/D23900 llvm-svn: 280030	2016-08-29 23:21:46 +00:00
Michael Kuperstein	173b43da35	Fix typo in comment. NFC. llvm-svn: 280025	2016-08-29 22:49:05 +00:00

... 3 4 5 6 7 ...

94831 Commits