llvm-project

Commit Graph

Author	SHA1	Message	Date
Quentin Colombet	904a2c7422	[RegBankSelect] Teach how to repair definitions. Although repairing definitions is not mandatory for correctness (only phis would be impacted because of the RPO traversal), not repairing might go against the cost model. Therefore, just repair when it is possible. llvm-svn: 266025	2016-04-12 00:12:59 +00:00
JF Bastien	4f43cfd2c2	MergeFunctions: test alloca better r237193 fix handling of alloca size / align in MergeFunctions, but only tested one and didn't follow FunctionComparator::cmpOperations's usual comparison pattern. It also didn't update Instruction.cpp:haveSameSpecialState which I'll do separately. llvm-svn: 266022	2016-04-12 00:03:26 +00:00
Derek Schuff	f7b2bce1f1	Replace MachineRegisterInfo::TracksLiveness with a MachineFunctionProperty Use the MachineFunctionProperty mechanism to indicate whether the liveness info is accurate instead of a bool flag on MRI. Keeps the MRI accessor function for convenience. NFC Differential Revision: http://reviews.llvm.org/D18767 llvm-svn: 266020	2016-04-11 23:32:13 +00:00
Mehdi Amini	ae280e54a9	ThinLTO renaming: use module hash instead of position in the summary This is more robust to changes in the link ordering. Differential Revision: http://reviews.llvm.org/D18946 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266018	2016-04-11 23:26:46 +00:00
JF Bastien	b3ac75f748	AtomicExpandPass: mark assert variable as used Avoid -Wunused-variable llvm-svn: 266016	2016-04-11 23:03:54 +00:00
James Y Knight	00db547f97	Fix compile with GCC after r266002 (Add __atomic_* lowering to AtomicExpandPass) It doesn't like implicitly calling the ArrayRef constructor with a returned array -- it appears to decays the returned value to a pointer, first, before trying to make an ArrayRef out of it. llvm-svn: 266011	2016-04-11 22:52:42 +00:00
Justin Bogner	1faf01578e	CodeGen: Fix a use-after-free in TailDuplication The call to processPHI already erased MI from its parent, so MI isn't even valid here, making the getParent() call a use-after-free in addition to being redundant. Found by ASan with the ArrayRecycler changes in llvm.org/pr26808. llvm-svn: 266008	2016-04-11 22:37:13 +00:00
JF Bastien	3b6eaace62	NFC: keep comment up to date MergeFunctions was refactored a while ago, and Instruction.cpp's comments went out of sync. The content did as well, will fix later. llvm-svn: 266007	2016-04-11 22:30:37 +00:00
Evgeniy Stepanov	f17120a85f	[safestack] Add canary to unsafe stack frames Add StackProtector to SafeStack. This adds limited protection against data corruption in the caller frame. Current implementation treats all stack protector levels as -fstack-protector-all. llvm-svn: 266004	2016-04-11 22:27:48 +00:00
Tim Northover	a6dea06fe3	ARM: use r7 as the frame-pointer on all MachO targets. This is better for a few reasons: + It matches the other tooling for iOS. + It matches EABI in more cases (i.e. Thumb-mode, and in practice we don't use ARM mode). + It leads to infinitesimally smaller code (0.2%, yay!). rdar://25369506 llvm-svn: 266003	2016-04-11 22:27:40 +00:00
James Y Knight	b91d38c5fe	Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266002	2016-04-11 22:22:33 +00:00
Simon Pilgrim	82e54871d0	[DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) anytime before LegalizeVectorOprs xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well. The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles. I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result. Differential Revision: http://reviews.llvm.org/D18944 llvm-svn: 265998	2016-04-11 21:10:33 +00:00
Manman Ren	5751814eda	Swift Calling Convention: swifterror target support. Differential Revision: http://reviews.llvm.org/D18716 llvm-svn: 265997	2016-04-11 21:08:06 +00:00
Tom Stellard	0ffdf65eaa	Revert "AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute" This reverts commit r263720. Just confirmed that s_waitcnt is required after ds_permute/ds_bpermute. llvm-svn: 265992	2016-04-11 20:38:40 +00:00
Hans Wennborg	1f09485c40	Fix broken assert, PR24624 llvm-svn: 265989	2016-04-11 20:35:41 +00:00
Hans Wennborg	e631996350	Remove redundant .c_str(), as suggested by PR25633 llvm-svn: 265988	2016-04-11 20:35:17 +00:00
Hans Wennborg	e9134897f4	Fix a couple of redundant conditional expressions (PR27283, PR28282) llvm-svn: 265987	2016-04-11 20:35:01 +00:00
Sanjay Patel	892f167aa5	use range-loops; NFCI llvm-svn: 265985	2016-04-11 20:13:44 +00:00
Tim Northover	6b3169bb97	MCParser: diagnose missing directional labels more clearly. Before, ELF at least managed a diagnostic but it was a completely untraceable "undefined symbol" error. MachO had a variety of even worse behaviours: crash, emit corrupt file, or an equally bad message. llvm-svn: 265984	2016-04-11 19:50:46 +00:00
Matthew Simpson	53207a99f9	[LoopUtils, LV] Fix PR27246 (first-order recurrences) This patch ensures that when we detect first-order recurrences, we reject a phi node if its previous value is also a phi node. During vectorization the initial and previous values of the recurrence are shuffled together to create the value for the current iteration. However, phi nodes are not widened like other instructions. This fixes PR27246. Differential Revision: http://reviews.llvm.org/D18971 llvm-svn: 265983	2016-04-11 19:48:18 +00:00
Sriraman Tallam	f39e190ad8	Test commit. llvm-svn: 265976	2016-04-11 18:40:50 +00:00
Lang Hames	f9033bbf54	[Object] Make .alt_entry directive parsing MachO specific. ELF and COFF will now treat .alt_entry like any other unrecognized directive. llvm-svn: 265975	2016-04-11 18:33:45 +00:00
Reid Kleckner	b6800b3052	Combine redundant stack realignment booleans in MachineFrameInfo MachineFrameInfo does not need to be able to distinguish between the user asking us not to realign the stack and the target telling us it doesn't support stack realignment. Either way, fixed stack objects have their alignment clamped. llvm-svn: 265971	2016-04-11 17:54:03 +00:00
Sanjay Patel	b91bcd704a	add FIXME comment; NFC llvm-svn: 265970	2016-04-11 17:35:57 +00:00
Sanjay Patel	3a48e9823e	add an assert for safety; NFC llvm-svn: 265969	2016-04-11 17:27:44 +00:00
Sanjay Patel	4b9c682acf	variable names start with a capital letter; NFC llvm-svn: 265968	2016-04-11 17:25:23 +00:00
Xinliang David Li	8dd4ca819b	Add code comment/NFC llvm-svn: 265966	2016-04-11 17:13:08 +00:00
Sanjay Patel	371290790f	[InstCombine] use canEvaluateShiftedShift() to handle the lshr case (NFCI) We need just a couple of logic tweaks to consolidate the shl and lshr cases. This is step 5 of refactoring to solve PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 llvm-svn: 265965	2016-04-11 17:11:55 +00:00
Sanjay Patel	816ec8882a	[InstCombine] don't try to shift an illegal amount (PR26760) This is the straightforward fix for PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 But we still need to make some changes to generalize this helper function and then send the lshr case into here. llvm-svn: 265960	2016-04-11 16:50:32 +00:00
Tom Stellard	52686e4182	TargetRegisterInfo: Add getRegAsmName() Summary: The motivation for this new function is to move an invalid assumption about the relationship between the names of register definitions in tablegen files and their assembly names into TargetRegisterInfo, so that we can begin working on fixing this assumption. The current problem is that if you have a register definition in TableGen like: def MYReg0 : Register<"r0", 0>; The function TargetLowering::getRegForInlineAsmConstraint() derives the assembly name from the tablegen name: "MyReg0" rather than the given assembly name "r0". This is working, because on most targets the tablegen name and the assembly names are case insensitive matches for each other (e.g. def EAX : X86Reg<"eax", ...> getRegAsmName() will allow targets to override this default assumption and return the correct assembly name. Reviewers: echristo, hfinkel Subscribers: SamWot, echristo, hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D15614 llvm-svn: 265955	2016-04-11 16:21:12 +00:00
Sanjay Patel	bd8b779d16	[InstCombine] rename variables in shifted-shift helper function (NFCI) This is step 3 of refactoring to solve PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 llvm-svn: 265954	2016-04-11 16:11:07 +00:00
Sanjay Patel	6eaff5cec6	[InstCombine] add helper function for shift-shift optimization (NFCI) This is step 2 of refactoring to solve PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 llvm-svn: 265951	2016-04-11 15:43:41 +00:00
Sanjoy Das	f9d88e650b	This reverts commit r265913 and r265912 See PR27315 r265913: "[IndVars] Eliminate op.with.overflow when possible" r265912: "[SCEV] See through op.with.overflow intrinsics" llvm-svn: 265950	2016-04-11 15:26:18 +00:00
Petar Jovanovic	e578e970cb	[mips] Make Static a default relocation model for MIPS codegen This change follows up defaults for GCC and Clang, so LLVM does not differ from them. While number of the test files are touched with this change, they all keep the old (expected) behaviour with the explicit option: "-relocation-model=pic" The tests that have not been touched are insensitive to relocation model. Differential Revision: http://reviews.llvm.org/D17995 llvm-svn: 265949	2016-04-11 15:24:23 +00:00
Daniel Sanders	a45d3e439f	[mips] Trivial corrections to range checked immediates. Summary: SYNC has a 5-bit unsigned immediate. Move MIPS16-specific pcrel16 operand to Mips16 files. Reviewers: vkalintiris Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D18755 llvm-svn: 265947	2016-04-11 15:20:40 +00:00
Teresa Johnson	6f6fa36244	[ThinLTO] BitcodeWriter still requires Analysis library This should fix bot failure: http://bb.pgr.jp/builders/i686-mingw32-RA-on-linux/builds/9873 The bitcode writer unfortunately still needs the Analysis library, as it replaces old dependence on BFI etc with dependence on new ModuleSummaryAnalysis pass. llvm-svn: 265945	2016-04-11 14:59:07 +00:00
Ulrich Weigand	aa04768600	[SystemZ] README: remove an implemented idea, add some new ones The note about conditional returns can now be removed, as they are implemented. Let's also add 2 new ones in exchange. Author: koriakin Differential Revision: http://reviews.llvm.org/D18962 llvm-svn: 265944	2016-04-11 14:38:47 +00:00
Ulrich Weigand	1bac911c58	[SystemZ] Add SVC instruction This is going to be useful for inline assembly only. Author: koriakin Differential Revision: http://reviews.llvm.org/D18952 llvm-svn: 265943	2016-04-11 14:35:39 +00:00
Teresa Johnson	2d5487cf44	[ThinLTO] Move summary computation from BitcodeWriter to new pass Summary: This is the first step in also serializing the index out to LLVM assembly. The per-module summary written to bitcode is moved out of the bitcode writer and to a new analysis pass (ModuleSummaryIndexWrapperPass). The pass itself uses a new builder class to compute index, and the builder class is used directly in places where we don't have a pass manager (e.g. llvm-as). Because we are computing summaries outside of the bitcode writer, we no longer can use value ids created by the bitcode writer's ValueEnumerator. This required changing the reference graph edge type to use a new ValueInfo class holding a union between a GUID (combined index) and Value* (permodule index). The Value* are converted to the appropriate value ID during bitcode writing. Also, this enables removal of the BitWriter library's dependence on the Analysis library that was previously required for the summary computation. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18763 llvm-svn: 265941	2016-04-11 13:58:45 +00:00
Oliver Stannard	c869e9158d	[ARM] Avoid switching ARM/Thumb mode on .arch/.cpu directive When we see a .arch or .cpu directive, we should try to avoid switching ARM/Thumb mode if possible. If we do have to switch modes, we also need to emit the correct mapping symbol for the new ISA. We did not do this previously, so could emit ARM code with Thumb mapping symbols (or vice-versa). The GAS behaviour is to always stay in the same mode, and to emit an error on any instructions seen when the current mode is not available on the current target. We can't represent that situation easily (we assume that Thumb mode is available if ModeThumb is set), so we differ from the GAS behaviour when switching to a target that can't support the old mode. I've added a warning for when this implicit mode-switch occurs. Differential Revision: http://reviews.llvm.org/D18955 llvm-svn: 265936	2016-04-11 13:06:28 +00:00
Ulrich Weigand	848a513d0a	[SystemZ] Support conditional indirect sibling calls via BCR This adds a conditional variant of CallBR instruction, CallBCR. Also, it can be fused with integer comparisons, resulting in one of the new C*BCall instructions. In addition to CallBRCL limitations, this has another one: it won't trigger if the function to call isn't already in %r1 - see f22 in the test for an example (it's also why the loads in tests are volatile). Author: koriakin Differential Revision: http://reviews.llvm.org/D18928 llvm-svn: 265933	2016-04-11 12:12:32 +00:00
Ulrich Weigand	fb97c51f6f	[SystemZ] Remove incorrect CC use for C*BReturn instructions These are fused compare-and-branches, so they obviously don't use CC. Author: koriakin Differential Revision: http://reviews.llvm.org/D18927 llvm-svn: 265932	2016-04-11 12:03:30 +00:00
Andrey Turetskiy	9df334c28e	[X86] Restrict max long nop length for Lakemont. Restrict the max length of long nops for Lakemont to 7. Experiments on MCU benchmarks (Dhrystone, Coremark) show that this is the most optimal length. Differential Revision: http://reviews.llvm.org/D18897 llvm-svn: 265924	2016-04-11 10:07:36 +00:00
Sanjoy Das	a07ad647ee	[IndVars] Eliminate op.with.overflow when possible Summary: If we can prove that an op.with.overflow intrinsic does not overflow, we can get rid of the intrinsic, and replace it with non-wrapping arithmetic. Reviewers: atrick, regehr Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18685 llvm-svn: 265913	2016-04-10 22:50:31 +00:00
Sanjoy Das	3c529a40ca	[SCEV] See through op.with.overflow intrinsics Summary: This change teaches SCEV to see reduce `(extractvalue 0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if possible). Reviewers: atrick, regehr Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18684 llvm-svn: 265912	2016-04-10 22:50:26 +00:00
Mehdi Amini	f9e4576e08	Plumb the option to emit the `ModuleHash` in the bitcode through the bitcode writer APIs From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265907	2016-04-10 21:07:19 +00:00
Simon Pilgrim	d263fdc512	[X86][AVX512BW] Add support for v64i8 multiplies Extend the existing lowering of vXi8 multiplies to support v64i8 on avx512bw targets. I added the Lower512IntArith helper function to help with this - not sure how often this could be used in the future, but it seemed better than putting all that logic inside LowerMUL. Differential Revision: http://reviews.llvm.org/D18937 llvm-svn: 265902	2016-04-10 17:02:48 +00:00
Elena Demikhovsky	751ed0a06a	Loop vectorization with uniform load Vectorization cost of uniform load wasn't correctly calculated. As a result, a simple loop that loads a uniform value wasn't vectorized. Differential Revision: http://reviews.llvm.org/D18940 llvm-svn: 265901	2016-04-10 16:53:19 +00:00
Teresa Johnson	3255eec16c	[ThinLTO] Remove unused parameter (NFC) llvm-svn: 265900	2016-04-10 15:17:26 +00:00
Craig Topper	35db8ecb50	[X86] Use for loops over types to reduce code for setting up operation actions. llvm-svn: 265893	2016-04-10 05:39:32 +00:00
Craig Topper	dcc8f49bf0	[X86] Remove unnecessary setOperationAction for SRA v2i64/v4i64 when VLX is suppored. This is already done for SSE2/AVX2 which VLX implies. NFC llvm-svn: 265892	2016-04-10 05:39:28 +00:00
Xinliang David Li	284644838f	[PGO] Fix deserialize bug Raw function pointer collected by value profile data may be from external functions that are not instrumented. They won't have mapping data to be used by the deserializer. Force the value to be 0 in this case. llvm-svn: 265890	2016-04-10 03:32:02 +00:00
Charles Davis	2f65f35c27	[CodeGen] Don't assume that fixed stack objects are aligned in a stack-realigned function. Summary: After we make the adjustment, we can assume that for local allocas, but not for stack parameters, the return address, or any other fixed stack object (which has a negative offset and therefore lies prior to the adjusted SP). Fixes PR26662. Reviewers: hfinkel, qcolombet, rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D18471 llvm-svn: 265886	2016-04-09 23:34:42 +00:00
Davide Italiano	7aa47094b2	[MC] support TLSDESC and TLSCALL / GNU2 tls dialect Differential Revision: http://reviews.llvm.org/D18885 llvm-svn: 265881	2016-04-09 20:32:33 +00:00
Adrian Prantl	3891e9e859	Drop debug info for DISubprograms that are not referenced by anything This patch drops the debug info for all DISubprograms that are (a) not attached to an llvm::Function and (b) not indirectly reachable via inline scopes from any surviving Function and (c) not reachable from a type (i.e.: member functions). Background: I'm currently working on a patch to reverse the pointers between DICompileUnit and DISubprogram (for more info check Duncan's RFC on lazy-loading of debug info metadata http://lists.llvm.org/pipermail/llvm-dev/2016-March/097419.html). The idea is to remove the list of subprograms from DICompileUnit and instead point to the owning compile unit from each DISubprogram. After doing this all DISubprograms fulfilling the above criteria will be implicitly dropped unless we go through an extra effort to preserve them. http://reviews.llvm.org/D18477 <rdar://problem/25256815> llvm-svn: 265876	2016-04-09 18:10:22 +00:00
Sanjay Patel	4abae4e0fa	[x86] use BMI 'andn' for logic + compare ops With BMI, we can use 'andn' to save an instruction when the result is only used in a compare. This is related to one of the potential sequences to check 'isfinite' in: https://llvm.org/bugs/show_bug.cgi?id=27164 Differential Revision: http://reviews.llvm.org/D18910 llvm-svn: 265875	2016-04-09 16:02:52 +00:00
Simon Pilgrim	1cc5712763	[X86][XOP] Support for VPPERM 2-input shuffle mask decoding This patch adds support for decoding XOP VPPERM instruction when it represents a basic shuffle. The mask decoding required the existing MCInstrLowering code to be updated to support binary shuffles - the implementation now matches what is done in X86InstrComments.cpp. Differential Revision: http://reviews.llvm.org/D18441 llvm-svn: 265874	2016-04-09 14:51:26 +00:00
Craig Topper	f027107094	[X86] Use for loops over types to reduce code for setting up operation actions. NFC llvm-svn: 265871	2016-04-09 06:31:02 +00:00
Craig Topper	e801ed9e15	[X86] Remove calls to setOperationAction that set CTLZ_ZERO_UNDEF for some vector types to Expand. Expand is already set for all operations for all vector types earlier so this is redundant. NFC llvm-svn: 265870	2016-04-09 05:53:48 +00:00
Sanjoy Das	dd77e1e6a5	Maintain calling convention when inling calls to llvm.deoptimize The behavior here was buggy -- we'd forget the calling convention after inlining a callsite calling llvm.deoptimize. llvm-svn: 265867	2016-04-09 00:22:59 +00:00
Mike Aizatsky	94e29668b0	[libfuzzer] defensive assert llvm-svn: 265866	2016-04-08 23:32:24 +00:00
Adrian Prantl	5992a72b4d	Support the Nodebug emission kind for DICompileUnits. Sample-based profiling and optimization remarks currently remove DICompileUnits from llvm.dbg.cu to suppress the emission of debug info from them. This is somewhat of a hack and only borderline legal IR. This patch uses the recently introduced NoDebug emission kind in DICompileUnit to achieve the same result without breaking the Verifier. A nice side-effect of this change is that it is now possible to combine NoDebug and regular compile units under LTO. http://reviews.llvm.org/D18808 <rdar://problem/25427165> llvm-svn: 265861	2016-04-08 22:43:03 +00:00
Easwaran Raman	9a3fc17ad4	Refactor Threshold computation. NFC. This is part of changes reviewed in http://reviews.llvm.org/D17584. llvm-svn: 265852	2016-04-08 21:28:02 +00:00
Tim Shen	0012756489	[SSP] Remove llvm.stackprotectorcheck. This is a cleanup patch for SSP support in LLVM. There is no functional change. llvm.stackprotectorcheck is not needed, because SelectionDAG isn't actually lowering it in SelectBasicBlock; rather, it adds check code in FinishBasicBlock, ignoring the position where the intrinsic is inserted (See FindSplitPointForStackProtector()). llvm-svn: 265851	2016-04-08 21:26:31 +00:00
Hans Wennborg	e25b65bdb7	Rangeify a loop. NFC. llvm-svn: 265846	2016-04-08 20:46:09 +00:00
Hans Wennborg	74ff770670	Remove some redundant variables from X86TargetLowering::LowerDYNAMIC_STACKALLOC These are already defined, with the same values, a few lines up. NFC. llvm-svn: 265845	2016-04-08 20:46:00 +00:00
Kyle Butt	3232dbbf02	Codegen: Factor tail duplication into a utility class. NFC This is in preparation for tail duplication during block placement. See D18226. This needs to be a utility class for 2 reasons. No passes may run after block placement, and also, tail-duplication affects subsequent layout decisions, so it must be interleaved with placement, and can't be separated out into its own pass. The original pass is still useful, and now runs by delegating to the utility class. llvm-svn: 265842	2016-04-08 20:35:01 +00:00
Evgeny Stupachenko	8788048403	test commit llvm-svn: 265840	2016-04-08 20:20:38 +00:00
Nirav Dave	66f485f4e2	Fix Load Control Dependence in MemCpy Generation In Memcpy lowering we had missed a dependence from the load of the operation to successor operations. This causes us to potentially construct an in initial DAG with a memory dependence not fully represented in the chain sub-DAG but rather require looking at the entire DAG breaking alias analysis by allowing incorrect repositioning of memory operations. To work around this, r200033 changed DAGCombiner::GatherAllAliases to be conservative if any possible issues to happen. Unfortunately this check forbade many non-problematic situations as well. For example, it's common for incoming argument lowering to add a non-aliasing load hanging off of EntryNode. Then, if GatherAllAliases visited EntryNode, it would find that other (unvisited) use of the EntryNode chain, and just give up entirely. Furthermore, the check was incomplete: it would not actually detect all such potentially problematic DAG constructions, because GatherAllAliases did not guarantee to visit all chain nodes going up to the root EntryNode. This is in general fine -- giving up early will just miss a potential optimization, not generate incorrect results. But, for this non-chain dependency detection code, it's possible that you could have a load attached to a higher-up chain node than any which were visited. If that load aliases your store, but the only dependency is through the value operand of a non-aliasing store, it would've been missed by this code, and potentially reordered. With the dependence added, this check can be removed and Alias Analysis can be much more aggressive. This fixes code quality regression in the Consecutive Store Merge cleanup (D14834). Test Change: ppc64-align-long-double.ll now may see multiple serializations of its stores Differential Revision: http://reviews.llvm.org/D18062 llvm-svn: 265836	2016-04-08 19:44:40 +00:00
Duncan P. N. Exon Smith	bb2c3e199e	ValueMapper: Extract llvm::RemapFunction from IRMover.cpp, NFC Strip out the remapping parts of IRLinker::linkFunctionBody and put them in ValueMapper.cpp under the name Mapper::remapFunction (with a top-level entry-point llvm::RemapFunction). This is a nice cleanup on its own since it puts the remapping code together and shares a single Mapper context for the entire IRLinker::linkFunctionBody Call. Besides that, this will make it easier to break the co-recursion between IRMover.cpp and ValueMapper.cpp in follow ups. llvm-svn: 265835	2016-04-08 19:26:32 +00:00
Duncan P. N. Exon Smith	adcebdf2d1	ValueMapper: Always use Mapper::mapValue from remapInstruction, NFCI Use Mapper::mapValue instead of llvm::MapValue from Mapper::remapInstruction when mapping an incoming block for a PHINode (follow-up to r265832). This will implicitly pass along the Materializer argument, but when this code was added in r133513 there was no Materializer argument. I suspect this call to MapValue was just missed in r182776 since it's not observable (basic blocks can't be materialized, and they don't reference other values). llvm-svn: 265833	2016-04-08 19:17:13 +00:00
Duncan P. N. Exon Smith	a574e7a7a4	ValueMapper: Roll RemapInstruction into Mapper, NFC Add Mapper::remapInstruction, move the guts of llvm::RemapInstruction into it, and use the same Mapper for most of the calls to MapValue and MapMetadata. There should be no functionality change here. I left off the call to MapValue that wasn't passing in a Materializer argument (for basic blocks of PHINodes). It shouldn't change functionality either, but I'm suspicious enough to commit separately. llvm-svn: 265832	2016-04-08 19:09:34 +00:00
Duncan P. N. Exon Smith	fe6583a0a6	Linker: Always pass RF_IgnoreMissingLocals; NFC This is a cleanup after clarifying the meaning of RF_IgnoreMissingLocals in r265628 and truly limiting it to locals in r265768. This should have no functionality change, since the only context that the flag has an effect is when we could hit function-local Value and Metadata, and we were already passing it in those contexts. llvm-svn: 265831	2016-04-08 19:01:38 +00:00
Kevin B. Smith	e0a6fc3bcc	[X86] Fix PR23155 by turning on X86FixupBWInsts by default. Differential Revision: http://reviews.llvm.org/D18866 llvm-svn: 265830	2016-04-08 18:58:29 +00:00
Duncan P. N. Exon Smith	69341e6abc	ValueMapper: Don't memoize metadata when RF_NoModuleLevelChanges Prevent the Metadata side-table in ValueMap from growing unnecessarily when RF_NoModuleLevelChanges. As a drive-by, make ValueMap::hasMD, which apparently had no users until I used it here for testing, actually compile. llvm-svn: 265828	2016-04-08 18:49:36 +00:00
Duncan P. N. Exon Smith	e05ff7c1a7	ValueMapper: Stop memoizing MDStrings Stop adding MDString to the Metadata section of the ValueMap in MapMetadata. It blows up the size of the map for no benefit, since we can always return quickly anyway. There is a potential follow-up that I don't think I'll push on right away, but maybe someone else is interested: stop checking for a pre-mapped MDString, and move the `isa<MDString>()` checks in Mapper::mapSimpleMetadata and MDNodeMapper::getMappedOp in front of the `VM.getMappedMD()` calls. While this would preclude explicitly remapping MDStrings it would probably be a little faster. llvm-svn: 265827	2016-04-08 18:47:02 +00:00
Sanjoy Das	87b9e1b727	Propagate Undef in llvm.cos Intrinsic Summary: The llvm cos intrinsic currently does not propagate undef's. This change transforms cos(undef) to null value or 0. There are 2 test cases added as well. Patch by Anna Thomas! Reviewers: sanjoy Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18863 llvm-svn: 265825	2016-04-08 18:21:11 +00:00
Lang Hames	6d22d8a44a	[Object] Report an error if .alt_entry is used with ELF or COFF. I'm looking into a better way to do this long-term, but for now at least don't crash. llvm-svn: 265815	2016-04-08 17:38:51 +00:00
Ulrich Weigand	fa2dffbc1a	[SystemZ] Support conditional sibling calls via BRCL This adds a conditional variant of CallJG instruction, CallBRCL. It can be used for conditional sibling calls. Unfortunately, due to IfCvt limitations, it only really works well for functions without arguments. Author: koriakin Differential Revision: http://reviews.llvm.org/D18864 llvm-svn: 265814	2016-04-08 17:22:19 +00:00
Quentin Colombet	ab8c21f72b	[RegBankSelect] Use reverse post order traversal. When assigning the register banks of an instruction, it is best to know all the constraints of the input to have a good idea of how this will impact the cost of the whole function. llvm-svn: 265812	2016-04-08 17:19:10 +00:00
Quentin Colombet	88805c1917	[RegisterBankInfo] Change the implementation for the default mapping. Do not give that much importance to the current register bank of an operand. This is likely just a side effect of the current execution and it is properly wise to prefer a register bank that can be extracted from the information available statically (like encoding constraints and type). llvm-svn: 265810	2016-04-08 16:59:50 +00:00
David Majnemer	56737722e4	[InstCombine] Fix miscompile in FoldSPFofSPF We had a select of a cast of a select but attempted to replace the outer select with the inner select dispite their incompatible types. Patch by Anton Korobeynikov! This fixes PR27236. llvm-svn: 265805	2016-04-08 16:51:49 +00:00
Quentin Colombet	6d6d6af226	[RegBankSelect] Improve debug output. Add verbose information when checking if the current and the desired register banks match. Detail what happens when we assign a register bank. llvm-svn: 265804	2016-04-08 16:48:16 +00:00
Mehdi Amini	3ba84ca62d	Fix missing include on OpenBSD From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265803	2016-04-08 16:45:05 +00:00
Quentin Colombet	876ddf8107	[MIR] Teach the parser how to deal with register banks. llvm-svn: 265802	2016-04-08 16:40:43 +00:00
David Majnemer	fcc5811797	[InstCombine] Add a peephole for redundant assumes Two or more identical assumes are occasionally next to each other in a basic block. While our generic machinery will turn a redundant assume into a no-op, it is not super cheap. We can perform a simpler check to achieve the same result for this case. llvm-svn: 265801	2016-04-08 16:37:12 +00:00
David Majnemer	60c6abc3cc	[LoopVectorize] Register cloned assumptions InstCombine cannot effectively remove redundant assumptions without them registered in the assumption cache. The vectorizer can create identical assumptions but doesn't register them with the cache, resulting in slower compile times because InstCombine tries to reason about a lot more assumptions. Fix this by registering the cloned assumptions. llvm-svn: 265800	2016-04-08 16:37:10 +00:00
Quentin Colombet	c1c94bc2ca	[MachineVerifier] Teach how to check some of the properties of generic virtual registers. Generic virtual registers: - May not have a register class - May not have a register bank - If they do not have a register class they must have a size - If they have a register bank, the size of the register bank must be greater or equal to the size of the virtual register (basically check that the virtual register will fit into that register class) llvm-svn: 265798	2016-04-08 16:35:22 +00:00
Quentin Colombet	fab1cfe673	[MIR] Teach the mir printer how to print the register bank. For now, we put the register bank in the Class field since a register may only have one of those at a given time. The downside of that representation is that if a register class and a register bank have the same name, we will not be able to distinguish them. llvm-svn: 265796	2016-04-08 16:26:22 +00:00
Sam Parker	2d5126cdf5	[ARM] Enable SMLAW[B\|T] and SMLUW[B\|T] instruction selection Added ISelDAGToDAG functions to enable selection of the smlawb, smlawt, smulwb and smulwt instructions for the ARM backend. Also updated the smul CodeGen test and removed the smulw one. Differential Revision: http://reviews.llvm.org/D18892 llvm-svn: 265793	2016-04-08 16:02:53 +00:00
Hans Wennborg	5a7723c7a2	Revert r265547 "Recommit r265309 after fixed an invalid memory reference bug happened" It caused PR27275: "ARM: Bad machine code: Using an undefined physical register" Also reverting the following commits that were landed on top: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" llvm-svn: 265790	2016-04-08 15:17:43 +00:00
Silviu Baranga	6f444dfd55	Re-commit [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV This re-commits r265535 which was reverted in r265541 because it broke the windows bots. The problem was that we had a PointerIntPair which took a pointer to a struct allocated with new. The problem was that new doesn't provide sufficient alignment guarantees. This pattern was already present before r265535 and it just happened to work. To fix this, we now separate the PointerToIntPair from the ExitNotTakenInfo struct into a pointer and a bool. Original commit message: Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 llvm-svn: 265786	2016-04-08 14:29:09 +00:00
Simon Pilgrim	476170384f	[X86] Tidied up shuffle decode function doxygen descriptions As discussed on D18441 - auto brief is used so we don't need /brief, we don't need to include the function name and added some missing descriptions. llvm-svn: 265785	2016-04-08 14:17:07 +00:00
Chuang-Yu Cheng	98c1894755	CXX_FAST_TLS calling convention: performance improvement for PPC64 This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message. The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D17533 llvm-svn: 265781	2016-04-08 12:04:32 +00:00
Vasileios Kalintiris	957d849e03	[mips] Use range-based for loops. NFC. llvm-svn: 265780	2016-04-08 10:33:00 +00:00
Jeroen Ketema	ad659c3400	[llvm-c] Expose LLVMContextGetDiagnostic{Handler,Context} Differential Revision: http://reviews.llvm.org/D18820 llvm-svn: 265773	2016-04-08 09:19:02 +00:00
Zlatko Buljan	53a037f5cc	[mips][microMIPS] Add CodeGen support for ADD, ADDIU, ADDU and DADD* instructions Differential Revision: http://reviews.llvm.org/D16454 llvm-svn: 265772	2016-04-08 07:27:26 +00:00
Craig Topper	00230805f2	Use std::fill to simplify some code. NFC llvm-svn: 265771	2016-04-08 07:10:46 +00:00
Duncan P. N. Exon Smith	4ec55f8ab6	Reapply "ValueMapper: Treat LocalAsMetadata more like function-local Values" This reverts commit r265765, reapplying r265759 after changing a call from LocalAsMetadata::get to ValueAsMetadata::get (and adding a unit test). When a local value is mapped to a constant (like "i32 %a" => "i32 7"), the new debug intrinsic operand may no longer be pointing at a local. http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/19020/ The previous coommit message follows: -- This is a partial re-commit -- maybe more of a re-implementation -- of r265631 (reverted in r265637). This makes RF_IgnoreMissingLocals behave (almost) consistently between the Value and the Metadata hierarchy. In particular: - MapValue returns nullptr or "metadata !{}" for missing locals in MetadataAsValue/LocalAsMetadata bridging paris, depending on the RF_IgnoreMissingLocals flag. - MapValue doesn't memoize LocalAsMetadata-related results. - MapMetadata no longer deals with LocalAsMetadata or RF_IgnoreMissingLocals at all. (This wasn't in r265631 at all, but I realized during testing it would make the patch simpler with no loss of generality.) r265631 went too far, making both functions universally ignore RF_IgnoreMissingLocals. This broke building (e.g.) compiler-rt. Reassociate (and possibly other passes) don't currently maintain dominates-use invariants for metadata operands, resulting in IR like this: define void @foo(i32 %arg) { call void @llvm.some.intrinsic(metadata i32 %x) %x = add i32 1, i32 %arg } If the inliner chooses to inline @foo into another function, then RemapInstruction will call `MapValue(metadata i32 %x)` and assert that the return is not nullptr. I've filed PR27273 to add a Verifier check and fix the underlying problem in the optimization passes. As a workaround, return `!{}` instead of nullptr for unmapped LocalAsMetadata when RF_IgnoreMissingLocals is unset. Otherwise, match the behaviour of r265631. Original commit message: ValueMapper: Make LocalAsMetadata match function-local Values Start treating LocalAsMetadata similarly to function-local members of the Value hierarchy in MapValue and MapMetadata. - Don't memoize them. - Return nullptr if they are missing. This also cleans up ConstantAsMetadata to stop listening to the RF_IgnoreMissingLocals flag. llvm-svn: 265768	2016-04-08 03:13:22 +00:00
Duncan P. N. Exon Smith	805873148a	Revert "ValueMapper: Treat LocalAsMetadata more like function-local Values" This reverts commit r265759, since even this limited version breaks some bots: http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/3311 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/17696 This also reverts r265761 "ValueMapper: Unduplicate RF_NoModuleLevelChanges check, NFC", since I had trouble separating it from r265759. llvm-svn: 265765	2016-04-08 00:56:21 +00:00
Quentin Colombet	e57546de40	[TargetRegisterInfo] Re-apply r265734. Original commit message: [TargetRegisterInfo] Refactor the code to use BitMaskClassIterator. llvm-svn: 265764	2016-04-08 00:51:00 +00:00
Sanjoy Das	5ce3272833	Don't IPO over functions that can be de-refined Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762	2016-04-08 00:48:30 +00:00
Duncan P. N. Exon Smith	0fa8aca0e6	ValueMapper: Unduplicate RF_NoModuleLevelChanges check, NFC llvm-svn: 265761	2016-04-08 00:41:10 +00:00
Adrian Prantl	3e9c88753b	DwarfDebug: Support floating point constants in location lists. This patch closes a gap in the DWARF backend that caused LLVM to drop debug info for floating point variables that were constant for part of their scope. Floating point constants are emitted as one or more DW_OP_constu joined via DW_OP_piece. This fixes a regression caught by the LLDB testsuite that I introduced in r262247 when we stopped blindly expanding the range of singular DBG_VALUEs to span the entire scope and started to emit location lists with accurate ranges instead. Also deletes a now-impossible testcase (debug-loc-empty-entries). <rdar://problem/25448338> llvm-svn: 265760	2016-04-08 00:38:37 +00:00
Duncan P. N. Exon Smith	267185ec92	ValueMapper: Treat LocalAsMetadata more like function-local Values This is a partial re-commit -- maybe more of a re-implementation -- of r265631 (reverted in r265637). This makes RF_IgnoreMissingLocals behave (almost) consistently between the Value and the Metadata hierarchy. In particular: - MapValue returns nullptr or "metadata !{}" for missing locals in MetadataAsValue/LocalAsMetadata bridging paris, depending on the RF_IgnoreMissingLocals flag. - MapValue doesn't memoize LocalAsMetadata-related results. - MapMetadata no longer deals with LocalAsMetadata or RF_IgnoreMissingLocals at all. (This wasn't in r265631 at all, but I realized during testing it would make the patch simpler with no loss of generality.) r265631 went too far, making both functions universally ignore RF_IgnoreMissingLocals. This broke building (e.g.) compiler-rt. Reassociate (and possibly other passes) don't currently maintain dominates-use invariants for metadata operands, resulting in IR like this: define void @foo(i32 %arg) { call void @llvm.some.intrinsic(metadata i32 %x) %x = add i32 1, i32 %arg } If the inliner chooses to inline @foo into another function, then RemapInstruction will call `MapValue(metadata i32 %x)` and assert that the return is not nullptr. I've filed PR27273 to add a Verifier check and fix the underlying problem in the optimization passes. As a workaround, return `!{}` instead of nullptr for unmapped LocalAsMetadata when RF_IgnoreMissingLocals is unset. Otherwise, match the behaviour of r265631. Original commit message: ValueMapper: Make LocalAsMetadata match function-local Values Start treating LocalAsMetadata similarly to function-local members of the Value hierarchy in MapValue and MapMetadata. - Don't memoize them. - Return nullptr if they are missing. This also cleans up ConstantAsMetadata to stop listening to the RF_IgnoreMissingLocals flag. llvm-svn: 265759	2016-04-08 00:33:44 +00:00
Quentin Colombet	e0a7ffa6cb	Revert "[TargetRegisterInfo] Refactor the code to use BitMaskClassIterator." This reverts commit r265734. Looks like ASan is not happy about it. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/11741 Looking. llvm-svn: 265755	2016-04-08 00:03:51 +00:00
Quentin Colombet	dcf5cf6a29	[RegisterBankInfo] Make the debug output more compact. Print the mask of the partial mapping as an hexadecimal instead of a binary value. llvm-svn: 265754	2016-04-08 00:03:49 +00:00
Davide Italiano	08f8f21b91	[IR/Verifier] Fix (yet another) crash. We need to check that if we reference a retainedType from DICompileUnit we're actually referencing a DICompositeType. llvm-svn: 265752	2016-04-08 00:01:32 +00:00
Quentin Colombet	e16f561d91	[RegBankSelect] Add a few debug statements. llvm-svn: 265749	2016-04-07 23:53:55 +00:00
Quentin Colombet	9a2ae85e67	[RegisterBankInfo] Add print and dump method to the InstructionMapping helper class. llvm-svn: 265747	2016-04-07 23:31:58 +00:00
Quentin Colombet	e087c9fc12	[RegisterBankInfo] Add print and dump method to the ValueMapping helper class. llvm-svn: 265746	2016-04-07 23:25:43 +00:00
Quentin Colombet	03c419628e	[MachineInstr] Teach the print method about RegisterBank. Properly print either the register class or the register bank or a virtual register. Get rid of a few ifdefs in the process. llvm-svn: 265745	2016-04-07 23:18:11 +00:00
Quentin Colombet	88f7f6bc4f	[AArch64] Fix a typo in the register class to register bank mapping. For GPR family we want the GPR register bank, not FPR! llvm-svn: 265743	2016-04-07 23:10:14 +00:00
Quentin Colombet	ac40034e06	[RegisterBankInfo] Strengthen getInstrMappingImpl. Teach the target independent code how to take advantage of type information to get the mapping of an instruction. llvm-svn: 265739	2016-04-07 22:52:49 +00:00
Quentin Colombet	e918006a87	[RegisterBankInfo] Add a way to record what register bank covers a specific type. This will be used to find the default mapping of the instruction. Also, this information is recorded, instead of computed, because it is expensive from a type to know which register bank maps it. Indeed, we need to iterate through all the register classes of all the register banks to find the one that maps the given type. llvm-svn: 265736	2016-04-07 22:45:42 +00:00
Quentin Colombet	c8d612f6fd	[RegisterBankInfo] Introduce getRegBankFromConstraints as an helper method. NFC. The refactoring intends to make the code more readable and expose more features to potential derived classes. llvm-svn: 265735	2016-04-07 22:35:03 +00:00
Quentin Colombet	2445dc1916	[TargetRegisterInfo] Refactor the code to use BitMaskClassIterator. llvm-svn: 265734	2016-04-07 22:16:56 +00:00
Quentin Colombet	cf477ffc58	[RegisterBankInfo] Refactor the code to use BitMaskClassIterator. llvm-svn: 265733	2016-04-07 22:08:56 +00:00
Mehdi Amini	a797877a7e	Const correctness for BranchProbabilityInfo (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265731	2016-04-07 21:59:28 +00:00
Mehdi Amini	4a9a1816cb	Rename parameter I to Index for WriteCombinedGlobalValueSummary() (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265729	2016-04-07 21:49:31 +00:00
Quentin Colombet	aac71a4a0e	[RegBankSelect] Reuse RegisterBankInfo logic to get to the register bank from a register. On top of duplicating the logic, it was buggy! It would assert on physical registers, since MachineRegisterInfo does not have any information regarding register classes/banks for them. llvm-svn: 265727	2016-04-07 21:32:23 +00:00
Amaury Sechet	c53ad4f3b2	Do not select EhPad BB in MachineBlockPlacement when there is regular BB to schedule Summary: EHPad BB are not entered the classic way and therefor do not need to be placed after their predecessors. This patch make sure EHPad BB are not chosen amongst successors to form chains, and are selected as last resort when selecting the best candidate. EHPad are scheduled in reverse probability order in order to have them flow into each others naturally. Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17625 llvm-svn: 265726	2016-04-07 21:29:39 +00:00
Quentin Colombet	846219ae10	[AArch64] Get rid of some GlobalISel ifdefs. llvm-svn: 265725	2016-04-07 21:24:40 +00:00
Quentin Colombet	6cc73ce808	[AArch64] gcc does not like litteral without quotes even on preprocessor macros. llvm-svn: 265720	2016-04-07 20:49:15 +00:00
Quentin Colombet	789ad56248	[AArch64][CallLowering] Do not build the API if GlobalISel is not built. This gets rid of some ifdefs and dummy implementations that were here just to fill the blanks. llvm-svn: 265719	2016-04-07 20:47:51 +00:00
Quentin Colombet	d4131814b3	[GlobalISel] Add RegBankSelect hooks into the pass pipeline. Now, RegBankSelect will happen after the IRTranslation and the target may optionally add additional passes in between. llvm-svn: 265716	2016-04-07 20:27:33 +00:00
Jan Vesely	43b7b5b846	AMDGPU/SI: Implement atomic load/store for i32 and i64 Standard load/store instructions with GLC bit set. Reviewers: tstellardAMD, arsenm Differential Revision: http://reviews.llvm.org/D18760 llvm-svn: 265709	2016-04-07 19:23:11 +00:00
Tom Stellard	9112758077	AMDGPU/SI: Add latency for export instructions Reviewers: arsenm, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18599 llvm-svn: 265708	2016-04-07 18:30:05 +00:00
Quentin Colombet	40ad573d2c	[RegBankSelect] Initial implementation for non-optimized output. The pass walk through the machine function and assign the register banks using the default mapping. In other words, there is no attempt to reduce cross register copies. llvm-svn: 265707	2016-04-07 18:19:27 +00:00
Quentin Colombet	fe1ee4f9be	[RegisterBankInfo] Provide a target independent helper function to guess the mapping of an instruction on register bank. For most instructions, it is possible to guess the mapping of the instruciton by using the encoding constraints. It remains instructions without encoding constraints. For copy-like instructions, we try to propagate the information we get from the other operands. Otherwise, the target has to give this information. llvm-svn: 265703	2016-04-07 18:01:19 +00:00
Quentin Colombet	ee366eff44	[RegisterBankInfo] Change the signature of getSizeInBits to factor out the access to MRI and TRI. llvm-svn: 265701	2016-04-07 17:44:54 +00:00
Quentin Colombet	5b7ba5092c	[RegisterBankInfo] Provide a default constructor for InstructionMapping helper class. The default constructor creates invalid (isValid() == false) instances and may be used to communicate that a mapping was not found. llvm-svn: 265699	2016-04-07 17:30:18 +00:00
Quentin Colombet	c33085f2c6	[MachineRegisterInfo] Track register bank for virtual registers. A virtual register may have either a register bank or a register class. This is represented by a PointerUnion between the related classes. Typically, a virtual register went through the following states regarding register class and register bank: 1. Creation: None is set. Virtual registers are fully generic. 2. Register bank assignment: Register bank is set. Virtual registers live into a register bank, but we do not know the constraints they need to fulfil. 3. Instruction selection: Register class is set. Virtual registers are bound by encoding constraints. To map these states to GlobalISel, the IRTranslator implements #1, RegBankSelect #2, and Select #3. llvm-svn: 265696	2016-04-07 17:20:29 +00:00
Quentin Colombet	d21115876c	[RegisterBank] Rename RegisterBank::contains into RegisterBank::covers. llvm-svn: 265695	2016-04-07 17:09:39 +00:00
Ulrich Weigand	79391ee0f2	[SystemZ] Fix build break from r265689 Fix build error seen on some build bots due to: error: default label in switch which covers all enumeration values llvm-svn: 265693	2016-04-07 16:33:25 +00:00
Kevin B. Smith	3802c4af59	[X86]: Fix for PR27251. Differential Revision: http://reviews.llvm.org/D18850 llvm-svn: 265690	2016-04-07 16:15:34 +00:00
Ulrich Weigand	2eb027d21f	[SystemZ] Implement conditional returns Return is now considered a predicable instruction, and is converted to a newly-added CondReturn (which maps to BCR to %r14) instruction by the if conversion pass. Also, fused compare-and-branch transform knows about conditional returns, emitting the proper fused instructions for them. This transform triggers on a lot of tests, hence the huge diffstat. The changes are mostly jX to br %r14 -> bXr %r14. Author: koriakin Differential Revision: http://reviews.llvm.org/D17339 llvm-svn: 265689	2016-04-07 16:11:44 +00:00
Davide Italiano	14e351a553	[IR/Verifier] Merge two ifs into one. NFC. llvm-svn: 265688	2016-04-07 15:55:28 +00:00
Ulrich Weigand	fc23907673	[GVN] Address review comments for D18662 As suggested by Chandler in his review comments for D18662, this follow-on patch renames some variables in GetLoadValueForLoad and CoerceAvailableValueToLoadType to hopefully make it more obvious which variables hold value sizes and which hold load/store sizes. No functional change intended. llvm-svn: 265687	2016-04-07 15:55:11 +00:00
Ulrich Weigand	6e6966460a	[GVN] Fix handling of sub-byte types in big-endian mode When GVN wants to re-interpret an already available value in a smaller type, it needs to right-shift the value on big-endian systems to ensure the correct bytes are accessed. The shift value is the difference of the sizes of the two types. This is correct as long as both types occupy multiples of full bytes. However, when one of them is a sub-byte type like i1, this no longer holds true: we still need to shift, but only to access the correct byte. Accessing bits within the byte requires no shift in either endianness; e.g. an i1 resides in the least-significant bit of its containing byte on both big- and little-endian systems. Therefore, the appropriate shift value to be used is the difference of the storage sizes of the two types. This is already handled correctly in one place where such a shift takes place (GetStoreValueForLoad), but is incorrect in two other places: GetLoadValueForLoad and CoerceAvailableValueToLoadType. This patch changes both places to use the storage size as well. Differential Revision: http://reviews.llvm.org/D18662 llvm-svn: 265684	2016-04-07 15:45:02 +00:00
Ehsan Amiri	4701a91e59	[PPC] Enable transformations in PPCPassConfig::addIRPasses at O2 http://reviews.llvm.org/D18562 A large number of testcases has been modified so they pass after this test. One testcase is deleted, because I realized even after undoing the original change that was committed with this testcase, the testcase still passes. So I removed it. The change to one other testcase (test/CodeGen/PowerPC/pr25802.ll) is an arbitrary change to keep it passing. Given the original intention of the testcase, and the fact that fixing it will require some time to change the testcase, we concluded that this quick change will be enough. llvm-svn: 265683	2016-04-07 15:30:55 +00:00
Tom Stellard	d37630e461	AMDGPU/SI: Add MachineBasicBlock parameter to SIInstrInfo::insertWaitStates Summary: This makes it possible to insert nops at the end of blocks. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18549 llvm-svn: 265678	2016-04-07 14:47:07 +00:00
Valery Pykhtin	e23b6deb01	[AMDGPU] fix readlane/readfirstlane src vgpr operand type. For VGPR_32 operand disassembler expects a VGPR register encoded as 0..255 (enum8 src operand). readfirstlane/readline actually has enum9 operand and this change fixes VGPR_32 to VS_32 (enum9 encoding). Differential Revision: http://reviews.llvm.org/D18696 llvm-svn: 265670	2016-04-07 13:41:51 +00:00
Dmitry Polukhin	a1feff7024	[GCC] Attribute ifunc support in llvm This patch add support for GCC attribute((ifunc("resolver"))) for targets that use ELF as object file format. In general ifunc is a special kind of function alias with type @gnu_indirect_function. Patch for Clang http://reviews.llvm.org/D15524 Differential Revision: http://reviews.llvm.org/D15525 llvm-svn: 265667	2016-04-07 12:32:19 +00:00
NAKAMURA Takumi	e546211492	InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation] llvm-svn: 265657	2016-04-07 11:30:06 +00:00
Benjamin Kramer	4fb78518f1	Make helper functions static. NFC. llvm-svn: 265653	2016-04-07 10:10:09 +00:00
Simon Pilgrim	d54bae6525	[X86][SSE] Add support for VZEXT constant folding llvm-svn: 265646	2016-04-07 07:52:45 +00:00
Amaury Sechet	33c161c02f	[BlockPlacement] Remove an unnecessary continue NFC. llvm-svn: 265643	2016-04-07 06:35:00 +00:00
Amaury Sechet	9ee4ddd710	[MBP] Remove an unused function parameter NFC. llvm-svn: 265642	2016-04-07 06:34:47 +00:00
Wei Mi	979e9756ec	Fix the sanitizer bootstrap error in r265547. The iterators of SmallPtrSet SpillsInSubTreeMap[Child].first may be invalidated when SpillsInSubTreeMap grows. Rearrange the code to ensure the grow of SpillsInSubTreeMap only happens before getting the iterators of the SmallPtrSet. llvm-svn: 265639	2016-04-07 05:27:17 +00:00
Amaury Sechet	41474a52e7	Revert "[BlockPlacement] Remove an unnecessary continue" and "[MBP] Remove an unused function parameter" llvm-svn: 265638	2016-04-07 04:28:40 +00:00
Duncan P. N. Exon Smith	45601e867d	Revert "ValueMapper: Make LocalAsMetadata match function-local Values" This reverts commit r265631, since it caused bot failures: http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/3256 http://lab.llvm.org:8011/builders/clang-cmake-aarch64-42vma/builds/7272 Looks like something is depending on the old behaviour. I'll try to track it down and recommit. llvm-svn: 265637	2016-04-07 02:10:50 +00:00
Ahmed Bougacha	1cf67fb9cb	[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC. Re-apply r265450 which caused PR27245 and was reverted in r265559 because of a wrong generalization: the fetch_and_add->add_and_fetch combine only works in specific, but pretty common, cases: (icmp slt x, 0) -> (icmp sle (add x, 1), 0) (icmp sge x, 0) -> (icmp sgt (add x, 1), 0) (icmp sle x, 0) -> (icmp slt (sub x, 1), 0) (icmp sgt x, 0) -> (icmp sge (sub x, 1), 0) Original Message: We only generate LOCKed versions of add/sub when the result is unused. It often happens that the result is used, but only by a comparison. We can optimize those out by reusing EFLAGS, which lets us use the proper instructions, instead of having to fallback to LXADD. Instead of doing this as an MI peephole (as we do for the other non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite tricky later. This also makes it eventually possible to stop expanding and/or/xor if the only user is an icmp (also see D18141). This uses the LOCK ISD opcodes added by r262244. Differential Revision: http://reviews.llvm.org/D17633 llvm-svn: 265636	2016-04-07 02:07:10 +00:00
Duncan P. N. Exon Smith	fdccad925c	ValueMapper: Allow RF_IgnoreMissingLocals and RF_NullMapMissingGlobalValues Remove the assertion that disallowed the combination, since RF_IgnoreMissingLocals should have no effect on globals. As it happens, RF_NullMapMissingGlobalValues asserted in MapValue(Constant*,...), so I also changed a cast to a cast_or_null to get my test passing. llvm-svn: 265633	2016-04-07 01:22:45 +00:00
Duncan P. N. Exon Smith	c1e4070708	ValueMapper: Make LocalAsMetadata match function-local Values Start treating LocalAsMetadata similarly to function-local members of the Value hierarchy in MapValue and MapMetadata. - Don't memoize them. - Return nullptr if they are missing. This also cleans up ConstantAsMetadata to stop listening to the RF_IgnoreMissingLocals flag. llvm-svn: 265631	2016-04-07 01:08:39 +00:00
Quentin Colombet	b073c12912	[AArch64] Teach RegisterBankInfo about the CC register bank. We need to cover each register class with a register bank. llvm-svn: 265629	2016-04-07 00:39:29 +00:00
Duncan P. N. Exon Smith	da68cbc4ad	IR: RF_IgnoreMissingValues => RF_IgnoreMissingLocals, NFC Clarify what this RemapFlag actually means. - Change the flag name to match its intended behaviour. - Clearly document that it's not supposed to affect globals. - Add a host of FIXMEs to indicate how to fix the behaviour to match the intent of the flag. RF_IgnoreMissingLocals should only affect the behaviour of RemapInstruction for function-local operands; namely, for operands of type Argument, Instruction, and BasicBlock. Currently, it is only passed into RemapInstruction calls (and the transitive MapValue calls that it makes). When I split Metadata from Value I didn't understand the flag, and I used it in a bunch of places for "global" metadata. This commit doesn't have any functionality change, but prepares to cleanup MapMetadata and MapValue. llvm-svn: 265628	2016-04-07 00:26:43 +00:00
Quentin Colombet	cbc353a422	[AArch64] Teach RegisterBankInfo about the mapping of register classes on register banks. llvm-svn: 265626	2016-04-07 00:14:30 +00:00
Michael Zolotukhin	56ad4048ae	Follow-up for r265605: don't mutate vector we're iterating. llvm-svn: 265625	2016-04-07 00:09:42 +00:00
Quentin Colombet	4359784c1b	[RegisterBankInfo] Implement a target independent version of getInstrMapping. This implementation requires that the target implemented getRegBankFromRegClass. Indeed, the implementation uses the register classes for the encoding constraints for the instructions to deduce the mapping of a value. llvm-svn: 265624	2016-04-07 00:07:50 +00:00
Hans Wennborg	ab16be799c	Re-commit r265039 "[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)" Third time's the charm? The previous attempt (r265345) caused ASan test failures on X86, as broken CFI caused stack traces to not work. This version of the patch makes sure not to merge with stack adjustments that have CFI, and to not add merged instructions' offests to the CFI about to be generated. This is already covered by the lit tests; I just got the expectations wrong previously. llvm-svn: 265623	2016-04-07 00:05:49 +00:00
Quentin Colombet	8c0d66bc54	[RegisterBankInfo] Add an helper function to get the size of a register. The previous method to get the size was too simple and could fail for physical registers. llvm-svn: 265620	2016-04-06 23:59:53 +00:00
Justin Lebar	9e479e4763	Fix a race condition in support library ThreadPool. By running TSAN on the ThreadPool unit tests it was discovered that the threads in the pool can pop tasks off the queue at the same time the "wait" routine is trying to check if the task queue is empty. This patch fixes this problem by checking for active threads in the waiter before checking whether the queue is empty. Patch by Jason Henline. Differential Revision: http://reviews.llvm.org/D18811 Reviewers: joker.eph, jlebar llvm-svn: 265618	2016-04-06 23:46:40 +00:00
Mike Aizatsky	70ea45306a	[sancov] enabling coverage edge pruning by default. Differential Revision: http://reviews.llvm.org/D18844 llvm-svn: 265615	2016-04-06 23:24:37 +00:00
Wei Mi	284fa0bd71	Fix the compare-clang diff error introduced by r265547. Use MapVector instead of DenseMap for MergeableSpillsMap so it will be iterated in determined order. llvm-svn: 265610	2016-04-06 22:31:17 +00:00
Peter Zotov	3e4561cec5	[llvm-c] Add LLVMGetValueKind. Patch by Nicole Mazzuca <npmazzuca@gmail.com>. Differential Revision: http://reviews.llvm.org/D18729 llvm-svn: 265608	2016-04-06 22:21:29 +00:00
Kevin Enderby	3fcdf6ae2a	Thread Expected<...> up from createMachOObjectFile() to allow llvm-objdump to produce a real error message Produce the first specific error message for a malformed Mach-O file describing the problem instead of the generic message for object_error::parse_failed of "Invalid data was encountered while parsing the file”. Many more good error messages will follow after this first one. This is built on Lang Hames’ great work of adding the ’Error' class for structured error handling and threading Error through MachOObjectFile construction. And making createMachOObjectFile return Expected<...> . So to to get the error to the llvm-obdump tool, I changed the stack of these methods to also return Expected<...> : object::ObjectFile::createObjectFile() object::SymbolicFile::createSymbolicFile() object::createBinary() Then finally in ParseInputMachO() in MachODump.cpp the error can be reported and the specific error message can be printed in llvm-objdump and can be seen in the existing test case for the existing malformed binary but with the updated error message. Converting these interfaces to Expected<> from ErrorOr<> does involve touching a number of places. To contain the changes for now use of errorToErrorCode() and errorOrToExpected() are used where the callers are yet to be converted. Also there some were bugs in the existing code that did not deal with the old ErrorOr<> return values. So now with Expected<> since they must be checked and the error handled, I added a TODO and a comment: “// TODO: Actually report errors helpfully” and a call something like consumeError(ObjOrErr.takeError()) so the buggy code will not crash since needed to deal with the Error. Note there is one fix also needed to lld/COFF/InputFiles.cpp that goes along with this that I will commit right after this. So expect lld not to built after this commit and before the next one. llvm-svn: 265606	2016-04-06 22:14:09 +00:00
Michael Zolotukhin	97567e141e	[LoopUnroll] Fix the way we update DT after complete unrolling. Updating dominators for exit-blocks of the unrolled loops is not enough, as shown in PR27157. The proper way is to update dominators for all dominance-children of original loop blocks. llvm-svn: 265605	2016-04-06 21:47:12 +00:00
Quentin Colombet	c916204a81	[RegisterBankInfo] Add methods to get the possible mapping of an instruction on a register bank. This will be used by the register bank select pass to assign register banks for generic virtual registers. This was originally committed as r265573 but broke at least one windows bot. The problem with the windows bot was that it was using a copy constructor for the InstructionMappings class and could not synthesize it. Actually, the fact that this class is not copy constructable is expected and the compiler should use the move assignment constructor. Marking the problematic assignment explicitly as using the move constructor has its own problems. Indeed, with recent clang we get a warning that we may prevent the elision of the copy by the compiler. A proper fix for both compilers would be to change the API of getPossibleInstrMapping to take a InstructionMappings as input/output parameter. This does not feel natural and since GISel is not used on windows yet, I chose to workaround the problem by not compiling the problematic code on windows. llvm-svn: 265604	2016-04-06 21:37:22 +00:00
JF Bastien	800f87a871	NFC: make AtomicOrdering an enum class Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602	2016-04-06 21:19:33 +00:00
Haicheng Wu	1951cf24a7	[MBP] Remove an unused function parameter NFC. llvm-svn: 265596	2016-04-06 20:38:20 +00:00
Ehsan Amiri	322eca3849	[PPC] Use VSX/FP Facility integer load when an integer load's only users are conversion to FP http://reviews.llvm.org/D18405 When the integer value loaded is never used directly as integer we should use VSX or Floating Point Facility integer loads and avoid extra direct move llvm-svn: 265593	2016-04-06 20:12:29 +00:00
James Y Knight	037b9894bd	Put quotes around #error string. GCC reports "missing terminating ' character", even when it's being skipped by preprocessing. llvm-svn: 265590	2016-04-06 19:52:32 +00:00
Nicolai Haehnle	df3a20cd80	AMDGPU: Add a shader calling convention This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589	2016-04-06 19:40:20 +00:00
Quentin Colombet	fb000583aa	Revert "[RegisterBankInfo] Add methods to get the possible mapping of an instruction on a register bank. This will be used by the register bank select pass to assign register banks for generic virtual registers." and the follow-on commits while I find out a way to fix the win7 bot: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19882 This reverts commit r265578, r265581, r265584, and r265585. llvm-svn: 265587	2016-04-06 19:04:58 +00:00
Davide Italiano	5f1c87bf07	[IRVerifier] Don't crash on invalid DIFile inside DISubprogram. r265515, this time with the correct fix. file inside DISubprogram is not mandatory. llvm-svn: 265586	2016-04-06 18:46:39 +00:00
Evgeniy Stepanov	268826a287	[gold] Save bitcode for module partitions (save-temps + split codegen). llvm-svn: 265583	2016-04-06 18:32:13 +00:00
Quentin Colombet	df4aee09f8	[RegisterBankInfo] Provide a default constructor for InstructionMapping helper class. The default constructor creates invalid (isValid() == false) instances and may be used to communicate that a mapping was not found. llvm-svn: 265581	2016-04-06 18:24:34 +00:00
Davide Italiano	18c968688e	[IRVerifier] Prefer dyn_cast<> over isa<> + cast<>. Thanks to Rafael for the suggestion! llvm-svn: 265579	2016-04-06 18:13:44 +00:00
Quentin Colombet	bb756dbf39	[RegisterBankInfo] Add an helper function to get the size of a register. The previous method to get the size was too simple and could fail for physical registers. llvm-svn: 265578	2016-04-06 18:04:35 +00:00
Duncan P. N. Exon Smith	ef06d445e0	IR: Use DenseSet instead of DenseMap for ConstantUniqueMap; NFC Use a DenseSet instead of a DenseMap for constants in LLVMContextImpl. Last time I looked at this was some time before r223588, when DenseSet<V> had no advantage over DenseMap<V,char>. After r223588, there's a 50% memory savings. This is all mechanical. There were little bits of missing API from DenseSet so I added the trivial implementations: - iterator::operator++(int) - template <class LookupKeyT> insert_as(ValueTy, LookupKeyT) There should be no functionality change, just reduced memory consumption (this wasn't on a profile or anything; just a cleanup I stumbled on). llvm-svn: 265577	2016-04-06 17:56:08 +00:00
Duncan P. N. Exon Smith	f3d08ef59a	IR: Stop explicitly clearing the MDStringCache The MDStringCache doesn't need to be explicitly cleared before destruction. The destructor handles it at least as efficiently. llvm-svn: 265576	2016-04-06 17:56:05 +00:00
Quentin Colombet	9af77135e5	[RegisterBankInfo] Add methods to get the possible mapping of an instruction on a register bank. This will be used by the register bank select pass to assign register banks for generic virtual registers. llvm-svn: 265573	2016-04-06 17:45:40 +00:00
Quentin Colombet	4f03c0b806	[AArch64] Change the CMake to avoid to build GlobalISel related APIs when GISel is not built. The positive side effects are: - We do not have to define dummy implementation - We do not have to do weird gymnastic to avoid like issues (like missing constructor or vtable for the base classes) llvm-svn: 265570	2016-04-06 17:38:12 +00:00
Quentin Colombet	c17f744001	[AArch64] Teach the subtarget how to get to the RegisterBankInfo. Rework the access to GlobalISel APIs to contain how much of the APIs we need to access for the final executable to build when GlobalISel is not built. This prevents massive usage of ifdefs in various places. Now, all the GlobalISel ifdefs will be happing only in AArch64TargetMachine.cpp. llvm-svn: 265567	2016-04-06 17:26:03 +00:00
Quentin Colombet	4812c91f56	[RegisterBankInfo] Implement the verify method of the InstructionMapping helper class. This checks that all the register operands get a proper mapping. llvm-svn: 265563	2016-04-06 17:01:43 +00:00
Fiona Glaser	045afc4f66	Loop Unroll: add options and tweak to make Partial unrolling more useful 1. Add FullUnrollMaxCount option that works like MaxCount, but also limits the unroll count for fully unrolled loops. So if a loop has an iteration count over this, it won't fully unroll. 2. Add CLI options for MaxCount and the new option, so they can be tested (plus a test). 3. Make partial unrolling obey MaxCount. An example use-case (the out of tree one this is originally designed for) is a target’s TTI can analyze a loop and decide on a max unroll count separate from the size threshold, e.g. based on register pressure, then constrain LoopUnroll to not exceed that, regardless of the size of the unrolled loop. llvm-svn: 265562	2016-04-06 16:57:25 +00:00
Hans Wennborg	6849f8f15f	Revert r265450 "[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC." It caused ASan 32-bit tests to hang (PR27245). llvm-svn: 265559	2016-04-06 16:44:38 +00:00
Fiona Glaser	16332ba861	LoopUnroll: only allow non-modulo Partial unrolling when Runtime=true Patch by Evgeny Stupachenko <evstupac@gmail.com>. llvm-svn: 265558	2016-04-06 16:43:45 +00:00
Quentin Colombet	3768f7005d	[RegisterBankInfo] Implement the verify method for the ValueMapping helper class. The method checks that the value is fully defined accross the different partial mappings and that the partial mappings are compatible between each other. llvm-svn: 265556	2016-04-06 16:40:23 +00:00
Quentin Colombet	2423fc419c	[RegisterBankInfo] Add a verify method for the PartialMapping helper class. This verifies that the PartialMapping can be accomadated into the related register bank. llvm-svn: 265555	2016-04-06 16:33:26 +00:00
Quentin Colombet	89c33caee3	[RegisterBankInfo] Add a couple of helper classes for the future cost model. llvm-svn: 265553	2016-04-06 16:27:01 +00:00
Hans Wennborg	a7e396b5ef	Revert "Re-commit r265039 "[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)"" It seems to be causing ASan tests to crash, probably due to miscompiling the run-time somehow. llvm-svn: 265551	2016-04-06 16:10:20 +00:00
Quentin Colombet	d1d324b2ae	[AArch64] Use the default constructor of RegisterBankInfo when GlobalISel is not built. This will avoid link-time error as the defautl constructor of RegisterBankInfo is the only one available when GlobalISel is not built. llvm-svn: 265549	2016-04-06 15:53:13 +00:00
Quentin Colombet	911181882e	[RegisterBankInfo] Inline the destructor to avoid link-time error when GlobalISel is not built. llvm-svn: 265548	2016-04-06 15:47:17 +00:00
Wei Mi	18293bef4e	Recommit r265309 after fixed an invalid memory reference bug happened when DenseMap growed and moved memory. I verified it fixed the bootstrap problem on x86_64-linux-gnu but I cannot verify whether it fixes the bootstrap error on clang-ppc64be-linux. I will watch the build-bot result closely. Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Differential Revision: http://reviews.llvm.org/D15302 llvm-svn: 265547	2016-04-06 15:41:07 +00:00
Silviu Baranga	a393baf1fd	Revert r265535 until we know how we can fix the bots llvm-svn: 265541	2016-04-06 14:06:32 +00:00
Sam Kolton	ff90c60a78	[AMDGPU] AsmParser: disable DPP for unsupported instructions. New dpp tests. Fix v_nop_dpp. Summary: 1. Disable DPP encoding for instructions that do not support it: - VOP1: - v_readfirstlane_b32 - v_clrexcp - v_movreld_b32 - v_movrels_b32 - v_movrelsd_b32 - VOP2: - v_madmk_f16/32 - v_madak_f16/32 - VOPC, VINTRP, VOP3 2. Fix DPP for v_nop 3. New DPP tests for VOP1 and VOP2 instructions Reviewers: nhaustov, tstellarAMD, vpykhtin Subscribers: tstellarAMD, arsenm Differential Revision: http://reviews.llvm.org/D18552 llvm-svn: 265538	2016-04-06 13:29:59 +00:00
Chad Rosier	074ce836f0	Simplify logic. NFC. llvm-svn: 265537	2016-04-06 13:27:13 +00:00
Silviu Baranga	72b4a4a330	[SCEV] Introduce a guarded backedge taken count and use it in LAA and LV Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 llvm-svn: 265535	2016-04-06 13:18:26 +00:00
Evgeny Astigeevich	9c24ebfa6d	[AArch64][CodeGen] NFC refactor AArch64InstrInfo::optimizeCompareInstr to prepare it for fixing a bug in it AArch64InstrInfo::optimizeCompareInstr has a bug which causes generation of incorrect code (PR#27158). The patch refactors the function to simplify reviewing the fix of the bug. 1. Function name ‘modifiesConditionCode’ is changed to ‘areCFlagsAccessedBetweenInstrs’ to reflect that the function can check modifying accesses, reading accesses or both. 2. Function ‘AArch64InstrInfo::optimizeCompareInstr’ - Documented the function - Cmp_NZCV is DeadNZCVIdx to reflect that it is an operand index of dead NZCV - The code for the case of substituting CmpInstr is put into separate functions the main of them is ‘substituteCmpInstr’. Differential Revision: http://reviews.llvm.org/D18609 llvm-svn: 265531	2016-04-06 11:39:00 +00:00
Chuang-Yu Cheng	6e1408a891	[ppc64] Temporary disable sibling call optimization on ppc64 due to breaking test case r265506 breaks print-stack-trace.cc test case of compiler-rt in bootstrap test. http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/1708 llvm-svn: 265528	2016-04-06 10:48:36 +00:00
David Majnemer	12fd50410d	[SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnan To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0 (which allows for better optimization, because there is no need to worry about errno being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt." This means that it's unsafe to replace sqrt with llvm.sqrt unless the call is annotated with nnan. Thanks to Hal Finkel for pointing this out! llvm-svn: 265521	2016-04-06 07:04:53 +00:00
Duncan P. N. Exon Smith	3e0430e0a8	IR: Move MDStrings to a BumpPtrAllocator We never delete any MDString until the context is destroyed. Might as well throw them onto a BumpPtrAllocator. llvm-svn: 265520	2016-04-06 06:41:54 +00:00
Duncan P. N. Exon Smith	bdfc984679	IRMover: Steal arguments when moving functions, NFC Instead of copying arguments from the source function to the destination, steal them. This has a few advantages. - The ValueMap doesn't need to be seeded with (or cleared of) Arguments. - Often the destination function won't have created any arguments yet, so this avoids malloc traffic. - Argument names don't need to be copied. Because argument lists are lazy, this required a new Function::stealArgumentListFrom helper. llvm-svn: 265519	2016-04-06 06:38:15 +00:00
Davide Italiano	22680e1c5c	Revert "[IRVerifier] Don't crash on invalid DIFile inside DISubprogram." This reverts commit r265515 as lots of tests need to be fixed before this actually can go in. llvm-svn: 265517	2016-04-06 04:34:38 +00:00
Richard Trieu	f35d4b0928	Add parentheses to silence warning. llvm-svn: 265516	2016-04-06 04:22:00 +00:00
Davide Italiano	2deceb0339	[IRVerifier] Don't crash on invalid DIFile inside DISubprogram. llvm-svn: 265515	2016-04-06 03:57:47 +00:00
Davide Italiano	8dc23a3cb5	[IRVerifier] Avoid crashing on an invalid compile unit. llvm-svn: 265514	2016-04-06 03:07:58 +00:00
Matthias Braun	8e594fdf19	AArch64: Fix compile error Fixed to adapt a use of enterBasicBlock() in my last commit (because I had follow on patches in my repository that change the code). llvm-svn: 265513	2016-04-06 02:59:44 +00:00
Matthias Braun	7dc03f060e	RegisterScavenger: Take a reference as enterBasicBlock() argument. Make it obvious that the argument cannot be nullptr. Remove an unnecessary nullptr check in initRegState. llvm-svn: 265511	2016-04-06 02:47:09 +00:00
Matthias Braun	3bb0fcc118	LivePhysRegs: Remove redundant check llvm-svn: 265509	2016-04-06 02:46:04 +00:00
Duncan P. N. Exon Smith	6f2e37429a	ValueMapper: Fix delayed blockaddress handling after r265273 r265273 added Mapper::mapBlockAddress, which delays mapping a blockaddress value until the function has a body. The condition was backwards, and should be checking Function::empty instead of GlobalValue::isDeclaration. llvm-svn: 265508	2016-04-06 02:25:12 +00:00
Duncan P. N. Exon Smith	29883866a4	AsmParser: Don't crash on unresolved !tbaa Instead of crashing, give a nice error. As a drive-by, fix the location associated with the errors for unresolved metadata (the location was off by one token). llvm-svn: 265507	2016-04-06 02:06:40 +00:00
Chuang-Yu Cheng	2e5973ef74	[ppc64] Enable sibling call optimization on ppc64 ELFv1/ELFv2 abi This patch enable sibling call optimization on ppc64 ELFv1/ELFv2 abi, and add a couple of test cases. This patch also passed llvm/clang bootstrap test, and spec2006 build/run/result validation. Original issue: https://llvm.org/bugs/show_bug.cgi?id=25617 Great thanks to Tom's (tjablin) help, he contributed a lot to this patch. Thanks Hal and Kit's invaluable opinions! Reviewers: hfinkel kbarton http://reviews.llvm.org/D16315 llvm-svn: 265506	2016-04-06 02:04:38 +00:00
Chuang-Yu Cheng	024a623c55	[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance This patch implement the following instructions: - addpcis subpcis - maddhd maddhdu maddld - modsw moduw modsd modud - darn - extswsli extswsli. - setb - dtstsfi dtstsfiq Total 15 instructions Reviewers: nemanjai hfinkel tjablin amehsan kbarton http://reviews.llvm.org/D17885 llvm-svn: 265505	2016-04-06 01:47:02 +00:00
Chuang-Yu Cheng	eaf4b3d75c	[Power9] Implement copy-paste, msgsync, slb, and stop instructions This patch implements the following BookII and Book III instructions: - copy copy_first cp_abort paste paste. paste_last - msgsync - slbieg slbsync - stop Total 10 instructions Reviewers: nemanjai hfinkel tjablin amehsan kbarton llvm-svn: 265504	2016-04-06 01:46:45 +00:00
Sanjoy Das	99abb2728b	[RS4GC] Add a comment llvm-svn: 265503	2016-04-06 01:33:54 +00:00
Sanjoy Das	65a60670e8	Lower @llvm.experimental.deoptimize as a noreturn call While preserving the return value for @llvm.experimental.deoptimize at the IR level is useful during mid-level optimization, doing so at the machine instruction level requires generating some extra code and a return that is non-ideal. This change has LLVM lower ``` %val = call @llvm.experimental.deoptimize ret %val ``` to effectively ``` call @__llvm_deoptimize() unreachable ``` instead. llvm-svn: 265502	2016-04-06 01:33:49 +00:00
NAKAMURA Takumi	285c8ff753	AArch64CodeGen: Make AArch64RegisterBankInfo.cpp optional along LLVM_BUILD_GLOBAL_ISEL. llvm-svn: 265499	2016-04-06 01:18:08 +00:00
David Majnemer	25d03dbcde	[SLPVectorizer] Vectorize libcalls of sqrt We didn't realize that we could transform the libcall into a vectorized intrinsic. llvm-svn: 265493	2016-04-06 00:14:59 +00:00
Quentin Colombet	5300950f3a	[AArch64] Initial implementation of the targeting of the register bank information. llvm-svn: 265489	2016-04-05 23:34:59 +00:00
Quentin Colombet	06bdd3c914	[RegisterBankInfo] Simplify the API for build a register bank. As part of the TRI argument of addRegBankCoverage we already have access to the TargetRegisterClass through the ID of that register class. Therefore, there is no point in needing a TargetRegisterClass instance, the ID is enough to get to it. llvm-svn: 265487	2016-04-05 23:26:39 +00:00
Sanjoy Das	8d89a2b296	[RS4GC] NFC cleanup of the DeferredReplacement class Instead of constructors use clearly named factory methods. llvm-svn: 265486	2016-04-05 23:18:53 +00:00
Sanjoy Das	49e974b33b	[RS4GC] Better codegen for deoptimize calls Don't emit a gc.result for a statepoint lowered from @llvm.experimental.deoptimize since the call into __llvm_deoptimize is effectively noreturn. Instead follow the corresponding gc.statepoint with an "unreachable". llvm-svn: 265485	2016-04-05 23:18:35 +00:00
Manman Ren	802cd6f9d7	Swift Calling Convention: swiftcc for ARM. Differential Revision: http://reviews.llvm.org/D18769 llvm-svn: 265482	2016-04-05 22:44:44 +00:00
Evgeniy Stepanov	dde29e2799	Faster stack-protector for Android/AArch64. Bionic has a defined thread-local location for the stack protector cookie. Emit a direct load instead of going through __stack_chk_guard. llvm-svn: 265481	2016-04-05 22:41:50 +00:00
Manman Ren	f8bdd88cd9	Swift Calling Convention: add swiftcc. Differential Revision: http://reviews.llvm.org/D17863 llvm-svn: 265480	2016-04-05 22:41:47 +00:00
Quentin Colombet	64bba01a63	[RegisterBank] Implement the verify method to check for the obvious mistakes. llvm-svn: 265479	2016-04-05 22:34:01 +00:00
Quentin Colombet	0195826998	[RegisterBankInfo] Add debug print to check how the initialization is going. llvm-svn: 265475	2016-04-05 21:47:56 +00:00
George Burgess IV	7e5404cc20	[CFLAA] Fix PR27213; incorrect tagging of args/globals Prior to this patch, CFLAA wouldn't tag arguments/globals properly if it didn't find any "interesting" edges on them. This means that, if all you do is store constants to a global or argument, we would never actually treat it as a global/argument. Test case: define void @foo(i32* %A, i32* %B) #0 { entry: store i32 0, i32* %A, align 4 store i32 0, i32* %B, align 4 ret void } CFLAA would say that %A can't alias %B, because neither pointer was used in an interesting way. This patch makes us note whether something is an argument, global, ... regardless of how interesting CFLAA thinks its uses are. (For the record, using a value in an interesting way means loading from it, using it in a GEP, ...) llvm-svn: 265474	2016-04-05 21:40:45 +00:00
Quentin Colombet	c94fbee9f6	[RegisterBank] Add printable capabilities for future debugging. llvm-svn: 265473	2016-04-05 21:40:43 +00:00
Duncan P. N. Exon Smith	818e5f38d2	Try harder to appease MSVC after r265456 r265465 wasn't good enough. I need to spell out all the moves. llvm-svn: 265470	2016-04-05 21:25:33 +00:00
Quentin Colombet	85689d934a	[RegisterBankInfo] Make addRegBankCoverage more capable to ease targeting jobs. Now, addRegBankCoverage also adds the subreg-classes not just the sub-classes of the given register class. llvm-svn: 265469	2016-04-05 21:20:12 +00:00
Junmo Park	53470fc451	Minor code cleanups. NFC. llvm-svn: 265468	2016-04-05 21:14:31 +00:00
Duncan P. N. Exon Smith	1de3c7e790	IR: Introduce ConstantAggregate, NFC Add a common parent class for ConstantArray, ConstantVector, and ConstantStruct called ConstantAggregate. These are the aggregate subclasses of Constant that take operands. This is mainly a cleanup, adding common `isa` target and removing duplicated code. However, it also simplifies caching which constants point transitively at `GlobalValue` (a possible future direction). llvm-svn: 265466	2016-04-05 21:10:45 +00:00
Duncan P. N. Exon Smith	f880d35b80	Try to appease MSVC after r265456 I can't remember if adding `= default` will make MSVC happy, or if I have to spell this out. Let's try the cleaner version first. llvm-svn: 265465	2016-04-05 21:07:01 +00:00
Quentin Colombet	d347d695c2	[RegisterBankInfo] Implement the methods to create register banks. llvm-svn: 265464	2016-04-05 21:06:15 +00:00
Duncan P. N. Exon Smith	db63bda88d	IR: Add missing assertion for ConstantVector::ConstantVector Use the same assertion as ConstantArray. Vectors should have the right number of elements. llvm-svn: 265463	2016-04-05 20:53:47 +00:00
Quentin Colombet	c4db2ad5b8	[RegisterBank] Provide a way to check if a register bank is valid. Change the default constructor to create invalid object. The target will have to properly initialize the register banks before using them. llvm-svn: 265460	2016-04-05 20:48:32 +00:00
Duncan P. N. Exon Smith	91d3cfed78	Revert "Fix Clang-tidy modernize-deprecated-headers warnings in remaining files; other minor fixes." This reverts commit r265454 since it broke the build. E.g.: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_build/22413/ llvm-svn: 265459	2016-04-05 20:45:04 +00:00
Duncan P. N. Exon Smith	ea7df770ae	ValueMapper: Rewrite Mapper::mapMetadata without recursion This commit completely rewrites Mapper::mapMetadata (the implementation of llvm::MapMetadata) using an iterative algorithm. The guts of the new algorithm are in MDNodeMapper::map, the entry function in a new class. Previously, Mapper::mapMetadata performed a recursive exploration of the graph with eager "just in case there's a reason" malloc traffic. The new algorithm has these benefits: - New nodes and temporaries are not created eagerly. - Uniquing cycles are not duplicated (see new unit test). - No recursion. Given a node to map, it does this: 1. Use a worklist to perform a post-order traversal of the transitively referenced unmapped nodes. 2. Track which nodes will change operands, and which will have new addresses in the mapped scheme. Propagate the changes through the POT until fixed point, to pick up uniquing cycles that need to change. 3. Map all the distinct nodes without touching their operands. If RF_MoveDistinctMetadata, they get mapped to themselves; otherwise, they get mapped to clones. 4. Map the uniqued nodes (bottom-up), lazily creating temporaries for forward references as needed. 5. Remap the operands of the distinct nodes. Mehdi helped me out by profiling this with -flto=thin. On his workload (importing/etc. for opt.cpp), MapMetadata sped up by 15%, contributed about 50% less to persistent memory, and made about 100x fewer calls to malloc. The speedup is less than I'd hoped. The profile mainly blames DenseMap lookups; perhaps there's a way to reduce them (e.g., by disallowing remapping of MDString). It would be nice to break the strange remaining recursion on the Value side: MapValue => materializeInitFor => RemapInstruction => MapValue. I think we could do this by having materializeInitFor return a worklist of things to be remapped. llvm-svn: 265456	2016-04-05 20:23:21 +00:00
Eugene Zelenko	1760dc2a23	Fix Clang-tidy modernize-deprecated-headers warnings in remaining files; other minor fixes. Some Include What You Use suggestions were used too. Use anonymous namespaces in source files. Differential revision: http://reviews.llvm.org/D18778 llvm-svn: 265454	2016-04-05 20:19:49 +00:00
Ahmed Bougacha	50e6cd4a3a	[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC. We only generate LOCKed versions of add/sub when the result is unused. It often happens that the result is used, but only by a comparison. We can optimize those out by reusing EFLAGS, which lets us use the proper instructions, instead of having to fallback to LXADD. Instead of doing this as an MI peephole (as we do for the other non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite tricky later. This also makes it eventually possible to stop expanding and/or/xor if the only user is an icmp (also see D18141). This uses the LOCK ISD opcodes added by r262244. Differential Revision: http://reviews.llvm.org/D17633 llvm-svn: 265450	2016-04-05 20:02:57 +00:00
Quentin Colombet	b235d32e74	[GlobalISel] Add the RegisterBankInfo class for the handling of register banks. llvm-svn: 265449	2016-04-05 20:02:47 +00:00
Ahmed Bougacha	629446ba03	[X86] Simplify early-exit check. NFC. llvm-svn: 265447	2016-04-05 20:02:22 +00:00
Quentin Colombet	bdc3b4d523	[GlobalISel] Add a class, RegisterBank, to represent register banks. llvm-svn: 265445	2016-04-05 19:54:44 +00:00
Sanjay Patel	4c7d094451	fix typo; NFC llvm-svn: 265442	2016-04-05 19:27:39 +00:00
Quentin Colombet	8e8e85c19f	[GlobalISel] Add the skeleton of the RegBankSelect pass. This pass is reponsible for assigning the generic virtual registers to register banks. llvm-svn: 265440	2016-04-05 19:06:01 +00:00
Sanjay Patel	f3bb6c51bc	fix documentation comments; NFC llvm-svn: 265434	2016-04-05 18:23:30 +00:00
Manman Ren	e221a870d3	Swift Calling Convention: swifterror target-independent change. At IR level, the swifterror argument is an input argument with type ErrorObject*. For targets that support swifterror, we want to optimize it to behave as an inout value with type ErrorObject; it will be passed in a fixed physical register. The main idea is to track the virtual registers for each swifterror value. We define swifterror values as AllocaInsts with swifterror attribute or a function argument with swifterror attribute. In SelectionDAGISel.cpp, we set up swifterror values (SwiftErrorVals) before handling the basic blocks. When iterating over all basic blocks in RPO, before actually visiting the basic block, we call mergeIncomingSwiftErrors to merge incoming swifterror values when there are multiple predecessors or to simply propagate them. There, we create a virtual register for each swifterror value in the entry block. For predecessors that are not yet visited, we create virtual registers to hold the swifterror values at the end of the predecessor. The assignments are saved in SwiftErrorWorklist and will be materialized at the end of visiting the basic block. When visiting a load from a swifterror value, we copy from the current virtual register assignment. When visiting a store to a swifterror value, we create a virtual register to hold the swifterror value and update SwiftErrorMap to track the current virtual register assignment. Differential Revision: http://reviews.llvm.org/D18108 llvm-svn: 265433	2016-04-05 18:13:16 +00:00
Jacques Pienaar	42991b3e5a	[lanai] LanaiSetflagAluCombiner more conservative Summary: LanaiSetflagAluCombiner could previously combine instructions across basic building blocks even when not legal. Make the LanaiSetflagAluCombiner more conservative to avoid this. Reviewers: eliben Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18746 llvm-svn: 265411	2016-04-05 16:18:13 +00:00
Sam Parker	0d3a3a537c	[ARM] Cleanup of smul and smla instruction descriptions Removed the SDNode argument passed to the AI_smul and AI_smla multiclass definitions as they are always mul. Differential Revision: http://reviews.llvm.org/D18791 llvm-svn: 265409	2016-04-05 16:01:25 +00:00
Konstantin Zhuravlyov	e63e02cb0c	[AMDGPU] Emit linkonce and linkonce_odr symbols Differential Revision: http://reviews.llvm.org/D18726 llvm-svn: 265408	2016-04-05 16:00:58 +00:00
Haicheng Wu	3618fa786f	[BlockPlacement] Remove an unnecessary continue NFC. llvm-svn: 265407	2016-04-05 15:37:08 +00:00
Chuang-Yu Cheng	d3fb38cae5	Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge Presently, CodeGenPrepare deletes all nearly empty (only phi and branch) basic blocks. This pass can delete loop preheaders which frequently creates critical edges. A preheader can be a convenient place to spill registers to the stack. If the entrance to a loop body is a critical edge, then spills may occur in the loop body rather than immediately before it. This patch protects loop preheaders from deletion in CodeGenPrepare even if they are nearly empty. Since the patch alters the CFG, it affects a large number of test cases. In most cases, the changes are merely cosmetic (basic blocks have different names or instruction orders change slightly). I am somewhat concerned about the test/CodeGen/Mips/brdelayslot.ll test case. If the loop preheader is not deleted, then the MIPS backend does not take advantage of a branch delay slot. Consequently, I would like some close review by a MIPS expert. The patch also partially subsumes D16893 from George Burgess IV. George correctly notes that CodeGenPrepare does not actually preserve the dominator tree. I think the dominator tree was usually not valid when CodeGenPrepare ran, but I am using LoopInfo to mark preheaders, so the dominator tree is now always valid before CodeGenPrepare. Author: Tom Jablin (tjablin) Reviewers: hfinkel george.burgess.iv vkalintiris dsanders kbarton cycheng http://reviews.llvm.org/D16984 llvm-svn: 265397	2016-04-05 14:06:20 +00:00
Peter Zotov	0a2fa0a13b	[llvm-c] Expose LLVM{Get,Set}ModuleIdentifier Patch by Nicole Mazzuca <npmazzuca@gmail.com>. Differential Revision: http://reviews.llvm.org/D18736 llvm-svn: 265394	2016-04-05 13:56:59 +00:00
Simon Dardis	d9d41f531e	[mips] MIPSR6 Compact jump support This patch adds support for compact jumps similiar to the previous compact branch support for MIPSR6. Unlike compact branches, compact jumps do not have a forbidden slot. As MipsInstrInfo::getEquivalentCompactForm can determine the correct expansion for jumps and branches for both microMIPS and MIPSR6, remove the unnecessary distinction in the delay slot filler. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders llvm-svn: 265390	2016-04-05 12:50:29 +00:00
Justin Holewinski	c79979299a	[NVPTX] Handle ldg created from sign-/zero-extended load Reviewers: jingyue Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D18053 llvm-svn: 265389	2016-04-05 12:38:01 +00:00
David L Kreitzer	188de5ae69	Adds the ability to use an epilog remainder loop during loop unrolling and makes this the default behavior. Patch by Evgeny Stupachenko (evstupac@gmail.com). Differential Revision: http://reviews.llvm.org/D18158 llvm-svn: 265388	2016-04-05 12:19:35 +00:00
Haojian Wu	591ae46820	Add parentheses around `&&` within `\|\|` to avoid compiler warning message. Summary: The assert code is introduced by r265370. Reviewers: bkramer Subscribers: tejohnson Differential Revision: http://reviews.llvm.org/D18786 llvm-svn: 265383	2016-04-05 09:07:47 +00:00
Dmitry Polukhin	a3d5b0b218	[IFUNC] Use GlobalIndirectSymbol when aliases and ifuncs have something similar Second part extracted from http://reviews.llvm.org/D15525 Use GlobalIndirectSymbol in all cases when aliases and ifuncs have something in common. Differential Revision: http://reviews.llvm.org/D18754 llvm-svn: 265382	2016-04-05 08:47:51 +00:00
Teresa Johnson	fb7c764496	[ThinLTO] Refactor some common code into getGlobalValueInfo method (NFC) Refactor common code that queries the ModuleSummaryIndex for a value's GlobalValueInfo struct into getGlobalValueInfo helper methods, which will also be used by D18763. llvm-svn: 265370	2016-04-05 00:40:16 +00:00
JF Bastien	1c3c223b65	Lanai: fix -Wsign-compare warning llvm-svn: 265368	2016-04-05 00:20:27 +00:00
Teresa Johnson	f4cf1c3eb4	Don't fold double constant to an integer if dest type not integral Summary: I encountered this issue when constant folding during inlining tried to fold away a bitcast of a double to an x86_mmx, which is not an integral type. The test case exposes the same issue with a smaller code snippet during early CSE. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18528 llvm-svn: 265367	2016-04-04 23:50:46 +00:00
JF Bastien	393b79ee00	Lanai: fix -Wpedantic warnings Extra semicolon. llvm-svn: 265365	2016-04-04 23:47:30 +00:00
Reid Kleckner	7de6761561	Fix non-determinism in order of LLVM attributes We were using array_pod_sort on an array of type 'Attribute', which wraps a pointer to AttributeImpl. For the most part this didn't matter because the printing code prints enum attributes in a defined order, but integer attributes such as 'align' and 'dereferenceable' were not ordered. Furthermore, AttributeImpl::operator< was broken for integer attributes. An integer attribute is a kind and an integer value, and both pieces need to be compared. By fixing the comparison operator, we can go back to std::sort, and things look good now. This should fix clang arm-swiftcall.c test failures on Windows. llvm-svn: 265361	2016-04-04 23:06:05 +00:00
Sanjay Patel	e77c7de459	use range loop; NFCI llvm-svn: 265360	2016-04-04 23:05:06 +00:00
Sanjay Patel	769b5fd546	fix typos; NFC llvm-svn: 265356	2016-04-04 22:45:56 +00:00
Amaury Sechet	56f056c01f	Style update in Core.h/Core.cpp . NFC llvm-svn: 265353	2016-04-04 22:00:25 +00:00
Justin Bogner	35c6903f22	Revert "CodeGen: Remove dead code in TailDuplicate" It seems this is reachable after all. It hit on 7zip-benchmark in lnt on ppc64: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/2317 This reverts r265347. llvm-svn: 265352	2016-04-04 21:41:54 +00:00
Matthias Braun	7511abd5c1	MachineScheduler: Ignore COPYs with undef/dead op in CopyConstrain mutation. There is no problem with the code today, but the fix will avoid a crash in test/CodeGen/AMDGPU/subreg-coalescer-undef-use.ll once the DetectDeadLanes pass is added. llvm-svn: 265351	2016-04-04 21:23:46 +00:00
Teresa Johnson	3c35e0999b	Clean up calls to WriteBitcodeToFile (NFC) Remove a default parameter value being passed unnecessarily, which also reduces the changes required when this parameter is changed in D18763. Document the remaining non-default bool value passed for another parameter. llvm-svn: 265348	2016-04-04 21:19:31 +00:00
Justin Bogner	9ab8131a57	CodeGen: Remove dead code in TailDuplicate I noticed that this isn't covered by our existing tests and spent some time trying to come up with an example it actually hits. I tried hand rolling something based on the explanation in the comment, but couldn't get anything that didn't abort tail duplication earlier for one reason or another. Then, I tried cranking tail-dup-size cranked up so this would fire more and ran a bootstrap of clang and the nightly test suite - those don't hit this either. This reverts r132816 and replaces it with an assert. llvm-svn: 265347	2016-04-04 21:11:40 +00:00
Hans Wennborg	a47a692341	Re-commit r265039 "[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)" The original commit miscompiled things on 32-bit Windows, e.g. a Clang boostrap. It turns out that mergeSPUpdates() was a bit too generous in what it interpreted as a stack adjustment, causing the following code: addl $12, %esp leal -4(%ebp), %esp To be "optimized" into simply: addl $8, %esp This commit tightens up mergeSPUpdates() and includes a new test (test14 in movtopush.ll) for this situation. llvm-svn: 265345	2016-04-04 21:02:46 +00:00
Zia Ansari	a82a58a4e5	Enable unroll for constant bound loops when TripCount is not modulo of unroll factor, reducing it to maximum power-of-2 that satisfies threshold limit. Commit for Evgeny Stupachenko (evstupac@gmail.com) Differential Revision: http://reviews.llvm.org/D18290 llvm-svn: 265337	2016-04-04 19:24:46 +00:00
Chandler Carruth	613eec8210	Revert r263460: [SpillPlacement] Fix a quadratic behavior in spill placement. That commit looks wonderful and awesome. Sadly, it greatly exacerbates PR17409 and effectively regresses build time for a lot of (very large) code when compiled with ASan or MSan. We thought this could be fixed forward by landing D15302 which at last fixes that PR, but some issues were discovered and it looks like that got reverted, so reverting this as well temporarily. As soon as the fix for PR17409 lands and sticks, we should re-land this patch as it won't trigger more significant test cases hitting that bug. Many thanks to Quentin and Wei here as they're doing all the awesome hard work!!! llvm-svn: 265331	2016-04-04 18:57:50 +00:00
Betul Buyukkurt	18131c4216	[PGO] Avoid instrumenting direct callee's at value sites. Direct callees' that are cast to other function prototypes, show up in the Call/Invoke instructions as ConstantExpr's. Currently llvm::CallSite's getCalledFunction() fails to return the callees in such expressions as direct calls. Value profiling should avoid instrumenting such cases. Mostly NFC. llvm-svn: 265330	2016-04-04 18:56:36 +00:00
Matthias Braun	870c34f0cf	ARM, AArch64, X86: Check preserved registers for tail calls. We can only perform a tail call to a callee that preserves all the registers that the caller needs to preserve. This situation happens with calling conventions like preserver_mostcc or cxx_fast_tls. It was explicitely handled for fast_tls and failing for preserve_most. This patch generalizes the check to any calling convention. Related to rdar://24207743 Differential Revision: http://reviews.llvm.org/D18680 llvm-svn: 265329	2016-04-04 18:56:13 +00:00
Teresa Johnson	916495d894	[ThinLTO] Add option to dump value name to GUID mapping Summary: Useful for debugging since we lose this correlation after the permodule summary/VST is read and until we later materialize source modules in the function importer. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18555 llvm-svn: 265327	2016-04-04 18:52:58 +00:00
Teresa Johnson	0beb858e97	[ThinLTO] Augment FunctionImport dump with value name to GUID map Summary: To aid in debugging, dump out the correlation between value names and GUID for each source module when it is materialized. This will make it easier to comprehend the earlier summary-based function importing debug trace which only has access to and prints the GUIDs. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18556 llvm-svn: 265326	2016-04-04 18:52:23 +00:00
Brendon Cahoon	86f783e315	[DependenceAnalysis] Check if result of getConstantPart is null A seg-fault occurs due to a reference of a null pointer, which is the value returned by getConstantPart. This function returns null if the constant part is not found. The code that calls this function needs to check for the null return value. Differential Revision: http://reviews.llvm.org/D18718 llvm-svn: 265319	2016-04-04 18:13:18 +00:00
Derek Schuff	73900c6876	Replace MachineRegisterInfo::isSSA() with a MachineFunctionProperty Use the MachineFunctionProperty mechanism to indicate whether a MachineFunction is in SSA form instead of a custom method on MachineRegisterInfo. NFC Differential Revision: http://reviews.llvm.org/D18574 llvm-svn: 265318	2016-04-04 18:03:29 +00:00
Wei Mi	fb5252cac1	Revert r265309 and r265312 because they caused some errors I need to investigate. llvm-svn: 265317	2016-04-04 17:45:03 +00:00
Derek Schuff	1dbf7a571f	Add MachineFunctionProperty checks for AllVRegsAllocated for target passes Summary: This adds the same checks that were added in r264593 to all target-specific passes that run after register allocation. Reviewers: qcolombet Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18525 llvm-svn: 265313	2016-04-04 17:09:25 +00:00
Wei Mi	cdaf1df657	Fix unused var warning caused by r265309. llvm-svn: 265312	2016-04-04 17:03:58 +00:00
Wei Mi	ffbc9c7f3b	Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Differential Revision: http://reviews.llvm.org/D15302 llvm-svn: 265309	2016-04-04 16:42:40 +00:00
Daniel Sanders	b3c2764f89	[mips] Range check simm32 and fold MIPS16's imm32 into simm32. Summary: At this point we should be able to enable IAS by default for O32 without breaking check-all, or recursion. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18439 llvm-svn: 265302	2016-04-04 15:32:49 +00:00
Ulrich Weigand	99ac5045ab	[SystemZ] Add compare-and-branch instructions to MC This adds MC support for fused compare + indirect branch instructions, ie. CRB, CGRB, CLRB, CLGRB, CIB, CGIB, CLIB, CLGIB. They aren't actually generated yet -- this is preparation for their use for conditional returns in the next iteration of D17339. Author: koriakin Differential Revision: http://reviews.llvm.org/D18742 llvm-svn: 265296	2016-04-04 14:26:43 +00:00
Ulrich Weigand	a9ac6d6cc2	[SystemZ] Support ATOMIC_FENCE A cross-thread sequentially consistent fence should be lowered into z/Architecture's BCR serialization instruction, instead of causing a fatal error in the back-end. Author: bryanpkc Differential Revision: http://reviews.llvm.org/D18644 llvm-svn: 265292	2016-04-04 12:45:44 +00:00
Ulrich Weigand	f557d08325	[SystemZ] Support llvm.frameaddress/llvm.returnaddress intrinsics Enable the SystemZ back-end to lower FRAMEADDR and RETURNADDR, which previously would cause the back-end to crash. Currently, only a frame count of zero is supported. Author: bryanpkc Differential Revision: http://reviews.llvm.org/D18514 llvm-svn: 265291	2016-04-04 12:44:55 +00:00
Elena Demikhovsky	e99c561391	AVX-512: Truncating store for i1 vectors Implemented truncstore for KNL and skylake-avx512. Covered vectors from v2i1 to v64i1. We save the value in bits (not in bytes) - v32i1 is saved in 4 bytes. Differential Revision: http://reviews.llvm.org/D18740 llvm-svn: 265283	2016-04-04 07:17:47 +00:00
Duncan P. N. Exon Smith	8e65f8ddfd	ValueMapper: Remove old FIXMEs; almost NFC Remove a few old FIXMEs from the original commit of the Metadata/Value split in r223802. These are commented out assertions to the effect that calls between mapValue and mapMetadata never return nullptr. (The only behaviour change is that Mapper::mapSimpleMetadata memoizes the nullptr return.) When I originally rewrote the mapping code, I thought we could be stricter in the new metadata hierarchy and never return nullptr when RF_NullMapMissingGlobalValues was off. It's still not entirely clear to me why these assertions failed (a few months ago, I had a theory that I forgot to write down, but that's helping no one). Understood or not, I no longer see how these commented-out assertions would be useful. I'm relegating them to the annals of source control before making significant changes to ValueMapper.cpp. llvm-svn: 265282	2016-04-04 04:59:56 +00:00
Duncan P. N. Exon Smith	fef609f15e	IR: Lazily create ReplaceableMetadataImpl on MDNode RAUW support on MDNode usually requires an extra allocation for ReplaceableMetadataImpl. This is only strictly necessary if there are tracking references to the MDNode. Make the construction of ReplaceableMetadataImpl lazy, so that we don't get allocations if we don't need them. Since MDNode::isResolved now checks MDNode::isTemporary and MDNode::NumUnresolved instead of whether a ReplaceableMetadataImpl is allocated, the internal changes are intrusive (at various internal checkpoints, isResolved now has a different answer). However, there should be no real functionality change here; just slightly lazier allocation behaviour. The external semantics should be identical. llvm-svn: 265279	2016-04-03 21:23:52 +00:00
Amaury Sechet	7c2883cf85	Various style fix in Core.h/Core.cpp . NFC llvm-svn: 265277	2016-04-03 21:06:04 +00:00
Duncan P. N. Exon Smith	756e1c3db4	ValueMapper: Disallow metadata mapping recursion through mapValue This adds an assertion to maintain the property from r265273. When Mapper::mapSimpleMetadata calls Mapper::mapValue, it should not find its way back to mapMetadataImpl. This guarantees that mapSimpleMetadata is not involved in any recursion. Since Mapper::mapValue calls out to arbitrary materializers, we need to save a bit on the ValueMap to make this assertion effective. There should be no functionality change here. This co-recursion should already have been impossible. llvm-svn: 265276	2016-04-03 20:54:51 +00:00
Duncan P. N. Exon Smith	a997856b3d	Work around MSVC failure from r265273 http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19726 llvm-svn: 265275	2016-04-03 20:42:21 +00:00
Simon Pilgrim	0edd3d771a	[X86] Removed duplicate code. llvm-svn: 265274	2016-04-03 20:40:35 +00:00
Duncan P. N. Exon Smith	c6065e3a25	ValueMapper: Avoid recursion in mapSimplifiedMetadata, NFC The main change is to delay materializing GlobalValue initializers from Mapper::mapValue until Mapper::~Mapper. This effectively removes all recursion from mapSimplifiedMetadata, as promised in r265270. mapSimplifiedMetadata calls mapValue for ConstantAsMetadata nodes to find the mapped constant, and now it shouldn't be possible for mapValue to indirectly re-invoke mapMetadata. I'll add an assertion to that effect in a follow-up (separated so that the assertion can easily be reverted independently, if it comes to that). This a step toward a broader goal: converting Mapper::mapMetadataImpl from a recursive to an iterative algorithm. When a BlockAddress points at a BasicBlock inside an unmaterialized function body, we need to delay it until the function body is materialized in Mapper::~Mapper. This commit creates a temporary BasicBlock and returns a new BlockAddress, then RAUWs the BasicBlock once it is known. This situation should be extremely rare since a BlockAddress is usually used from within the function it's referencing (and BlockAddress itself is rare). There should be no observable functionality change. llvm-svn: 265273	2016-04-03 20:17:45 +00:00
Peter Zotov	8efe38a1e2	[CodeGenPrepare] Fix r265264 (again). Don't require TLI for SinkCmpExpression, like it wasn't before r265264. llvm-svn: 265271	2016-04-03 19:32:13 +00:00
Duncan P. N. Exon Smith	ae8bd4bd11	ValueMapper: Split out mapSimpleMetadata, NFC Split out a helper for mapping metadata without operands. This is any metadata that is not an MDNode, and any MDNode where the answer is known without looking at operands. Through some weird twists, this function is co-recursive: mapSimpleMetadata => MapValue => materializeInitFor => linkFunctionBody => RemapInstructions => MapMetadata => mapSimpleMetadata I plan to break the recursion in a follow-up. llvm-svn: 265270	2016-04-03 19:31:01 +00:00
Duncan P. N. Exon Smith	829dc87a68	ValueMapper: Introduce Mapper helper class, NFC Remove a bunch of boilerplate from ValueMapper.cpp by using a new file-local class called Mapper. llvm-svn: 265268	2016-04-03 19:06:24 +00:00
Simon Pilgrim	cd0dfc93eb	[X86][SSE] Support for MOVMSK signbit extraction instructions Add support for lowering with the MOVMSK instruction to extract vector element signbits to a GPR. This is an early step towards more optimal handling of vector comparison results. Differential Revision: http://reviews.llvm.org/D18741 llvm-svn: 265266	2016-04-03 18:22:03 +00:00
Peter Zotov	f87e550e89	[CodeGenPrepare] Fix r265264. The case where there was no TargetLowering was not handled, leading to null pointer dereferences. llvm-svn: 265265	2016-04-03 17:11:53 +00:00
Peter Zotov	0b6d7bc682	[CodeGenPrepare] Avoid sinking soft-FP comparisons Sinking comparisons in CGP can undo the job of hoisting them done earlier by LICM, and soft-FP makes this an expensive mistake. A common pattern that produces floating point comparisons uniform over a loop is an explicit check for division by zero. If the divisor is hoisted out of the loop, the comparison can also be, but hoisting the function that unwinds is never legal, since it may cause side effects in the loop body prior to the unwinding to not be executed. Differential Revision: http://reviews.llvm.org/D18744 llvm-svn: 265264	2016-04-03 16:36:17 +00:00
Simon Pilgrim	20d1d4f045	[X86] Tidied up X86ISD instruction nodes. NFCI. Tidied up comments, stripped trailing whitespace, split apart nodes that aren't related. No change in ordering although there is definitely some scope for it. llvm-svn: 265263	2016-04-03 14:14:32 +00:00
Peter Zotov	0218d0f383	Mark some FP intrinsics as safe to speculatively execute Floating point intrinsics in LLVM are generally not speculatively executed, since most of them are defined to behave the same as libm functions, which set errno. However, the only error that can happen when executing ceil, floor, nearbyint, rint and round libm functions per POSIX.1-2001 is -ERANGE, and that requires the maximum value of the exponent to be smaller than the number of mantissa bits, which is not the case with any of the floating point types supported by LLVM. The trunc and copysign functions never set errno per per POSIX.1-2001. Differential Revision: http://reviews.llvm.org/D18643 llvm-svn: 265262	2016-04-03 12:30:46 +00:00
Elena Demikhovsky	5e426f7356	AVX-512: Load and Extended Load for i1 vectors Implemented load+{sign\|zero}_extend for i1 vectors Fixed failures in i1 vector load. Covered loading of v2i1, v4i1, v8i1, v16i1, v32i1, v64i1 vectors for KNL and SKX. Differential Revision: http://reviews.llvm.org/D18737 llvm-svn: 265259	2016-04-03 08:41:12 +00:00
Davide Italiano	d4f5a059e0	[SimplifyLibCalls] Garbage collect dead code. We already skip optimizations if the return value of printf() is used, so CI->use_empty() is always true. Differential Revision: http://reviews.llvm.org/D18656 llvm-svn: 265253	2016-04-03 01:46:52 +00:00
Jacques Pienaar	796975d311	[lanai] Fix for LanaiDelaySlotFiller and LanaiMCInstLower.cpp Summary: * Fix to stop delay slot filler from inserting SP modifying instructions in the newly expanded call/return instructions. * In LowerSymbol the outermost type was not LanaiMCExpr if there was a binary expression * Remove printExpr in LanaiInstPrinter Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18734 llvm-svn: 265251	2016-04-03 00:49:27 +00:00
Zoran Jovanovic	2b7cc5a4ae	[mips][microMIPS] Revert commits r264245 and r264248. Commit r264245 was the reason for failing tests in LLVM test suite. Commit r264248 depends on the first one. llvm-svn: 265249	2016-04-02 23:06:13 +00:00
Saleem Abdulrasool	85b43639b1	AArch64: support .cpu directive Add support for the AArch64 .cpu directive. This is a slightly involved directive since the parameter is actually a variable encoded string. The general structure is: <cpu>[[+-]<feature>]* We now map some of the supported string names for features for internal representation of feature flags. If we encounter one which we do not support, bail out as we cannot validate the assembly any longer. Resolves PR27010. llvm-svn: 265240	2016-04-02 19:29:52 +00:00
Duncan P. N. Exon Smith	6d72d166dc	Linker: Split mapUnneededSubprograms into two; almost NFC Split the loop through compile units in mapUnneededSubprograms in two. First, visit imported entities to ensure that we've visited all need subprograms. Second, visit subprograms, and drop the ones we don't need. Hypothetically this protects against a subprogram from one compile unit being referenced from an imported entity in a different compile unit. I don't think that's valid IR (a debug info expert could confirm), but I think the refactor makes the code more clear. llvm-svn: 265233	2016-04-02 17:54:01 +00:00
Duncan P. N. Exon Smith	751114b39d	Remove redundant assertion after cast, NFC llvm-svn: 265232	2016-04-02 17:41:52 +00:00
Duncan P. N. Exon Smith	0d60a9887f	Linker: Avoid unnecessary work when moving named metadata IRLinker::mapUnneededSubprograms has to be sure that any "needed" subprograms get linked in. Rather than traversing through imported entities using llvm::getSubprogram, call MapMetadata. The latter memoizes the result in the ValueMap (sharing work with IRLinker::linkNamedMDNodes proper), and makes the local SmallPtrSet redundant. llvm-svn: 265231	2016-04-02 17:39:31 +00:00
Mehdi Amini	8958c40430	Rename FunctionIndex into GlobalValueIndex to reflect the recent changes (NFC) The index used to contain only Function, but now contains GlobalValue in general. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265230	2016-04-02 17:29:47 +00:00
Duncan P. N. Exon Smith	4b520e5ef6	Linker: Remove IRMover::isMetadataUnneeded indirection; almost NFC Instead of checking live during MapMetadata whether a subprogram is needed, seed the ValueMap with `nullptr` up-front. There is a small hypothetical functionality change. Previously, calling MapMetadataOp on a node whose "scope:" chain led to an unneeded subprogram would return nullptr. However, if that were ever called, then the subprogram would be needed; a situation that the IRMover is supposed to avoid a priori! Besides cleaning up the code a little, this restores a nice property: MapMetadataOp returns the same as MapMetadata. llvm-svn: 265229	2016-04-02 17:12:00 +00:00
Duncan P. N. Exon Smith	da4a56d1ab	ValueMapper: Add support for seeding metadata with nullptr Support seeding a ValueMap with nullptr for Metadata entries, a situation I didn't consider in the Metadata/Value split. I added a ValueMapper::getMappedMD accessor that returns an Optional<Metadata*> with the mapped (possibly null) metadata. IRMover needs to use this to avoid modifying the map when it's checking for unneeded subprograms. I updated a call from bugpoint since I find the new code clearer. llvm-svn: 265228	2016-04-02 17:04:38 +00:00
Duncan P. N. Exon Smith	520f8542ff	Bitcode: Try to emit metadata in function blocks Whenever metadata is only referenced by a single function, emit the metadata just in that function block. This should improve lazy-loading by reducing the amount of metadata in the global block. For now, this should catch all DILocations, and anything else that happens to be referenced only by a single function. It's also a first step toward a couple of possible future directions (which this commit does not implement): 1. Some debug info metadata is only referenced from compile units and individual functions. If we can drop the link from the compile unit, this optimization will get more powerful. 2. Any uniqued metadata that isn't referenced globally can in theory be emitted in every function block that references it (trading off bitcode size and full-parse time vs. lazy-load time). Note: this assumes the new BitcodeReader error checking from r265223. The metadata stored in function blocks gets purged after parsing each function, which means unresolved forward references will get lost. Since all the global metadata should have already been resolved by the time we get to the function metadata blocks we just need to check for that case. (If for some reason we need to handle bitcode that fails the checks in r265223, the fix is to store about-to-be-dropped unresolved nodes in MetadataList::shrinkTo until they can be handled succesfully by a future call to MetadataList::tryToResolveCycles.) llvm-svn: 265226	2016-04-02 15:22:57 +00:00
Duncan P. N. Exon Smith	0b76b723f4	Fix doxygen comments from r265224, NFC llvm-svn: 265225	2016-04-02 15:16:56 +00:00
Duncan P. N. Exon Smith	9342911f31	BitcodeWriter: Further unify function metadata, NFC Further unify the handling of function-local metadata with global metadata, by exposing the same interface in ValueEnumerator. Both contexts use the same accessors: - getMDStrings(): get the strings for this block. - getNonMDStrings(): get the non-strings for this block. A future commit will start adding strings to the function-block. llvm-svn: 265224	2016-04-02 15:09:42 +00:00
Duncan P. N. Exon Smith	8742de9b20	BitcodeReader: Check for unresolved function metadata A follow-up commit will start using function metadata blocks more heavily. This commit adds some error checking to confirm that metadata is fully resolved before (and after) materializing each function. This is valid even when reading very old bitcode from before the metadata/value split. The global metadata block always came before the function blocks. However, in case somehow this causes a regression (i.e., an old LLVM did produce such bitcode after all) I'm committing separately. llvm-svn: 265223	2016-04-02 14:55:01 +00:00
Mehdi Amini	1e5fddda3d	Reverts r265219. Unintentionally commited... time to call the day off! From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265221	2016-04-02 05:35:03 +00:00
Mehdi Amini	89038a1071	Fix "warning: variabl 'XX’ set but not used" in release build (variable used in assertion, NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265220	2016-04-02 05:34:19 +00:00
Mehdi Amini	5921a3ae66	wip From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265219	2016-04-02 05:34:14 +00:00
Mehdi Amini	b049431bec	constify GlobalValue::getGUID() and GlobalValue::getGlobalIdentifier() (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265217	2016-04-02 05:25:27 +00:00
Mehdi Amini	024a79f780	Revert "ThinLTO: add module caching handling." This reverts commit r265214, unintentionally commited. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265216	2016-04-02 05:08:18 +00:00
Mehdi Amini	ad5741b075	Create a typedef GlobalValue::GUID for uint64_t and RAUW (NFC) Summary: This should make the code more readable, especially all the map declarations. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18721 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265215	2016-04-02 05:07:53 +00:00
Mehdi Amini	2cd609482d	ThinLTO: add module caching handling. Reviewers: tejohnson Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18494 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265214	2016-04-02 05:07:08 +00:00
Mehdi Amini	e70901552c	80 lines column after renaming "shouldDiscardValueNames" (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265212	2016-04-02 03:59:58 +00:00
Mehdi Amini	50af49fcdc	Rename Context::discardValueNames() to shouldDiscardValueNames() (NFC) Suggested by Sean Silva. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265211	2016-04-02 03:46:17 +00:00
Mehdi Amini	27814980a3	Add Cache Pruning support Incremental LTO will usea cache to store object files. This patch handles the pruning part of the cache, exposing a few knobs: - Pruning interval: the implementation keeps a "timestamp" file in the directory and will scan it only after a given interval since the last modification of the timestamp file. This is for performance purpose, we don't want to scan continuously the folder. - Entry expiration: this is the time after which a file that hasn't been used is remove from the cache. - Maximum size: expressed in percentage of the available disk space, it helps to avoid that we blow up the disk space. http://reviews.llvm.org/D18422 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265209	2016-04-02 03:28:26 +00:00
Hans Wennborg	fa6e414eef	Fix -Wpedantic warning about extra semi-colon llvm-svn: 265204	2016-04-02 01:03:41 +00:00
Rong Xu	0eb3603626	[PGO] Use a helper function to find all indirect call-sites Use a helper function to find all the direct-calls-sites in a function. Also split the code into a separated file as this will be use by indirect-call-promotion transformation. Differential Revision: http://reviews.llvm.org/D18704 llvm-svn: 265199	2016-04-01 23:16:44 +00:00
Tim Northover	5dad9df9f7	AArch64: avoid clobbering SP for dead MOVimm pseudos. We were producing ORR, which actually defines a GPR32sp rather than a GPR32. Should fix PR23209. llvm-svn: 265198	2016-04-01 23:14:52 +00:00
Nico Weber	73853ab4f8	Make DIASession work if msdia*.dll isn't registered. This fixes various symbolization test failures for me when I build with a hermetic VS2015 without having run the 2015 installer. http://reviews.llvm.org/D18707 llvm-svn: 265193	2016-04-01 22:21:51 +00:00
Mehdi Amini	5a2e5d324e	ThinLTO: special handling for LinkOnce functions These function can be dropped by the compiler if they are no longer referenced in the current module. However there is a change that another module is still referencing them because of the import. Multiple solutions can be used: - Always import LinkOnce when a caller is imported. This ensure that every module with a call to a LinkOnce has the definition and will be able to emit it if it emits the call. - Turn the LinkOnce into Weak, so that it is always emitted. - Turn all LinkOnce into available_externally and come back after all modules are codegen'ed to emit only one copy of the linkonce, when there is still a reference to it. This patch implement the second option, with am optimization that only one module will turn the LinkOnce into Weak, while the others will turn it into available_externally, so that there is exactly one copy emitted for the whole compilation. http://reviews.llvm.org/D18346 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265190	2016-04-01 21:53:50 +00:00
Manman Ren	9bfd0d03e9	Swift Calling Convention: add swifterror attribute. A ``swifterror`` attribute can be applied to a function parameter or an AllocaInst. This commit does not include any target-specific change. The target-specific optimization will come as a follow-up patch. Differential Revision: http://reviews.llvm.org/D18092 llvm-svn: 265189	2016-04-01 21:41:15 +00:00
Rong Xu	92c2eae4e1	Fix buildbot lldb-amd64-ninja-netbsd7 failure llvm-svn: 265180	2016-04-01 20:15:04 +00:00
James Y Knight	e6a4646372	Remove useless check for ThreadModel==Single in ARMISelLowering. NFC. ThreadModel::Single is already handled already by ARMPassConfig adding LowerAtomicPass to the pass list, which lowers all atomics to non-atomic ops and deletes fences. So by the time we get to ISel, there's no atomic fences left, so they don't need special handling. llvm-svn: 265178	2016-04-01 19:33:19 +00:00
Peter Collingbourne	dd711b93e0	LowerBitSets: Move declarations to separate namespace. Should fix modules build. llvm-svn: 265176	2016-04-01 18:46:50 +00:00
Mike Aizatsky	f13cbee12e	[libfuzzer] adding license headers to cpp files Differential Revision: http://reviews.llvm.org/D18705 llvm-svn: 265174	2016-04-01 18:38:58 +00:00
Tom Stellard	354a43c7bc	AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2} Summary: Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý. Patch by: Vedran Miletić Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: jvesely, scchan, kanarayan, arsenm Differential Revision: http://reviews.llvm.org/D17280 llvm-svn: 265170	2016-04-01 18:27:37 +00:00
Mike Aizatsky	01c0f8d8a3	[sancov] save entry block from pruning (it is always full dominator) llvm-svn: 265168	2016-04-01 18:13:19 +00:00
Sanjay Patel	9f413364d5	[x86] avoid intermediate splat for non-zero memsets (PR27100) Follow-up to http://reviews.llvm.org/D18566 and http://reviews.llvm.org/D18676 - where we noticed that an intermediate splat was being generated for memsets of non-zero chars. That was because we told getMemsetStores() to use a 32-bit vector element type, and it happily obliged by producing that constant using an integer multiply. The 16-byte test that was added in D18566 is now equivalent for AVX1 and AVX2 (no splats, just a vector load), but we have PR27141 to track that splat difference. Note that the SSE1 path is not changed in this patch. That can be a follow-up. This patch should resolve PR27100. llvm-svn: 265161	2016-04-01 17:36:45 +00:00
Chad Rosier	8787a81023	[AArch64] Fix a typo. NFC. llvm-svn: 265160	2016-04-01 17:34:38 +00:00
David Majnemer	fe3f9d1721	[InstCombine] Don't sink an instr after a catchswitch A catchswitch is a terminator, instructions cannot be inserted after it. llvm-svn: 265158	2016-04-01 17:28:17 +00:00
David Majnemer	6f1f85f0e1	[SLPVectorizer] Don't insert an extractelement before a catchswitch A catchswitch cannot be preceded by another instruction in the same basic block (other than a PHI node). Instead, insert the extract element right after the materialization of the vectorized value. This isn't optimal but is a reasonable compromise given the constraints of WinEH. This fixes PR27163. llvm-svn: 265157	2016-04-01 17:28:15 +00:00
Rong Xu	8e8fe859e0	[PGO] Refactor PGOFuncName meta data code to be used in clang Refactor the code that gets and creates PGOFuncName meta data so that it can be used in clang's value profile annotation. Differential Revision: http://reviews.llvm.org/D18623 llvm-svn: 265149	2016-04-01 16:43:30 +00:00
Sanjay Patel	a05e0ff223	[x86] avoid intermediate splat for non-zero memsets (PR27100) Follow-up to D18566 - where we noticed that an intermediate splat was being generated for memsets of non-zero chars. That was because we told getMemsetStores() to use a 32-bit vector element type, and it happily obliged by producing that constant using an integer multiply. The tests that were added in the last patch are now equivalent for AVX1 and AVX2 (no splats, just a vector load), but we have PR27141 to track that splat difference. In the new tests, the splat via shuffling looks ok to me, but there might be some room for improvement depending on uarch there. Note that the SSE1/2 paths are not changed in this patch. That can be a follow-up. This patch should resolve PR27100. Differential Revision: http://reviews.llvm.org/D18676 llvm-svn: 265148	2016-04-01 16:27:14 +00:00
Valery Pykhtin	5b3559c1ec	[AMDGPU] fix MADAK/MADMK instructions operand namings to match encoding fields. $vsrc1 -> $src1, $k -> $imm Differential Revision: http://reviews.llvm.org/D18659 llvm-svn: 265141	2016-04-01 13:13:12 +00:00
Andrea Di Biagio	8c48841907	[x86] Remove redundant call to setTargetDAGCombine for BUILD_VECTOR node type. Since revision 235394, we no longer perform target specific combines on build_vector nodes. No functional change intended. llvm-svn: 265138	2016-04-01 12:25:44 +00:00
Sagar Thakur	48973d21e1	[MIPS][LLVM-MC] Fix JR encoding for MIPSR6 ISA Summary: The assembler was picking the wrong JR variant because the pre-R6 one was still enabled at R6. Author: nitesh.jain Reviewers: vkalintiris, dsanders Subscribers: dsanders, llvm-commits, mohit.bhakkad, sagar, bhushan, jaydeep Differential: D18387 llvm-svn: 265134	2016-04-01 11:55:33 +00:00
Andrey Turetskiy	958eb46443	[X86] Introduce Lakemont CPU. Add a new Intel MCU CPU Lakemont, which doesn't support X87. Differential Revision: http://reviews.llvm.org/D18650 llvm-svn: 265128	2016-04-01 10:16:15 +00:00
James Molloy	b876c72bcc	Fix for pr24346: arm asm label calculation error in sub Some ARM instructions encode 32-bit immediates as a 8-bit integer (0-255) and a 4-bit rotation (0-30, even) in its least significant 12 bits. The original fixup, FK_Data_4, patches the instruction by the value bit-to-bit, regardless of the encoding. For example, assuming the label L1 and L2 are 0x0 and 0x104 respectively, the following instruction: add r0, r0, #(L2 - L1) ; expects 0x104, i.e., 260 would be assembled to the following, which adds 1 to r0, instead of 260: e2800104 add r0, r0, #4, 2 ; equivalently 1 The new fixup kind fixup_arm_mod_imm takes care of the encoding: e2800f41 add r0, r0, #260 Patch by Ting-Yuan Huang! llvm-svn: 265122	2016-04-01 09:40:47 +00:00
Oliver Stannard	a5520b02a5	[AArch64] Better errors for out-of-range fixups When a fixup that can be resolved by the assembler is out of range, we should report an error in the source, rather than crashing. Differential Revision: http://reviews.llvm.org/D18402 llvm-svn: 265120	2016-04-01 09:14:50 +00:00
Mehdi Amini	215d59e7b0	ThinLTO: move ObjCARCContractPass in the CodeGen pipeline This is to be coherent with Full LTO. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265118	2016-04-01 08:22:59 +00:00
Mehdi Amini	43b657b5c7	Add a libLTO API to stop/restart ThinLTO between optimizations and CodeGen This allows the linker to instruct ThinLTO to perform only the optimization part or only the codegen part of the process. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265113	2016-04-01 06:47:02 +00:00
Chuang-Yu Cheng	f8b592f213	[PPC64] Bug fix: when enabling sibling-call-opt and shrink-wrapping, the tail call branch instruction might disappear Bug Pattern: # BB#0: # %entry cmpldi 3, 0 beq- 0, .LBB0_2 # BB#1: # %exit lwz 4, 0(3) #TC_RETURNd8 LVComputationKind 0 .LBB0_2: # %cond.false mflr 0 std 0, 16(1) stdu 1, -96(1) .Ltmp0: .cfi_def_cfa_offset 96 .Ltmp1: .cfi_offset lr, 16 bl __assert_fail nop The branch instruction for tail call return is not generated, because the shrink-wrapping pass choosing a new Restore Point: %cond.false, so %exit block is not sent to emitEpilogue, that's why the branch is not generated. Thanks Kit's opinions! Reviewers: nemanjai hfinkel tjablin kbarton http://reviews.llvm.org/D17606 llvm-svn: 265112	2016-04-01 06:44:32 +00:00
Mehdi Amini	d7ad221c16	Add a module Hash in the bitcode and the combined index, implementing a kind of "build-id" This is intended to be used for ThinLTO incremental build. Differential Revision: http://reviews.llvm.org/D18213 This is a recommit of r265095 after fixing the Windows issues. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265111	2016-04-01 05:33:11 +00:00
Mehdi Amini	180441f09a	Fix S390 big endian detection From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265109	2016-04-01 05:12:24 +00:00
Mehdi Amini	4cd5702578	Add support for computing SHA1 in LLVM Provide a class to generate a SHA1 from a sequence of bytes, and a convenience raw_ostream adaptor. This will be used to provide a "build-id" by hashing the Module block when writing bitcode. ThinLTO will use this information for incremental build. Reapply r265094 which was reverted in r265102 because it broke MSVC bots (constexpr is not supported). http://reviews.llvm.org/D16325 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265107	2016-04-01 04:30:16 +00:00
Michael Kuperstein	7bab713188	Use range-based for loops. NFC. llvm-svn: 265105	2016-04-01 03:45:08 +00:00
Mehdi Amini	85fb9e058e	Revert "Add support for computing SHA1 in LLVM" This reverts commit r265096, r265095, and r265094. Windows build is broken, and the validation does not pass. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265102	2016-04-01 03:03:21 +00:00
Sanjoy Das	f83ab6de56	Don't insert stackrestore on deoptimizing returns They're not necessary (since the stack pointer is trivially restored on return), and the way LLVM inserts the stackrestore calls breaks the IR (we get a stackrestore between the deoptimize call and the return). llvm-svn: 265101	2016-04-01 02:51:30 +00:00
Sanjoy Das	18b92968ea	Don't insert lifetime end markers on deoptimizing returns They're not necessary (since the lifetime of the alloca is trivially over due to the return), and the way LLVM inserts the lifetime.end markers breaks the IR (we get a lifetime end marker between the deoptimize call and the return). llvm-svn: 265100	2016-04-01 02:51:26 +00:00
Sanjoy Das	9d41a8f269	Don't use an i64 return type with webkit_jscc Re-enable an assertion enabled by Justin Lebar in rL265092. rL265092 was breaking test/CodeGen/X86/deopt-intrinsic.ll because webkit_jscc does not like non-i64 return types. Change the test case to not do that. llvm-svn: 265099	2016-04-01 02:51:21 +00:00
Matthias Braun	cc7fba40fe	AArch64ISelLowering: Remove unused variables/arguments; NFC llvm-svn: 265098	2016-04-01 02:49:17 +00:00
Chuang-Yu Cheng	35c6181982	Fix Sub-register Rewriting in Aggressive Anti-Dependence Breaker Previously, HandleLastUse would delete RegRef information for sub-registers if they were dead even if their corresponding super-register were still live. If the super-register were later renamed, then the definitions of the sub-register would not be updated appropriately. This patch alters the behavior so that RegInfo information for sub-registers is only deleted when the sub-register and super-register are both dead. This resolves PR26775. This is the mirror image of Hal's r227311 commit. Author: Tom Jablin (tjablin) Reviewers: kbarton uweigand nemanjai hfinkel http://reviews.llvm.org/D18448 llvm-svn: 265097	2016-04-01 02:05:29 +00:00
Mehdi Amini	4c2ed3337d	Add a module Hash in the bitcode and the combined index, implementing a kind of "build-id" This is intended to be used for ThinLTO incremental build. Differential Revision: http://reviews.llvm.org/D18213 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265095	2016-04-01 01:30:06 +00:00
Mehdi Amini	3689ae14eb	Add support for computing SHA1 in LLVM Provide a class to generate a SHA1 from a sequence of bytes, and a convenience raw_ostream adaptor. This will be used to provide a "build-id" by hashing the Module block when writing bitcode. ThinLTO will use this information for incremental build. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265094	2016-04-01 01:29:54 +00:00
Justin Lebar	98981e5573	Revert "Protect some assertions with NDEBUG rather than DEBUG()." This reverts r265092, because it breaks CodeGen/X86/deopt-intrinsic.ll. llvm-svn: 265093	2016-04-01 01:23:23 +00:00
Justin Lebar	c814e8e4ab	Protect some assertions with NDEBUG rather than DEBUG(). DEBUG() only runs if you pass -debug, but these assertions are generally useful. llvm-svn: 265092	2016-04-01 01:09:12 +00:00
Justin Lebar	96418481bc	[NVPTX] Add a truncate DAG node to some calls. Summary: Previously, we were running afoul of the assertion EVT(CLI.Ins[i].VT) == InVals[i].getValueType() && "LowerCall emitted a value with the wrong type!" in SelectionDAGBuilder.cpp when running the NVPTX/i8-param.ll test. This is because our backend (for some reason) treats small return values as i32, but it wasn't ever truncating the i32 back down to the expected width in the DAG. Unclear to me whether this fixes any actual bugs -- in this test, at least, the generated code is unchanged. Reviewers: jingyue Subscribers: llvm-commits, tra, jholewinski Differential Revision: http://reviews.llvm.org/D17872 llvm-svn: 265091	2016-04-01 01:09:10 +00:00
Justin Lebar	efcc81cbb4	[NVPTX] Read __CUDA_FTZ from module flags in NVVMReflect. Summary: Previously the NVVMReflect pass would read its configuration from command-line flags or a static configuration given to the pass at instantiation time. This doesn't quite work for clang's use-case. It needs to pass a value for __CUDA_FTZ down on a per-module basis. We use a module flag for this, so the NVVMReflect pass needs to be updated to read said flag. Reviewers: tra, rnk Subscribers: cfe-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18672 llvm-svn: 265090	2016-04-01 01:09:07 +00:00
Justin Lebar	645c3014a1	[NVPTX] Annotate some instructions as hasSideEffects = 0. Summary: Tablegen tries to infer this from the selection DAG patterns defined for the instructions, but it can't always. An instructive example is CLZr64. CLZr32 is correctly inferred to have no side-effects, but the selection DAG pattern for CLZr64 is slightly more complicated, and in particular the ctlz DAG node is not at the root of the pattern. Thus tablegen can't infer that CLZr64 has no side-effects. Reviewers: jholewinski Subscribers: jholewinski, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17472 llvm-svn: 265089	2016-04-01 01:09:05 +00:00
Justin Lebar	acc47105f8	[ifcnv] Add brief comment explaining what ifcnv is. llvm-svn: 265088	2016-04-01 01:09:03 +00:00
Mehdi Amini	64719159d0	Fix Windows build (typo in disk_space() implementation) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265087	2016-04-01 00:52:05 +00:00
Akira Hatanaka	e9148dd62f	[LoopVectorize] Don't unconditionally print vectorization diagnostics when compiling with LTO. r244523 a new class DiagnosticInfoOptimizationRemarkAnalysisAliasing for optimization analysis remarks related to pointer aliasing without guarding it in isDiagnosticEnabled in LLVMContext.cpp. This caused the diagnostic message to be printed unconditionally when compiling with LTO. This commit cleans up isDiagnosticEnabled and makes sure all the vectorization optimization remarks are guarded. rdar://problem/25382153 llvm-svn: 265084	2016-04-01 00:34:39 +00:00
Mehdi Amini	e2d8f1b8fc	Add disk_space() to llvm::fs Summary: Adapted from Boost::filesystem. (This is a reapply by reverting commit r265080 and fixing the WinAPI part) Differential Revision: http://reviews.llvm.org/D18467 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265082	2016-04-01 00:18:08 +00:00
Mehdi Amini	640de72a1e	Revert "Add disk_space() to llvm::fs" This reverts commit r265074 and r265068. Breaks windows build From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265080	2016-04-01 00:13:31 +00:00
Adrian Prantl	b939a25707	Move the DebugEmissionKind enum from DIBuilder into DICompileUnit. This mostly cosmetic patch moves the DebugEmissionKind enum from DIBuilder into DICompileUnit. DIBuilder is not the right place for this enum to live in — a metadata consumer should not have to include DIBuilder.h. I also added a Verifier check that checks that the emission kind of a DICompileUnit is actually legal. http://reviews.llvm.org/D18612 <rdar://problem/25427165> llvm-svn: 265077	2016-03-31 23:56:58 +00:00
Hans Wennborg	649159df3c	Follow-up to r265036: I got these iterators mixed up llvm-svn: 265076	2016-03-31 23:55:16 +00:00
Mehdi Amini	e503a71df1	Use const ref instead of value for Twine in the disk_space() API Thanks Rui for noticing! From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265074	2016-03-31 23:14:45 +00:00
Mehdi Amini	4c82356ad3	Add disk_space() to llvm::fs Summary: Adapted from Boost::filesystem. (This is a reapply by reverting commit r265062 and fixing the WinAPI part) Differential Revision: http://reviews.llvm.org/D18467 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265068	2016-03-31 23:05:26 +00:00
Peter Collingbourne	f84646cd95	Object: Correctly read thin archives containing absolute paths. Differential Revision: http://reviews.llvm.org/D18666 llvm-svn: 265065	2016-03-31 22:08:31 +00:00
Tim Shen	800ed436e5	[AsmPrinter] Print aliases in topological order Print aliases in topological order, that is, for any alias a = b, b must be printed before a. This is because on some targets (e.g. PowerPC) linker expects aliases in such an order to generate correct TOC information. GCC also prints aliases in topological order. llvm-svn: 265064	2016-03-31 22:08:19 +00:00
Chandler Carruth	b472856a73	Fix PR26940 where compiles times regressed massively. Patch by Jonas Paulsson. Original description: Bugfix in buildSchedGraph() to make -dag-maps-huge-region work properly I found that the reduction of the maps did in fact never happen in this test case. This was because all the stores / loads were made with addresses from arguments and they thus became "unknown" stores / loads. Fixed by removing continue statements and making sure that the test for reduction always takes place. Differential Revision: http://reviews.llvm.org/D18673 llvm-svn: 265063	2016-03-31 21:55:58 +00:00
Mehdi Amini	b880144703	Revert "Add disk_space() to llvm::fs" Breaks windows bot. This reverts commit r265050. This reverts commit r265055. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265062	2016-03-31 21:55:35 +00:00
Evgeniy Stepanov	f74f091ea6	Preserve blockaddress use edges in the module splitter. "blockaddress" can not apply to an external function. All blockaddress constant uses must belong to the same module as the definition of the target function. llvm-svn: 265061	2016-03-31 21:55:11 +00:00
David Majnemer	ae272d718e	[NVPTX] Infer __nvvm_reflect as nounwind, readnone This patch simply mirrors the attributes we give to @llvm.nvvm.reflect to the __nvvm_reflect libdevice call. This shaves about 30% of the code in libdevice away because of CSE opportunities. It's also helps us figure out that libdevice implementations of transcendental functions don't have side-effects. llvm-svn: 265060	2016-03-31 21:29:57 +00:00
Sanjay Patel	4d71160d5d	fix typo; NFC llvm-svn: 265054	2016-03-31 21:00:48 +00:00
Jun Bum Lim	760afcb338	[AArch64] Allow loads with imp-def to be handled in getMemOpBaseRegImmOfsWidth() Summary: This change will allow loads with imp-def to be clustered in machine-scheduler pass. areMemAccessesTriviallyDisjoint() can also handle loads with imp-def. Reviewers: mcrosier, jmolloy, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18665 llvm-svn: 265051	2016-03-31 20:53:47 +00:00
Mehdi Amini	9defda528e	Add disk_space() to llvm::fs Summary: Adapted from Boost::filesystem. Reviewers: bruno, silvas Subscribers: tberghammer, danalbert, llvm-commits, srhines Differential Revision: http://reviews.llvm.org/D18467 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265050	2016-03-31 20:48:27 +00:00
Hal Finkel	fc35391f2b	[PowerPC] Add a late MI-level pass for QPX load/splat simplification Chapter 3 of the QPX manual states that, "Scalar floating-point load instructions, defined in the Power ISA, cause a replication of the source data across all elements of the target register." Thus, if we have a load followed by a QPX splat (from the first lane), the splat is redundant. This adds a late MI-level pass to remove the redundant splats in some of these cases (specifically when both occur in the same basic block). This optimization is scheduled just prior to post-RA scheduling. It can't happen before anything that might replace the load with some already-computed quantity (i.e. store-to-load forwarding). llvm-svn: 265047	2016-03-31 20:39:41 +00:00
Hans Wennborg	132cd62121	Revert r265039 "[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)" I think it might have caused these build breakages: http://lab.llvm.org:8011/builders/clang-x86-win2008-selfhost/builds/7234/steps/build%20stage%202/logs/stdio http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19566/steps/run%20tests/logs/stdio llvm-svn: 265046	2016-03-31 20:27:30 +00:00
Evgeniy Stepanov	a614ab7b71	Preserve extern_weak linkage in CloneModule. Only force "extern" linkage if the function used to be a definition in the source module. Declarations keep their original linkage. llvm-svn: 265043	2016-03-31 20:21:31 +00:00
Benjamin Kramer	569efd2cfd	[ARM] Expand v1i64 and v2i64 ctpop. The default is legal, which results in 'Cannot select' errors. This is triggered during selfhost due to a recent cost model change. llvm-svn: 265040	2016-03-31 19:42:04 +00:00
Hans Wennborg	e97fb414e8	[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140) For code such as: void f(int, int); void g() { f(1, 2); } compiled for 32-bit X86 Linux, Clang would previously generate: subl $12, %esp subl $8, %esp pushl $2 pushl $1 calll f addl $16, %esp addl $12, %esp retl This patch fixes that by merging adjacent stack adjustments in eliminateCallFramePseudoInstr(). Differential Revision: http://reviews.llvm.org/D18627 llvm-svn: 265039	2016-03-31 19:26:24 +00:00
Hans Wennborg	e1a2e90ffa	Change eliminateCallFramePseudoInstr() to return an iterator This will become necessary in a subsequent change to make this method merge adjacent stack adjustments, i.e. it might erase the previous and/or next instruction. It also greatly simplifies the calls to this function from Prolog- EpilogInserter. Previously, that had a bunch of logic to resume iteration after the call; now it just continues with the returned iterator. Note that this changes the behaviour of PEI a little. Previously, it attempted to re-visit the new instruction created by eliminateCallFramePseudoInstr(). That code was added in r36625, but I can't see any reason for it: the new instructions will obviously not be pseudo instructions, they will not have FrameIndex operands, and we have already accounted for the stack adjustment. Differential Revision: http://reviews.llvm.org/D18627 llvm-svn: 265036	2016-03-31 18:33:38 +00:00
Jacques Pienaar	4badd6aaf3	[lanai] isBrImm should accept any non-constant immediate. isBrImm should accept any non-constant immediate. Previously it was only accepting LanaiMCExpr ones which was wrong. Differential Revision: http://reviews.llvm.org/D18571 llvm-svn: 265032	2016-03-31 17:58:55 +00:00
Ehsan Amiri	99b017ae35	[PPC] basic support for Power 9 direct move instructions http://reviews.llvm.org/D18097 Initial support does not include any patterns to generate this instructions llvm-svn: 265031	2016-03-31 17:47:17 +00:00
Rong Xu	d5a57b5947	[PGO] use emplace_back. NFC. Use emplace_back instead of push_back for simplicity. llvm-svn: 265030	2016-03-31 17:39:33 +00:00
Sanjay Patel	92d5ea5e07	[x86] use SSE/AVX ops for non-zero memsets (PR27100) Move the memset check down to the CPU-with-slow-SSE-unaligned-memops case: this allows fast targets to take advantage of SSE/AVX instructions and prevents slow targets from stepping into a codegen sinkhole while trying to splat a byte into an XMM reg. Follow-on bugs exposed by the current codegen are: https://llvm.org/bugs/show_bug.cgi?id=27141 https://llvm.org/bugs/show_bug.cgi?id=27143 Differential Revision: http://reviews.llvm.org/D18566 llvm-svn: 265029	2016-03-31 17:30:06 +00:00
Xinliang David Li	d0b4cbb9dd	Minor code cleanup /NFC llvm-svn: 265025	2016-03-31 16:22:17 +00:00
Stephan Bergmann	480de227f6	Don't use potentially invalidated iterator If the lhs is evaluated before the rhs, FuncletI's operator-> can trigger the assert(isHandleInSync() && "invalid iterator access!"); at include/llvm/ADT/DenseMap.h:1061. (Happens e.g. when compiled with GCC 6.) Differential Revision: http://reviews.llvm.org/D18440 llvm-svn: 265024	2016-03-31 15:42:01 +00:00
Ulrich Weigand	3707ba8030	[PowerPC] Correctly compute 64-bit offsets in fast isel PPCSimplifyAddress contains this code: IntegerType OffsetTy = ((VT == MVT::i32) ? Type::getInt32Ty(Context) : Type::getInt64Ty(Context)); to determine the type to be used for an index register, if one needs to be created. However, the "VT" here is the type of the data being loaded or stored, not* the type of an address. This means that if a data element of type i32 is accessed using an index that does not not fit into 32 bits, a wrong address is computed here. Note that PPCFastISel is only ever used on 64-bit currently, so the type of an address is actually always MVT::i64. Other parts of the code, even in this same PPCSimplifyAddress routine, already rely on that fact. Thus, this patch changes the code to simply unconditionally use Type::getInt64Ty(*Context) as OffsetTy. llvm-svn: 265023	2016-03-31 15:37:06 +00:00
Nemanja Ivanovic	a621a7f9c3	[PowerPC] Basic support for P9 atomic loads and stores This patch corresponds to review: http://reviews.llvm.org/D18032 This patch provides asm implementation for the following instructions: lwat, ldat, stwat, stdat, ldmx, mcrxrx llvm-svn: 265022	2016-03-31 15:26:37 +00:00
Jun Bum Lim	cf9744367b	[AArch64] Handle missing store pair opportunity Summary: This change will handle missing store pair opportunity where the first store instruction stores zero followed by the non-zero store. For example, this change will convert : str wzr, [x8] str w1, [x8, #4] into: stp wzr, w1, [x8] Reviewers: jmolloy, t.p.northover, mcrosier Subscribers: flyingforyou, aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18570 llvm-svn: 265021	2016-03-31 14:47:24 +00:00
Ulrich Weigand	1931b01a64	[PowerPC] Remove incorrect use of COPY_TO_REGCLASS in fast isel The fast isel pass currently emits a COPY_TO_REGCLASS node to convert from a F4RC to a F8RC register class during conversion of a floating-point number to integer. There is actually no support in the common code instruction printers to emit COPY_TO_REGCLASS nodes, so the PowerPC back-end has special code there to simply ignore COPY_TO_REGCLASS. This is correct if and only if the source and destination registers of COPY_TO_REGCLASS are the same (except for the different register class). But nothing guarantees this to be the case, and if the register allocator does end up allocating source and destination to different registers after all, the back-end simply generates incorrect code. I've included a test case that shows such incorrect code generation. However, it seems that COPY_TO_REGCLASS is actually not intended to be used at the MI layer at all. It is used during SelectionDAG, but always lowered to a plain COPY before emitting MI. Other back-end's fast isel passes never emit COPY_TO_REGCLASS at all. I suspect it is simply wrong for the PowerPC back-end to emit it here. This patch changes the PowerPC back-end to directly emit COPY instead of COPY_TO_REGCLASS and removes the special handling in the instruction printers. Differential Revision: http://reviews.llvm.org/D18605 llvm-svn: 265020	2016-03-31 14:44:50 +00:00
Daniel Sanders	85fd10bd93	[mips] Range check simm16 Summary: There are too many instructions to exhaustively test so addiu and lwc2 are used as representative examples. It should be noted that many memory instructions that should have simm16 range checking do not because it is also necessary to support the macro of the same name which accepts simm32. The range checks for these occur in the macro expansion. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18437 llvm-svn: 265019	2016-03-31 14:34:00 +00:00
Daniel Sanders	eab3146156	[mips] Range check simm11 and mem_simm11. Summary: ldc2/sdc2 now emit slightly worse diagnostics for MIPS-I. The problem is that they don't trigger the custom parser because all the candidates are disabled by feature bits. On all other subtargets, the diagnostics are accurate but are subject to the usual issues of needing to report multiple ways to correct the code (e.g. smaller offset, enable a CPU feature) but only being able to report one error. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18436 llvm-svn: 265018	2016-03-31 14:23:20 +00:00
Dmitry Polukhin	cd835ad876	[IFUNC] Introduce GlobalIndirectSymbol as a base class for alias and ifunc This patch is a part of http://reviews.llvm.org/D15525 GlobalIndirectSymbol class contains common implementation for both aliases and ifuncs. This patch should be NFC change that just prepare common code for ifunc support. Differential Revision: http://reviews.llvm.org/D18433 llvm-svn: 265016	2016-03-31 14:16:21 +00:00
Sam Kolton	1048fb1818	[AMDGPU] Disassembler: support for DPP Review: http://reviews.llvm.org/D18642 llvm-svn: 265015	2016-03-31 14:15:04 +00:00
Daniel Sanders	dc0602a2c2	[mips] Split mem_msa into range checked mem_simm10 and mem_simm10_lsl[123] Summary: Also, made test_mi10.s formatting consistent with the majority of the MC tests. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18435 llvm-svn: 265014	2016-03-31 14:12:01 +00:00
Nirav Dave	83ce54aac2	Prevent X86ISelLowering from merging volatile loads Change isConsecutiveLoads to check that loads are non-volatile as this is a requirement for any load merges. Propagate change to two callers. Reviewers: RKSimon Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18546 llvm-svn: 265013	2016-03-31 13:40:55 +00:00
Daniel Sanders	2e9f69d933	[mips] Range check simm9 and fix a bug this revealed. Summary: The bug was that microMIPS's [ls]w[lr]e instructions claimed to support a 12-bit offset when it is only 9-bit. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18434 llvm-svn: 265010	2016-03-31 13:15:23 +00:00
Zlatko Buljan	6221be8e46	[mips][microMIPS] Implement MFC, MFHC and DMFC* instructions Differential Revision: http://reviews.llvm.org/D17334 llvm-svn: 265002	2016-03-31 08:51:24 +00:00
Jonas Paulsson	2ba315218b	Indentation fix in SystemZInstrInfo.cpp llvm-svn: 265000	2016-03-31 08:00:14 +00:00
Sanjoy Das	56df0ec610	[InstCombine] Fix incorrect rule from rL236202 The rule for SMIN introduced in rL236202 doesn't work as advertised: the check for Pred == ICmpInst::ICMP_SGT was missing. llvm-svn: 264996	2016-03-31 05:14:34 +00:00
Sanjoy Das	c9d6d8b106	Delete trailing whitespace llvm-svn: 264995	2016-03-31 05:14:29 +00:00
Sanjoy Das	e12c0e5159	[SCEV] Track NoWrap properties using MatchBinaryOp, NFC This way once we teach MatchBinaryOp to map more things into arithmetic, the non-wrapping add recurrence construction would understand it too. Right now MatchBinaryOp still only understands arithmetic, so this is solely a code-reorganization change. llvm-svn: 264994	2016-03-31 05:14:26 +00:00
Sanjoy Das	118d919a6a	[SCEV] NFC code motion to simplify later change llvm-svn: 264993	2016-03-31 05:14:22 +00:00
Craig Topper	d2aa03a60a	[X86] Use MVT instead of EVT in code called after legalization. llvm-svn: 264992	2016-03-31 04:37:41 +00:00
Hal Finkel	851b33a0b1	[PowerPC] Load two floats directly instead of using one 64-bit integer load When dealing with complex<float>, and similar structures with two single-precision floating-point numbers, especially when such things are being passed around by value, we'll sometimes end up loading both float values by extracting them from one 64-bit integer load. It looks like this: t13: i64,ch = load<LD8[%ref.tmp]> t0, t6, undef:i64 t16: i64 = srl t13, Constant:i32<32> t17: i32 = truncate t16 t18: f32 = bitcast t17 t19: i32 = truncate t13 t20: f32 = bitcast t19 The problem, especially before the P8 where those bitcasts aren't legal (and get expanded via the stack), is that it would have been better to use two floating-point loads directly. Here we add a target-specific DAGCombine to do just that. In short, we turn: ld 3, 0(5) stw 3, -8(1) rldicl 3, 3, 32, 32 stw 3, -4(1) lfs 3, -4(1) lfs 0, -8(1) into: lfs 3, 4(5) lfs 0, 0(5) llvm-svn: 264988	2016-03-31 02:56:05 +00:00
Sanjoy Das	021de058df	Introduce a @llvm.experimental.guard intrinsic Summary: As discussed on llvm-dev[1]. This change adds the basic boilerplate code around having this intrinsic in LLVM: - Changes in Intrinsics.td, and the IR Verifier - A lowering pass to lower @llvm.experimental.guard to normal control flow - Inliner support [1]: http://lists.llvm.org/pipermail/llvm-dev/2016-February/095523.html Reviewers: reames, atrick, chandlerc, rnk, JosephTremoulet, echristo Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18527 llvm-svn: 264976	2016-03-31 00:18:46 +00:00
Hans Wennborg	6596977130	[X86] Enable call frame optimization ("mov to push") not only for optsize (PR26325) The size savings are significant, and from what I can tell, both ICC and GCC do this. Differential Revision: http://reviews.llvm.org/D18573 llvm-svn: 264966	2016-03-30 23:38:01 +00:00
Matthias Braun	8d41436004	CodeGen: Factor out code for tail call result compatibility check; NFC llvm-svn: 264959	2016-03-30 22:46:04 +00:00
Matt Arsenault	2fe4fbc184	AMDGPU: Add frexp_exp intrinsic llvm-svn: 264944	2016-03-30 22:28:52 +00:00
Matt Arsenault	5cd4f8f89f	AMDGPU: Constant folding for frexp_mant llvm-svn: 264943	2016-03-30 22:28:26 +00:00
Teresa Johnson	d8d94652b2	Use existing PrintEscapedString in AssemblyWriter r264884 introduced a helper to escape the backslashes in the source file path, but I since discovered an existing mechanism to escape strings. llvm-svn: 264936	2016-03-30 22:17:28 +00:00
Peter Collingbourne	2bc252acd5	Cloning: Reduce complexity of debug info cloning and fix correctness issue. Commit r260791 contained an error in that it would introduce a cross-module reference in the old module. It also introduced O(N^2) complexity in the module cloner by requiring the entire module to be visited for each function. Fix both of these problems by avoiding use of the CloneDebugInfoMetadata function (which is only designed to do intra-module cloning) and cloning function-attached metadata in the same way that we clone all other metadata. Differential Revision: http://reviews.llvm.org/D18583 llvm-svn: 264935	2016-03-30 22:05:13 +00:00
Aaron Ballman	ef0fe1eed8	Silencing warnings from MSVC 2015 Update 2. All of these changes silence "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC. llvm-svn: 264929	2016-03-30 21:30:00 +00:00
Matt Arsenault	46ba31650e	LegalizeDAG: Don't replace vector store with integer if not legal For the same reason as the corresponding load change. Note that ExpandStore is completely broken for non-byte sized element vector stores, but preserve the current broken behavior which has tests for it. The behavior should be the same, but now introduces a new typed store that is incorrectly split later rather than doing it directly. llvm-svn: 264928	2016-03-30 21:15:18 +00:00
Matt Arsenault	a4b1b6ea05	LegalizeDAG: Don't replace vector load with integer unless legal On AMDGPU we want to be able to promote i64/f64 loads to v2i32. If the access is unaligned, this would conclude that since i64 is legal, it would convert it back to i64 and there is an endless legalization loop. Extract the logic for scalarizing the load into a new TargetLowering function, where this can also replace the custom function AMDGPU has for this. llvm-svn: 264927	2016-03-30 21:15:10 +00:00
David Majnemer	5d518386b6	[IndVarSimplify] Don't insert after a catchswitch Widening a PHI requires us to insert a trunc. The logical place for this trunc is in the same BB as the PHI. This is not possible if the BB is terminated by a catchswitch. This fixes PR27133. llvm-svn: 264926	2016-03-30 21:12:06 +00:00
Simon Pilgrim	c49bd2ede0	[X86][AVX] Ensure EltsFromConsecutiveLoads tests the entire vector for consecutive loads/zeros Fix for issue introduced D17297, where we were breaking early from the loop detecting consecutive loads which could leave us thinking a consecutive load with zeros was possible. llvm-svn: 264922	2016-03-30 20:52:24 +00:00
Justin Lebar	e3804cc932	[NVPTX] Make NVVMReflect a function pass. Summary: Currently it's a module pass. Make it a function pass so that we can move it to PassManagerBuilder's EP_EarlyAsPossible extension point, which only accepts function passes. Reviewers: rnk Subscribers: tra, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18615 llvm-svn: 264919	2016-03-30 20:40:11 +00:00
Justin Lebar	2fe1323112	[PassManager] Make PassManagerBuilder::addExtension take an std::function, rather than a function pointer. Summary: This gives callers flexibility to pass lambdas with captures, which lets callers avoid the C-style void*-ptr closure style. (Currently, callers in clang store state in the PassManagerBuilderBase arg.) No functional change, and the new API is backwards-compatible. Reviewers: chandlerc Subscribers: joker.eph, cfe-commits Differential Revision: http://reviews.llvm.org/D18613 llvm-svn: 264918	2016-03-30 20:39:29 +00:00
Hal Finkel	2e0ff2b244	[LoopVectorize] Don't vectorize loops when everything will be scalarized This change prevents the loop vectorizer from vectorizing when all of the vector types it generates will be scalarized. I've run into this problem on the PPC's QPX vector ISA, which only holds floating-point vector types. The loop vectorizer will, however, happily vectorize loops with purely integer computation. Here's an example: LV: The Smallest and Widest types: 32 / 32 bits. LV: The Widest register is: 256 bits. LV: Found an estimated cost of 0 for VF 1 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 1 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 1 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 1 for VF 1 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 1 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 1 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 1 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Scalar loop costs: 3. LV: Found an estimated cost of 0 for VF 2 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 2 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 2 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 2 for VF 2 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 2 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 2 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 2 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Vector loop of width 2 costs: 2. LV: Found an estimated cost of 0 for VF 4 For instruction: %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.body ] LV: Found an estimated cost of 0 for VF 4 For instruction: %arrayidx = getelementptr inbounds [1600 x i32], [1600 x i32]* %a, i64 0, i64 %indvars.iv25 LV: Found an estimated cost of 0 for VF 4 For instruction: %2 = trunc i64 %indvars.iv25 to i32 LV: Found an estimated cost of 4 for VF 4 For instruction: store i32 %2, i32* %arrayidx, align 4 LV: Found an estimated cost of 1 for VF 4 For instruction: %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1 LV: Found an estimated cost of 1 for VF 4 For instruction: %exitcond27 = icmp eq i64 %indvars.iv.next26, 1600 LV: Found an estimated cost of 0 for VF 4 For instruction: br i1 %exitcond27, label %for.cond.cleanup, label %for.body LV: Vector loop of width 4 costs: 1. ... LV: Selecting VF: 8. LV: The target has 32 registers LV(REG): Calculating max register usage: LV(REG): At #0 Interval # 0 LV(REG): At #1 Interval # 1 LV(REG): At #2 Interval # 2 LV(REG): At #4 Interval # 1 LV(REG): At #5 Interval # 1 LV(REG): VF = 8 The problem is that the cost model here is not wrong, exactly. Since all of these operations are scalarized, their cost (aside from the uniform ones) are indeed VF*(scalar cost), just as the model suggests. In fact, the larger the VF picked, the lower the relative overhead from the loop itself (and the induction-variable update and check), and so in a sense, picking the largest VF here is the right thing to do. The problem is that vectorizing like this, where all of the vectors will be scalarized in the backend, isn't really vectorizing, but rather interleaving. By itself, this would be okay, but then the vectorizer itself also interleaves, and that's where the problem manifests itself. There's aren't actually enough scalar registers to support the normal interleave factor multiplied by a factor of VF (8 in this example). In other words, the problem with this is that our register-pressure heuristic does not account for scalarization. While we might want to improve our register-pressure heuristic, I don't think this is the right motivating case for that work. Here we have a more-basic problem: The job of the vectorizer is to vectorize things (interleaving aside), and if the IR it generates won't generate any actual vector code, then something is wrong. Thus, if every type looks like it will be scalarized (i.e. will be split into VF or more parts), then don't consider that VF. This is not a problem specific to PPC/QPX, however. The problem comes up under SSE on x86 too, and as such, this change fixes PR26837 too. I've added Sanjay's reduced test case from PR26837 to this commit. Differential Revision: http://reviews.llvm.org/D18537 llvm-svn: 264904	2016-03-30 19:37:08 +00:00
Rong Xu	b534166fd4	[PGO] PGOFuncName in LTO optimizations PGOFuncNames are used as the key to retrieve the Function definition from the MD5 stored in the profile. For internal linkage function, we prefix the source file name to the PGOFuncNames. LTO's internalization privatizes many global linkage symbols. This happens after value profile annotation, but those internal linkage functions should not have a source prefix. To differentiate compiler generated internal symbols from original ones, PGOFuncName meta data are created and attached to the original internal symbols in the value profile annotation step. If a symbol does not have the meta data, its original linkage must be non-internal. Also add a new map that maps PGOFuncName's MD5 value to the function definition. Differential Revision: http://reviews.llvm.org/D17895 llvm-svn: 264902	2016-03-30 18:37:52 +00:00
Teresa Johnson	83c517c44e	Restore "[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly" This restores commit 264869, with a fix for windows bots to properly escape '\' in the path when serializing out. Added test. llvm-svn: 264884	2016-03-30 18:15:08 +00:00
Chad Rosier	f7ac5f28ab	[AArch64] Fix warnings pointed out by Hal. llvm-svn: 264882	2016-03-30 18:08:51 +00:00
Rong Xu	311ada11f8	[PGO] Use ArrayRef in annotateValueSite() Using ArrayRef in annotateValueSite's parameter instead of using an array and it's size. Differential Revision: http://reviews.llvm.org/D18568 llvm-svn: 264879	2016-03-30 16:56:31 +00:00
Tom Stellard	1d5e6d4bdc	AMDGPU/SI: Improve MachineSchedModel definition This patch contains a few improvements to the model, including: - Using a single resource with a defined buffers size for each memory unit. - Setting the IssueWidth correctly. - Fixing latency values for memory instructions. shader-db stats: 16429 shaders in 3231 tests Totals: SGPRS: 318232 -> 312328 (-1.86 %) VGPRS: 208996 -> 209346 (0.17 %) Code Size: 7147044 -> 7166440 (0.27 %) bytes LDS: 83 -> 83 (0.00 %) blocks Scratch: 1862656 -> 1459200 (-21.66 %) bytes per wave Max Waves: 49182 -> 49243 (0.12 %) Wait states: 0 -> 0 (0.00 %)A Differential Revision: http://reviews.llvm.org/D18453 llvm-svn: 264877	2016-03-30 16:35:13 +00:00
Tom Stellard	0bc954e3bc	AMDGPU/SI: Enable lanemask tracking in misched Summary: This results in higher register usage, but should make it easier for the compiler to hide latency. This pass is a prerequisite for some more scheduler improvements, and I think the increase register usage with this patch is acceptable, because when combined with the scheduler improvements, the total register usage will decrease. shader-db stats: 2382 shaders in 478 tests Totals: SGPRS: 48672 -> 49088 (0.85 %) VGPRS: 34148 -> 34847 (2.05 %) Code Size: 1285816 -> 1289128 (0.26 %) bytes LDS: 28 -> 28 (0.00 %) blocks Scratch: 492544 -> 573440 (16.42 %) bytes per wave Max Waves: 6856 -> 6846 (-0.15 %) Wait states: 0 -> 0 (0.00 %) Depends on D18451 Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18452 llvm-svn: 264876	2016-03-30 16:35:09 +00:00
Jonas Paulsson	f76123386a	[SystemZ] Add nop and nopr InstAliases. For compatability with GAS, nop and nopr are recognized as alises for bc and bcr, respectively. A mask of 0 turns these instructions effectively into no-operations. Reviewed by Ulrich Weigand. llvm-svn: 264875	2016-03-30 16:11:58 +00:00
Nirav Dave	8dd66e5753	Remove HasFnAttribute guards to getFnAttribute calls These checks are redundant and can be removed Reviewers: hans Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D18564 llvm-svn: 264872	2016-03-30 15:41:12 +00:00
Teresa Johnson	20beeea24a	Revert "[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly" This reverts commit r264869. I am seeing Windows bot failures due to the "\" in the path being mishandled at some point (seems to be interpreted wrongly at some point and llvm-as \| llvm-dis is yielding some junk characters). Need to investigate. llvm-svn: 264871	2016-03-30 15:16:04 +00:00
Simon Pilgrim	b87ffe8519	[X86][XOP] BITREVERSE lowering using VPPERM XOP's VPPERM has some great 'permute operations' that it can do as well as part of shuffling the bytes of a 128-bit vector - in this case we use it to perform BITREVERSE in a single instruction. llvm-svn: 264870	2016-03-30 14:14:00 +00:00
Teresa Johnson	832a6790f6	[ThinLTO] Serialize the Module SourceFileName to/from LLVM assembly Summary: This change serializes out and in the SourceFileName to LLVM assembly so that it is preserved through "llvm-dis \| llvm-as". This is necessary to ensure that the global identifiers created for local values in the module summary index are the same even if the bitcode is streamed out and read back from LLVM assembly. Serializing the summary itself to LLVM assembly is in progress. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18588 llvm-svn: 264869	2016-03-30 14:00:02 +00:00
Benjamin Kramer	9415e06da7	[NVPTX] Avoid temporary std::string and make single-use function local to the cpp file. No functionality change intended. llvm-svn: 264861	2016-03-30 12:31:51 +00:00
James Molloy	8e46cd05a1	[VectorUtils] Don't try and truncate PHIs to a smaller bitwidth We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means. However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018. This fixes PR17018. llvm-svn: 264852	2016-03-30 10:11:43 +00:00
Chandler Carruth	8e06a10d1f	[x86] Fix a horrible bug in our lowering of x86 floating point atomic operations. Specifically, we had code that tried to badly approximate reconstructing all of the possible variations on addressing modes in two x86 instructions based on those in one pseudo instruction. This is not the first bug uncovered with doing this, so stop doing it altogether. Instead generically and pedantically copy every operand from the address over to both new instructions, and strip kill flags from any register operands. This fixes a subtle bug seen in the wild where we would mysteriously drop parts of the addressing mode, causing for example the index argument in the added test case to just be completely ignored. Hypothetically, this was an extremely bad miscompile because it actually caused a predictable and leveragable write of a 64bit quantity to an unintended offset (the first element of the array intead of whatever other element was intended). As a consequence, in theory this could even have introduced security vulnerabilities. However, this was only something that could happen with an atomic floating point add. No other operation could trigger this bug, so it seems extremely unlikely to have occured widely in the wild. But it did in fact occur, and frequently in scientific applications which were using relaxed atomic updates of a floating point value after adding a delta. Those would end up being quite badly miscompiled by LLVM, which is how we found this. Of course, this often looks like a race condition in the code, but it was actually a miscompile. I suspect that this whole RELEASE_FADD thing was a complete mistake. There is no such operation, and I worry that anything other than add will get remarkably worse codegeneration. But that's not for this change.... llvm-svn: 264845	2016-03-30 08:41:59 +00:00
Duncan P. N. Exon Smith	9071729966	IR: Constify LLVMContext::discardValueNames, NFC llvm-svn: 264823	2016-03-30 04:32:29 +00:00
Duncan P. N. Exon Smith	7457ecbebe	BitcodeReader: Fix weird whitespace, NFC llvm-svn: 264822	2016-03-30 04:21:52 +00:00
George Burgess IV	49cad7d70b	[MemorySSA] Make the visitor more careful with calls. Prior to this patch, the MemorySSA caching visitor would cache all calls that it visited. When paired with phi optimization, this can be problematic. Consider: define void @foo() { ; 1 = MemoryDef(liveOnEntry) call void @clobberFunction() br i1 undef, label %if.end, label %if.then if.then: ; MemoryUse(??) call void @readOnlyFunction() ; 2 = MemoryDef(1) call void @clobberFunction() br label %if.end if.end: ; 3 = MemoryPhi(...) ; MemoryUse(?) call void @readOnlyFunction() ret void } When optimizing MemoryUse(?), we visit defs 1 and 2, so we note to cache them later. We ultimately end up not being able to optimize passed the Phi, so we set MemoryUse(?) to point to the Phi. We then cache the clobbering call for def 1 to be the Phi. This commit changes this behavior so that we wipe out any calls added to VisistedCalls while visiting the defs of a phi we couldn't optimize. Aside: With this patch, we now can bootstrap clang/LLVM without a single MemorySSA verifier failure. Woohoo. :) llvm-svn: 264820	2016-03-30 03:12:08 +00:00
Chandler Carruth	81c3ddeb1c	[x86] Extract a helper function to compute the full addressing mode from an x86 MachineInstr's operands. This will be super useful to fix some bad atomics code in my next commit. No functionality changed. llvm-svn: 264819	2016-03-30 03:10:24 +00:00
Xinliang David Li	a55fd1a9dc	[PGO] Handle invoke inst in IR based icall instrumentation Differential Revision: http://reviews.llvm.org/D18580 llvm-svn: 264818	2016-03-30 02:16:07 +00:00
George Burgess IV	82ee942a8c	[MemorySSA] Change how the walker views/walks visited phis. This patch teaches the caching MemorySSA walker a few things: 1. Not to walk Phis we've walked before. It seems that we tried to do this before, but it didn't work so well in cases like: define void @foo() { %1 = alloca i8 %2 = alloca i8 br label %begin begin: ; 3 = MemoryPhi({%0,liveOnEntry},{%end,2}) ; 1 = MemoryDef(3) store i8 0, i8* %2 br label %end end: ; MemoryUse(?) load i8, i8* %1 ; 2 = MemoryDef(1) store i8 0, i8* %2 br label %begin } Because we wouldn't put Phis in Q.Visited until we tried to visit them. So, when trying to optimize MemoryUse(?): - We would visit 3 above - ...Which would make us put {%0,liveOnEntry} in Q.Visited - ...Which would make us visit {%0,liveOnEntry} - ...Which would make us put {%end,2} in Q.Visited - ...Which would make us visit {%end,2} - ...Which would make us visit 3 - ...Which would realize we've already visited everything in 3 - ...Which would make us conservatively return 3. In the added test-case, (@looped_visitedonlyonce) this behavior would cause us to give incorrect results. Specifically, we'd visit 4 twice in the same query, but on the second visit, we'd skip while.cond because it had been visited, visit if.then/if.then2, and cache "1" as the clobbering def on the way back. 2. If we try to walk the defs of a {Phi,MemLoc} and see it has been visited before, just hand back the Phi we're trying to optimize. I promise this isn't as terrible as it seems. :) We now insert {Phi,MemLoc} pairs just before walking the Phi's upward defs. So, we check the cache for the {Phi,MemLoc} pair before checking if we've already walked the Phi. The {Phi,MemLoc} pair is (almost?) always guaranteed to have a cache entry if we've already fully walked it, because we cache as we go. So, if the {Phi,MemLoc} pair isn't in cache, either: (a) we must be in the process of visiting it (in which case, we can't give a better answer in a cache-as-we-go DFS walker) (b) we visited it, but didn't cache it on the way back (...which seems to require `ModifyingAccess` to not dominate `StartingAccess`, so I'm 99% sure that would be an error. If it's not an error, I haven't been able to get it to happen locally, so I suspect it's rare.) - - - - - As a consequence of this change, we no longer skip upward defs of phis, so we can kill the `VisitedOnlyOne` check. This gives us better accuracy than we had before, at the cost of potentially doing a bit more work when we have a loop. llvm-svn: 264814	2016-03-30 00:26:26 +00:00
Adam Nemet	fb8fbba584	[Aarch64] Turn on the LoopDataPrefetch pass for Cyclone llvm-svn: 264811	2016-03-30 00:21:29 +00:00
Adam Nemet	b81f1e0db3	[PPC] Remove -ppc-loop-prefetch-distance in favor of -prefetch-distance After the previous change, this can now be overridden centrally in the pass. llvm-svn: 264807	2016-03-29 23:45:56 +00:00
Adam Nemet	1428d41f9a	[LoopDataPrefetch] Centralize the tuning cl::opts under the pass This is effectively NFC, minus the renaming of the options (-cyclone-prefetch-distance -> -prefetch-distance). The change was requested by Tim in D17943. llvm-svn: 264806	2016-03-29 23:45:52 +00:00
Anna Zaks	1a470b6f7c	[tsan] Do not instrument reads/writes to instruction profile counters. We have known races on profile counters, which can be reproduced by enabling -fsanitize=thread and -fprofile-instr-generate simultaneously on a multi-threaded program. This patch avoids reporting those races by not instrumenting the reads and writes coming from the instruction profiler. llvm-svn: 264805	2016-03-29 23:19:40 +00:00
Kostya Serebryany	9e1a238357	[libFuzzer] more docs llvm-svn: 264803	2016-03-29 23:07:36 +00:00
Duncan P. N. Exon Smith	e8eb94a9a5	ADCE: Remove debug info intrinsics in dead scopes During ADCE, track which debug info scopes still have live references from the code, and delete debug info intrinsics for the dead ones. These intrinsics describe the locations of variables (in registers or stack slots). If there's no code left corresponding to a variable's scope, then there's no way to reference the variable in the debugger and it doesn't matter what its value is. I add a DEBUG printout when the described location in an SSA register, in case it helps some trying to track down why locations get lost. However, we still delete these; the scope itself isn't attached to any real code, so the ship has already sailed. llvm-svn: 264800	2016-03-29 22:57:12 +00:00
Fiona Glaser	44a2f7a298	MachineSink: make shouldSink a TII target hook Some targets may disagree on what they want sunk or not sunk, so make this a target hook instead of hardcoded. llvm-svn: 264799	2016-03-29 22:44:57 +00:00
Adam Nemet	85fba39390	[LoopDataPrefetch] Make more member functions private, NFC. llvm-svn: 264798	2016-03-29 22:40:02 +00:00
Derek Schuff	07636cd5e7	Add a print method to MachineFunctionProperties for better error messages This makes check failures much easier to understand. Make it empty (but leave it in the class) for NDEBUG builds. Differential Revision: http://reviews.llvm.org/D18529 llvm-svn: 264780	2016-03-29 20:28:20 +00:00
James Y Knight	7306cd47d4	[SPARC] Use AtomicExpandPass to expand AtomicRMW instructions. They were previously expanded to CAS loops in a custom isel expansion, but AtomicExpandPass knows how to do that generically. Testing is covered by the existing sparc atomics.ll testcases. llvm-svn: 264771	2016-03-29 19:09:54 +00:00
Matthias Braun	72a58c3e28	MachineVerifier: On dead-def live segments, check that corresponding machine operand has a dead flag llvm-svn: 264769	2016-03-29 19:07:43 +00:00
Matthias Braun	1c20c8280a	LiveVariables: Fix typo and shorten comment llvm-svn: 264768	2016-03-29 19:07:40 +00:00
Duncan P. N. Exon Smith	40b44e1d0a	IR: Add DbgInfoIntrinsic::getVariableLocation Create a common accessor, DbgInfoIntrinsic::getVariableLocation, which doesn't care about the type of debug info intrinsic. Use this to further unify the implementations of DbgDeclareInst::getAddress and DbgValueInst::getValue. Besides being a cleanup, I'm planning to use this to prepare DEBUG output without having to branch on the concrete type. llvm-svn: 264767	2016-03-29 18:56:03 +00:00
Teresa Johnson	b703c77b03	[ThinLTO] Remove post-pass metadata linking support Since we have moved to a model where functions are imported in bulk from each source module after making summary-based importing decisions, there is no longer a need to link metadata as a postpass, and all users have been removed. This essentially reverts r255909 and follow-on fixes. llvm-svn: 264763	2016-03-29 18:24:19 +00:00
Nirav Dave	2aab7f4358	Add support for no-jump-tables Add function soft attribute to the generation of Jump Tables in CodeGen as initial step towards clang support of gcc's no-jump-table support Reviewers: hans, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18321 llvm-svn: 264756	2016-03-29 17:46:23 +00:00
Derek Schuff	42666eeea2	Add MachineVerifier check for AllVRegsAllocated MachineFunctionProperty Summary: Check that any function that has the property set is free of virtual register operands. Also, it is actually VirtRegMap (and not the register allocators) that acutally remove the VReg operands (except for RegAllocFast). Reviewers: qcolombet Subscribers: MatzeB, llvm-commits, qcolombet Differential Revision: http://reviews.llvm.org/D18535 llvm-svn: 264755	2016-03-29 17:40:22 +00:00
Manman Ren	f46262e0b7	Swift Calling Convention: add swiftself attribute. Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754	2016-03-29 17:37:21 +00:00
Sanjoy Das	2381fcd557	[SCEV] Extract out a MatchBinaryOp; NFCI MatchBinaryOp abstracts out the IR instructions from the operations they represent. While this change is NFC, we will use this factoring later to map things like `(extractvalue 0 (sadd.with.overflow X Y))` to `(add X Y)`. llvm-svn: 264747	2016-03-29 16:40:44 +00:00
Sanjoy Das	260ad4dd63	[SCEV] Use Operator::getOpcode instead of manual dispatch; NFC llvm-svn: 264746	2016-03-29 16:40:39 +00:00
Justin Lebar	3db0b85fc8	Make InlineSimple's one-arg constructor explicit. NFC llvm-svn: 264744	2016-03-29 16:26:06 +00:00
Justin Lebar	bd145b3cb2	Reformat a comment in InlineSimple.cpp. NFC llvm-svn: 264743	2016-03-29 16:26:03 +00:00
Konstantin Zhuravlyov	ecc7cbf611	Test commit access llvm-svn: 264736	2016-03-29 15:15:44 +00:00
Teresa Johnson	efeae0e210	[ThinLTO] Use new GlobalValue::getGUID helper (NFC) This was already being used for functions and aliases, was missed when handling global variables. llvm-svn: 264734	2016-03-29 14:49:26 +00:00
Simon Dardis	9a3f32c00d	[mips] Test commit: Mark insertNoop as dead code (NFC) llvm-svn: 264728	2016-03-29 13:02:19 +00:00
Daniel Sanders	5d3840fdf9	[mips] Correct MIPS16 jal/jalx to have uimm26 offsets and add MC layer range checks. NFC. Summary: However, this has no effect at this time because the instructions affected are marked 'isCodeGenOnly=1' and have no alternative for the MC layer. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18179 llvm-svn: 264712	2016-03-29 09:40:38 +00:00
Elena Demikhovsky	95629caaa9	AVX-512: fixed a bug in fp_to_uint pattern on KNL Fixed fp_to_uint instruction selection on KNL. One pattern was missing for <4 x double> to <4 x i32> Differential Revision: http://reviews.llvm.org/D18512 llvm-svn: 264701	2016-03-29 06:33:41 +00:00
Duncan P. N. Exon Smith	bb7ce3b850	BitcodeReader: Allow METADATA_STRINGS to only have !"" Support parsing a METADATA_STRINGS record that only has a single piece of metadata, !"". Fixes a corner case in r264551. llvm-svn: 264699	2016-03-29 05:25:17 +00:00
Hyojin Sung	4673f10568	[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264697	2016-03-29 04:08:57 +00:00
Matthias Braun	f54530ef00	RegisterPressure: Simplify liveness tracking when lanemasks are not checked. Split RegisterOperands code that collects defs/uses into a variant with and without lanemask tracking. This is a bit of code duplication, but there are enough subtle differences between the two variants that this seems cleaner (and potentially faster). This also fixes a problem where lanes where tracked even though TrackLaneMasks was false. This is part of the fix for http://llvm.org/PR27106. I will commit the testcase when it is completely fixed. llvm-svn: 264696	2016-03-29 03:54:22 +00:00
Matthias Braun	82cff88691	LiveVariables: Do not remove dead flags from vreg operands Also add a FIXME comment on why Mips RDDSP causes bogus dead flags to be added which LiveVariables cleans up by accident. llvm-svn: 264695	2016-03-29 03:08:18 +00:00
Hal Finkel	fa7057a415	[PowerPC] Refactor popcnt[dw] target features Instead of using two feature bits, one to indicate the availability of the popcnt[dw] instructions, and another to indicate whether or not they're fast, use a single enum. This allows more consistent control via target attribute strings, and via Clang's command line. llvm-svn: 264690	2016-03-29 01:36:01 +00:00
Kyle Butt	5e241b11ed	[Codegen] Decrease minimum jump table density. Minimum density for both optsize and non optsize are now options -sparse-jump-table-density (default 10) for non optsize functions -dense-jump-table-density (default 40) for optsize functions, which matches the current default. This improves several benchmarks at google at the cost of a small codesize increase. For code compiled with -Os, the old behavior continues llvm-svn: 264689	2016-03-29 00:23:41 +00:00
Easwaran Raman	6f4903d985	Sample profile summary cleanup Replace references to MaxHeadSamples with MaxFunctionCount Differential Revision: http://reviews.llvm.org/D18522 llvm-svn: 264686	2016-03-28 23:14:29 +00:00
Derek Schuff	ecabac6244	[WebAssembly] Remove duplicate disabling of passes Also put all the disabled passes together llvm-svn: 264684	2016-03-28 22:52:20 +00:00
Hal Finkel	69ada2f514	[PowerPC] Clarify a comment in PPCTTI about vector loads This should say that we could do unaligned vector loads on the P7 using VSX instructions, not that we should. llvm-svn: 264683	2016-03-28 22:39:35 +00:00
Evgeniy Stepanov	f575b2687c	Remove personality for declarations in CloneModule. Personality is copied as part of copyFunctionAttributes, but it is invalid on a declaration. Remove the personality attribute it the function body is not cloned. Also add a verifier run over output modules in the llvm-split tool. llvm-svn: 264667	2016-03-28 21:37:02 +00:00
Simon Pilgrim	d3df400fa9	[X86][SSE] Vectorize a bit (AND/XOR/OR) op if a BUILD_VECTOR has the same op for all their scalar elements. If all a BUILD_VECTOR's source elements are the same bit (AND/XOR/OR) operation type and each has one constant operand, lower to a pair of BUILD_VECTOR and just apply the bit operation to the vectors. The constant operands will form a constant vector meaning that we still only have a single BUILD_VECTOR to lower and we will have replaced all the scalarized operations with a single SSE equivalent. Its not in our interest to start make a general purpose vectorizer from this, but I'm seeing enough of these scalar bit operations from the later legalization/scalarization stages to support them at least. Differential Revision: http://reviews.llvm.org/D18492 llvm-svn: 264666	2016-03-28 21:33:52 +00:00
Vedant Kumar	86705ba5b1	Reapply (2x) "[PGO] Fix name encoding for ObjC-like functions" Function names in ObjC can have spaces in them. This interacts poorly with name compression, which uses spaces to separate PGO names. Fix the issue by using a different separator and update a test. I chose "\01" as the separator because 1) it's non-printable, 2) we strip it from PGO names, and 3) it's the next natural choice once "\00" is discarded (that one's overloaded). What's changed since the original commit? - I fixed up the covmap-V2 binary format tests using a linux VM. - I weakened the CHECK lines in instrprof-comdat.h to account for the fact that there have been bugfixes to clang coverage. These will be fixed up in a follow-up. - I added an assert to make sure we don't get bitten by this again. - I constructed the c-general.profraw file without name compression enabled to appease some bots. Differential Revision: http://reviews.llvm.org/D18516 llvm-svn: 264658	2016-03-28 21:06:42 +00:00
Adrian Prantl	faebbb053d	Add an IR Verifier check for orphaned DICompileUnits. A DICompileUnit that is not listed in llvm.dbg.cu will cause assertion failures and/or crashes in the backend. The Verifier should reject this. rdar://problem/25369499 llvm-svn: 264657	2016-03-28 21:06:26 +00:00
Evgeniy Stepanov	a023f79db1	Handle section vs global name conflict. This is a fix for PR26941. When there is both a section and a global definition with the same name, the global wins. Section symbols are not added to the symbol table; section references are left undefined and fixed up in the object writer unless they've been satisfied by some other definition. llvm-svn: 264649	2016-03-28 20:36:28 +00:00
Ryan Govostes	653f9d0273	[asan] Support dead code stripping on Mach-O platforms On OS X El Capitan and iOS 9, the linker supports a new section attribute, live_support, which allows dead stripping to remove dead globals along with the ASAN metadata about them. With this change __asan_global structures are emitted in a new __DATA,__asan_globals section on Darwin. Additionally, there is a __DATA,__asan_liveness section with the live_support attribute. Each entry in this section is simply a tuple that binds together the liveness of a global variable and its ASAN metadata structure. Thus the metadata structure will be alive if and only if the global it references is also alive. Review: http://reviews.llvm.org/D16737 llvm-svn: 264645	2016-03-28 20:28:57 +00:00
Vedant Kumar	476a94d9ef	Revert "Reapply "[PGO] Fix name encoding for ObjC-like functions"" This reverts commit r264641 to investigate why c-general.test is failing on the bots. llvm-svn: 264643	2016-03-28 20:20:40 +00:00
Vedant Kumar	f20b6cec1c	Reapply "[PGO] Fix name encoding for ObjC-like functions" Function names in ObjC can have spaces in them. This interacts poorly with name compression, which uses spaces to separate PGO names. Fix the issue by using a different separator and update a test. I chose "\01" as the separator because 1) it's non-printable, 2) we strip it from PGO names, and 3) it's the next natural choice once "\00" is discarded (that one's overloaded). This reverts the revert commit beaf3d18. What's changed? - I fixed up the covmap-V2 binary format tests using a linux VM. - I updated the expected counts in instrprof-comdat.h to account for the fact that there have been bugfixes to clang coverage. - I added an assert to make sure we don't get bitten by this again. Differential Revision: http://reviews.llvm.org/D18516 llvm-svn: 264641	2016-03-28 20:12:07 +00:00
Easwaran Raman	8f6b9efc36	Profile summary cleanup. Differential Revision: http://reviews.llvm.org/D18468 llvm-svn: 264619	2016-03-28 18:58:05 +00:00
Adam Nemet	2f36f05951	[PGO] Comment how function pointers for indirect calls are mapped to function names Summary: Hopefully this will make it easier for the next person to figure all this out... Reviewers: bogner, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18490 llvm-svn: 264611	2016-03-28 18:27:44 +00:00
Matthias Braun	b74eb41d58	MIRParser: Add %subreg.xxx syntax for subregister index operands Differential Revision: http://reviews.llvm.org/D18279 llvm-svn: 264608	2016-03-28 18:18:46 +00:00
Haicheng Wu	6a6bc750d5	[AArch64] Do not lower scalar sdiv/udiv to a shifts + mul sequence when optimizing for minsize Mimic what x86 does when optimizing sdiv/udiv for minsize. llvm-svn: 264606	2016-03-28 18:17:07 +00:00
Reid Kleckner	ba85781f58	Revert "[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops" This reverts commit r264596. It does not compile. llvm-svn: 264604	2016-03-28 18:07:40 +00:00
Hal Finkel	7059d41622	[PowerPC] On the A2, popcnt[dw] are very slow The A2 cores support the popcntw/popcntd instructions, but they're microcoded, and slower than our default software emulation. Specifically, popcnt[dw] take approximately 74 cycles, whereas our software emulation takes only 24-28 cycles. I've added a new target feature to indicate a slow popcnt[dw], instead of just removing the existing target feature from the a2/a2q processor models, because: 1. This allows us to return more accurate information via the TTI interface (I recognize that this currently makes no practical difference) 2. Is hopefully easier to understand (it allows the core's features to match its manual while still having the desired effect). llvm-svn: 264600	2016-03-28 17:52:08 +00:00
David Blaikie	b805f73ad1	Remove else after return llvm-svn: 264599	2016-03-28 17:45:48 +00:00
Eugene Zelenko	35623fb7d5	Fix Clang-tidy modernize-deprecated-headers warnings in some files; other minor fixes. Differential revision: http://reviews.llvm.org/D18469 llvm-svn: 264598	2016-03-28 17:40:08 +00:00
Hyojin Sung	0ada5b0d14	[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264596	2016-03-28 17:22:25 +00:00
Rong Xu	6090afd744	[PGO] Don't set the function hotness attribute when populating counters Don't set the function hotness attribute on the fly. This changes the CFG branch probability of the caller function, which leads to inconsistent BB ordering. This patch moves the attribute setting to a separated loop after the counts in all functions are populated. Fixes PR27024 - PGO instrumentation profile data is not reflected in correct basic blocks. Differential Revision: http://reviews.llvm.org/D18491 llvm-svn: 264594	2016-03-28 17:08:56 +00:00
Derek Schuff	ad154c837e	Introduce MachineFunctionProperties and the AllVRegsAllocated property MachineFunctionProperties represents a set of properties that a MachineFunction can have at particular points in time. Existing examples of this idea are MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which will eventually be switched to use this mechanism. This change introduces the AllVRegsAllocated property; i.e. the property that all virtual registers have been allocated and there are no VReg operands left. With this mechanism, passes can declare that they require a particular property to be set, or that they set or clear properties by implementing e.g. MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class verifies that the requirements are met, and handles the setting and clearing based on the delcarations. Passes can also directly query and update the current properties of the MF if they want to have conditional behavior. This change annotates the target-independent post-regalloc passes; future changes will also annotate target-specific ones. Reviewers: qcolombet, hfinkel Differential Revision: http://reviews.llvm.org/D18421 llvm-svn: 264593	2016-03-28 17:05:30 +00:00
Vedant Kumar	088a726f6f	Revert "[PGO] Fix name encoding for ObjC-like functions" This reverts commit r264587. Reverting to investigate 6 unexpected failures on the ppc bot: http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/2822 llvm-svn: 264590	2016-03-28 16:14:07 +00:00
Tom Stellard	a76bcc2ea1	AMDGPU/SI: Limit load clustering to 16 bytes instead of 4 instructions Summary: This helps prevent load clustering from drastically increasing register pressure by trying to cluster 4 SMRDx8 loads together. The limit of 16 bytes was chosen, because it seems like that was the original intent of setting the limit to 4 instructions, but more analysis could show that a different limit is better. This fixes yields small decreases in register usage with shader-db, but also helps avoid a large increase in register usage when lane mask tracking is enabled in the machine scheduler, because lane mask tracking enables more opportunities for load clustering. shader-db stats: 2379 shaders in 477 tests Totals: SGPRS: 49744 -> 48600 (-2.30 %) VGPRS: 34120 -> 34076 (-0.13 %) Code Size: 1282888 -> 1283184 (0.02 %) bytes LDS: 28 -> 28 (0.00 %) blocks Scratch: 495616 -> 492544 (-0.62 %) bytes per wave Max Waves: 6843 -> 6853 (0.15 %) Wait states: 0 -> 0 (0.00 %) Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18451 llvm-svn: 264589	2016-03-28 16:10:13 +00:00
Davide Italiano	6db1dcbf6b	[SimplifyLibCalls] Transform printf("%s", "a") -> putchar('a'). llvm-svn: 264588	2016-03-28 15:54:01 +00:00
Vedant Kumar	e44e0be818	[PGO] Fix name encoding for ObjC-like functions Function names in ObjC can have spaces in them. This interacts poorly with name compression, which uses spaces to separate PGO names. Fix the issue by using a different separator and update a test. I chose "\01" as the separator because 1) it's non-printable, 2) we strip it from PGO names, and 3) it's the next natural choice once "\00" is discarded (that one's overloaded). Differential Revision: http://reviews.llvm.org/D18516 llvm-svn: 264587	2016-03-28 15:52:08 +00:00
Vedant Kumar	43a8565be6	[Coverage] Strip <unknown> from PGO names if no filenames are available Patch suggested by David Li! llvm-svn: 264586	2016-03-28 15:49:08 +00:00
Krzysztof Parzyszek	2d65ea74dc	[Hexagon] Improve handling of unaligned vector loads and stores llvm-svn: 264584	2016-03-28 15:43:03 +00:00
James Y Knight	01f2ca5612	NFC: skip FenceInst up-front in AtomicExpandPass. llvm-svn: 264583	2016-03-28 15:05:30 +00:00
Krzysztof Parzyszek	bb63f66686	[Hexagon] Only use restore functions for single register at -Oz llvm-svn: 264581	2016-03-28 14:52:21 +00:00
Krzysztof Parzyszek	a34901aae9	[Hexagon] Speed up frame lowering when no optimizations are enabled - Do not optimize stack slots in optnone functions. - Get aligned-base register from HexagonMachineFunctionInfo instead of looking for ALIGNA instruction in the function's body. llvm-svn: 264580	2016-03-28 14:42:03 +00:00
Douglas Katzman	d0c11cf7ad	Sparc: silently ignore .proc assembler directive Differential Revision: http://reviews.llvm.org/D18463 llvm-svn: 264579	2016-03-28 14:00:11 +00:00
Jacques Pienaar	fcef3e4617	[lanai] Add Lanai backend. Add the Lanai backend to lib/Target. General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html). Differential Revision: http://reviews.llvm.org/D17011 llvm-svn: 264578	2016-03-28 13:09:54 +00:00
Hal Finkel	5c83a090bc	[SROA] Fix typo in comment llvm-svn: 264573	2016-03-28 11:23:21 +00:00
Hal Finkel	29f5131daf	C++11 is required, remove some preprocessor checks for it We require C++11 to build, so remove a few remaining preprocessor checks for '__cplusplus >= 201103L'. This should always be true. llvm-svn: 264572	2016-03-28 11:13:03 +00:00
Chuang-Yu Cheng	d5eb774eb6	[Power9] Implement new altivec instructions: bcd* series This patch implements the following altivec instructions: - Decimal Convert From/to National/Zoned/Signed-QWord: bcdcfn. bcdcfz. bcdctn. bcdctz. bcdcfsq. bcdctsq. - Decimal Copy-Sign/Set-Sign: bcdcpsgn. bcdsetsgn. - Decimal Shift/Unsigned-Shift/Shift-and-Round: bcds. bcdus. bcdsr. - Decimal (Unsigned) Truncate: bcdtrunc. bcdutrunc. Total 13 instructions Thanks Amehsan's advice! Thanks Kit's great help! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D17838 llvm-svn: 264568	2016-03-28 09:04:23 +00:00
Chuang-Yu Cheng	80722719eb	[Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat This change implements the following vsx instructions: - Scalar Insert/Extract xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp - Vector Insert/Extract xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp xxextractuw xxinsertw - Scalar/Vector Test Data Class xststdcdp xststdcsp xststdcqp xvtstdcdp xvtstdcsp - Maximum/Minimum xsmaxcdp xsmaxjdp xsmincdp xsminjdp - Vector Byte-Reverse/Permute/Splat xxbrd xxbrh xxbrq xxbrw xxperm xxpermr xxspltib 30 instructions Thanks Nemanja for invaluable discussion! Thanks Kit's great help! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D16842 llvm-svn: 264567	2016-03-28 08:34:28 +00:00
Elena Demikhovsky	83f0647d85	AVX-512: Fixed ICMP instruction selection for i1 operands ICMP instruction selection fails on SKX and KNL for i1 operand. I use XOR to resolve: (A == B) is equivalent to (A xor B) == 0 Differential Revision: http://reviews.llvm.org/D18511 llvm-svn: 264566	2016-03-28 07:47:58 +00:00
Chuang-Yu Cheng	5663848996	[Power9] Implement new vsx instructions: quad-precision move, fp-arithmetic This change implements the following vsx instructions: - quad-precision move xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp - quad-precision fp-arithmetic xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o) xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o) 22 instructions Thanks Nemanja and Kit for careful review and invaluable discussion! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D16110 llvm-svn: 264565	2016-03-28 07:38:01 +00:00
Vedant Kumar	141ed94492	[Coverage] Fix the way we load "<unknown>:func" records When emitting coverage mappings for functions with local linkage and an unknown filename, we use "<unknown>:func" for the PGO function name. The problem is that we don't strip "<unknown>" from the name when loading coverage data, like we do for other file names. Fix that and add a test. llvm-svn: 264559	2016-03-28 01:16:12 +00:00
Duncan P. N. Exon Smith	544e4f97b3	BitcodeWriter: Replace dead code with an assertion, NFC The caller of ValueEnumerator::EnumerateOperandType never sends in metadata. Assert that, and remove the unnecessary logic. llvm-svn: 264558	2016-03-28 00:03:12 +00:00
Duncan P. N. Exon Smith	b42fa2e5c6	BitcodeWriter: Reuse writeMetadataRecords, NFC Change writeFunctionMetadata to call writeMetadataRecords. For now there's no functionality change, but makes it easy to serialize other types of metadata in the function block in the future. llvm-svn: 264557	2016-03-27 23:59:32 +00:00
Duncan P. N. Exon Smith	cffd8cb9dc	BitcodeWriter: Rename some functions for consistency, NFC To match writeMetadataRecords, writeNamedMetadata and writeMetadataStrings, change: WriteModuleMetadata => writeModuleMetadata WriteFunctionLocalMetadata => writeFunctionMetadata Write##CLASS => write##CLASS The only major change is "FunctionLocal" => "Function". The point is to be less specific, in preparation for emitting normal metadata records inside function metadata blocks (currently we only emit `LocalAsMetadata` there). llvm-svn: 264556	2016-03-27 23:56:04 +00:00
Duncan P. N. Exon Smith	80d153f6aa	BitcodeWriter: Split out writeMetadataRecords, NFC Besides being a nice cleanup, this is preparation for reusing the code in function metadata blocks. llvm-svn: 264555	2016-03-27 23:53:30 +00:00
Duncan P. N. Exon Smith	5465f0adc4	BitcodeWriter: Restructure WriteFunctionLocalMetadata, NFC Use an early return to simplify logic. llvm-svn: 264554	2016-03-27 23:38:36 +00:00
Duncan P. N. Exon Smith	2766e4d488	BitcodeWriter: Simplify tracking of function-local metadata, NFC We don't really need a separate vector here; instead, point at a range inside the main MDs array. This matches how r264551 references the ranges of strings and non-strings. llvm-svn: 264552	2016-03-27 23:22:31 +00:00
Duncan P. N. Exon Smith	6565a0d4b2	Reapply ~"Bitcode: Collect all MDString records into a single blob" Spiritually reapply commit r264409 (reverted in r264410), albeit with a bit of a redesign. Firstly, avoid splitting the big blob into multiple chunks of strings. r264409 imposed an arbitrary limit to avoid a massive allocation on the shared 'Record' SmallVector. The bug with that commit only reproduced when there were more than "chunk-size" strings. A test for this would have been useless long-term, since we're liable to adjust the chunk-size in the future. Thus, eliminate the motivation for chunk-ing by storing the string sizes in the blob. Here's the layout: vbr6: # of strings vbr6: offset-to-blob blob: [vbr6]: string lengths [char]: concatenated strings Secondly, make the output of llvm-bcanalyzer readable. I noticed when debugging r264409 that llvm-bcanalyzer was outputting a massive blob all in one line. Past a small number, the strings were impossible to split in my head, and the lines were way too long. This version adds support in llvm-bcanalyzer for pretty-printing. <STRINGS abbrevid=4 op0=3 op1=9/> num-strings = 3 { 'abc' 'def' 'ghi' } From the original commit: Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this should (a) slightly reduce bitcode size, since there is less record overhead, and (b) greatly improve reading speed, since blobs are super cheap to deserialize. llvm-svn: 264551	2016-03-27 23:17:54 +00:00
Duncan P. N. Exon Smith	456c9968e5	Support: Implement StreamingMemoryObject::getPointer The implementation is fairly obvious. This is preparation for using some blobs in bitcode. For clarity (and perhaps future-proofing?), I moved the call to JumpToBit in BitstreamCursor::readRecord ahead of calling MemoryObject::getPointer, since JumpToBit can theoretically (a) read bytes, which (b) invalidates the blob pointer. This isn't strictly necessary the two memory objects we have: - The return of RawMemoryObject::getPointer is valid until the memory object is destroyed. - StreamingMemoryObject::getPointer is valid until the next chunk is read from the stream. Since the JumpToBit call is only going ahead to a word boundary, we'll never load another chunk. However, reordering makes it clear by inspection that the blob returned by BitstreamCursor::readRecord will be valid. I added some tests for StreamingMemoryObject::getPointer and BitstreamCursor::readRecord. llvm-svn: 264549	2016-03-27 23:00:59 +00:00
Duncan P. N. Exon Smith	d3be62ddf2	Bitcode: Add SimpleBitstreamCursor::getPointerToByte, etc. Add API to SimpleBitstreamCursor to allow users to translate between byte addresses and pointers. - jumpToPointer: move the bit position to a particular pointer. - getPointerToByte: get the pointer for a particular byte. - getPointerToBit: get the pointer for the byte of the current bit. - getCurrentByteNo: convenience function for assertions and tests. Mainly adds unit tests (getPointerToBit/Byte already has a use), but also preparation for eventually using jumpToPointer. llvm-svn: 264546	2016-03-27 22:45:25 +00:00
Duncan P. N. Exon Smith	d766d136ce	Bitcode: Split out SimpleBitstreamCursor Split out SimpleBitstreamCursor from BitstreamCursor, which is a lower-level cursor with no knowledge of bitcode blocks, abbreviations, or records. It just knows how to read bits and navigate the stream. This is mainly organizational, to separate the API for manipulating raw bits from that for bitcode concepts like Record and Block. llvm-svn: 264545	2016-03-27 22:40:55 +00:00
Teresa Johnson	d29478f70e	[ThinLTO] Add optional import message and statistics Summary: Add a statistic to count the number of imported functions. Also, add a new -print-imports option to emit a trace of imported functions, that works even for an NDEBUG build. Note that emitOptimizationRemark does not work for the above printing as it expects a Function object and DebugLoc, neither of which we have with summary-based importing. This is part 2 of D18487, the first part was committed separately as r264536. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18487 llvm-svn: 264537	2016-03-27 15:27:30 +00:00
Teresa Johnson	9aae395fa8	[ThinLTO] Don't try to import alias unless aliasee can be imported With r264503, aliases are now being added to the GlobalsToImport set even when their aliasees can't be imported due to their linkage type. While the importing worked correctly (the aliases imported as declarations) due to the logic in doImportAsDefinition, there is no point to adding them to the GlobalsToImport set. Additionally, with D18487 it was resulting in incorrectly printing a message indicating that the alias was imported. To avoid this, delay adding aliases to the GlobalsToImport set until after the linkage type of the aliasee is checked. This patch is part of D18487. llvm-svn: 264536	2016-03-27 15:01:11 +00:00
Hal Finkel	0b37175ca6	[PowerPC] Map max/minnum intrinsics and fmax/fmin to ISD nodes for CTR-based loop legality Intrinsic::maxnum and Intrinsic::minnum, along with the associated libc function calls (fmax[f], etc.) generally map to function calls after lowering. For some vector types with QPX at least, however, we can legally lower these, and we don't need to prohibit CTR-based loops on their account. It turned out, however, that the logic that checked the opcodes associated with intrinsics was broken (it would set the Opcode variable, but that variable was later checked only if set for some otherwise-external function call. This fixes the latter problem and adds the FMAX/MINNUM mappings. llvm-svn: 264532	2016-03-27 05:40:56 +00:00
Michael Kruse	ff379b69b2	[Verifier] Reject PHIs using defs from own block. Reject the following IR as malformed (assuming that %entry, %next are not in a loop): next: %y = phi i32 [ 0, %entry ] %x = phi i32 [ %y, %entry ] Such PHI nodes came up in PR26718. While there was no consensus on whether or not this is valid IR, most opinions on that bug and in a discussion on the llvm-dev mailing list tended towards a "strict interpretation" (term by Joseph Tremoulet) of PHI node uses. Also, the language reference explicitly states that "the use of each incoming value is deemed to occur on the edge from the corresponding predecessor block to the current block" and `DominatorTree::dominates(Instruction*, Use&)` uses this definition as well. For the code mentioned in PR15384, clang does not compile to such PHIs (anymore?). The test case still hangs when replacing `%tmp6` with `%tmp` in revisions before r176366 (where PR15384 has been fixed). The occurrence of %tmp6 therefore was probably unintentional. Its value is not used except in other PHIs. Reviewers: majnemer, reames, JosephTremoulet, bkramer, grosser, jdoerfert, kparzysz, sanjoy Differential Revision: http://reviews.llvm.org/D18443 llvm-svn: 264528	2016-03-26 23:32:57 +00:00
Sanjay Patel	796db35f62	[SimplifyCFG] propagate branch metadata when creating select (PR26636) llvm-svn: 264527	2016-03-26 23:30:50 +00:00
Simon Pilgrim	dcdf85033c	[X86][AVX] Enabled SMUL_LOHI/UMUL_LOHI v8i32 vectors on AVX1 targets Correct splitting of v8i32 vectors into v4i32 vectors to prevent scalarization llvm-svn: 264517	2016-03-26 18:32:13 +00:00
JF Bastien	a874d1a40d	Revert "NFC: static_assert instead of comment" This reverts commit fa36fcff16c7d4f78204d6296bf96c3558a4a672. Causes the following Windows failure: C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\lib\CodeGen\MachineInstr.cpp(762): error C2338: must be trivially copyable to memmove llvm-svn: 264516	2016-03-26 18:20:02 +00:00
JF Bastien	d4ff3360ae	NFC: static_assert instead of comment Summary: isPodLike is as close as we have for is_trivially_copyable. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18483 llvm-svn: 264515	2016-03-26 18:14:27 +00:00
Simon Pilgrim	e4dbeb40c6	[X86][AVX] Enabled MULHS/MULHU v16i16 vectors on AVX1 targets Correct splitting of v16i16 vectors into v8i16 vectors to prevent scalarization Differential Revision: http://reviews.llvm.org/D18307 llvm-svn: 264512	2016-03-26 15:44:55 +00:00
Simon Pilgrim	3eef33a806	[X86][SSE] Add MULHS/MULHU custom lowering for i8 vectors Currently this is to mainly to prevent scalarization of integer division by constants. Differential Revision: http://reviews.llvm.org/D18307 llvm-svn: 264511	2016-03-26 15:27:20 +00:00
Simon Pilgrim	7379a70677	[X86][AVX512BW] AVX512BW can sign-extend v32i8 to v32i16 for simpler v32i8 multiplies. Only pre-AVX512BW targets need to split v32i8 vectors. llvm-svn: 264509	2016-03-26 09:44:27 +00:00
David Majnemer	b549ab02b4	[PowerPC] Disable the CTR optimization in the presence of {min,max}num The minnum and maxnum intrinsics get lowered to libcalls which invalidates the CTR optimization. This fixes PR27083. llvm-svn: 264508	2016-03-26 09:42:31 +00:00
Simon Pilgrim	ff7b7141cd	[X86][SSE] Don't duplicate Lower256IntArith functionality in LowerMul. NFC. LowerMul v32i8 on AVX2 needs to split the 256-bit sources to allow sign-extension back to v16i16 to occur. Since this is basically the same as Lower256IntArith we simplify by using that here instead. llvm-svn: 264506	2016-03-26 09:29:04 +00:00
Junmo Park	a26e93bcec	Minor code cleanup. NFC. llvm-svn: 264505	2016-03-26 06:04:55 +00:00
Chuang-Yu Cheng	065969ec8e	[Power9] Implement new altivec instructions: permute, count zero, extend sign, negate, parity, shift/rotate, mul10 This change implements the following vector operations: - vclzlsbb vctzlsbb vctzb vctzd vctzh vctzw - vextsb2w vextsh2w vextsb2d vextsh2d vextsw2d - vnegd vnegw - vprtybd vprtybq vprtybw - vbpermd vpermr - vrlwnm vrlwmi vrldnm vrldmi vslv vsrv - vmul10cuq vmul10uq vmul10ecuq vmul10euq 28 instructions Thanks Nemanja, Kit for invaluable hints and discussion! Reviewers: hal, nemanja, kbarton, tjablin, amehsan Phabricator: http://reviews.llvm.org/D15887 llvm-svn: 264504	2016-03-26 05:46:11 +00:00
Mehdi Amini	01e321306b	ThinLTO: use the callgraph from the combined index to drive the FunctionImporter Summary: Now that the summary contains the full reference/call graph, we can replace the existing function importer that loads and inspect the IR to iteratively walk the call graph by a traversal based purely on the summary information. Decouple the actual importing decision from any IR manipulation. Reviewers: tejohnson Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18343 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264503	2016-03-26 05:40:34 +00:00
Mehdi Amini	385cf28829	Rename ModuleSummaryIndex::modPathStringEntries() into modulePaths() It now return the map instead of an iterator. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264489	2016-03-26 03:35:38 +00:00
Lang Hames	d1af8fce0f	[Support] Switch to RAII helper for error-as-out-parameter idiom. As discussed on the llvm-commits thread for r264467. llvm-svn: 264479	2016-03-25 23:54:32 +00:00
Sunil Srivastava	34fce9377e	Improve the reliability of file renaming in Windows by having the compiler retry the rename operation on 3 error conditions of ReplaceFileW() that it was previously bailing out on. Patch by Douglas Yung! Differential Revision: http://reviews.llvm.org/D17903 llvm-svn: 264477	2016-03-25 23:41:28 +00:00
Lang Hames	ff044b1f69	[Object] Make createMachOObjectFile return Expected<...> rather than ErrorOr<...>. llvm-svn: 264473	2016-03-25 23:11:52 +00:00
Philip Reames	b5681138e4	Allow value forwarding past release fences in GVN A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-load forwarding across a release fence. We choose to be much more conservative about stores. In theory, nothing prevents us from shifting a store from after a release fence to before it, and then eliminating the preceeding (previously fenced) store. Doing this without actually moving the second store is likely also legal, but we chose to be conservative at this time. The LangRef indicates only atomic loads and stores are effected by fences. This patch chooses to be far more conservative then that. This is the GVN companion to http://reviews.llvm.org/D11434 which applied the same logic in EarlyCSE and has been baking in tree for a while now. Differential Revision: http://reviews.llvm.org/D11436 llvm-svn: 264472	2016-03-25 22:40:35 +00:00
Lang Hames	8262764869	[Object] Make MachOObjectFile's constructor private, provide a static create method instead. This is not quite a named constructor: Construction may fail, and MachOObjectFiles are usually passed by unique_ptr anyway, so create returns an Expected<std::unique_ptr<MachOObjectFile>>. llvm-svn: 264469	2016-03-25 21:59:14 +00:00
David Majnemer	020e890a19	[X86] Emit a proper ADJCALLSTACKDOWN in EmitLoweredTLSAddr We forgot to add the second machine operand to our ADJCALLSTACKDOWN, resulting in crashes in PEI. This fixes PR27071. llvm-svn: 264465	2016-03-25 21:49:11 +00:00
Jun Bum Lim	36c53fe147	[MachineCopyPropagation] Expose more dead copies across instructions with regmasks When encountering instructions with regmasks, instead of cleaning up all the elements in MaybeDeadCopies map, remove only the instructions erased. By keeping more instruction in MaybeDeadCopies, this change will expose more dead copies across instructions with regmasks. llvm-svn: 264462	2016-03-25 21:15:35 +00:00
Nirav Dave	fa250cad37	Prevent construction of cycle in DAG store merge When merging stores in DAGCombiner, add check to ensure that no dependenices exist that would cause the construction of a cycle in our DAG. This may happen if one store has a data dependence on another instruction (e.g. a load) which itself has a (chain) dependence on another store being merged. These stores cannot be merged safely and doing so results in a cycle that is discovered in LegalizeDAG. This test is only done in cases where Antialias analysis is used (UseAA) as non-AA store merge candidates will be merged logically after all loads which have been checked to not alias. Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight Subscribers: llvm-commits, tberghammer, danalbert, srhines Differential Revision: http://reviews.llvm.org/D18336 llvm-svn: 264461	2016-03-25 21:06:30 +00:00
Kostya Serebryany	f3ab6d9e10	[libFuzzer] use fflush after every Printf llvm-svn: 264459	2016-03-25 20:31:26 +00:00
Richard Smith	3a61a4d104	Remove useless and unused CrashRecoveryContext::getBacktrace(). This function always returned an empty string. llvm-svn: 264458	2016-03-25 20:30:10 +00:00
Sanjoy Das	d4c783335b	[RS4GC] Lower calls to @llvm.experimental.deoptimize This changes RS4GC to lower calls to ``@llvm.experimental.deoptimize`` to gc.statepoints wrapping ``__llvm_deoptimize``, and changes ``callsGCLeafFunction`` to recognize ``@llvm.experimental.deoptimize`` as a non GC leaf function. I've had to hard code the ``"__llvm_deoptimize"`` name in RewriteStatepointsForGC; since ``TargetLibraryInfo`` is available only during codegen. This isn't without precedent in the codebase, so I'm not overtly concerned. llvm-svn: 264456	2016-03-25 20:12:13 +00:00
Justin Bogner	ec0e7d2582	CodeGen: Don't iterate over operands after we've erased an MI This fixes a use-after-free introduced 3 years ago, in r182872 ;) The code more or less worked because the memory that CopyMI was pointing to happened to still be valid, but lots of tests would crash if you ran under ASAN with the recycling allocator changes from llvm.org/PR26808 llvm-svn: 264455	2016-03-25 20:03:28 +00:00
Saleem Abdulrasool	750a90df6a	ARM: maintain BB ordering when expanding WIN__DBZCHK It is possible to have a fallthrough MBB prior to MBB placement. The original addition of the BB would result in reordering the BB as not preceding the successor. Because of the fallthrough nature of the BB, we could end up executing incorrect code or even a constant pool island! Insert the spliced BB into the same location to avoid that. Thanks to Tim Northover for invaluable hints and Fiora for the discussion on what may have been occurring! llvm-svn: 264454	2016-03-25 19:48:06 +00:00
Teresa Johnson	aae2610042	[ThinLTO] Rename edges() to calls() for clarity (NFC) Helps distinguish from refs() which iterates over non-call references. llvm-svn: 264445	2016-03-25 18:59:13 +00:00
Justin Bogner	ec5ea36891	CodeGen: Fix a use-after-free in TII Found by ASAN with the recycling allocator changes from PR26808. llvm-svn: 264443	2016-03-25 18:38:48 +00:00
Justin Bogner	f2a0d349a6	AMDGPU: Fix a use-after free and a missing break We're erasing MI here, but then immediately using it again inside the `if`. This moves the erase after we're done using it. Doing that reveals a second problem though - this case is missing a break, so we fall through to the default and dereference MI again. This is obviously a bug, though I don't know how to write a test that triggers it - all we do in the error case is print some extra debug output. Both of these issue crash on lots of tests under ASAN with the recycling allocator changes from PR26808 applied. llvm-svn: 264442	2016-03-25 18:33:16 +00:00
Hans Wennborg	5f916d3df4	[X86] Use "and $0" and "orl $-1" to store 0 and -1 when optimizing for minsize 64-bit, 32-bit and 16-bit move-immediate instructions are 7, 6, and 5 bytes, respectively, whereas and/or with 8-bit immediate is only three bytes. Since these instructions imply an additional memory read (which the CPU could elide, but we don't think it does), restrict these patterns to minsize functions. Differential Revision: http://reviews.llvm.org/D18374 llvm-svn: 264440	2016-03-25 18:11:31 +00:00
Reid Kleckner	f6f04f8fc8	Consider regmasks when computing register-based DBG_VALUE live ranges Now register parameters that aren't saved to the stack or CSRs are considered dead after the first call. Previously the debugger would show whatever was in the register. Fixes PR26589 Reviewers: aprantl Differential Revision: http://reviews.llvm.org/D17211 llvm-svn: 264429	2016-03-25 17:54:46 +00:00
Lang Hames	9e964f3728	[Object] Start threading Error through MachOObjectFile construction. llvm-svn: 264425	2016-03-25 17:25:34 +00:00
Jonas Paulsson	5dd1e56de5	[SystemZ] Remove isBranch and isTerminator flags on BRCT and BRCTG. The BranchUnaryRI instruction class already sets these flags. Reviewed by Ulrich Weigand. llvm-svn: 264411	2016-03-25 15:42:30 +00:00
Duncan P. N. Exon Smith	fc8110041f	Revert "Bitcode: Collect all MDString records into a single blob" This reverts commit r264409 since it failed to bootstrap: http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build/8302/ llvm-svn: 264410	2016-03-25 15:22:27 +00:00
Duncan P. N. Exon Smith	fdbf0a5af8	Bitcode: Collect all MDString records into a single blob Optimize output of MDStrings in bitcode. This emits them in big blocks (currently 1024) in a pair of records: - BULK_STRING_SIZES: the sizes of the strings in the block, and - BULK_STRING_DATA: a single blob, which is the concatenation of all the strings. Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this should (a) slightly reduce bitcode size, since there is less record overhead, and (b) greatly improve reading speed, since blobs are super cheap to deserialize. I needed to add support for blobs to streaming input to get the test suite passing. - StreamingMemoryObject::getPointer reads ahead and returns the address of the blob. - To avoid a possible reallocation of StreamingMemoryObject::Bytes, BitstreamCursor::readRecord needs to move the call to JumpToEnd forward so that getPointer is the last bitstream operation. llvm-svn: 264409	2016-03-25 14:40:18 +00:00
Chad Rosier	59bcbba6b4	[AArch64] Fix typo. NFC. llvm-svn: 264408	2016-03-25 14:37:43 +00:00
David L Kreitzer	8d441eb936	Enable non-power-of-2 #pragma unroll counts. Patch by Evgeny Stupachenko. Differential Revision: http://reviews.llvm.org/D18202 llvm-svn: 264407	2016-03-25 14:24:52 +00:00
Simon Pilgrim	ac04923b0f	[X86][SSE] Don't duplicate Lower256IntArith functionality in LowerShift. NFC. LowerShift was using the same code as Lower256IntArith to split 256-bit vectors into 2 x 128-bit vectors, so now we just call Lower256IntArith. llvm-svn: 264403	2016-03-25 14:17:54 +00:00
Elena Demikhovsky	abc9c04ab7	fixed typo llvm-svn: 264395	2016-03-25 10:08:36 +00:00
Mehdi Amini	1e39ef331b	Add lastAccessedTime to file_status Differential Revision: http://reviews.llvm.org/D18456 This is a re-commit of r264387 and r264388 after fixing a typo. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264392	2016-03-25 07:30:21 +00:00
Mehdi Amini	ec68482e53	Revert "Add lastAccessedTime to file_status" This reverts commit r264387. Bots are broken in various ways, I need to take one commit at a time... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264390	2016-03-25 06:51:43 +00:00
Mehdi Amini	5aba49ebc3	Revert "Fix windows build for sys::fs:file_status Access Time added in r264387" This reverts commit r264388. Bots are broken in various ways, I need to take one commit at a time... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264389	2016-03-25 06:43:22 +00:00
Mehdi Amini	e3249fc6ab	Fix windows build for sys::fs:file_status Access Time added in r264387 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264388	2016-03-25 06:06:44 +00:00
Mehdi Amini	b53b351a8e	Add lastAccessedTime to file_status Reviewers: silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18456 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264387	2016-03-25 05:58:11 +00:00
Mehdi Amini	cb708b265d	Query the StringMap only once when creating MDString (NFC) Summary: Loading IR with debug info improves MDString::get() from 19ms to 10ms. This is a rework of D16597 with adding an "emplace" method on the StringMap to avoid requiring the MDString move ctor to be public. Reviewers: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17920 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264386	2016-03-25 05:58:04 +00:00
Mehdi Amini	be8a57f9bf	Adjust initial size in StringMap constructor to guarantee no grow() Summary: StringMap ctor accepts an initialize size, but expect it to be rounded to the next power of 2. The ctor can handle that directly instead of expecting clients to round it. Also, since the map will resize itself when 75% full, take this into account an initialize a larger initial size to avoid any growth. Reviewers: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18344 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264385	2016-03-25 05:57:57 +00:00
Mehdi Amini	4f2bb50b20	Add GUID/getGlobalIdentifier() non-static API to global value Summary: These are just helpers calling their static counter part to simplify client code. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18339 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264382	2016-03-25 05:57:41 +00:00
Duncan P. N. Exon Smith	bdde9e1f21	Bitcode: Use std::stable_partition for reproducible builds Caught by inspection while working on partitioning metadata. It's nice to produce the same bitcode if you run the compiler twice. llvm-svn: 264381	2016-03-25 02:20:28 +00:00
Duncan P. N. Exon Smith	68f5624356	Bitcode: Stop using MODULE_CODE_METADATA_VALUES The motivation for MODULE_CODE_METADATA_VALUES was to enable an -flto=thin scheme where: 1. First, one function is cherry-picked from a bitcode file. 2. Later, another function is cherry-picked. 3. Later, ... 4. Finally, the metadata needed by all the previous functions is loaded. This was abandoned in favour of: 1. Calculate the superset of functions needed from a Module. 2. Link all functions at once. Delayed metadata reading no longer serves a purpose. It also adds a few complication, since we can't count on metadata being properly parsed when exiting the BitcodeReader. After discussing with Teresa, we agreed to remove it. The code that depended on this was removed/updated in r264326. llvm-svn: 264378	2016-03-25 01:29:50 +00:00
Matt Arsenault	8c8fcb2585	AMDGPU: Cost model for basic integer operations This resolves bug 21148 by preventing promotion to i64 induction variables. llvm-svn: 264376	2016-03-25 01:16:40 +00:00
Hans Wennborg	4ae5119eeb	X86: Use push-pop for materializing 8-bit immediates for minsize (take 2) This is the same as r255936, with added logic for avoiding clobbering of the red zone (PR26023). Differential Revision: http://reviews.llvm.org/D18246 llvm-svn: 264375	2016-03-25 01:10:56 +00:00
Matt Arsenault	9651813ee0	AMDGPU: Partially implement getArithmeticInstrCost for FP ops llvm-svn: 264374	2016-03-25 01:00:32 +00:00
Duncan P. N. Exon Smith	efe16c8eb4	IR: Stop upgrading !llvm.loop attachments via MDString Remove logic to upgrade !llvm.loop by changing the MDString tag directly. This old logic would check (and change) arbitrary strings that had nothing to do with loop metadata. Instead, check !llvm.loop attachments directly, and change which strings get attached. Rather than updating the assembly-based upgrade, drop it entirely. It has been quite a while since we supported upgrading textual IR. llvm-svn: 264373	2016-03-25 00:56:13 +00:00
Duncan P. N. Exon Smith	1d15a9f0c9	IR: Reserve an MDKind for !llvm.loop; NFC This reserves an MDKind for !llvm.loop, which allows callers to avoid a string-based lookup. I'm not sure why it was missing. There should be no functionality change here, just a small compile-time speedup. llvm-svn: 264371	2016-03-25 00:35:38 +00:00
Saleem Abdulrasool	0dab98d926	ARM: fix optimised division on WoA We did not have an explicit branch to the continuation BB. When the check was hoisted, this could permit control follow to fall through into the division trap. Add the explicit branch to the continuation basic block to ensure that code execution is correct. llvm-svn: 264370	2016-03-25 00:34:11 +00:00
Matt Arsenault	59767cea79	AMDGPU: TTI: Make insertelement free. We don't want to have a cost to scalarizing operations. llvm-svn: 264364	2016-03-25 00:14:11 +00:00
Reid Kleckner	a15b76b377	Try to fix ODR violation of ErrorInfo::ID This implements my suggestion to Lang. llvm-svn: 264360	2016-03-24 23:49:34 +00:00
Manman Ren	9dd8c14674	CXX TLS: collect return blocks after SelectAllBasicBlocks. It is incorrect to get the corresponding MBB for a ReturnInst before SelectAllBasicBlocks since SelectAllBasicBlocks can change the correspondence between a ReturnInst and the MBB it is in. PR27062 llvm-svn: 264358	2016-03-24 23:21:29 +00:00
Sanjoy Das	fd3eaa8c5c	Reduce code duplication by extracting out a helper function; NFC llvm-svn: 264355	2016-03-24 22:51:49 +00:00
Sanjoy Das	731c67fed2	Lower varargs correctly in deopt bundle lowering Earlier we were ignoring varargs in LowerCallSiteWithDeoptBundle because populateCallLoweringInfo does not set CallLoweringInfo::IsVarArg. llvm-svn: 264354	2016-03-24 22:37:52 +00:00
Sean Silva	a915a1690e	Fix typo: XDS -> XDG Patch by Robert Ma <bob1211@gmail.com>! llvm-svn: 264352	2016-03-24 22:27:27 +00:00
Matthias Braun	ae81c29352	LiveInterval: Fix Distribute() failing on liveranges with unused VNInfos This fixes http://llvm.org/PR26991 llvm-svn: 264345	2016-03-24 21:41:38 +00:00
David Majnemer	e09d035dad	[LoopStrengthReduce] Don't hoist into a catchswitch We try to hoist the insertion point as high as possible to encourage sharing. However, we must be careful not to hoist into a catchswitch as it is both an EHPad and a terminator. llvm-svn: 264344	2016-03-24 21:40:22 +00:00
Eric Christopher	b979d51afa	Finish the incomplete 'd' inline asm constraint support for PPC by making sure we give it a register and mark it as a register constraint. llvm-svn: 264340	2016-03-24 21:04:52 +00:00
Kostya Serebryany	f389ae12c1	[libFuzzer] handle SIGTERM llvm-svn: 264338	2016-03-24 21:03:58 +00:00
Reid Kleckner	01bc66a8ce	Revert "Recommitted r263424 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26942 (the fix is included in this commit)." This reverts commit r264280. This broke building Chromium for iOS. We'll upload a reproducer to the PR soon. llvm-svn: 264334	2016-03-24 20:38:49 +00:00
Krzysztof Parzyszek	01598de3ec	[Hexagon] Be sure to treat subregisters of a CSR as CSRs as well llvm-svn: 264331	2016-03-24 20:31:41 +00:00
Sanjoy Das	df9ae70f49	Add lowering support for llvm.experimental.deoptimize Summary: Only adds support for "naked" calls to llvm.experimental.deoptimize. Support for round-tripping through RewriteStatepointsForGC will come as a separate patch (should be simpler than this one). Reviewers: reames Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18429 llvm-svn: 264329	2016-03-24 20:23:29 +00:00
Krzysztof Parzyszek	c9d4caa32c	[Hexagon] Add support for run-time stack overflow checking Patch by Sundeep Kushwaha. llvm-svn: 264328	2016-03-24 20:20:07 +00:00
Krzysztof Parzyszek	181fdbd174	[Hexagon] Generate PIC-specific versions of save/restore routines In PIC mode, the registers R14, R15 and R28 are reserved for use by the PLT handling code. This causes all functions to clobber these registers. While this is not new for regular function calls, it does also apply to save/restore functions, which do not follow the standard ABI conventions with respect to the volatile/non-volatile registers. Patch by Jyotsna Verma. llvm-svn: 264324	2016-03-24 19:18:48 +00:00
Sanjoy Das	c0c59fe14e	[Statepoints] Fix yet another issue around gc pointer uniqueing Given that StatepointLowering now uniques derived pointers before putting them in the per-statepoint spill map, we may end up with missing entries for derived pointers when we visit a gc.relocate on a pointer that was de-duplicated away. Fix this by keeping two maps, one mapping gc pointers to their de-duplicated values, and one mapping a de-duplicated value to the slot it is spilled in. llvm-svn: 264320	2016-03-24 18:57:39 +00:00
Sanjoy Das	42f91a9959	Minor cosmestic changes (NFC) - Reflow comments - Rename function llvm-svn: 264319	2016-03-24 18:57:31 +00:00
David Blaikie	0b214e4a2a	[debuginfo] Include dwo_name in the split unit to improve dwp diagnostics When multiple DWP files are merged together and duplicate DWO IDs are found it's currently difficult to give an actionable error message - the DW_AT_name of the CU could be provided, but might be identical (if the same source file is built into two different configurations), which doesn't help the user identify the problem. When no intermediate DWP files are generated, the path to the two DWO files could be provided - but is lost once the DWOs are merged into a DWP. So, include the name of the DWO (dwo_name) in the split file so that collissions involving a source CU from a DWP can be better diagnosed. (improvements to llvm-dwp using this to come shortly) llvm-svn: 264316	2016-03-24 18:37:08 +00:00
Adam Nemet	7aba60c853	[LLE] Check for mismatching types between the store and the load earlier isDependenceDistanceOfOne asserts that the store and the load access through the same type. This function is also used by removeDependencesFromMultipleStores so we need to make sure we filter out mismatching types before reaching this point. Now we do this when the initial candidates are gathered. This is a refinement of the fix made in r262267. Fixes PR27048. llvm-svn: 264313	2016-03-24 17:59:26 +00:00
Simon Atanasyan	26fe92d19f	[MC][mips] Add MipsMCInstrAnalysis class and register it as MC instruction analyzer The `MipsMCInstrAnalysis` class overrides the `evaluateBranch` method and calculates target addresses for branch and calls instructions. That allows llvm-objdump to print functions' names in branch instructions in the disassemble mode. Differential Revision: http://reviews.llvm.org/D18209 llvm-svn: 264309	2016-03-24 17:18:14 +00:00
Simon Pilgrim	a6ba27fbde	[X86][XOP] Fixed instruction postfixes to more closely match operands Suggested by Sanjay in D18189 as the multiple folding options in XOP instructions can be tricky llvm-svn: 264305	2016-03-24 16:31:30 +00:00
Duncan P. N. Exon Smith	a5e25a5563	BitcodeWriter: Move abbreviation for GenericDINode; almost NFC Simplify ValueEnumerator and WriteModuleMetadata by shifting the logic for the METADATA_GENERIC_DEBUG abbreviation into WriteGenericDINode. (This is just like r264302, but for GenericDINode.) The only change is that the abbreviation is emitted later in the bitcode, just before the first `GenericDINode` record. This shouldn't be observable though. llvm-svn: 264303	2016-03-24 16:30:18 +00:00
Duncan P. N. Exon Smith	625fda2714	BitcodeWriter: Move abbreviation for DILocation; almost NFC Simplify ValueEnumerator and WriteModuleMetadata by shifting the logic for the METADATA_LOCATION abbreviation into WriteDILocation. The only change is that the abbreviation is emitted later in the bitcode, just before the first `DILocation` record. This shouldn't be observable though. llvm-svn: 264302	2016-03-24 16:25:51 +00:00
Duncan P. N. Exon Smith	f8ecdf5284	BitcodeWriter: Split out named metadata; almost NFC Split writeNamedMetadata out of WriteModuleMetadata to write named metadata, and createNamedMetadataAbbrev for the abbreviation. There should be no effective functionality change, although the layout of the bitcode will change. Previously, the abbreviation was emitted at the top of the block, but now it is delayed until immediately before the named metadata records are emitted. llvm-svn: 264301	2016-03-24 16:16:08 +00:00
Duncan P. N. Exon Smith	0b7243ee38	Bitcode: Module* -> Module&, NFC llvm-svn: 264299	2016-03-24 16:01:46 +00:00
Elena Demikhovsky	95f3173ce9	AVX-512: Generate KTEST instead of TEST fir i1 vectors KTEST instruction may be used instead of TEST in this case: %int_sel3 = bitcast <8 x i1> %sel3 to i8 %res = icmp eq i8 %int_sel3, zeroinitializer br i1 %res, label %L2, label %L1 Differential Revision: http://reviews.llvm.org/D18444 llvm-svn: 264298	2016-03-24 15:53:45 +00:00
Tim Northover	4498eff9bb	CodeGen: extend RHS when splitting ATOMIC_CMP_SWAP_WITH_SUCCESS. If the operation's type has been promoted during type legalization, we need to account for the fact that the high bits of the comparison operand are likely unspecified. The LHS is usually zero-extended, but MIPS sign extends it, so we have to be slightly careful. Patch by Simon Dardis. llvm-svn: 264296	2016-03-24 15:38:38 +00:00
Tom Stellard	9babad25e5	AMDGPU/SI: Add Polaris support Patch By: Sonny Jiang llvm-svn: 264295	2016-03-24 15:31:05 +00:00
Simon Pilgrim	d7c4fce47d	[X86][XOP] Merged 128/256 bit 4op instruction definitions. NFCI. llvm-svn: 264294	2016-03-24 15:28:02 +00:00
NAKAMURA Takumi	e6d29c9928	Define ErrorInfo::ID explicitly. llvm-svn: 264293	2016-03-24 15:26:43 +00:00
Rafael Espindola	e1c42ac12b	Fix another case where we were unconditionally linking linkonce GVs. With this I think that now llvm-link, lld and the gold plugin should agree on which symbol is kept. llvm-svn: 264292	2016-03-24 15:23:01 +00:00
NAKAMURA Takumi	d8c1be66ab	Error.cpp: Fix a warning. [-Wpedantic] llvm-svn: 264291	2016-03-24 15:19:39 +00:00
Rafael Espindola	42e0323768	Fix resolution of linkonce symbols in comdats. After comdat processing, the symbols still go through regular symbol resolution. We were not doing it for linkonce symbols since they are lazy linked. This fixes pr27044. llvm-svn: 264288	2016-03-24 14:58:44 +00:00
Daniel Sanders	15f8fb6f83	[mips] Range check vsplat_simm5 and vsplat_simm10 Summary: Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18177 llvm-svn: 264287	2016-03-24 14:53:40 +00:00
Pirama Arumuga Nainar	dc45aef2d8	Remove unsafe AssertZext after promoting result of FP_TO_FP16 Summary: Some target lowerings of FP_TO_FP16, for instance ARM's vcvtb.f16.f32 instruction, do not guarantee that the top 16 bits are zeroed out. Remove the unsafe AssertZext and add tests to exercise this. Reviewers: jmolloy, sbaranga, kristof.beyls, aadg Subscribers: llvm-commits, srhines, aemerson Differential Revision: http://reviews.llvm.org/D18426 llvm-svn: 264285	2016-03-24 14:06:03 +00:00
Nemanja Ivanovic	5ebc92dbe1	[PowerPC] Disable direct moves for extractelement and bitcast in 32-bit mode This patch corresponds to review: http://reviews.llvm.org/D17711 It disables direct moves on these operations in 32-bit mode since the patterns assume 64-bit registers. The final patch is slightly different from the Phabricator review as the bitcast operations needed to be disabled in 32-bit mode as well. This fixes PR26617. llvm-svn: 264282	2016-03-24 13:40:33 +00:00
Amjad Aboud	6ff7e10052	Recommitted r263424 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26942 (the fix is included in this commit). Differential Revision: http://reviews.llvm.org/D18350 llvm-svn: 264280	2016-03-24 13:30:16 +00:00
Daniel Sanders	837f15187b	[mips] Range check simm10 Summary: Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18148 llvm-svn: 264279	2016-03-24 13:26:59 +00:00
Simon Pilgrim	572ca71573	[X86][XOP] Support for VPPERM byte shuffle instruction This patch begins adding support for lowering to the XOP VPPERM instruction - adding the X86ISD::VPPERM opcode. Differential Revision: http://reviews.llvm.org/D18189 llvm-svn: 264260	2016-03-24 11:52:43 +00:00
Daniel Sanders	f692130216	[mips] Tidy up cnMIPS tablegen definitions. NFC. Summary: In particular, make the cnMIPS predicates much more obvious and prefer def ... : ... { let Foo = bar; } over: let Foo = bar in def ... : ...; Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18354 llvm-svn: 264258	2016-03-24 11:40:48 +00:00
Vasileios Kalintiris	b8a37205d2	Fix sequence point warning. NFC. llvm-svn: 264255	2016-03-24 10:53:28 +00:00
Zlatko Buljan	94af4cbcf4	[mips][microMIPS] Add CodeGen support for DIV, MOD, DIVU, MODU, DDIV, DMOD, DDIVU and DMODU instructions Differential Revision: http://reviews.llvm.org/D17137 llvm-svn: 264248	2016-03-24 09:22:45 +00:00
Hrvoje Varga	2cb74ac3c3	[mips][microMIPS] Implement MTC, MTHC and DMTC* instructions Differential Revision: http://reviews.llvm.org/D17328 llvm-svn: 264246	2016-03-24 08:02:09 +00:00
Hrvoje Varga	dbea1a1e51	[mips][microMIPS] Fix for "Cannot copy registers" assertion Differential Revision: http://reviews.llvm.org/D17068 llvm-svn: 264245	2016-03-24 06:05:35 +00:00
Adam Nemet	59a6550425	[LAA] Formatting fix in previous change llvm-svn: 264244	2016-03-24 05:15:24 +00:00
Adam Nemet	279784ffc4	[LAA] Support memchecks involving loop-invariant addresses We used to only allow SCEVAddRecExpr for pointer expressions in order to be able to compute the bounds. However this is also trivially possible for loop-invariant addresses (scUnknown) since then the bounds are the address itself. Interestingly, we used allow this for the special case when the loop-invariant address happens to also be an SCEVAddRecExpr (in an outer loop). There are a couple more loops that are vectorized in SPEC after this. My guess is that the main reason we don't see more because for example a loop-invariant load is vectorized into a splat vector with several vector-inserts. This is likely to make the vectorization unprofitable. I.e. we don't notice that a later LICM will move all of this out of the loop so the cost estimate should really be 0. llvm-svn: 264243	2016-03-24 04:28:47 +00:00
Kostya Serebryany	315167339e	[libFuzzer] don't report memory leaks if we are dying due to a timeout (just use _Exit instead of exit in the timeout callback) llvm-svn: 264237	2016-03-24 01:32:08 +00:00
Kostya Serebryany	6278f933a8	[libFuzzer] use fdopen+vfprintf instead of fsnprintf+write llvm-svn: 264230	2016-03-24 00:57:32 +00:00
Paul Robinson	f81836bd18	[PS4] Guarantee an instruction after a 'noreturn' call. We need the "return address" of a noreturn call to be within the bounds of the calling function; TrapUnreachable turns 'unreachable' into a 'ud2' instruction, which has that desired effect. Differential Revision: http://reviews.llvm.org/D18414 llvm-svn: 264224	2016-03-24 00:10:03 +00:00
Rafael Espindola	1ee9fbd842	Fix lazy linking of comdat members. If not for lazy linking of linkonce GVs, comdats are just a preprocessing before symbol resolution. Lazy linking complicates it since when we pick a visible member of comdat, we have to make sure the rest of it passes symbol resolution too. llvm-svn: 264223	2016-03-24 00:06:03 +00:00
Lang Hames	e7aad357a9	[Support] Make all Errors convertible to std::error_code. This is a temporary crutch to enable code that currently uses std::error_code to be incrementally moved over to Error. Requiring all Error instances be convertible enables clients to call errorToErrorCode on any error (not just ECErrors created by conversion from an error_code). This patch also moves code for Error from ErrorHandling.cpp into a new Error.cpp file. llvm-svn: 264221	2016-03-23 23:57:28 +00:00
Matt Arsenault	ea00b499c7	APFloat: Fix signalling nans for scalbn llvm-svn: 264219	2016-03-23 23:51:45 +00:00
Matt Arsenault	30d37a74da	AMDGPU: Remove atomic inc/dec patterns There is no benefit to these since materializing the constant 1 requires the same number of instructions as materializing uint_max llvm-svn: 264215	2016-03-23 23:23:38 +00:00
Matt Arsenault	0a30e456b4	AMDGPU: Promote alloca should skip volatiles llvm-svn: 264214	2016-03-23 23:17:29 +00:00
Mike Aizatsky	9987f43ffa	[sancov] code readability improvement. Summary: Reply to http://reviews.llvm.org/D18341 Differential Revision: http://reviews.llvm.org/D18406 llvm-svn: 264213	2016-03-23 23:15:03 +00:00
Matt Arsenault	f43c2a0b49	AMDGPU: Insert moves of frame index to value operands Strengthen tests of storing frame indices. Right now this just creates irrelevant scheduling changes. We don't want to have multiple frame index operands on an instruction. There seem to be various assumptions that at least the same frame index will not appear twice in the LocalStackSlotAllocation pass. There's no reason to have this happen, and it just makes it easy to introduce bugs where the immediate offset is appplied to the storing instruction when it should really be applied to the value being stored as a separate add. This might not be sufficient. It might still be problematic to have an add fi, fi situation, but that's even less unlikely to happen in real code. llvm-svn: 264200	2016-03-23 21:49:25 +00:00
Cong Hou	94710840fb	Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed. Currently, AnalyzeBranch() fails non-equality comparison between floating points on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this function can modify the branch by reversing the conditional jump and removing unconditional jump if there is a proper fall-through. However, in the case of non-equality comparison between floating points, this can turn the branch "unanalyzable". Consider the following case: jne.BB1 jp.BB1 jmp.BB2 .BB1: ... .BB2: ... AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be removed: jne.BB1 jnp.BB2 .BB1: ... .BB2: ... However, AnalyzeBranch() cannot analyze this branch anymore as there are two conditional jumps with different targets. This may disable some optimizations like block-placement: in this case the fall-through behavior is enforced even if the fall-through block is very cold, which is suboptimal. Actually this optimization is also done in block-placement pass, which means we can remove this optimization from AnalyzeBranch(). However, currently X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined negation conditions for them. In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them. Here only the second conditional jump is reversed. This is valid as we only need them to do this "unconditional jump removal" optimization. Differential Revision: http://reviews.llvm.org/D11393 llvm-svn: 264199	2016-03-23 21:45:37 +00:00
Kevin Enderby	8fb96b958a	More more change need as part of r264187 where ErrorOr<> was added to getSymbolType(). llvm-svn: 264194	2016-03-23 21:20:16 +00:00
Rafael Espindola	f2e71244c6	Fix logic for which symbols to keep with comdats. If a comdat is dropped, all symbols in it are dropped. If a comdat is kept, the symbols survive to pass regular symbol resolution. With this patch we do that for all global symbols. The added test is a copy of test/tools/gold/X86/comdat.ll that we now pass. llvm-svn: 264192	2016-03-23 21:16:33 +00:00
Kevin Enderby	5afbc1cda7	Fix a crash in running llvm-objdump -t with an invalid Mach-O file already in the test suite. While this is not really an interesting tool and option to run on a Mach-O file to show the symbol table in a generic libObject format it shouldn’t crash. The reason for the crash was in MachOObjectFile::getSymbolType() when it was calling MachOObjectFile::getSymbolSection() without checking its return value for the error case. What makes this fix require a fair bit of diffs is that the method getSymbolType() is in the class ObjectFile defined without an ErrorOr<> so I needed to add that all the sub classes. And all of the uses needed to be updated and the return value needed to be checked for the error case. The MachOObjectFile version of getSymbolType() “can” get an error in trying to come up with the libObject’s internal SymbolRef::Type when the Mach-O symbol symbol type is an N_SECT type because the code is trying to select from the SymbolRef::ST_Data or SymbolRef::ST_Function values for the SymbolRef::Type. And it needs the Mach-O section to use isData() and isBSS to determine if it will return SymbolRef::ST_Data. One other possible fix I considered is to simply return SymbolRef::ST_Other when MachOObjectFile::getSymbolSection() returned an error. But since in the past when I did such changes that “ate an error in the libObject code” I was asked instead to push the error out of the libObject code I chose not to implement the fix this way. As currently written both the COFF and ELF versions of getSymbolType() can’t get an error. But if isReservedSectionNumber() wanted to check for the two known negative values rather than allowing all negative values or the code wanted to add the same check as in getSymbolAddress() to use getSection() and check for the error then these versions of getSymbolType() could return errors. At the end of the day the error printed now is the generic “Invalid data was encountered while parsing the file” for object_error::parse_failed. In the future when we thread Lang’s new TypedError for recoverable error handling though libObject this will improve. And where the added // Diagnostic(… comment is, it would be changed to produce and error message like “bad section index (42) for symbol at index 8” for this case. llvm-svn: 264187	2016-03-23 20:27:00 +00:00
Sanjay Patel	7876f180b5	[x86] make peekThroughBitcasts() a helper function This should be hoisted further up so it can be used in DAGCombiner and other backends, but I'm limiting the scope in the interest of patch minimalism. It's not quite NFC because some of the replaced code was using an 'if' check rather than a 'while' loop, so those cases would only look through a single bitcast. llvm-svn: 264186	2016-03-23 20:16:37 +00:00
Chad Rosier	85c8594056	[AArch64] Replace return 0 with return false. NFC. llvm-svn: 264185	2016-03-23 20:07:28 +00:00
Kyle Butt	613112826e	Codegen: [PPC] Word Rotates are Zero Extending. Add Word rotates to the list of instructions that are zero extending. This allows them to be used in dot form to compare with zero. llvm-svn: 264183	2016-03-23 19:51:22 +00:00
George Burgess IV	0e4898685f	Fix bugs in the MemorySSA walker. There are a few bugs in the walker that this patch addresses. Primarily: - Caching can break when we have multiple BBs without phis - We weren't optimizing some phis properly - Because of how the DFS iterator works, there were times where we wouldn't cache any results of our DFS I left the test cases with FIXMEs in, because I'm not sure how much effort it will take to get those to work (read: We'll probably ultimately have to end up redoing the walker, or we'll have to come up with some creative caching tricks), and more test coverage = better. Differential Revision: http://reviews.llvm.org/D18065 llvm-svn: 264180	2016-03-23 18:31:55 +00:00
Easwaran Raman	12b79aa0f1	Add getBlockProfileCount method to BlockFrequencyInfo Differential Revision: http://reviews.llvm.org/D18233 llvm-svn: 264179	2016-03-23 18:18:26 +00:00
Justin Bogner	c35c10593b	SelectionDAG: Remove a tautological dyn_cast. NFC Index is already a StoreSDNode, so this dyn_cast doesn't do anything. llvm-svn: 264177	2016-03-23 18:15:33 +00:00
Artyom Skrobov	e6f1b7f094	Replace a string comparison in ARMSubtarget.h with a tablegen entry in ARM.td (NFC) Reviewers: rengolin, t.p.northover Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D18393 llvm-svn: 264165	2016-03-23 16:18:13 +00:00
Silviu Baranga	d68ed85401	[SCEV] Change the SCEV Predicates interfaces for conversion to AddRecExpr to return SCEVAddRecExpr* instead of SCEV* Summary: This changes the conversion functions from SCEV * to SCEVAddRecExpr from ScalarEvolution and PredicatedScalarEvolution to return a SCEVAddRecExpr* instead of a SCEV* (which removes the need of most clients to do a dyn_cast right after calling these functions). We also don't add new predicates if the transformation was not successful. This is not entirely a NFC (as it can theoretically remove some predicates from LAA when we have an unknown dependece), but I couldn't find an obvious regression test for it. Reviewers: sanjoy Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18368 llvm-svn: 264161	2016-03-23 15:29:30 +00:00
Oliver Stannard	aa77b1e025	[AArch64] Replace some uses of report_fatal_error with reportError in AArch64 ELF object writer If we can't handle a relocation type, report it as an error in the source, rather than asserting. I've added a more descriptive message and a test for the only cases of this that I've been able to trigger. Differential Revision: http://reviews.llvm.org/D18388 llvm-svn: 264156	2016-03-23 13:45:03 +00:00
Andrey Turetskiy	6a3d561ea0	[X86] Introduction of FeatureX87. Add FeatureX87 in X86 backend to be able to define CPUs which doesn't have x87. Differential Revision: http://reviews.llvm.org/D13979 llvm-svn: 264148	2016-03-23 11:13:54 +00:00
Hrvoje Varga	c45baf212a	[mips][microMIPS] Delay slot filler modifications Differential Revision: http://reviews.llvm.org/D18181 llvm-svn: 264147	2016-03-23 10:29:38 +00:00
Valery Pykhtin	c0a77c5064	[AMDGPU] Fix missing assembler predicates. Differential Revision: http://reviews.llvm.org/D18351 llvm-svn: 264137	2016-03-23 04:27:26 +00:00
Sanjoy Das	a5b2972977	Remove stale comment llvm-svn: 264131	2016-03-23 02:28:35 +00:00
Sanjoy Das	ac53dc7520	[StatepointLowering] Don't do two DenseMap lookups; nfci llvm-svn: 264130	2016-03-23 02:24:15 +00:00
Sanjoy Das	7edbef316b	[StatepointLowering] Minor NFC cleanups - Use auto - Name variables in LLVM style - Use llvm::find instead of std::find - Blank lines between declarations llvm-svn: 264129	2016-03-23 02:24:13 +00:00
Sanjoy Das	4cd746ebe0	[StatepointLowering] Minor nfc refactoring Now that StatepointLoweringInfo represents base pointers, derived pointers and gc relocates as SmallVectors and not ArrayRefs, we no longer need to allocate "backing storage" on stack in LowerStatepoint. So elide the backing storage, and inline the trivial body of getIncomingStatepointGCValues. llvm-svn: 264128	2016-03-23 02:24:10 +00:00
Sanjoy Das	e58ca59cf4	[StatepointLowering] Schedule gc relocates before uniqueing them Otherwise we can see an "unexpected" gc.relocate that we uniqued away. llvm-svn: 264127	2016-03-23 02:24:07 +00:00
Tom Stellard	52ecd2d69b	AMDGPU: Cache information about register pressure sets We can statically decide whether or not a register pressure set is for SGPRs or VGPRs, so we don't need to re-compute this information in SIRegisterInfo::getRegPressureSetLimit(). Differential Revision: http://reviews.llvm.org/D14805 llvm-svn: 264126	2016-03-23 01:53:22 +00:00
Junmo Park	820964e9c6	Minor code cleanup. NFC. llvm-svn: 264124	2016-03-23 01:38:35 +00:00
Davide Italiano	1a911e204d	[ModuleUtils] Use range-based loop. NFC. llvm-svn: 264122	2016-03-23 00:43:35 +00:00
Joerg Sonnenberger	772bb5b65d	Typo llvm-svn: 264110	2016-03-22 22:24:52 +00:00
Justin Bogner	8809c40270	MC: Don't access the filesystem in MCContext's constructor MCContext shouldn't be accessing the filesystem - that's a gross layering violation and makes it awkward to use as a library or in a daemon where it may not even be allowed filesystem access. The CWD lookup here is normally redundant anyway, since the calling context either also looks up the CWD or sets this to something more specific. Here, we fix up the one caller that doesn't already set up a debug compilation dir and make it clear that the responsibility for such set up is in the users of MCContext. llvm-svn: 264109	2016-03-22 22:24:29 +00:00
Rafael Espindola	370d528a05	Drop comdats from the dst module if they are not selected. A really unfortunate design of llvm-link and related libraries is that they operate one module at a time. This means they can copy a GV to the destination module that should not be there in the final result because a later bitcode file takes precedence. We already handled cases like a strong GV replacing a weak for example. One case that is not currently handled is a comdat replacing another. This doesn't happen in ELF, but with COFF largest selection kind it is possible. In "llvm-link a.ll b.ll" if the selected comdat was from a.ll, everything will work and we will not copy the comdat from b.ll. But if we run "llvm-link b.ll a.ll", we fail to delete the already copied comdat from b.ll. This patch fixes that. llvm-svn: 264103	2016-03-22 21:35:47 +00:00
George Burgess IV	d4febd1612	Keep CodeGenPrepare from preserving the domtree. CGP modifies the domtree in some cases, so saying that it preserves the domtree is a lie. We'll be able to selectively preserve it with the new pass manager. Differential Revision: http://reviews.llvm.org/D16893 llvm-svn: 264099	2016-03-22 21:25:08 +00:00
Matthias Braun	68bb2931cc	Revert "Support arbitrary addrspace pointers in masked load/store intrinsics" This commit broke LTO builds. Reverting it to unbreak the bots while the issue is investigated. See also: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160321/341002.html This reverts r263158 llvm-svn: 264088	2016-03-22 20:24:34 +00:00
Simon Pilgrim	c6f5fe3d69	[SelectionDAG] Ensure constant folded legalized vector element types are compatible with the BUILD_VECTOR type Found during fuzz testing - 32-bit x86 targets were legalizing a <2 x i1> compare result to <2 x i32> when <2 x i64> was expected. llvm-svn: 264085	2016-03-22 19:59:53 +00:00
Tim Northover	b49a8a9dbb	CodeGen: check return types match when emitting tail call to builtin. We were just completely ignoring the types when determining whether we could safely emit a libcall as a tail call. This is clearly wrong. Theoretically, we could dig deeper looking for incidental matches (much like the generic code in Analysis.cpp does), but it's probably not worth it for the few libcalls that exist. llvm-svn: 264084	2016-03-22 19:14:38 +00:00
Adam Nemet	8b47e0d0ea	[LoopVersioning] Relax an assert for LCSSA PHIs When you have multiple LCSSA (single-operand) PHIs that are converted into two-operand PHIs due to versioning, only assert that the PHI currently being converted has a single operand. I.e. we don't want to check PHIs that were converted earlier in the loop. Fixes PR27023. Thanks to Karl-Johan Karlsson for the minimized testcase! llvm-svn: 264081	2016-03-22 18:38:15 +00:00
Sanjoy Das	eb5037cadc	Allow lowering call sites with both funclets and deopt state Lowering funclets is a no-op, so we can just go ahead and lower the deopt state. llvm-svn: 264078	2016-03-22 18:10:39 +00:00
Dan Gohman	665d7e3838	[WebAssembly] Implement the rotate instructions. llvm-svn: 264076	2016-03-22 18:01:49 +00:00
Sanjoy Das	6b535630a1	Add a hasOperandBundlesOtherThan helper, and use it; NFC llvm-svn: 264072	2016-03-22 17:51:25 +00:00
Simon Pilgrim	25fb4177fb	[X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardware Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Reapplied with a fix for PR26953 (missing vector widening legalization). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 264062	2016-03-22 16:22:08 +00:00
Daniel Sanders	f3599eb683	[mips] Make simm6 consistent with the rest. NFC. Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18147 llvm-svn: 264057	2016-03-22 14:50:22 +00:00
Daniel Sanders	97297770a6	[mips] Range check simm7. Summary: Also renamed li_simm7 to li16_imm since it's not a simm7 and has an unusual encoding (it's a uimm7 except that 0x7f represents -1). Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18145 llvm-svn: 264056	2016-03-22 14:40:00 +00:00
Daniel Sanders	0f17d0da4a	[mips] Range check simm5. Summary: We can't check the error message for this one because there's another lw/sw available that covers a larger range. We therefore check the transition between the two sizes. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18144 llvm-svn: 264054	2016-03-22 14:29:53 +00:00
Daniel Sanders	946dee3b5b	[mips] Range check vsplat_uimm[1234568]. Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18143 llvm-svn: 264053	2016-03-22 14:17:41 +00:00
Daniel Sanders	93fa4ce9b7	[mips] Range check uimm4_ptr, remove uimm6_ptr, and use correctly sized immediates in MSA copy/insert. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18142 llvm-svn: 264052	2016-03-22 13:58:53 +00:00
Zinovy Nis	07ac2bd4d0	[PATCH] Force LoopReroll to reset the loop trip count value after reroll. It's a bug fix. For rerolled loops SE trip count remains unchanged. It leads to incorrect work of the next passes. My patch just resets SE info for rerolled loop forcing SE to re-evaluate it next time it requested. I also added a verifier call in the exisitng test to be sure no invalid SE data remain. Without my fix this test would fail with -verify-scev. Differential Revision: http://reviews.llvm.org/D18316 llvm-svn: 264051	2016-03-22 13:50:57 +00:00
Marina Yatsina	33ef7dad18	[ELF][gcc compatibility]: support section names with special characters (e.g. "/") Adding support for section names with special characters in them (e.g. "/"). GCC successfully compiles such section names. This also fixes PR24520. Differential Revision: http://reviews.llvm.org/D15678 llvm-svn: 264038	2016-03-22 11:23:15 +00:00
Mehdi Amini	c04fc7a60f	Rename DenseMap::resize() into DenseMap::reserve() (NFC) This is more coherent with usual containers. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264026	2016-03-22 07:20:00 +00:00
Junmo Park	5ac1a47cad	Minor code cleanup. NFC. llvm-svn: 264024	2016-03-22 04:37:32 +00:00
Sanjoy Das	38bfc22161	Add "first class" lowering for deopt operand bundles Summary: After this change, deopt operand bundles can be lowered directly by SelectionDAG into STATEPOINT instructions (which are then lowered to a call or sequence of nop, with an associated __llvm_stackmaps entry0. This obviates the need to round-trip deoptimization state through gc.statepoint via RewriteStatepointsForGC. Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin Subscribers: sanjoy, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18257 llvm-svn: 264015	2016-03-22 00:59:13 +00:00
Mike Aizatsky	602f79275d	[sancov] do not instrument nodes that are full pre-dominators Summary: Without tree pruning clang has 2,667,552 points. Wiht only dominators pruning: 1,515,586. With both dominators & predominators pruning: 1,340,534. Resubmit of r262103. Differential Revision: http://reviews.llvm.org/D18341 llvm-svn: 264003	2016-03-21 23:08:16 +00:00
Nicolai Haehnle	0a33abdfd2	AMDGPU: Fix dangling references introduced by r263982 Fixes Valgrind errors on the test cases that were reported as failing by buildbots. llvm-svn: 264000	2016-03-21 22:54:02 +00:00
Simon Pilgrim	b57b002253	[InstCombine] Ensure all undef operands are handled before binary instruction constant folding As noted in PR18355, this patch makes it clear that all cases with undef operands have been handled before further constant folding is attempted. Differential Revision: http://reviews.llvm.org/D18305 llvm-svn: 263994	2016-03-21 22:15:50 +00:00
Duncan P. N. Exon Smith	20be876a64	Fix -Wdocumentation warnings from r263853 Thanks to chapuni for catching this. llvm-svn: 263993	2016-03-21 22:13:44 +00:00
George Burgess IV	3887a41725	[MemorySSA] Consider def-only BBs for live-in calculations. If we have a BB with only MemoryDefs, live-in calculations will ignore it. This means we get results like this: define void @foo(i8* %p) { ; 1 = MemoryDef(liveOnEntry) store i8 0, i8* %p br i1 undef, label %if.then, label %if.end if.then: ; 2 = MemoryDef(1) store i8 1, i8* %p br label %if.end if.end: ; 3 = MemoryDef(1) store i8 2, i8* %p ret void } ...When there should be a MemoryPhi in the `if.end` BB. This patch fixes that behavior. llvm-svn: 263991	2016-03-21 21:25:39 +00:00
Nicolai Haehnle	a56e6b6a53	AMDGPU: Coding style fixes I meant to add these before committing r263982 as per the review, but I forgot to squash. llvm-svn: 263983	2016-03-21 20:39:24 +00:00
Nicolai Haehnle	213e87f2ee	AMDGPU: Add SIWholeQuadMode pass Summary: Whole quad mode is already enabled for pixel shaders that compute derivatives, but it must be suspended for instructions that cause a shader to have side effects (i.e. stores and atomics). This pass addresses the issue by storing the real (initial) live mask in a register, masking EXEC before instructions that require exact execution and (re-)enabling WQM where required. This pass is run before register coalescing so that we can use machine SSA for analysis. The changes in this patch expose a problem with the second machine scheduling pass: target independent instructions like COPY implicitly use EXEC when they operate on VGPRs, but this fact is not encoded in the MIR. This can lead to miscompilation because instructions are moved past changes to EXEC. This patch fixes the problem by adding use-implicit operands to target independent instructions. Some general codegen passes are relaxed to work with such implicit use operands. Reviewers: arsenm, tstellarAMD, mareko Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18162 llvm-svn: 263982	2016-03-21 20:28:33 +00:00
Krzysztof Parzyszek	b14f4fd0de	[Hexagon] Add handling fixups and instruction relaxation llvm-svn: 263981	2016-03-21 20:27:17 +00:00
Krzysztof Parzyszek	c6f1e1a709	[Hexagon] Properly encode registers in duplex instructions llvm-svn: 263980	2016-03-21 20:13:33 +00:00
Krzysztof Parzyszek	6514a887f4	[Hexagon] Fix reserving emergency spill slots for register scavenger - R10 and R11 are not reserved registers. - Check for reserved registers when finding unused caller-saved registers. llvm-svn: 263977	2016-03-21 19:57:08 +00:00
Dan Gohman	c8d7f14506	[WebAssembly] Implement the eqz instructions. llvm-svn: 263976	2016-03-21 19:54:41 +00:00
Chad Rosier	2e5c526bb1	[SLP] Remove unnecessary member variables by using container APIs. This changes the debug output, but still retains its usefulness. Differential Revision: http://reviews.llvm.org/D18324 llvm-svn: 263975	2016-03-21 19:47:44 +00:00
Tom Stellard	92339e888f	AMDGPU/SI: Fix threshold calculation for branching when exec is zero Summary: When control flow is implemented using the exec mask, the compiler will insert branch instructions to skip over the masked section when exec is zero if the section contains more than a certain number of instructions. The previous code would only count instructions in successor blocks, and this patch modifies the code to start counting instructions in all blocks between the start and end of the branch. Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18282 llvm-svn: 263969	2016-03-21 18:56:58 +00:00
Chad Rosier	cf173ffb46	[AArch64] Add a helpful assert. NFC. llvm-svn: 263965	2016-03-21 18:04:10 +00:00
Matt Arsenault	cb38a6bd35	AMDGPU: Remove SignBitIsZero for mubuf scratch offsets These instructions do not have the same negative base address problem that DS instructions do on SI. llvm-svn: 263964	2016-03-21 18:02:18 +00:00
Peter Collingbourne	86b9fbe980	ARM: Better codegen for 64-bit compares. This introduces a custom lowering for ISD::SETCCE (introduced in r253572) that allows us to emit a short code sequence for 64-bit compares. Before: push {r7, lr} cmp r0, r2 mov.w r0, #0 mov.w r12, #0 it hs movhs r0, #1 cmp r1, r3 it ge movge.w r12, #1 it eq moveq r12, r0 cmp.w r12, #0 bne .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} After: push {r7, lr} subs r0, r0, r2 sbcs.w r0, r1, r3 bge .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} Saves around 80KB in Chromium's libchrome.so. Some notes on this patch: - I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I introduced (nothing else needs them). However, they are necessary in order to avoid poor codegen, and they seem similar to existing combines in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to (brcond Compare)). - No support for Thumb-1. This is in principle possible, but we'd need to implement ARMISD::SUBE for Thumb-1. Differential Revision: http://reviews.llvm.org/D15256 llvm-svn: 263962	2016-03-21 18:00:02 +00:00
Renato Golin	2b6b7ffd6c	[ARM] Add Cortex-A32 support Adding Cortex-A32 as an available target in the ARM backend. Patch by Sam Parker. llvm-svn: 263956	2016-03-21 17:29:01 +00:00
Matt Arsenault	c25a71106c	APFloat: Add frexp llvm-svn: 263950	2016-03-21 16:49:16 +00:00
Matt Arsenault	b96b57347a	AMDGPU: Add frexp_mant intrinsic llvm-svn: 263948	2016-03-21 16:11:05 +00:00
Matt Arsenault	155dda9134	Implement constant folding for bitreverse llvm-svn: 263945	2016-03-21 15:00:35 +00:00
Chad Rosier	4aeab5fbf2	[AArch64] Fix a -Wdocumentation warning. NFC. llvm-svn: 263942	2016-03-21 13:43:58 +00:00
Silviu Baranga	f875e4fd92	[IndVars] Fix PR26974: make sure replaceCongruentIVs doesn't break LCSSA Summary: replaceCongruentIVs can break LCSSA when trying to replace IV increments since it tries to replace all uses of a phi node with another phi node while both of the phi nodes are not necessarily in the processed loop. This will cause an assert in IndVars. To fix this, we add a check to make sure that the replacement maintains LCSSA. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18266 llvm-svn: 263941	2016-03-21 12:44:29 +00:00
Silviu Baranga	46030585b3	[DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext\|any_ext\|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935	2016-03-21 11:43:46 +00:00
Jingyue Wu	1375560bdb	[NVPTX] Adds a new address space inference pass. Summary: The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable to convert the address space of a pointer induction variable. This patch adds a new pass called NVPTXInferAddressSpaces that overcomes that limitation using a fixed-point data-flow analysis (see the file header comments for details). The new pass is experimental and not enabled by default. Users can turn it on by setting the -nvptx-use-infer-addrspace flag of llc. Reviewers: jholewinski, tra, jlebar Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D17965 llvm-svn: 263916	2016-03-20 20:59:20 +00:00
Simon Pilgrim	fcc4532afa	[X86][SSE] Tidyup setTargetShuffleZeroElements to match computeZeroableShuffleElements Based on feedback for D14261 llvm-svn: 263911	2016-03-20 17:43:07 +00:00
Simon Pilgrim	c44472a5bc	[X86][SSE] Detect zeroable shuffle elements from different value types Improve computeZeroableShuffleElements to be able to peek through bitcasts to extract zero/undef values from BUILD_VECTOR nodes of different element sizes to the shuffle mask. Differential Revision: http://reviews.llvm.org/D14261 llvm-svn: 263906	2016-03-20 15:45:42 +00:00
Igor Breger	3ea8af5108	AVX512BW: Enable v32i1/v64i1 BUILD_VECTOR Differential Revision: http://reviews.llvm.org/D18211 llvm-svn: 263898	2016-03-20 13:09:43 +00:00
Craig Topper	ea87eae4ca	Suppress a -Wunused-variable warning in release builds. llvm-svn: 263892	2016-03-20 01:17:54 +00:00
Michael Kuperstein	048cc3b7a8	Use a range-based for loop. NFC. llvm-svn: 263889	2016-03-20 00:16:13 +00:00
Mehdi Amini	43165d913a	Expose IRBuilder::CreateAtomicCmpXchg as LLVMBuildAtomicCmpXchg in the C API. Summary: Also expose getters and setters in the C API, so that the change can be tested. Reviewers: nhaehnle, axw, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18260 From: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> llvm-svn: 263886	2016-03-19 21:28:28 +00:00
Saleem Abdulrasool	2854666263	CodeGen: use range based for loop Convert a loop to use a range based style loop. NFC. llvm-svn: 263884	2016-03-19 16:35:32 +00:00
David Majnemer	abae6b588b	[SimplifyLibCalls] Only consider sinpi/cospi functions within the same function The sinpi/cospi can be replaced with sincospi to remove unnecessary computations. However, we need to make sure that the calls are within the same function! This fixes PR26993. llvm-svn: 263875	2016-03-19 04:53:02 +00:00
David Majnemer	cdf2873e36	[InstCombine] Don't insert instructions before a catch switch CatchSwitches are not splittable, we cannot insert casts, etc. before them. This fixes PR26992. llvm-svn: 263874	2016-03-19 04:39:52 +00:00
Mehdi Amini	9bc362a215	Add a comment on partial hashing of Metadata Following r263866, on D. Blaikie suggestion. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263869	2016-03-19 01:06:24 +00:00
Mehdi Amini	5d99c4efaa	Hash Metadata using pointer for MDString argument instead of value (NFC) MDString are uniqued in the Context on creation, hashing the pointer is less expensive than hashing the String itself. Reviewers: dexonsmith Differential Revision: http://reviews.llvm.org/D16560 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263867	2016-03-19 01:02:34 +00:00
Mehdi Amini	53fc3895e0	Compute some Debug Info Metadata hash key partially (NFC) Summary: This patch changes the computation of the hash key for DISubprogram to be computed on a small subset of the fields. The hash is computed a lot faster, but there might be more collision in the table. However by carefully selecting the fields, colisions should be rare. Using `opt` to load the IR for FastISelEmitter.cpp.o, with this patch: - DISubprogram::getImpl() goes from 28ms to 15ms. - DICompositeType::getImpl() goes from 6ms to 2ms - DIDerivedType::getImpl() goes from 18 to 12ms Reviewers: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16571 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263866	2016-03-19 00:59:26 +00:00
Mehdi Amini	8d05185a26	Rework linkInModule(), making it oblivious to ThinLTO Summary: ThinLTO is relying on linkInModule to import selected function. However a lot of "magic" was hidden in linkInModule and the IRMover, who would rename and promote global variables on the fly. This is moving to an approach where the steps are decoupled and the client is reponsible to specify the list of globals to import. As a consequence some test are changed because they were relying on the previous behavior which was importing the definition of every single global without control on the client side. Now the burden is on the client to decide if a global has to be imported or not. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18122 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263863	2016-03-19 00:40:31 +00:00
Manman Ren	a3a019cf90	[CXX_FAST_TLS] Fix issues in ARM. We need to be careful on which registers can be explicitly handled via copies. Prologue, Epilogue use physical registers and if one belongs to the set of CSRsViaCopy, it will no longer be CSRed, since PEI overwrites it after the explicit copies. llvm-svn: 263857	2016-03-18 23:44:37 +00:00
Manman Ren	4865d89653	[CXX_FAST_TLS] Disable tail call when calling conventions are mismatched. Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller and callee have mismatched calling conventions. llvm-svn: 263856	2016-03-18 23:41:51 +00:00
Manman Ren	2828c57b6f	[CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86. Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855	2016-03-18 23:38:49 +00:00
Duncan P. N. Exon Smith	c3fa1eded2	AArch64: Don't modify other modules in AArch64PromoteConstant Avoid modifying other modules in `AArch64PromoteConstant` when the constant is `ConstantData` (a horrible accident, I'm sure, caught by an experimental follow-up to r261464). Previously, this walked through all the users of a constant, but that reaches into other modules when the constant doesn't depend transitively on a `GlobalValue`! Since we're walking instructions anyway, just modify the instructions we actually see. As a drive-by, instead of storing `Use` and getting the instructions again via `Use::getUser()` (which is not a constantant time lookup), store `std::pair<Instruction, unsigned>`. Besides being cheaper, this makes it easier to drop use-lists form `ConstantData` in the future. (I threw this in because I was touching all the code anyway.) Because the patch completely changes the traversal logic, it looks like a rewrite of the pass, but the core logic is all the same (or should be, minus the out-of-module changes). In other words, there should be NFC as long as the LLVMContext only has a single Module. I didn't think of a good way to test this, but I hope to submit a patch eventually that makes walking these use-lists illegal/impossible. llvm-svn: 263853	2016-03-18 23:30:54 +00:00
Mike Aizatsky	759aca01ce	[sancov] clang-formatting SanitizerCoverage.cpp and fully pleasing clang-tidy. Differential Revision: http://reviews.llvm.org/D18288 llvm-svn: 263852	2016-03-18 23:29:29 +00:00
Chandler Carruth	3006115cfe	Revert "Revert "[sancov] specifying sanitizer coverage dependencies."" This reverts commit r263825, re-instating r263797. llvm-svn: 263847	2016-03-18 22:43:42 +00:00
Chandler Carruth	e2b7021a91	[sancov] Fix the sancov pass to initialize itself inside its constructor. This should fix the recent crashes on certain architectures. llvm-svn: 263845	2016-03-18 22:35:58 +00:00
Alexei Starovoitov	7e453bb8be	BPF: emit an error message for unsupported signed division operation Signed-off-by: Yonghong Song <yhs@plumgrid.com> Signed-off-by: Alexei Starovoitov <ast@fb.com> llvm-svn: 263842	2016-03-18 22:02:47 +00:00
Easwaran Raman	26628d3015	Interface to get/set profile summary metadata to module Differential Revision: http://reviews.llvm.org/D17894 llvm-svn: 263835	2016-03-18 21:29:30 +00:00
Kostya Serebryany	49e409068a	[libFuzzer] add a flag close_fd_mask so that we can silence spammy targets by closing stderr/stdout llvm-svn: 263831	2016-03-18 20:58:29 +00:00
Matthias Braun	0d208fc9f6	MILexer: Add ErrorCallbackType typedef; NFC llvm-svn: 263829	2016-03-18 20:41:11 +00:00
Sanjoy Das	74af78e3b0	[IndVars] Make the fix for PR26973 more obvious; NFCI llvm-svn: 263828	2016-03-18 20:37:11 +00:00
Sanjoy Das	60fb899f28	[IndVars] Pass the right loop to isLoopInvariantPredicate The loop on IVOperand's incoming values assumes IVOperand to be an induction variable on the loop over which `S Pred X` is invariant; otherwise loop invariant incoming values to IVOperand are not guaranteed to dominate the comparision. This fixes PR26973. llvm-svn: 263827	2016-03-18 20:37:07 +00:00
Mike Aizatsky	075ed3eec1	Revert "[sancov] specifying sanitizer coverage dependencies." This fails on arm. This reverts commit 52c8e0f7119d1ea1050c0708565a8c92b73386d2. llvm-svn: 263825	2016-03-18 20:34:58 +00:00
Nicolai Haehnle	fa771811b3	AMDGPU: add missing braces around multi-line if block This fixes an issue with rL263658 pointed out by Tom Stellard. llvm-svn: 263823	2016-03-18 20:32:04 +00:00
Chad Rosier	cdfd7e7201	[AArch64] Enable more load clustering in the MI Scheduler. This patch adds unscaled loads and sign-extend loads to the TII getMemOpBaseRegImmOfs API, which is used to control clustering in the MI scheduler. This is done to create more opportunities for load pairing. I've also added the scaled LDRSWui instruction, which was missing from the scaled instructions. Finally, I've added support in shouldClusterLoads for clustering adjacent sext and zext loads that too can be paired by the load/store optimizer. Differential Revision: http://reviews.llvm.org/D18048 llvm-svn: 263819	2016-03-18 19:21:02 +00:00
Reid Kleckner	fbd7787d7e	[codeview] Only emit function ids for inlined functions We aren't referencing any other kind of function currently. Should save a bit on our debug info size. llvm-svn: 263817	2016-03-18 18:54:32 +00:00
Colin LeMahieu	0143146514	[MCParser] Accept uppercase radix variants 0X and 0B Differential Revision: http://reviews.llvm.org/D14781 llvm-svn: 263802	2016-03-18 18:22:07 +00:00
Mike Aizatsky	4f7994c8cb	[sancov] specifying sanitizer coverage dependencies. Summary: These dependencies would be used in the future to reduce the number of instrumented blocks(http://reviews.llvm.org/rL262103) This is submitted as a separate CL because of previous problems with ARM. Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D18227 llvm-svn: 263797	2016-03-18 17:33:21 +00:00
Nicolai Haehnle	95e8ffd398	AMDGPU: Overload return type of llvm.amdgcn.buffer.load.format Summary: Allow the selection of BUFFER_LOAD_FORMAT_x and _XY. Do this now before the frontend patches land in Mesa. Eventually, we may want to automatically reduce the size of loads at the LLVM IR level, which requires such overloads, and in some cases Mesa can generate them directly. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18255 llvm-svn: 263792	2016-03-18 16:24:40 +00:00
Nicolai Haehnle	ad63638f6d	AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsics Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 llvm-svn: 263791	2016-03-18 16:24:31 +00:00
Nicolai Haehnle	3003ba00a3	AMDGPU: use ComplexPattern for offsets in llvm.amdgcn.buffer.load/store.format Summary: We cannot easily deduce that an offset is in an SGPR, but the Mesa frontend cannot easily make use of an explicit soffset parameter either. Furthermore, it is likely that in the future, LLVM will be in a better position than the frontend to choose an SGPR offset if possible. Since there aren't any frontend uses of these intrinsics in upstream repositories yet, I would like to take this opportunity to change the intrinsic signatures to a single offset parameter, which is then selected to immediate offsets or voffsets using a ComplexPattern. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18218 llvm-svn: 263790	2016-03-18 16:24:20 +00:00
Sam Kolton	a74cd526e9	[AMDGPU] Assembler: Change dpp_ctrl syntax to match sp3 Review: http://reviews.llvm.org/D18267 llvm-svn: 263789	2016-03-18 15:35:51 +00:00
Benjamin Kramer	d96b0c14fb	[Fuzzer] Guard no_sanitize_memory attributes behind __has_feature. Otherwise GCC fails to build it because it doesn't know the attribute. llvm-svn: 263787	2016-03-18 14:19:19 +00:00
Ehsan Amiri	631ed04af0	adding another optimization opportunity to readme file llvm-svn: 263775	2016-03-18 04:02:25 +00:00
Kostya Serebryany	c43b584c1c	[libFuzzer] read corpus dirs recursively llvm-svn: 263773	2016-03-18 01:36:00 +00:00
Adam Nemet	709e3046ee	[LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead Summary: It can hurt performance to prefetch ahead too much. Be conservative for now and don't prefetch ahead more than 3 iterations on Cyclone. Reviewers: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17949 llvm-svn: 263772	2016-03-18 00:27:43 +00:00
Adam Nemet	6d8beeca53	[LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses Summary: And use this TTI for Cyclone. As it was explained in the original RFC (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW prefetcher work up to 2KB strides. I am also adding tests for this and the previous change (D17943): * Cyclone prefetching accesses with a large stride * Cyclone not prefetching accesses with a small stride * Generic Aarch64 subtarget not prefetching either Reviewers: hfinkel Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17945 llvm-svn: 263771	2016-03-18 00:27:38 +00:00
Adam Nemet	53e758fc55	[Aarch64] Add pass LoopDataPrefetch for Cyclone Summary: This wires up the pass for Cyclone but keeps it off for now because we need a few more TTIs. The getPrefetchMinStride value is not very well tuned right now but it works well with CFP2006/433.milc which motivated this. Tests will be added as part of the upcoming large-stride prefetching patch. Reviewers: t.p.northover Subscribers: llvm-commits, aemerson, hfinkel, rengolin Differential Revision: http://reviews.llvm.org/D17943 llvm-svn: 263770	2016-03-18 00:27:29 +00:00
Kostya Serebryany	945761b8c2	[libFuzzer] improve -merge functionality llvm-svn: 263769	2016-03-18 00:23:29 +00:00
Peter Collingbourne	a1f8625662	DebugInfo: Add ability to not emit DW_AT_vtable_elem_location for virtual functions. A virtual index of -1u indicates that the subprogram's virtual index is unrepresentable (for example, when using the relative vtable ABI), so do not emit a DW_AT_vtable_elem_location attribute for it. Differential Revision: http://reviews.llvm.org/D18236 llvm-svn: 263765	2016-03-17 23:58:03 +00:00
Tim Shen	5cdf75084a	[PPC, FastISel] Fix ordered/unordered fcmp For fcmp, major concern about the following 6 cases is NaN result. The comparison result consists of 4 bits, indicating lt, eq, gt and un (unordered), only one of which will be set. The result is generated by fcmpu instruction. However, bc instruction only inspects one of the first 3 bits, so when un is set, bc instruction may jump to to an undesired place. More specifically, if we expect an unordered comparison and un is set, we expect to always go to true branch; in such case UEQ, UGT and ULT still give false, which are undesired; but UNE, UGE, ULE happen to give true, since they are tested by inspecting !eq, !lt, !gt, respectively. Similarly, for ordered comparison, when un is set, we always expect the result to be false. In such case OGT, OLT and OEQ is good, since they are actually testing GT, LT, and EQ respectively, which are false. OGE, OLE and ONE are tested through !lt, !gt and !eq, and these are true. llvm-svn: 263753	2016-03-17 22:27:58 +00:00
Adam Nemet	b0c4eae073	[LoopVectorize] Annotate versioned loop with noalias metadata Summary: Use the new LoopVersioning facility (D16712) to add noalias metadata in the vector loop if we versioned with memchecks. This can enable some optimization opportunities further down the pipeline (see the included test or the benchmark improvement quoted in D16712). The test also covers the bug I had in the initial version in D16712. The vectorizer did not previously use LoopVersioning. The reason is that the vectorizer performs its transformations in single shot. It creates an empty single-block vector loop that it then populates with the widened, if-converted instructions. Thus creating an intermediate versioned scalar loop seems wasteful. So this patch (rather than bringing in LoopVersioning fully) adds a special interface to LoopVersioning to allow the vectorizer to add no-alias annotation while still performing its own versioning. As the vectorizer propagates metadata from the instructions in the original loop to the vector instructions we also check the pointer in the original instruction and see if LoopVersioning can add no-alias metadata based on the issued memchecks. Reviewers: hfinkel, nadav, mzolotukhin Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17191 llvm-svn: 263744	2016-03-17 20:32:37 +00:00
Adam Nemet	5eccf07df3	[LoopVersioning] Annotate versioned loop with noalias metadata Summary: If we decide to version a loop to benefit a transformation, it makes sense to record the now non-aliasing accesses in the newly versioned loop. This allows non-aliasing information to be used by subsequent passes. One example is 456.hmmer in SPECint2006 where after loop distribution, we vectorize one of the newly distributed loops. To vectorize we version this loop to fully disambiguate may-aliasing accesses. If we add the noalias markers, we can use the same information in a later DSE pass to eliminate some dead stores which amounts to ~25% of the instructions of this hot memory-pipeline-bound loop. The overall performance improves by 18% on our ARM64. The scoped noalias annotation is added in LoopVersioning. The patch then enables this for loop distribution. A follow-on patch will enable it for the vectorizer. Eventually this should be run by default when versioning the loop but first I'd like to get some feedback whether my understanding and application of scoped noalias metadata is correct. Essentially my approach was to have a separate alias domain for each versioning of the loop. For example, if we first version in loop distribution and then in vectorization of the distributed loops, we have a different set of memchecks for each versioning. By keeping the scopes in different domains they can conveniently be defined independently since different alias domains don't affect each other. As written, I also have a separate domain for each loop. This is not necessary and we could save some metadata here by using the same domain across the different loops. I don't think it's a big deal either way. Probably the best is to review the tests first to see if I mapped this problem correctly to scoped noalias markers. I have plenty of comments in the tests. Note that the interface is prepared for the vectorizer which needs the annotateInstWithNoAlias API. The vectorizer does not use LoopVersioning so we need a way to pass in the versioned instructions. This is also why the maps have to become part of the object state. Also currently, we only have an AA-aware DSE after the vectorizer if we also run the LTO pipeline. Depending how widely this triggers we may want to schedule a DSE toward the end of the regular pass pipeline. Reviewers: hfinkel, nadav, ashutosh.nema Subscribers: mssimpso, aemerson, llvm-commits, mcrosier Differential Revision: http://reviews.llvm.org/D16712 llvm-svn: 263743	2016-03-17 20:32:32 +00:00
Justin Bogner	ae341c6e9b	Bitcode: Error out instead of crashing on corrupt metadata I hit a crash in the bitcode reader on some corrupt input where an MDString had somehow been attached to an instruction instead of an MDNode. This input is pretty bogus, but we shouldn't be crashing on bad input here. This change adds error handling in all of the places where we currently have unchecked casts from Metadata to MDNode, which means we'll error out instead of crashing for that sort of input. Unfortunately, I don't have tests. Hitting this requires flipping bits in the input bitcode, and committing corrupt binary files to catch these cases is a bit too opaque and unmaintainable. llvm-svn: 263742	2016-03-17 20:12:06 +00:00
Tim Northover	498c56c240	ARM: stop asserting on weird <3 x Ty> vectors in ISelLowering. llvm-svn: 263741	2016-03-17 20:10:28 +00:00
Kostya Serebryany	c5575aabd6	[libFuzzer] deprecate several flags llvm-svn: 263739	2016-03-17 19:59:39 +00:00
Kostya Serebryany	23dbc390af	[libFuzzer] add __attribute__((no_sanitize_memory)) to two functions that may be called from signal handler(s) or from msan. This will hopefully avoid msan false reports which I can't reproduce llvm-svn: 263737	2016-03-17 19:42:35 +00:00
Guozhi Wei	7b390ec4cd	[InstCombine] Combine A->B->A BitCast This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 263734	2016-03-17 18:47:20 +00:00
Sanjoy Das	c9058ca9e0	[Statepoints] Export a magic constant into a header; NFC llvm-svn: 263733	2016-03-17 18:42:17 +00:00
Petar Jovanovic	0b44f24033	[PowerPC] Disable CTR loops optimization for soft float operations This patch prevents CTR loops optimization when using soft float operations inside loop body. Soft float operations use function calls, but function calls are not allowed inside CTR optimized loops. Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D17600 llvm-svn: 263727	2016-03-17 17:11:33 +00:00
Derek Schuff	d4207ba0f6	[WebAssembly] Stackify code emitted by eliminateFrameIndex and SP writeback Summary: MRI::eliminateFrameIndex can emit several instructions to do address calculations; these can usually be stackified. Because instructions with FI operands can have subsequent operands which may be expression trees, find the top of the leftmost tree and insert the code before it, to keep the LIFO property. Also use stackified registers when writing back the SP value to memory in the epilog; it's unnecessary because SP will not be used after the epilog, and it results in better code. Differential Revision: http://reviews.llvm.org/D18234 llvm-svn: 263725	2016-03-17 17:00:29 +00:00
David Majnemer	511391feaa	[COFF] Refactor section alignment calculation Section alignment isn't completely trivial, let it live in one place so that we may reuse it in LLVM. llvm-svn: 263722	2016-03-17 16:55:18 +00:00
David Majnemer	62fed0c354	Forgot to commit this with r263692 llvm-svn: 263721	2016-03-17 16:55:11 +00:00
Changpeng Fang	234fcb81d3	AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute Symmary: ds_permute/ds_bpermute do not read memory so s_waitcnt is not needed. Reviewers arsenm, tstellarAMD Subscribers llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18197 llvm-svn: 263720	2016-03-17 16:43:50 +00:00
Nicolai Haehnle	79cad857a0	AMDGPU: mark atomic instructions as sources of divergence Summary: As explained by the comment, threads will typically see different values returned by atomic instructions even if the arguments are equal. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18156 llvm-svn: 263719	2016-03-17 16:21:59 +00:00
Simon Pilgrim	0f37fbac51	[X86][SSE] Simplified blend-with-zero combining We were being too aggressive in trying to combine a shuffle into a blend-with-zero pattern, often resulting in a endless loop of contrasting combines This patch stops the combine if we already have a blend in place (means we miss some domain corrections) llvm-svn: 263717	2016-03-17 15:59:36 +00:00
Sanjay Patel	9e23fedaf0	propagate 'unpredictable' metadata on select instructions This is similar to D18133 where we allowed profile weights on select instructions. This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects. A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at: http://reviews.llvm.org/rL263648 Differential Revision: http://reviews.llvm.org/D18220 llvm-svn: 263716	2016-03-17 15:30:52 +00:00
Saleem Abdulrasool	071a099102	ARM: Revert SVN r253865, 254158, fix windows division The two changes together weakened the test and caused a regression with division handling in MSVC mode. They were applied to avoid an assertion being triggered in the block frequency analysis. However, the underlying problem was simply being masked rather than solved properly. Address the actual underlying problem and revert the changes. Rather than analyze the cause of the assertion, the division failure was assumed to be an overflow. The underlying issue was a subtle bug in the BB construction in the emission of the div-by-zero check (WIN__DBZCHK). We did not construct the proper successor information in the basic blocks, nor did we update the PHIs associated with the basic block when we split them. This would result in assertions being triggered in the block frequency analysis pass. Although the original tests are being removed, the tests themselves performed very little in terms of validation but merely tested that we did not assert when generating code. Update this with new tests that actually ensure that we do not regress on the code generation. llvm-svn: 263714	2016-03-17 14:10:49 +00:00
Simon Atanasyan	58ee875296	[mips] Use `formatImm` call to print immediate value in the `MipsInstPrinter` That allows, for example, to print hex-formatted immediates using llvm-objdump --print-imm-hex command line option. Differential Revision: http://reviews.llvm.org/D18195 llvm-svn: 263704	2016-03-17 10:43:36 +00:00
Scott Egerton	d65377da78	[mips] Eliminate instances of "potentially uninitialised local variable" warnings, NFC Summary: This should eliminate all occurrences of this within LLVMMipsAsmParser. This patch is in response to http://reviews.llvm.org/D17983. I was unable to reproduce the warnings on my machine so please advise if this fixes the warnings. Reviewers: ariccio, vkalintiris, dsanders Subscribers: dblaikie, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18087 llvm-svn: 263703	2016-03-17 10:37:51 +00:00
Sanjoy Das	312038872d	[Statepoints] Separate out logic for statepoint directives; NFC This splits out the logic that maps the `"statepoint-id"` attribute into the actual statepoint ID, and the `"statepoint-num-patch-bytes"` attribute into the number of patchable bytes the statpeoint is lowered into. The new home of this logic is in IR/Statepoint.cpp, and this refactoring will support similar functionality when lowering calls with deopt operand bundles in the future. llvm-svn: 263685	2016-03-17 01:56:10 +00:00
Sanjoy Das	d6fc46ea03	[Statepoints] Minor NFC cleanups Mostly code simplifcations, and bringing up IR/Statepoints.cpp up to LLVM coding style. llvm-svn: 263683	2016-03-17 00:47:18 +00:00
Sanjoy Das	3a02019fbc	[SelectionDAG] Remove visitStatepoint; NFC This way we have a single entry point into StatepointLowering. The method was a direct dispatch to LowerStatepoint anyway. llvm-svn: 263682	2016-03-17 00:47:14 +00:00
Chris Bieneman	671d0dda7d	Upgrade TBAA before upgrading intrinsics Summary: If TBAA is on an intrinsic and it gets upgraded and drops the TBAA we hit an odd assert. We should just upgrade the TBAA first because it doesn't have side-effects. Reviewers: reames, apilipenko, manmanren Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18229 llvm-svn: 263673	2016-03-16 23:17:54 +00:00
Sanjoy Das	43e33d61c6	Fix indentation; NFC llvm-svn: 263672	2016-03-16 23:11:21 +00:00
Sanjoy Das	70697ff74d	Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFC Summary: This is a step towards implementing "direct" lowering of calls and invokes with deopt operand bundles into STATEPOINT nodes (as opposed to having them mandatorily pass through RewriteStatepointsForGC, which is the case today). This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint` helper function that is able to lower a "statepoint like thing", and uses it to lower `gc.statepoint` calls. This is an NFC now, but in a later change we will use `LowerAsStatepoint` to directly lower calls and invokes with operand bundles without going through an intermediate `gc.statepoint` IR representation. FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add support for lowering non gc.statepoints, right now it is fairly tightly coupled with an IR level `gc.statepoint`. Reviewers: reames, pgavlin, JosephTremoulet Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18106 llvm-svn: 263671	2016-03-16 23:08:00 +00:00
Xinliang David Li	897d2923a2	Variable name cleanup /NFC llvm-svn: 263666	2016-03-16 22:13:41 +00:00
James Y Knight	f44fc5219f	Tweak some atomics functions in preparation for larger changes; NFC. - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665	2016-03-16 22:12:04 +00:00
Sanjoy Das	19c6159833	[SelectionDAG] Extract out populateCallLoweringInfo; NFC SelectionDAGBuilder::populateCallLoweringInfo is now used instead of SelectionDAGBuilder::lowerCallOperands. The populateCallLoweringInfo interface is more composable in face of design changes like http://reviews.llvm.org/D18106 llvm-svn: 263663	2016-03-16 20:49:31 +00:00
Vedant Kumar	aa0cae6208	[ProfileData] Make a utility method public, NFC The swift frontend needs to be able to look up PGO function name variables based on the original raw function name. That's because it's not possible to create PGO function name variables while emitting swift IR. Instead, we have to create the name variables while lowering swift IR to llvm IR, at which point we fix up all calls to the increment intrinsic to point to the right name variable. llvm-svn: 263662	2016-03-16 20:49:26 +00:00
Nicolai Haehnle	ef160de3e5	AMDGPU: Prevent uniform loops from becoming infinite Summary: Uniform loops where the branch leaving the loop is predicated on VCCNZ must be skipped if EXEC = 0, otherwise they will be infinite. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18137 llvm-svn: 263658	2016-03-16 20:14:33 +00:00
Colin LeMahieu	bb0cdfb9f7	[Hexagon] Adding missing break in switch statement. Extra operands would have been appended to the end. llvm-svn: 263657	2016-03-16 20:00:38 +00:00
Chad Rosier	fea398188c	[SLP] Make DataLayout a member variable. llvm-svn: 263656	2016-03-16 19:48:42 +00:00
Geoff Berry	56fabf9b55	Revert "[LSR] Create fewer redundant instructions." This reverts commit r263644. Investigating bootstrap failures. llvm-svn: 263655	2016-03-16 19:21:47 +00:00
Simon Pilgrim	b5a20f0fec	Removed trailing whitespace llvm-svn: 263650	2016-03-16 18:37:44 +00:00
Sanjay Patel	be37e62e0c	fix function names; NFC llvm-svn: 263646	2016-03-16 18:00:09 +00:00
Evgeniy Stepanov	4b96ed693a	[msan] Add a comment with a bug link. llvm-svn: 263645	2016-03-16 17:39:17 +00:00
Geoff Berry	459b750871	[LSR] Create fewer redundant instructions. Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Reviewers: atrick Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18001 llvm-svn: 263644	2016-03-16 17:29:49 +00:00
Michel Danzer	302f83ac4e	AMDGPU: Verify instructions in non-debug builds as well And emit an error if it fails. This prevents illegal instructions from getting sent to the GPU, which would potentially result in a hang. This is a candidate for the stable branch(es). Reviewed-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 263627	2016-03-16 09:10:42 +00:00
Michel Danzer	beb79ceb19	AMDGPU/SI: Clean up indentation in SIInstrInfo::getDefaultRsrcDataFormat Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 263626	2016-03-16 09:10:35 +00:00
Igor Breger	0ba7b04f5f	AVX512BW: Fix SRA v64i8 lowering. Use PCMPGTM (cmp result in k register) for 512bit vector because PCMPGT supported only for 128/256bit. Differential Revision: http://reviews.llvm.org/D18204 llvm-svn: 263624	2016-03-16 08:48:26 +00:00
Haicheng Wu	7873857a88	[JumpThreading] See through Cast Instructions To capture more jump-thread opportunity. llvm-svn: 263618	2016-03-16 04:52:52 +00:00
Lang Hames	f7f6d3e93f	[Support] Add the 'Error' class for structured error handling. This patch introduces the Error classs for lightweight, structured, recoverable error handling. It includes utilities for creating, manipulating and handling errors. The scheme is similar to exceptions, in that errors are described with user-defined types. Unlike exceptions however, errors are represented as ordinary return types in the API (similar to the way std::error_code is used). For usage notes see the LLVM programmer's manual, and the Error.h header. Usage examples can be found in unittests/Support/ErrorTest.cpp. Many thanks to David Blaikie, Mehdi Amini, Kevin Enderby and others on the llvm-dev and llvm-commits lists for lots of discussion and review. llvm-svn: 263609	2016-03-16 01:02:46 +00:00
Haicheng Wu	64d9d7c3f7	Revert "[JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()" Not sure it handles undef properly. llvm-svn: 263605	2016-03-15 23:38:47 +00:00
Adam Nemet	c979c6e123	Turn LoopLoadElimination on again The latent bug that LLE exposed in the LoopVectorizer was resolved (PR26952). The pass can be disabled with -mllvm -enable-loop-load-elim=0 llvm-svn: 263595	2016-03-15 22:26:12 +00:00
Mike Aizatsky	298516ffa9	[libfuzzer] speeding up corpus load llvm-svn: 263591	2016-03-15 21:47:21 +00:00
Bjorn Steinbrink	37ca462508	Also handle the new Rust pers fn to isCatchAll() llvm-svn: 263585	2016-03-15 20:57:07 +00:00
Bjorn Steinbrink	59fdec673d	Add Rust's personality function to the list of known personality functions Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18192 llvm-svn: 263581	2016-03-15 20:35:45 +00:00
Evgeniy Stepanov	d6e91369d8	[msan] Don't put module constructors in comdats. There is something strange going on with debug info (.eh_frame_hdr) disappearing when msan.module_ctor are placed in comdat sections. Moving this functionality under flag, disabled by default. llvm-svn: 263579	2016-03-15 20:25:47 +00:00
Teresa Johnson	1396809b17	[ThinLTO] Record all global variable defs in the summary Record all variable defs with a summary record to aid in building a complete reference graph and locating constant variable defs to import. llvm-svn: 263576	2016-03-15 19:35:45 +00:00
Chris Bieneman	ef43d448d4	[CMake] Add PACKAGE_VENDOR for customizing version output Summary: This change adds a PACKAGE_VENDOR variable. When set it makes the version output more closely resemble the clang version output. Reviewers: aprantl, bogner Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18159 llvm-svn: 263566	2016-03-15 18:07:46 +00:00
Adam Nemet	fdb20595a1	[LV] Preserve LoopInfo when store predication is used This was a latent bug that got exposed by the change to add LoopSimplify as a dependence to LoopLoadElimination. Since LoopInfo was corrupted after LV, LoopSimplify mis-compiled nbench in the test-suite (more details in the PR). The problem was that when we create the blocks for predicated stores we didn't add those to any loops. The original testcase for store predication provides coverage for this assuming we verify LI on the way out of LV. Fixes PR26952. llvm-svn: 263565	2016-03-15 18:06:20 +00:00
Davide Italiano	dfdf278ebf	[MC] Rename TLSDESC as it's not ARM specific. Similarly to what was done for TLSCALL in r263515. llvm-svn: 263564	2016-03-15 17:29:52 +00:00
Changpeng Fang	01f6062227	AMDGPU/SI: Implement GroupStaticSize Intrinsic for Dynamic LDS Summary: Static LDS size is saved in MachineFunctionInfo::LDSSize, We define a pseudo instruction with usesCustomInserter bit set. Then, in EmitInstrWithCustomInserter, we replace this pseudo instruction with a mov of MachineFunctionInfo::LDSSize. Reviewers: arsenm tstellarAMD Subscribers llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18064 llvm-svn: 263563	2016-03-15 17:28:44 +00:00
Douglas Katzman	708eeb0519	Myriad: Add new sparc CPU kinds. llvm-svn: 263557	2016-03-15 16:41:47 +00:00
Benjamin Kramer	96f4b12880	[GlobalOpt] Don't look through aliases when sorting names of globals. If both are different aliases to the same value the sorting becomes non-deterministic as array_pod_sort is not stable. llvm-svn: 263550	2016-03-15 14:18:26 +00:00
Chad Rosier	ebe559019b	[SLP] Update comment to reflect reality. NFC. llvm-svn: 263548	2016-03-15 13:27:58 +00:00
Lang Hames	abda4d2526	[MachO] Extend the alt_entry support for aliases added in r263521 to expressions of the form 'a = .' and 'a = Ltmp'. llvm-svn: 263528	2016-03-15 04:20:49 +00:00
Eric Christopher	257338ff0f	Use some braces to format this a little better. llvm-svn: 263527	2016-03-15 03:01:31 +00:00
Teresa Johnson	2794f71575	BitcodeWriter dyn_cast cleanup for r263275 (NFC) Address review suggestions from dblaikie: change a few dyn_cast to cast and fold a cast into if condition. llvm-svn: 263526	2016-03-15 02:41:29 +00:00
Eric Christopher	ee00abe5e6	Fix llvm/llvm/lib/Transforms/Utils/LoopUnroll.cpp:285:53: error: suggest parentheses around '&&' within '\|\|' [-Werror=parentheses]. llvm-svn: 263525	2016-03-15 02:19:06 +00:00
Teresa Johnson	b43027d1e0	Move global ID computation from Function to GlobalValue (NFC) Since the static getGlobalIdentifier and getGUID methods are now called for global values other than functions, reflect that by moving these methods to the GlobalValue class. llvm-svn: 263524	2016-03-15 02:13:19 +00:00
Lang Hames	1b640e05ba	[MachO] Add MachO alt-entry directive support. This patch adds support for the MachO .alt_entry assembly directive, and uses it for global aliases with non-zero GEP offsets. The alt_entry flag indicates that a symbol should be layed out immediately after the preceding symbol. Conceptually it introduces an alternate entry point for a function or data structure. E.g.: safe_foo: // check preconditions for foo .alt_entry fast_foo fast_foo: // body of foo, can assume preconditions. The .alt_entry flag is also implicitly set on assembly aliases of the form: a = b + C where C is a non-zero constant, since these have the same effect as an alt_entry symbol: they introduce a label that cannot be moved relative to the preceding one. Setting the alt_entry flag on aliases of this form fixes http://llvm.org/PR25381. llvm-svn: 263521	2016-03-15 01:43:05 +00:00
Kostya Serebryany	0c5e3af862	[libFuzzer] use max_len exactly equal to the max size of input. Fix 32-bit build llvm-svn: 263518	2016-03-15 01:28:00 +00:00
Sanjoy Das	c11460e051	[StatepointLowering] Move an assertion; NFCI Instead of running an explicit loop over `gc.relocate` calls hanging off of a `gc.statepoint`, assert the validity of the type of the value being relocated in `visitRelocate`. llvm-svn: 263516	2016-03-15 01:16:31 +00:00
Davide Italiano	249c45d92e	[MC] Rename TLSCALL as it's not ARM specific. `MCSymbolRefExpr` variant kind for TLSCALL is prefixed with _ARM_ since this is how it was originally implemented. The X86_64 version is exactly the same so there's no reason to create a new variant, we can just rename the existing one to be machine-independent. This generalization is the first step to implement support for GNU2 TLS dialect in MC. Differential Revision: http://reviews.llvm.org/D18160 llvm-svn: 263515	2016-03-15 00:25:22 +00:00
Teresa Johnson	26ab5772b0	[ThinLTO] Renaming of function index to module summary index (NFC) (Resubmitting after fixing missing file issue) With the changes in r263275, there are now more than just functions in the summary. Completed the renaming of data structures (started in r263275) to reflect the wider scope. In particular, changed the FunctionIndex* data structures to ModuleIndex*, and renamed related variables and comments. Also renamed the files to reflect the changes. A companion clang patch will immediately succeed this patch to reflect this renaming. llvm-svn: 263513	2016-03-15 00:04:37 +00:00
Eric Christopher	da8b3f1914	Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware" as it seems to be causing crashes during code generation in halide. PR forthcoming. This reverts commit r263303. llvm-svn: 263512	2016-03-14 23:59:57 +00:00
Justin Lebar	6827de19b2	[LoopUnroll] Respect the convergent attribute. Summary: Specifically, when we perform runtime loop unrolling of a loop that contains a convergent op, we can only unroll k times, where k divides the loop trip multiple. Without this change, we'll happily unroll e.g. the following loop for (int i = 0; i < N; ++i) { if (i == 0) convergent_op(); foo(); } into int i = 0; if (N % 2 == 1) { convergent_op(); foo(); ++i; } for (; i < N - 1; i += 2) { if (i == 0) convergent_op(); foo(); foo(); }. This is unsafe, because we've just added a control-flow dependency to the convergent op in the prelude. In general, runtime unrolling loops that contain convergent ops is safe only if we don't have emit a prelude, which occurs when the unroll count divides the trip multiple. Reviewers: resistor Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17526 llvm-svn: 263509	2016-03-14 23:15:34 +00:00
Amaury Sechet	bdb261b4c0	Imporove load to store => memcpy Summary: This now try to reorder instructions in order to help create the optimizable pattern. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer Differential Revision: http://reviews.llvm.org/D16523 llvm-svn: 263503	2016-03-14 22:52:27 +00:00
Manuel Jacob	6be355961e	Re-add ConstantFoldInstOperands form taking opcode and return type. Summary: This form was replaced by a form taking an instruction instead of opcode and return type in r258391. After committing this change (and some depending, follow-up changes) it turned out in the review thread to be controversial. The discussion didn't come to a conclusion yet. I'm re-adding the old form to fix the API regression and to provide a better base for discussion, possibly on llvm-dev. A difference to the original function is that it can't be called with GEPs (similarly to how it was already the case for compares). In order to support opaque pointers in the future, folding GEPs needs to be passed the source element type, which is not possible with the current API. Reviewers: dberlin, reames Subscribers: dblaikie, eddyb Differential Revision: http://reviews.llvm.org/D17901 llvm-svn: 263501	2016-03-14 22:34:17 +00:00
Amaury Sechet	eae09c2c2a	Factor out MachineBlockPlacement::fillWorkLists. NFC Summary: There are places in MachineBlockPlacement where a worklist is filled in pretty much identical way. The code is duplicated. This refactor it so that the same code is used in both scenarii. Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18077 llvm-svn: 263495	2016-03-14 21:24:11 +00:00
Teresa Johnson	cec0cae313	Revert "[ThinLTO] Renaming of function index to module summary index (NFC)" This reverts commit r263490. Missed a file. llvm-svn: 263493	2016-03-14 21:18:10 +00:00
Teresa Johnson	892920b358	[ThinLTO] Renaming of function index to module summary index (NFC) With the changes in r263275, there are now more than just functions in the summary. Completed the renaming of data structures (started in r263275) to reflect the wider scope. In particular, changed the FunctionIndex* data structures to ModuleIndex*, and renamed related variables and comments. Also renamed the files to reflect the changes. A companion clang patch will immediately succeed this patch to reflect this renaming. llvm-svn: 263490	2016-03-14 21:05:56 +00:00
Adam Nemet	bb45810e4f	Revert "Turn LoopLoadElimination on again" This reverts commit r263472. There is an LNT failure on clang-ppc64be-linux-lnt. Turn this off, while I am investigating. llvm-svn: 263485	2016-03-14 20:38:55 +00:00
Sanjay Patel	ee52b6e77d	allow branch weight metadata on select instructions (PR26636) As noted in: https://llvm.org/bugs/show_bug.cgi?id=26636 This doesn't accomplish anything on its own. It's the first step towards preserving and using branch weights with selects. The next step would be to make sure we're propagating the info in all of the other places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's an easy fix to make this happen; we have to look at each transform individually to determine how to correctly propagate the weights. Along with that step, we need to then use the weights when making subsequent transform decisions such as discussed in http://reviews.llvm.org/D16836. The inliner test is independent but closely related. It verifies that metadata is preserved when both branches and selects are cloned. Differential Revision: http://reviews.llvm.org/D18133 llvm-svn: 263482	2016-03-14 20:18:59 +00:00
Justin Lebar	9d94397859	[attrs] Handle convergent CallSites. Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Originally landed as r261544, then reverted in r261544 for (incidental) build breakage. Re-landed here with no changes. Reviewers: chandlerc, jingyue Subscribers: llvm-commits, tra, jhen, hfinkel Differential Revision: http://reviews.llvm.org/D17739 llvm-svn: 263481	2016-03-14 20:18:54 +00:00
Ulrich Weigand	52aa7fba3f	[SystemZ] Add missing isBranch flags to certain instruction Some instructions were missing isBranch, isCall, or isTerminator flags. This didn't really affect code generation since most of the affected patterns were used only for the AsmParser and/or disassembler. However, it could affect tools using the MC layer to disassemble and parse binary code (e.g. via MCInstrDesc::mayAffectControlFlow). llvm-svn: 263478	2016-03-14 20:16:30 +00:00
Keno Fischer	a91ae8336b	[SLPVectorizer] Fix dependency list Summary: DemandedBits was added to the requirements of SLPVectorizer in rL261212 (and various earlier version of it), but the appropriate initialization statement was accidentally forgotten. Ref [[ https://github.com/JuliaLang/julia/issues/14998 \| JuliaLang/julia#14998 ]]. Patch by Yichao Yu. Reviewers: mssimpso Differential Revision: http://reviews.llvm.org/D18152 llvm-svn: 263476	2016-03-14 20:04:24 +00:00
Adam Nemet	5a19ae917b	Turn LoopLoadElimination on again The two issues that were discovered got fixed (r263058, r263173). The pass can be disabled with -mllvm -enable-loop-load-elim=0 llvm-svn: 263472	2016-03-14 19:40:25 +00:00
Michael Kuperstein	b7860fedd4	[AliasSetTracker] Do not strip pointer casts when processing MemSetInst This fixes PR26843. llvm-svn: 263462	2016-03-14 18:34:29 +00:00
Chad Rosier	27c352d26d	[AArch64] Refactor AArch64FrameLowering::emitPrologue. NFC. http://reviews.llvm.org/D18125 Patch by Aditya Kumar. llvm-svn: 263461	2016-03-14 18:24:34 +00:00
Quentin Colombet	40ce25b68b	[SpillPlacement] Fix a quadratic behavior in spill placement. The bad behavior happens when we have a function with a long linear chain of basic blocks, and have a live range spanning most of this chain, but with very few uses. Let say we have only 2 uses. The Hopfield network is only seeded with two active blocks where the uses are, and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two new nodes to the network due to the completely linear shape of the CFG. Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes, which adds up to a quadratic algorithm. This is an historical accident effect from r129188. When the Hopfield network is expanding, most of the action is happening on the frontier where new nodes are being added. The internal nodes in the network are not likely to be flip-flopping much, or they will at least settle down very quickly. This means that while `SpillPlacer->iterate()` is recomputing all the nodes in the network, it is probably only the two frontier nodes that are changing their output. Instead of recomputing the whole network on each iteration, we can maintain a SparseSet of nodes that need to be updated: - `SpillPlacement::activate()` adds the node to the todo list. - When a node changes value (i.e., `update()` returns true), its neighbors are added to the todo list. - `SpillPlacement::iterate()` only updates the nodes in the list. The result of Hopfield iterations is not necessarily exact. It should converge to a local minimum, but there is no guarantee that it will find a global minimum. It is possible that updating nodes in a different order will cause us to switch to a different local minimum. In other words, this is not NFC, but although I saw a few runtime improvements and regressions when I benchmarked this change, those were side effects and actually the performance change is in the noise as expected. Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks, guidance and time for the review. llvm-svn: 263460	2016-03-14 18:21:25 +00:00
Chad Rosier	6d98655070	[AArch64] Break the dependency between FP and SP when possible. When the SP in not changed because of realignment/VLAs etc., we restore the SP by using the previous value of SP and not the FP. Breaking the dependency will help in cases when the epilog of a callee is close to the epilog of the caller; for then "sub sp, fp, #" depends on the load restoring the FP in the epilog of the callee. http://reviews.llvm.org/D18060 Patch by Aditya Kumar and Evandro Menezes. llvm-svn: 263458	2016-03-14 18:17:41 +00:00
Chad Rosier	7a21bb196b	[Mips] Fix -Wunused-private-field warning after r263444. llvm-svn: 263454	2016-03-14 18:10:20 +00:00
Sanjay Patel	7506852709	[DAG] use !isUndef() ; NFCI llvm-svn: 263453	2016-03-14 18:09:43 +00:00
Sanjay Patel	5719584129	[DAG] use isUndef() ; NFCI llvm-svn: 263448	2016-03-14 17:28:46 +00:00
Tom Stellard	331f981cc9	AMDGPU/SI: Handle wait states required for DPP instructions Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17543 llvm-svn: 263447	2016-03-14 17:05:56 +00:00
Sanjay Patel	62d707c8d9	[x86, AVX] replace masked load with full vector load when possible Converting masked vector loads to regular vector loads for x86 AVX should always be a win. I raised the legality issue of reading the extra memory bytes on llvm-dev. I did not see any objections. 1. x86 already does this kind of optimization for multiple scalar loads -> vector load. 2. If other targets have the same flexibility, we could move this transform up to CGP or DAGCombiner. Differential Revision: http://reviews.llvm.org/D18094 llvm-svn: 263446	2016-03-14 16:54:43 +00:00
Daniel Sanders	e8efff373a	[mips] MIPS32R6 compact branch support Summary: MIPSR6 introduces a class of branches called compact branches. Unlike the traditional MIPS branches which have a delay slot, compact branches do not have a delay slot. The instruction following the compact branch is only executed if the branch is not taken and must not be a branch. It works by generating compact branches for MIPS32R6 when the delay slot filler cannot fill a delay slot. Then, inspecting the generated code for forbidden slot hazards (a compact branch with an adjacent branch or other CTI) and inserting nops to clear this hazard. Patch by Simon Dardis. Reviewers: vkalintiris, dsanders Subscribers: MatzeB, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D16353 llvm-svn: 263444	2016-03-14 16:24:05 +00:00
Marek Olsak	ed2213e6ef	AMDGPU/SI: Incomplete shader binaries need to finish execution at the end Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D18058 llvm-svn: 263441	2016-03-14 15:57:14 +00:00
Nicolai Haehnle	74127fe8d7	AMDGPU: mark llvm.amdgcn.image.atomic.* as a source of divergence Summary: When multiple threads perform an atomic op with the same arguments, they will usually see different return values. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18101 llvm-svn: 263440	2016-03-14 15:37:18 +00:00
Vasileios Kalintiris	42db3ff47f	[mips] Use range-based for loops. NFC. llvm-svn: 263438	2016-03-14 15:05:30 +00:00
Benjamin Kramer	1082fa66a5	Revert "Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26715 at r263379." This reverts commit r263424. Breaks self-host. llvm-svn: 263437	2016-03-14 14:58:36 +00:00
Ulrich Weigand	cdce026b4d	[SystemZ] Avoid LER on z13 due to partial register dependencies On the z13, it turns out to be more efficient to access a full floating-point register than just the upper half (as done e.g. by the LE and LER instructions). Current code already takes this into account when loading from memory by using the LDE instruction in place of LE. However, we still generate LER, which shows the same performance issues as LE in certain circumstances. This patch changes the back-end to emit LDR instead of LER to implement FP32 register-to-register copies on z13. llvm-svn: 263431	2016-03-14 13:50:03 +00:00
Chad Rosier	00bd82cade	[CVP] Replace nonnegative with positive, per Philip's request. NFC. llvm-svn: 263430	2016-03-14 13:48:00 +00:00
Zlatko Buljan	fba68931ed	[mips] Fix an issue with long double when function roundl is defined Differential Revision: http://reviews.llvm.org/D17760 llvm-svn: 263428	2016-03-14 12:50:23 +00:00
Daniel Sanders	127d2d2b46	[mips] Range check uimm16_64 Summary: Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D17725 llvm-svn: 263427	2016-03-14 12:44:44 +00:00
Amjad Aboud	ab0378b16c	Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26715 at r263379. llvm-svn: 263424	2016-03-14 12:03:20 +00:00
Daniel Sanders	cfa3483c8e	[mips] Simplify ordering of range checked immediate classes. Summary: With the addition of checks to ensure that operands have a strict ordering it has become tricky to manage the order in the way I originally intended. This patch linearizes the ordering which simplifies the implementation but requires an order that is arbitrary in places. Here are some examples: * uimm4 < uimm5 < uimm6 * simm4 < uimm4 < simm5 < uimm5 * uimm5 < uimm5_plus1 (1..32) < uimm5_plus32 (32..63) < uimm6 The term 'superset' starts to break down here since the _plus classes are not true supersets of uimm5 (but they are still subsets of uimm6). * uimm5 < uimm5_64, and uimm5 < vsplat_uimm5 This is entirely arbitrary. We need an ordering and what we pick is unimportant since only one is possible for a given mnemonic. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D17723 llvm-svn: 263423	2016-03-14 11:46:30 +00:00
Nikolay Haustov	79af6b33e0	[AMDGPU] Assembler: SOP* instruction fixes s_bitset0_b64, s_bitset1_b64 has 32-bit src0, not 64-bit. s_rfe_b64 has just one destination operand and no source. Uncomment S_BITCMP* and S_SETVSKIP, adjust SOPC_* classes for that. Add s_memrealtime test and change comments in smem.s to follow common style. Change test for s_memtime to use non-zero register to make it really test encoding. Add tests for s_buffer_load*. Add tests for SOPC instructions (same for SI and VI) Differential Revision: http://reviews.llvm.org/D18040 llvm-svn: 263420	2016-03-14 11:17:19 +00:00
Daniel Sanders	19b7f76afa	[mips] Range check uimm6_lsl2. Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D17291 llvm-svn: 263419	2016-03-14 11:16:56 +00:00
Hans Wennborg	369ebfe4c9	Try to fix build of WebAssemblyRegStackify.cpp on Windows It's failing to build on VS2015 with: C:\b\build\slave\ClangToTWin\build\src\third_party\llvm\lib\Target\WebAssembly\WebAssemblyRegStackify.cpp(520): error C2668: 'llvm::make_reverse_iterator': ambiguous call to overloaded function C:\b\build\slave\ClangToTWin\build\src\third_party\llvm\include\llvm/ADT/STLExtras.h(217): note: could be 'std::reverse_iterator<llvm::MachineBasicBlock::iterator> llvm::make_reverse_iterator<llvm::MachineInstrBundleIterator<llvm::MachineInstr>>(IteratorTy)' with [ IteratorTy=llvm::MachineInstrBundleIterator<llvm::MachineInstr> ] C:\b\depot_tools\win_toolchain\vs_files\391bbf1220d3edcd3cc3fccdb56224181e3b13a7\win_sdk\bin\..\..\VC\include\xutility(1217): note: or 'std::reverse_iterator<llvm::MachineBasicBlock::iterator> std::make_reverse_iterator<llvm::MachineInstrBundleIterator<llvm::MachineInstr>>(_RanIt)' [found using argument-dependent lookup] with [ _RanIt=llvm::MachineInstrBundleIterator<llvm::MachineInstr> ] I don't have VS2015 locally at the moment, but hopefully this will help. llvm-svn: 263418	2016-03-14 11:04:15 +00:00
Igor Breger	a949100532	AVX512: icmp operation should be always lowered to CMPM (AVX-512) instruction on SKX. implemented by delena Differential Revision: http://reviews.llvm.org/D18054 llvm-svn: 263417	2016-03-14 10:26:39 +00:00
Valery Pykhtin	0f97f17152	[AMDGPU] AsmParser: Factor out parseRegister. NFC. llvm-svn: 263411	2016-03-14 07:43:42 +00:00
Valery Pykhtin	9e33c7f5d3	[AMDGPU] AsmParser: refactor post push_back vector access. NFC. llvm-svn: 263409	2016-03-14 05:25:44 +00:00
David Majnemer	b9456a5eb3	[CodeView] Consistently handle overly large symbol names Overly large symbol names weren't correctly handled for leaf function records. llvm-svn: 263408	2016-03-14 05:15:09 +00:00
Valery Pykhtin	f91911c3ae	[AMDGPU] AsmParser: remove redundant isReg checks. NFC. llvm-svn: 263407	2016-03-14 05:01:45 +00:00
Haicheng Wu	d60ae33d29	[CVP] Convert an SDiv to a UDiv if both operands are known to be nonnegative The motivating example is this for (j = n; j > 1; j = i) { i = j / 2; } The signed division is safely to be changed to an unsigned division (j is known to be larger than 1 from the loop guard) and later turned into a single shift without considering the sign bit. llvm-svn: 263406	2016-03-14 03:24:28 +00:00
Amaury Sechet	7b05a4c2cb	Add facility to add/remove/check attribute on function and arguments. Summary: This comes from work to make attribute manipulable via the C API. Reviewers: gottesmm, hfinkel, baldrick, echristo, tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18128 llvm-svn: 263404	2016-03-14 01:37:29 +00:00
Mehdi Amini	ba9fba81d6	Remove PreserveNames template parameter from IRBuilder This reapplies r263258, which was reverted in r263321 because of issues on Clang side. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263393	2016-03-13 21:05:13 +00:00
Simon Pilgrim	035b19ecf5	[X86][SSE41] Avoid variable blend for constant v8i16 shifts The SSE41 v8i16 shift lowering using (v)pblendvb is great for non-constant shift amounts, but if it is constant then we can efficiently reduce the VSELECT to shuffles with the pre-SSE41 lowering. llvm-svn: 263383	2016-03-13 18:35:59 +00:00
Amjad Aboud	62f6f5cc80	Fixed DIBuilder to verify that same imported entity will not be added twice to the "imports" list of the DICompileUnit. Differential Revision: http://reviews.llvm.org/D17884 llvm-svn: 263379	2016-03-13 11:11:39 +00:00
David Majnemer	1256125fb7	[CodeView] Truncate display names Fundamentally, the length of a variable or function name is bound by the maximum size of a record: 0xffff. However, the name doesn't live in a vacuum; other data is associated with the name, lowering the bound further. We would naively attempt to emit the name, causing us to assert because the record would no-longer fit in 16-bits. Instead, truncate the name but preserve as much as we can. While I have tested this locally, I've decided to not commit it due to the test's size. N.B. While this behavior is undesirable, it is better than MSVC's behavior. They seem to truncate to ~4000 characters. llvm-svn: 263378	2016-03-13 10:53:30 +00:00
David Majnemer	90a021fb02	[Bitcode] Make writeComdats less strange It had a weird artificial limitation on the write side: the comdat name couldn't be bigger than 2**16. However, the reader had no such limitation. Make the reader and the writer agree. llvm-svn: 263377	2016-03-13 08:01:03 +00:00
Fiona Glaser	2e5c0c2858	ConstantFoldInstruction: avoid wasted calls to ConstantFoldConstantExpression Check to see if all operands are constant before calling simplify on them so that we don't perform wasted simplifications. llvm-svn: 263374	2016-03-13 05:36:15 +00:00
Matt Arsenault	69fdf9b2e4	APFloat: Fix ilogb for denormals llvm-svn: 263370	2016-03-13 05:12:32 +00:00
Matt Arsenault	afa31cf4cc	APFloat: Fix scalbn handling of denormals This was incorrect for denormals, and also failed on longer exponent ranges. llvm-svn: 263369	2016-03-13 05:11:51 +00:00
Craig Topper	955308fbee	[X86] Remove many operands that represent memory stores from outs to ins. These operands are the registers and immediates that specify the memory address not the memory itself thus they are inputs. llvm-svn: 263354	2016-03-13 02:56:31 +00:00
Amaury Sechet	006ce6327e	Use templated version of unwrap instead of cats in the Core.cpp. NFC llvm-svn: 263349	2016-03-13 00:54:40 +00:00
Amaury Sechet	c78768f17d	Move LLVMConstStructInContext so that declarationa nd definition order match. NFC llvm-svn: 263348	2016-03-13 00:40:12 +00:00
Sanjay Patel	9da9c76627	fix documentation comments; NFC llvm-svn: 263346	2016-03-12 20:44:58 +00:00
Sanjay Patel	5658b58936	remove unnecessary cast; NFC llvm-svn: 263343	2016-03-12 18:17:41 +00:00
Sanjay Patel	2e0027706a	fix formatting; NFC llvm-svn: 263342	2016-03-12 18:05:53 +00:00
Sanjay Patel	5781d840dd	use range loops; NFCI llvm-svn: 263341	2016-03-12 16:52:17 +00:00
Sanjay Patel	c4acbae63f	[x86, InstCombine] delete x86 SSE2 masked store with zero mask This follows up on the related AVX instruction transforms, but this one is too strange to do anything more with. Intel's behavioral description of this instruction in its Software Developer's Manual is tragi-comic. llvm-svn: 263340	2016-03-12 15:16:59 +00:00
Nemanja Ivanovic	bd56e4e25a	Fix for PR 26378 This patch corresponds to review: http://reviews.llvm.org/D17712 We were not clearing the TOC vector in PPCAsmPrinter when initializing it. This caused duplicate definition asserts when the pass is reused on the module (i.e. with -compile-twice or in JIT contexts). llvm-svn: 263338	2016-03-12 10:23:07 +00:00
Sanjoy Das	ecf96c9516	Make gc relocates more strongly typed; NFC Don't use a `Value ` where we can use a stronger `GCRelocateInst ` type. llvm-svn: 263327	2016-03-12 02:54:27 +00:00
Quentin Colombet	cf9732b417	[X86] Make sure we do not clobber RBX with cmpxchg when used as a base pointer. cmpxchg[8\|16]b uses RBX as one of its argument. In other words, using this instruction clobbers RBX as it is defined to hold one the input. When the backend uses dynamically allocated stack, RBX is used as a reserved register for the base pointer. Reserved registers have special semantic that only the target understands and enforces, because of that, the register allocator don’t use them, but also, don’t try to make sure they are used properly (remember it does not know how they are supposed to be used). Therefore, when RBX is used as a reserved register but defined by something that is not compatible with that use, the register allocator will not fix the surrounding code to make sure it gets saved and restored properly around the broken code. This is the responsibility of the target to do the right thing with its reserved register. To fix that, when the base pointer needs to be preserved, we use a different pseudo instruction for cmpxchg that save rbx. That pseudo takes two more arguments than the regular instruction: - One is the value to be copied into RBX to set the proper value for the comparison. - The other is the virtual register holding the save of the value of RBX as the base pointer. This saving is done as part of isel (i.e., we emit a copy from rbx). cmpxchg_save_rbx <regular cmpxchg args>, input_for_rbx_reg, save_of_rbx_as_bp This gets expanded into: rbx = copy input_for_rbx_reg cmpxchg <regular cmpxchg args> rbx = save_of_rbx_as_bp Note: The actual modeling of the pseudo is a bit more complicated to make sure the interferes that appears after the pseudo gets expanded are properly modeled before that expansion. This fixes PR26883. llvm-svn: 263325	2016-03-12 02:25:27 +00:00
Kostya Serebryany	64d24578d8	[libFuzzer] try to use max_len based on the items of the corpus instead of blindly defaulting to 64 bytes. llvm-svn: 263323	2016-03-12 01:57:04 +00:00
Eric Christopher	35abd051c0	Temporarily revert: commit ae14bf6488e8441f0f6d74f00455555f6f3943ac Author: Mehdi Amini <mehdi.amini@apple.com> Date: Fri Mar 11 17:15:50 2016 +0000 Remove PreserveNames template parameter from IRBuilder Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258 91177308-0d34-0410-b5e6-96231b3b80d8 until we can figure out what to do about clang and Release build testing. This reverts commit 263258. llvm-svn: 263321	2016-03-12 01:47:22 +00:00
Mehdi Amini	33661070c5	Minor cleanup and documentation to IRMover (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263304	2016-03-11 22:19:06 +00:00
Simon Pilgrim	33d57c7547	[X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 263303	2016-03-11 22:18:05 +00:00
Ahmed Bougacha	171f7b9986	[AArch64] Don't blindly lower f16/f128 FCCMPs. Instead, extend f16 (like we do when lowering a standalone SETCC), and let f128 be legalized to the RT calls. Fixes PR26803. llvm-svn: 263301	2016-03-11 22:02:58 +00:00
Dan Gohman	da323e88ea	[WebAssembly] Add `final` keywords to a few more subclasses, for consistency. llvm-svn: 263287	2016-03-11 19:45:37 +00:00
George Burgess IV	b42b762bca	[MemorySSA] Make a return type reflect reality. NFC. llvm-svn: 263286	2016-03-11 19:34:03 +00:00
Sanjoy Das	b51325dbdb	Introduce @llvm.experimental.deoptimize Summary: This intrinsic, together with deoptimization operand bundles, allow frontends to express transfer of control and frame-local state from one (typically more specialized, hence faster) version of a function into another (typically more generic, hence slower) version. In languages with a fully integrated managed runtime this intrinsic can be used to implement "uncommon trap" like functionality. In unmanaged languages like C and C++, this intrinsic can be used to represent the slow paths of specialized functions. Note: this change does not address how `@llvm.experimental_deoptimize` is lowered. That will be done in a later change. Reviewers: chandlerc, rnk, atrick, reames Subscribers: llvm-commits, kmod, mjacob, maksfb, mcrosier, JosephTremoulet Differential Revision: http://reviews.llvm.org/D17732 llvm-svn: 263281	2016-03-11 19:08:34 +00:00
Vedant Kumar	e5a9a275d3	[PGO] Skip value profile instrumentation of inline asm Value profile instrumentation treats inline asm calls like they are indirect calls. This causes problems when the 'Callee' is passed to a ptrtoint cast -- the verifier rightly claims that this is bogus and crashes opt. llvm-svn: 263278	2016-03-11 18:57:48 +00:00
Teresa Johnson	76a1c1d0ba	[ThinLTO] Support for reference graph in per-module and combined summary. Summary: This patch adds support for including a full reference graph including call graph edges and other GV references in the summary. The reference graph edges can be used to make importing decisions without materializing any source modules, can be used in the plugin to make file staging decisions for distributed build systems, and is expected to have other uses. The call graph edges are recorded in each function summary in the bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when there is PGO, where the ValueId can be mapped to the function GUID via the ValueSymbolTable. In the function index in memory, the call graph edges reference the target via the CalleeGUID instead of the CalleeValueId. The reference graph edges are recorded in each summary record with a list of referenced value IDs, which can be mapped to value GUID via the ValueSymbolTable. Addtionally, a new summary record type is added to record references from global variable initializers. A number of bitcode records and data structures have been renamed to reflect the newly expanded scope of the summary beyond functions. More cleanup will follow. Reviewers: joker.eph, davidxl Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D17212 llvm-svn: 263275	2016-03-11 18:52:24 +00:00
Simon Pilgrim	7b2164ffe0	Fix spelling. llvm-svn: 263266	2016-03-11 17:31:43 +00:00
Quentin Colombet	dd4b137364	[IRTranslator] Translate unconditional branches. llvm-svn: 263265	2016-03-11 17:28:03 +00:00
Quentin Colombet	f9b4934d1d	[MachineIRBuilder] Rework buildInstr API to maximize code reuse. llvm-svn: 263264	2016-03-11 17:27:58 +00:00
Quentin Colombet	e225e2541b	[IRTranslator] Update getOrCreateVReg API to use references. A value that we want to keep in a virtual register cannot be null. Reflect that in the API. llvm-svn: 263263	2016-03-11 17:27:54 +00:00
Quentin Colombet	000b580b13	[MachineIRBuilder] Rename the setter of MF for consistency with the getter. llvm-svn: 263262	2016-03-11 17:27:51 +00:00
Quentin Colombet	91ebd71e26	[MachineIRBuilder] Rename the setter for MBB for consistency with the getter. llvm-svn: 263261	2016-03-11 17:27:47 +00:00
Quentin Colombet	53237a9e64	[IRTranslator] Update getOrCreateBB API to use references. A null basic block is invalid, so just pass a reference. llvm-svn: 263260	2016-03-11 17:27:43 +00:00
Mehdi Amini	99eab3dd06	Remove PreserveNames template parameter from IRBuilder Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263258	2016-03-11 17:15:50 +00:00
Mehdi Amini	1e9c925182	Do not specialize IRBuilder to strip names in SROA Summary: Following r263086, we are replacing this by a runtime check. More cleanup will follow on the IRBuilder itself, but I submitted this patch separately as SROA has a fancy "prefixInserter" class that needs extra-love. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18022 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263256	2016-03-11 17:15:34 +00:00
Chad Rosier	ac216fd9d5	[misched] Fix a truncation issue from r263021. The truncation was causing the sorting algorithm to behave oddly when comparing positive and negative offsets. Fortunately, this doesn't currently happen in practice and was exposed by a WIP. Thus, I can't test this change now, but the follow on patch will. llvm-svn: 263255	2016-03-11 16:54:07 +00:00
Chandler Carruth	ace8c8f765	[PM] Sink the "Expression" type for GVN into the class as a private member type. Because of how this type is used by the ValueTable, it cannot actually have hidden visibility. GCC actually nicely warns about this but Clang just silently ... I don't even know. =/ We should do a better job either way though. This should resolve a bunch of the GCC warnings about visibility that the port of GVN triggered and make the visibility story a bit more correct. llvm-svn: 263250	2016-03-11 16:25:19 +00:00
Marianne Mailhot-Sarrasin	7423f40674	More UTF string conversion wrappers Added new string conversion wrappers that convert between `std::string` (of UTF-8 bytes) and `std::wstring`, which is particularly useful for Win32 interop. Also fixed a missing string conversion for `getenv` on Win32, using these new wrappers. The motivation behind this is to provide the support functions required for LLDB to work properly on Windows with non-ASCII data; however, the functions are not LLDB specific. Patch by cameron314 Differential Revision: http://reviews.llvm.org/D17549 llvm-svn: 263247	2016-03-11 15:59:32 +00:00
Valery Pykhtin	a7f480b4e9	[AMDGPU] Fix VOPC instruction operand namings Differential Revision: http://reviews.llvm.org/D17966 llvm-svn: 263242	2016-03-11 14:53:28 +00:00
Simon Pilgrim	7ca9614c71	[X86][AVX] Fixed issue where a long chain of shuffles could attempt to combine to a single (illegal) PSHUFB instruction. Its not enough that we test for SSSE3 - that's only OK for 128-bit vectors - we also need to test for AVX2 / AVX512BW for 256/512 bit vector cases. llvm-svn: 263239	2016-03-11 14:39:10 +00:00
Chandler Carruth	5bfbc3f941	[AA] Make BasicAA just require domtree. This doesn't change how many times we construct domtrees in the normal pipeline, and it removes fragility and instability where basic-aa may not be run in time to see domtrees because they happen to be constructed afterward. This isn't quite as clean as the change to memdep because there is a mode where basic-aa specifically runs without domtrees -- in the hacking version used by function-attrs with the legacy pass manager. llvm-svn: 263234	2016-03-11 13:53:18 +00:00
Chandler Carruth	aef32bd319	[memdep] Just require domtree for memdep. This doesn't cause us to construct dominator trees any more often in the normal pipeline, and removes an entire mode of memdep that needed to be reasoned about and maintained. Perhaps more importantly, it removes the ability for the results of memdep to be different because of accidental pass scheduling goofs or the order of evaluation of 'getResult' calls. Essentially, 'getCachedResult', unless across IR-unit boundaries, is extremely dangerous. We need to work much harder to avoid it (or its analog in the old pass manager). llvm-svn: 263232	2016-03-11 13:46:00 +00:00
Chandler Carruth	3bc9c7fb45	[PM] The order of evaluation of these analyses is actually significant, much to my horror, so use variables to fix it in place. This terrifies me. Both basic-aa and memdep will provide more precise information when the domtree and/or the loop info is available. Because of this, if your pass (like GVN) requires domtree, and then queries memdep or basic-aa, it will get more precise results. If it does this in the other order, it gets less precise results. All of the ideas I have for fixing this are, essentially, terrible. Here I've just caused us to stop having unspecified behavior as different implementations evaluate the order of these arguments differently. I'm actually rather glad that they do, or the fragility of memdep and basic-aa would have gone on unnoticed. I've left comments so we don't immediately break this again. This should fix bots whose host compilers evaluate the order of arguments differently from Clang. llvm-svn: 263231	2016-03-11 13:26:47 +00:00
Vasileios Kalintiris	e2cbc21b6f	[mips] MIPSR6 Instruction itineraries Summary: Defines instruction itineraries for common MIPSR6 instructions. Patch by Simon Dardis. Reviewers: vkalintiris Subscribers: MatzeB, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D17198 llvm-svn: 263229	2016-03-11 13:05:06 +00:00
Daniel Sanders	78e8902097	[mips] Range check simm4. Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D16811 llvm-svn: 263220	2016-03-11 11:37:50 +00:00
Chandler Carruth	b47f8010a9	[PM] Make the AnalysisManager parameter to run methods a reference. This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, many parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219	2016-03-11 11:05:24 +00:00
Chandler Carruth	30a073029c	[PM] Rename the CRTP mixin base classes for the new pass manager to clarify their purpose. Firstly, call them "...Mixin" types so it is clear that there is no type hierarchy being formed here. Secondly, use the term 'Info' to clarify that they aren't adding any interesting semantics to the passes or analyses, just exposing APIs used by the management layer to get information about the pass or analysis. Thanks to Manuel for helping pin down the naming confusion here and come up with effective names to address it. In case you already have some out-of-tree stuff, the following should be roughly what you want to update: perl -pi -e 's/\b(Pass\|Analysis)Base\b/\1InfoMixin/g' llvm-svn: 263217	2016-03-11 10:33:22 +00:00
Chandler Carruth	b4faf13c15	[PM] Implement the final conclusion as to how the analysis IDs should work in the face of the limitations of DLLs and templated static variables. This requires passes that use the AnalysisBase mixin provide a static variable themselves. So as to keep their APIs clean, I've made these private and befriended the CRTP base class (which is the common practice). I've added documentation to AnalysisBase for why this is necessary and at what point we can go back to the much simpler system. This is clearly a better pattern than the extern template as it caught numerous places where the template magic hadn't been applied and things were "just working" but would eventually have broken mysteriously. llvm-svn: 263216	2016-03-11 10:22:49 +00:00
Benjamin Kramer	c126353473	[InstCombine] Use Twines to generate names. Since the names are used in a loop this does more work in debug builds. In release builds value names are generally discarded so we don't have to do the concatenation at all. It's also simpler code, no functional change intended. llvm-svn: 263215	2016-03-11 10:20:56 +00:00
Nikolay Haustov	6560781c4f	[AMDGPU] Assembler: change v_madmk operands to have same order as mad. The constant is now at source operand 1 (previously at 2). This is also how it is in legacy AMD sp3 assembler. Update tests. Differential Revision: http://reviews.llvm.org/D17984 llvm-svn: 263212	2016-03-11 09:27:25 +00:00
Chandler Carruth	45a9c203a0	[PM/AA] Teach the AAManager how to handle module analyses in addition to function analyses, and use it to wire up globals-aa to the new pass manager. llvm-svn: 263211	2016-03-11 09:15:11 +00:00
Chandler Carruth	89c45a162f	[PM] Port GVN to the new pass manager, wire it up, and teach a couple of tests to run GVN in both modes. This is mostly the boring refactoring just like SROA and other complex transformation passes. There is some trickiness in that GVN's ValueNumber class requires hand holding to get to compile cleanly. I'm open to suggestions about a better pattern there, but I tried several before settling on this. I was trying to balance my desire to sink as much implementation detail into the source file as possible without introducing overly many layers of abstraction. Much like with SROA, the design of this system is made somewhat more cumbersome by the need to support both pass managers without duplicating the significant state and logic of the pass. The same compromise is struck here. I've also left a FIXME in a doxygen comment as the GVN pass seems to have pretty woeful documentation within it. I'd like to submit this with the FIXME and let those more deeply familiar backfill the information here now that we have a nice place in an interface to put that kind of documentaiton. Differential Revision: http://reviews.llvm.org/D18019 llvm-svn: 263208	2016-03-11 08:50:55 +00:00
Matt Arsenault	bafc9dc591	AMDGPU: Don't use InstVisitor for AMDGPUPromoteAlloca Frontend authors are strongly encouraged to keep allocas in the entry block, so don't bother visiting every instruction in the other blocks of the function. llvm-svn: 263206	2016-03-11 08:20:50 +00:00
Matt Arsenault	6b6a2c37bc	AMDGPU: R600 code splitting cleanup Move a few functions only used by R600 to R600 specific code, fix header macros to stop using R600, mark classes as final. llvm-svn: 263204	2016-03-11 08:00:27 +00:00
Matt Arsenault	9a19c240c0	AMDGPU: Materialize sign bits with bfrev If a constant is the same as the reverse of an inline immediate, this is 4 bytes smaller than having to embed a 32-bit literal. llvm-svn: 263201	2016-03-11 07:42:49 +00:00
Junmo Park	6098cbbd2c	Minor code cleanups. NFC. llvm-svn: 263200	2016-03-11 07:05:32 +00:00
Junmo Park	4ba6cf69e4	Minor code cleanup. NFC. llvm-svn: 263196	2016-03-11 05:07:07 +00:00
Pete Cooper	adebb9379a	Remove llvm::getDISubprogram in favor of Function::getSubprogram llvm::getDISubprogram walks the instructions in a function, looking for one in the scope of the current function, so that it can find the !dbg entry for the subprogram itself. Now that !dbg is attached to functions, this should not be necessary. This patch changes all uses to just query the subprogram directly on the function. Ideally this should be NFC, but in reality its possible that a function: has no !dbg (in which case there's likely a bug somewhere in an opt pass), or that none of the instructions had a scope referencing the function, so we used to not find the !dbg on the function but now we will Reviewed by Duncan Exon Smith. Differential Revision: http://reviews.llvm.org/D18074 llvm-svn: 263184	2016-03-11 02:14:16 +00:00
Adam Nemet	efb234135c	[LLE] Add missed LoopSimplify dependence The code assumed that we always had a preheader without making the pass dependent on LoopSimplify. Thanks to Mattias Eriksson V for reporting this. llvm-svn: 263173	2016-03-10 23:54:39 +00:00
Tim Northover	6092de5075	AArch64: only try to use scaled fcvt ops on legal vector types. Before we ended up calling getSimpleVectorType on a <3 x float>, which asserted. llvm-svn: 263169	2016-03-10 23:02:21 +00:00
Sanjay Patel	0181943b89	[x86] don't use a shuffle when a vselect will do; NFCI Looking at the IR definition of a masked load made me realize there was no reason to use a shuffle here, so we don't need to convert the format of the mask at all. llvm-svn: 263167	2016-03-10 22:35:33 +00:00
Marianne Mailhot-Sarrasin	eddc5b130e	Test commit access llvm-svn: 263165	2016-03-10 21:54:25 +00:00
Simon Pilgrim	8c9f00f788	Strip trailing whitespace. llvm-svn: 263162	2016-03-10 20:58:11 +00:00
Simon Pilgrim	61eb49e437	[X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion). Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 263159	2016-03-10 20:40:26 +00:00
Artur Pilipenko	3c8fc57e16	Support arbitrary addrspace pointers in masked load/store intrinsics This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 263158	2016-03-10 20:39:22 +00:00
Peter Collingbourne	aba16fca5d	ARM: Support relative references using the PREL31 symbol variant. Differential Revision: http://reviews.llvm.org/D17937 llvm-svn: 263156	2016-03-10 19:30:18 +00:00

... 17 18 19 20 21 ...

89803 Commits