llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	897eee4187	AMDGPU: Remove unused intrinsics llvm-svn: 275371	2016-07-14 05:23:19 +00:00
Matt Arsenault	aa94c1e7ee	AMDGPU: Fix test not actually testing anything It wasn't actually running the pass, and since it is missing the llvm prefix, the eh intrinsic was not really an IntrinsicInst. Also add missing test for lifetime markers. llvm-svn: 275370	2016-07-14 05:23:15 +00:00
Dean Michael Berris	52735fc435	XRay: Add entry and exit sleds Summary: In this patch we implement the following parts of XRay: - Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches. - Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts). - X86-specific nop sleds as described in the white paper. - A machine function pass that adds the different instrumentation marker instructions at a very late stage. - A way of identifying which return opcode is considered "normal" for each architecture. There are some caveats here: 1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet. 2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library. Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D19904 llvm-svn: 275367	2016-07-14 04:06:33 +00:00
Davide Italiano	7dac027ed7	[IPSCCP] Constant fold struct argument/instructions when all the lattice values are constant. This now should also work with the interprocedural variant of the pass. Slightly easier now that the yak is shaved. Differential Revision: http://reviews.llvm.org/D22329 llvm-svn: 275363	2016-07-14 02:51:41 +00:00
Nico Weber	af7e8465e1	Teach fast isel about thiscall (and callee-pop) calls. http://reviews.llvm.org/D22315 llvm-svn: 275360	2016-07-14 01:52:51 +00:00
Mehdi Amini	8484f92f7f	[Scalarizer] PR28108: Skip over nullptr rather than crashing on it. Summary: In Scalarizer::gather we see if we already have a scattered form of Op, and in that case use the new form. In the particular case of PR28108, the found ValueVector SV has size 2, where the first Value is nullptr, and the second is indeed a proper Value. The nullptr then caused an assert to blow when we tried to do cast<Instruction>(SV[I]). With this patch we check SV[I] before doing the cast, and if it's nullptr we just skip over it. I don't know the Scalarizer well enough to know if this is the best fix or if something should be done else where to prevent the nullptr from being in the ValueVector at all, but at least this avoids the crash and looking at the test case output it looks reasonable. Reviewers: hfinkel, frasercrmck, wala, mehdi_amini Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21518 llvm-svn: 275359	2016-07-14 01:31:25 +00:00
Mehdi Amini	9e332a7719	Add missing test for r275347 "[IPRA] Set callee saved registers to none for local function when IPRA is enabled." llvm-svn: 275358	2016-07-14 01:31:20 +00:00
Adrian Prantl	0418ef2691	Synchronize LLVM and clang's ObjCDeclSpec::ObjCPropertyAttributeKind. This adds Clang-specific DWARF constants for nullability and ObjC class properties that are already generated by clang. This patch adds dwarfdump support and a more comprehensive testcase. <rdar://problem/27335745> llvm-svn: 275354	2016-07-14 00:41:18 +00:00
David Majnemer	7f781aba97	[ConstantFolding] Fold masked loads We can constant fold a masked load if the operands are appropriately constant. Differential Revision: http://reviews.llvm.org/D22324 llvm-svn: 275352	2016-07-14 00:29:50 +00:00
David Majnemer	f89660aba7	[ConstantFolding] Extend FoldReinterpretLoadFromConstPtr to handle negative offsets Treat loads which clip before the start of a global initializer the same way we treat clipping beyond the end of the initializer: use zeros. llvm-svn: 275345	2016-07-13 23:33:07 +00:00
Michael Kuperstein	be837fa40f	[DAG] Correctly chain masked loads If a masked loads is not added to the chain, it should not reset the chain's root. This fixes the remaining part of PR28515. llvm-svn: 275340	2016-07-13 23:23:40 +00:00
Quentin Colombet	68a84587c5	[MIR] Fix one GlobalISel test case that I missed in r275314. llvm-svn: 275333	2016-07-13 22:35:33 +00:00
Nico Weber	b888555bcc	Add a triple to fix test on bots after 275320. llvm-svn: 275327	2016-07-13 22:19:40 +00:00
Nico Weber	eb9488b151	Fix a TODO in X86CallFrameOptimization to not rely on a codegen artifact. This happens to make X86CallFrameOptimization in -O0 / FastISel builds as well, but it's not clear if the pass should run in that setup. http://reviews.llvm.org/D22314 llvm-svn: 275320	2016-07-13 21:38:27 +00:00
Alina Sbirlea	640a61cd8b	Extended LoadStoreVectorizer to vectorize subchains. Summary: LSV used to abort vectorizing a chain for interleaved load/store accesses that alias. Allow a valid prefix of the chain to be vectorized, mark just the prefix and retry vectorizing the remaining chain. Reviewers: llvm-commits, jlebar, arsenm Subscribers: mzolotukhin Differential Revision: http://reviews.llvm.org/D22119 llvm-svn: 275317	2016-07-13 21:20:01 +00:00
Quentin Colombet	545e558b82	[MIR] Print on the given output instead of stderr. Currently the MIR framework prints all its outputs (errors and actual representation) on stderr. This patch fixes that by printing the regular output in the output specified with -o. Differential Revision: http://reviews.llvm.org/D22251 llvm-svn: 275314	2016-07-13 20:36:03 +00:00
Matt Arsenault	f071102647	AMDGPU: Remove last AMDIL intrinsics llvm-svn: 275309	2016-07-13 19:42:06 +00:00
Andrew Kaylor	346dd7f1bd	Reverting r275284 due to platform-specific test failures llvm-svn: 275304	2016-07-13 19:09:16 +00:00
Sanjay Patel	eff2aa70fc	add more tests for zexty xor sandwiches ...mmm sandwiches llvm-svn: 275302	2016-07-13 18:58:55 +00:00
Simon Pilgrim	5d664af3c3	[X86][SSE] Regenerate truncated shift test Check SSE2 and AVX2 implementations llvm-svn: 275300	2016-07-13 18:50:10 +00:00
Simon Pilgrim	631643e7d9	Regenerate test llvm-svn: 275299	2016-07-13 18:46:37 +00:00
Sanjay Patel	904a88025a	add test for zexty xor sandwich llvm-svn: 275297	2016-07-13 18:40:38 +00:00
Krzysztof Parzyszek	cb4dd7656b	Move mempcpy_call.ll to X86 subdirectory llvm-svn: 275294	2016-07-13 18:28:45 +00:00
Sanjay Patel	c00e48a3db	[InstCombine] extend vector select matching for non-splat constants In D21740, we discussed trying to make this a more general matcher. However, I didn't see a clean way to handle the regular m_Not cases and these non-splat vector patterns, so I've opted for the direct approach here. If there are other potential uses of areInverseVectorBitmasks(), we could move that helper function to a higher level. There is an open question as to which is of these forms should be considered the canonical IR: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a, <4 x i32> %b %shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 5, i32 6, i32 3> Differential Revision: http://reviews.llvm.org/D22114 llvm-svn: 275289	2016-07-13 18:07:02 +00:00
Andrew Kaylor	12cccdd731	Fix for Bug 26903, adds support to inline __builtin_mempcpy Patch by Sunita Marathe Differential Revision: http://reviews.llvm.org/D21920 llvm-svn: 275284	2016-07-13 17:25:11 +00:00
Matthias Braun	512424f28a	PatchableFunction: Skip pseudos that do not create code This fixes http://llvm.org/PR28524 llvm-svn: 275278	2016-07-13 16:37:29 +00:00
Teresa Johnson	b907d06151	[ThinLTO/gold] Enable symbol resolution in distributed backend case While testing a follow-on change to enable index-based symbol resolution and internalization in the distributed backends, I realized that a test case change I made in r275247 was only required because we were not analyzing symbols in the claimed files in thinlto-index-only mode. In the fixed test case there should be no internalization because we are linking in -shared mode, so f() is in fact exported, which is detected properly when we analyze symbols in thinlto-index-only mode. Note that this is not (yet) a correctness issue (because we are not yet performing the index-based linkage optimizations in the distributed backends - that's coming in a follow-on patch). llvm-svn: 275277	2016-07-13 16:35:56 +00:00
Sanjay Patel	610a2f6525	[x86][SSE/AVX] optimize pcmp results better (PR28484) We know that pcmp produces all-ones/all-zeros bitmasks, so we can use that behavior to avoid unnecessary constant loading. One could argue that load+and is actually a better solution for some CPUs (Intel big cores) because shifts don't have the same throughput potential as load+and on those cores, but that should be handled as a CPU-specific later transformation if it ever comes up. Removing the load is the more general x86 optimization. Note that the uneven usage of vpbroadcast in the test cases is filed as PR28505: https://llvm.org/bugs/show_bug.cgi?id=28505 Differential Revision: http://reviews.llvm.org/D22225 llvm-svn: 275276	2016-07-13 16:04:07 +00:00
Simon Pilgrim	a99368fa35	[X86][AVX512] Add support for VPERMILPD/VPERMILPS variable shuffle mask comments llvm-svn: 275272	2016-07-13 15:45:36 +00:00
Simon Pilgrim	48d8340760	[X86][AVX] Add support for target shuffle combining to VPERMILPS variable shuffle mask Added AVX512F VPERMILPS shuffle decoding support llvm-svn: 275270	2016-07-13 15:10:43 +00:00
Tom Stellard	418beb7671	AMDGPU/SI: Add support for R_AMDGPU_GOTPCREL Reviewers: rafael, ruiu, tony-tye, arsenm, kzhuravl Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21484 llvm-svn: 275268	2016-07-13 14:23:33 +00:00
Nirav Dave	8ea792db60	[MC] Fix lexing ordering in assembly label parsing to preserve same line comment placement. llvm-svn: 275265	2016-07-13 14:03:12 +00:00
Matt Arsenault	0056868c4a	AMDGPU: Fold out no-op kill intrinsics llvm-svn: 275253	2016-07-13 06:04:22 +00:00
David Majnemer	1b3db33e3d	[ConstantFolding] Don't treat negative GEP offsets as positive GEP offsets are signed, don't treat them as huge positive numbers. llvm-svn: 275251	2016-07-13 05:16:16 +00:00
Adam Nemet	c2f791d8a7	[BFI] Add new LazyBFI analysis pass Summary: This is necessary for D21771. In order to add the hotness attribute to optimization remarks we need BFI to be available in all passes that emit optimization remarks. However we don't want to pay for computing BFI unless the hotness attribute is requested. This is achieved by making BFI lazy at the very high-level through a new analysis pass -- BFI is not calculated unless requested. I am adding a test to check the laziness under D21771 where the first user of the analysis is added. Reviewers: hfinkel, dexonsmith, davidxl Subscribers: davidxl, dexonsmith, llvm-commits Differential Revision: http://reviews.llvm.org/D22141 llvm-svn: 275250	2016-07-13 05:01:48 +00:00
Teresa Johnson	27694571b1	[ThinLTO/gold] ThinLTO internalization fixes Internalization was missing cases where we originally had a local symbol that was promoted eagerly but not actually exported. This is because we were only internalizing the set of global (non-local) symbols that were PREVAILAING_DEF_IRONLY. Instead, collect the set of global symbols that are referenced outside of a single IR file, and skip internalization for those. llvm-svn: 275247	2016-07-13 03:42:41 +00:00
David Majnemer	a7b6c973e5	[ConstantFold] Don't incorrectly infer inbounds on array GEP The many levels of nesting inside the responsible code made it easy for bugs to sneak in. Flattening the logic makes it easier to see what's going on. llvm-svn: 275244	2016-07-13 03:24:41 +00:00
Keno Fischer	1efc3b70c5	Fix ScalarEvolutionExpander step scaling bug The expandAddRecExprLiterally function incorrectly transforms `[Start + Step * X]` into `Step * [Start + X]` instead of the correct transform of `[Step * X] + Start`. This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219 due to what appeared to be sufficiently complicated loop interactions. Patch by Jameson Nash (jameson@juliacomputing.com). Reviewers: sanjoy Differential Revision: http://reviews.llvm.org/D16505 llvm-svn: 275239	2016-07-13 01:28:12 +00:00
Dehao Chen	9cba1f4e7e	New pass manager for LICM. Summary: Port LICM to the new pass manager. Reviewers: davidxl, silvas Subscribers: krasin, vitalybuka, silvas, davide, sanjoy, llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21772 llvm-svn: 275222	2016-07-12 22:37:48 +00:00
Tim Northover	72eebfa4b0	GlobalISel: freeze reserved regs after IRTranslator. We can freeze the registers after the MachineFrameInfo has been configured (by telling it about calls, inline asm, ...). This doesn't happen at all yet, but will be part of IR translation. Fixes -verify-machineinstrs assertion. llvm-svn: 275221	2016-07-12 22:23:42 +00:00
Matt Arsenault	786724a22e	AMDGPU: Follow up to r275203 I meant to squash this into it. llvm-svn: 275220	2016-07-12 21:41:32 +00:00
Nemanja Ivanovic	f0407e3902	The test case I added is PowerPC specific but I accidentally had it in the wrong directory. Moved it to CodeGen/PowerPC. Sorry about the noise. llvm-svn: 275218	2016-07-12 21:24:08 +00:00
Michael Kuperstein	a99c46cc73	[LV] Remove wrong assumption about LCSSA The LCSSA pass itself will not generate several redundant PHI nodes in a single exit block. However, such redundant PHI nodes don't violate LCSSA form, and may be introduced by passes that preserve LCSSA, and/or preserved by the LCSSA pass itself. So, assuming a single PHI node per exit block is not safe. llvm-svn: 275217	2016-07-12 21:24:06 +00:00
Nemanja Ivanovic	b43bb6141e	[Power9] Add codegen for VSX word insert/extract instructions This patch corresponds to review: http://reviews.llvm.org/D20239 It adds exploitation of XXINSERTW and XXEXTRACTUW instructions that are useful in some cases for inserting and extracting vector elements of v4[if]32 vectors. llvm-svn: 275215	2016-07-12 21:00:10 +00:00
Piotr Padlewski	fa0cdb371b	Review fixes to lit documentation Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22245 llvm-svn: 275214	2016-07-12 20:59:17 +00:00
Simon Pilgrim	6fa71da4a4	[X86][AVX] Add support for target shuffle combining to VPERM2F128/VPERM2I128 llvm-svn: 275212	2016-07-12 20:27:32 +00:00
Davide Italiano	0080269342	[SCCP] Constant fold structs if all the lattice value are constant. Differential Revision: http://reviews.llvm.org/D22269 llvm-svn: 275208	2016-07-12 19:54:19 +00:00
Matthias Braun	96ec47db74	X86FixupBWInsts: No need for forward liveness analysis. With r274952 and r275201 in place there are no cases left where a forward liveness analysis yields different results than a backward one. So we can remove the forward stepping logic. Differential Revision: http://reviews.llvm.org/D22083 llvm-svn: 275204	2016-07-12 19:04:30 +00:00
Matt Arsenault	657f871a4e	AMDGPU: Fix verifier error with kill intrinsic Don't create a terminator in the middle of the block. We should probably get rid of this intrinsic. llvm-svn: 275203	2016-07-12 19:01:23 +00:00
Dehao Chen	b9f8e29290	[PM] Port LoopIdiomRecognize Pass to new PM Summary: Port LoopIdiomRecognize Pass to new PM Reviewers: davidxl Subscribers: davide, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D22250 llvm-svn: 275202	2016-07-12 18:45:51 +00:00
Wei Ding	5b2636a152	AMDGPU: Add LLVM IR Intrinsic for v_lerp_u8 Differential Revision: http://reviews.llvm.org/D22239 llvm-svn: 275197	2016-07-12 18:02:14 +00:00
Xinliang David Li	9eb472ba4b	[PGO] Don't include full file path in static function profile counter names Patch by Jake VanAdrighem Differential Revision: http://reviews.llvm.org/D22028 llvm-svn: 275193	2016-07-12 17:14:51 +00:00
Sanjay Patel	4a6a751dce	add tests for missing DeMorgan's Law folds llvm-svn: 275192	2016-07-12 17:05:04 +00:00
Sanjay Patel	3900191ecc	auto-generate checks llvm-svn: 275188	2016-07-12 16:21:55 +00:00
Sanjay Patel	93dffe629a	auto-generate checks llvm-svn: 275187	2016-07-12 16:17:30 +00:00
Sanjay Patel	6d1f227e6b	auto-generate checks llvm-svn: 275186	2016-07-12 16:13:04 +00:00
Haicheng Wu	711ca868fc	[AArch64] Set FMOVS0 and FMOVD0 as isAsCheapAsAMove when needed. If a subtarget has both ZCZeroing and CustomCheapAsMoveHandling features (now only Kryo has both), set FMOVS0 and FMOVD0 isAsCheapAsAMove. Differential Revision: http://reviews.llvm.org/D22256 llvm-svn: 275178	2016-07-12 15:31:41 +00:00
Nemanja Ivanovic	eebbcb6d57	[PowerPC] Cannonicalize applicable vector shift immediates as swaps This patch corresponds to review: http://reviews.llvm.org/D21358 Vector shifts that have the same semantics as a vector swap are cannonicalized as such to provide additional opportunities for swap removal optimization to remove unnecessary swaps. llvm-svn: 275168	2016-07-12 12:16:27 +00:00
Amjad Aboud	acee568545	[codeview] Improved array type support. Added support for: 1. Multi dimension array. 2. Array of structure type, which previously was declared incompletely. 3. Dynamic size array. 4. Array where element type is a typedef, volatile or constant (this should resolve PR28311). Differential Revision: http://reviews.llvm.org/D21526 llvm-svn: 275167	2016-07-12 12:06:34 +00:00
Nicolai Haehnle	7968c34586	AMDGPU: Unify MOVRELSOffset and MOVRELDOffset Summary: Previously, constant index insertelements would be turned into SI_INDIRECT_DST, which is bound to prevent some optimization opportunities. Worse, it mislead the heuristic that decides whether immediates should be lowered to S_MOV_B32 or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22217 llvm-svn: 275160	2016-07-12 08:12:16 +00:00
Vitaly Buka	204dc533c5	Revert "New pass manager for LICM." Summary: This reverts commit r275118. Subscribers: sanjoy, mehdi_amini Differential Revision: http://reviews.llvm.org/D22259 llvm-svn: 275156	2016-07-12 06:25:32 +00:00
Craig Topper	a6e6febe2c	[AVX512] Remove masked logic op intrinsics and autoupgrade them to native IR. llvm-svn: 275155	2016-07-12 05:27:53 +00:00
Ivan Krasin	5474645dc8	Print remarks from WholeProgramDevirt pass for each call site. Summary: It's useful to have some visibility about which call sites are devirtualized, especially for debug purposes. Another use case is a regression test on the application side (like, Chromium). Reviewers: pcc Differential Revision: http://reviews.llvm.org/D22252 llvm-svn: 275145	2016-07-12 02:38:37 +00:00
NAKAMURA Takumi	e92e2124f6	llvm/test/CodeGen/AMDGPU/selected-stack-object.ll REQUIRES +Asserts, since it expects assertion failure. llvm-svn: 275144	2016-07-12 02:18:09 +00:00
Haicheng Wu	1e39574e9f	[Kryo] Enable ZCZeroing feature This feature uses immediate #0 to zero a register. Differential Revision: http://reviews.llvm.org/D19985 llvm-svn: 275143	2016-07-12 02:04:01 +00:00
Nico Weber	c7bf646a99	Teach FastISel about thiscall (and, hence, about callee-pop). http://reviews.llvm.org/D22115 llvm-svn: 275135	2016-07-12 01:30:35 +00:00
Matt Arsenault	45f8216cee	AMDGPU: Remove superfluous string attributes from tests Also fix v_mac.ll not testing right thing for fneg llvm-svn: 275129	2016-07-11 23:35:48 +00:00
Mehdi Amini	e75aa6f674	Add a libLTO API to query a memory buffer and check if it contains ObjC categories The linker supports a feature to force load an object from a static archive if it defines an Objective-C category. This API supports this feature by looking at every section in the module to find if a category is defined in the module. llvm-svn: 275125	2016-07-11 23:10:18 +00:00
Dehao Chen	7ef5820fa3	New pass manager for LICM. Summary: Port LICM to the new pass manager. Reviewers: davidxl, silvas Subscribers: silvas, davide, sanjoy, llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21772 llvm-svn: 275118	2016-07-11 22:45:24 +00:00
Alina Sbirlea	cbc6ac2afd	Correct ordering of loads/stores. Summary: Aiming to correct the ordering of loads/stores. This patch changes the insert point for loads to the position of the first load. It updates the ordering method for loads to insert before, rather than after. Before this patch the following sequence: "load a[1], store a[1], store a[0], load a[2]" Would incorrectly vectorize to "store a[0,1], load a[1,2]". The correctness check was assuming the insertion point for loads is at the position of the first load, when in practice it was at the last load. An alternative fix would have been to invert the correctness check. The current fix changes insert position but also requires reordering of instructions before the vectorized load. Updated testcases to reflect the changes. Reviewers: tstellarAMD, llvm-commits, jlebar, arsenm Subscribers: mzolotukhin Differential Revision: http://reviews.llvm.org/D22071 llvm-svn: 275117	2016-07-11 22:34:29 +00:00
Tim Northover	3e0361710a	ARM: validate immediate branch targets in AsmParser. Immediate branch targets aren't commonly used, but if they are we should make sure they can actually be encoded. This means they must be divisible by 2 when targeting Thumb mode, and by 4 when targeting ARM mode. Also do a little naming cleanup while I was changing everything around anyway. llvm-svn: 275116	2016-07-11 22:29:37 +00:00
Nicolai Haehnle	c06bfa1daa	AMDGPU: Treat texture gather instructions more like other MIMG instructions Summary: Setting MIMG to 0 has a bunch of unexpected side effects, including that isVMEM returns false which leads to incorrect treatment in the hazard recognizer. The reason I noticed it is that it also leads to incorrect treatment in VGPR-to-SGPR copies, which is one cause of the referenced bug. The only reason why MIMG was set to 0 is to signal the special handling of dmasks, but that can be checked differently. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96877 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22210 llvm-svn: 275113	2016-07-11 21:59:43 +00:00
Zachary Turner	dbeaea7b35	Refactor the PDB writing to use a builder approach llvm-svn: 275110	2016-07-11 21:45:26 +00:00
Zachary Turner	f6b9382467	[pdb] Add a pdb2yaml option to not dump file headers. This will be useful once we start adding the ability to dump type records and symbol records, since it will allow us to generate mergeable information instead of information that specifies an entire file. llvm-svn: 275109	2016-07-11 21:45:09 +00:00
Nicolai Haehnle	f52c3cf272	AMDGPU: fix local stack slot allocation bugs Summary: The main bug fix here is using the 32-bit encoding of V_ADD_I32 in materializeFrameBaseRegister and resolveFrameIndex, so that arbitrary immediates work. The second part is that we may now require the SegmentWaveByteOffset even when there are initially no stack objects and VGPR spilling isn't enabled, for stack slots that are allocated later. This means that some bits become effectively dead and can be cleaned up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96602 Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21551 llvm-svn: 275108	2016-07-11 21:44:40 +00:00
Michael Kuperstein	f0c59330e9	[X86] Make some cast costs more precise Make some AVX and AVX512 cast costs more precise. Based on part of a patch by Elena Demikhovsky (D15604). Differential Revision: http://reviews.llvm.org/D22064 llvm-svn: 275106	2016-07-11 21:39:44 +00:00
Quentin Colombet	fb82c7bc94	[X86] Fix tailcall return address clobber bug. This bug (llvm.org/PR28124) was introduced by r237977, which refactored the tail call sequence to be generated in two passes instead of one. Unfortunately, the stack adjustment produced by the first pass was not recognized by X86FrameLowering::mergeSPUpdates() in all cases, causing code such as the following, which clobbers the return address, to be generated: popl %edi popl %edi pushl %eax jmp tailcallee # TAILCALL To fix the problem, the entire stack adjustment is performed in X86ExpandPseudo::ExpandMI() for tail calls. Patch by Magnus Lång <margnus1@gmail.com> Differential Revision: http://reviews.llvm.org/D21325 llvm-svn: 275103	2016-07-11 21:03:03 +00:00
Alina Sbirlea	327955e057	Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains. Add additional parameters: AddressSpace, Alignment, Fast. Reviewers: llvm-commits, jlebar Subscribers: arsenm, mzolotukhin Differential Revision: http://reviews.llvm.org/D21935 llvm-svn: 275100	2016-07-11 20:46:17 +00:00
Michael Kuperstein	cfbac5f361	[X86] Disable FixupSetCC for CodeGenOpt::None It is an optimization pass, and should not run at -O0. Especially since Fast RA will not do the required register coalescing anyway, so it's a loss even from the optimization standpoint. This also works around (but doesn't quite fix) PR28489. llvm-svn: 275099	2016-07-11 20:40:44 +00:00
Chad Rosier	4f0dad1674	[IPRA] Properly compute register usage at call sites. Differential Revision: http://reviews.llvm.org/D21395 Patch by Vivek Pandya. PR28144 llvm-svn: 275087	2016-07-11 18:45:49 +00:00
Zhan Jun Liau	def708a0f9	[SystemZ] Recognize Load On Condition Immediate (LOCHI/LOGHI) opportunities Summary: Add support for the z13 instructions LOCHI and LOCGHI which conditionally load immediate values. Add target instruction info hooks so that if conversion will allow predication of LHI/LGHI. Author: RolandF Reviewers: uweigand Subscribers: zhanjunl Commiting on behalf of Roland. Differential Revision: http://reviews.llvm.org/D22117 llvm-svn: 275086	2016-07-11 18:45:03 +00:00
Jingyue Wu	641cfee976	[SLSR] Call getPointerSizeInBits with the correct address space. llvm-svn: 275083	2016-07-11 18:13:28 +00:00
Davide Italiano	e8ae0b5eb4	[PM/IPO] Port LowerTypeTests to the new PassManager. There's a little bit of churn in this patch because the initialization mechanism is now shared between the old and the new PM. Other than that, it's just a pretty mechanical translation. llvm-svn: 275082	2016-07-11 18:10:06 +00:00
Jacques Pienaar	c3a162c451	[lanai] Add more tests for assembly of conditional ALU ops llvm-svn: 275081	2016-07-11 17:58:16 +00:00
Dehao Chen	9232f98279	Implement callsite-hotness based inline cost for Sample-based PGO Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073	2016-07-11 16:48:54 +00:00
Dehao Chen	29d2641f52	Tune the weight propagation algorithm for sample profile. Summary: Handle the case when there is only one incoming/outgoing edge for a visited basic block: use the block weight to adjust edge weight even when the edge has been visited before. This can help reduce inaccuracies introduced by incorrect basic block profile, as shown in the updated unittest. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22180 llvm-svn: 275072	2016-07-11 16:40:17 +00:00
Sanjay Patel	8f1d408c74	[x86] make some of the tests 256-bit for testing diversity llvm-svn: 275070	2016-07-11 15:08:37 +00:00
Nirav Dave	8603062ee4	Fix branch relaxation in 16-bit mode. Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation to generate jumps with 16-bit sized immediates in 16-bit mode. This fixes PR22097. Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D20830 llvm-svn: 275068	2016-07-11 14:23:53 +00:00
Sanjay Patel	b428951990	[x86] specify triple to avoid bot failures llvm-svn: 275067	2016-07-11 14:17:54 +00:00
Nicolai Haehnle	889a20cf40	[Sink] Don't move calls to readonly functions across stores Summary: Reviewers: hfinkel, majnemer, tstellarAMD, sunfish Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17279 llvm-svn: 275066	2016-07-11 14:11:51 +00:00
Sanjay Patel	0d38830aca	[x86] update checks llvm-svn: 275064	2016-07-11 14:07:31 +00:00
Nirav Dave	53a72f4d3c	Provide support for preserving assembly comments Preserve assembly comments from input in output assembly and flags to toggle property. This is on by default for inline assembly and off in llvm-mc. Parsed comments are emitted immediately before an EOL which generally places them on the expected line. Reviewers: rtrieu, dwmw2, rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20020 llvm-svn: 275058	2016-07-11 12:42:14 +00:00
Artem Tamazov	53c9de08d2	[AMDGPU][llvm-mc] Quickfix for r272748 to enable labels in branch instructions. Fixes issue mentioned at: https://github.com/RadeonOpenCompute/LLVM-AMDGPU-Assembler-Extra/issues/13. Lit tests added. Differential Revision: http://reviews.llvm.org/D22133 llvm-svn: 275054	2016-07-11 12:07:18 +00:00
Zlatko Buljan	cba9f80ba8	[mips][microMIPS] Implement LDC1, SDC1, LDC2, SDC2, LWC1, SWC1, LWC2 and SWC2 instructions and add CodeGen support Differential Revision: http://reviews.llvm.org/D18824 llvm-svn: 275050	2016-07-11 07:41:56 +00:00
Elena Demikhovsky	d84f337953	AVX-512: DAG lowering for scalar MIN/MAX commutable ops DAG lowering was missing for the scalar FMINC, FMAXC nodes. The nodes are generated only in the "unsafe-fp-math" mode. Added tests. llvm-svn: 275048	2016-07-11 06:08:06 +00:00
Craig Topper	7ee070e7bc	[AVX512] Add support for 512-bit ANDN now that all ones build vectors survive long enough to allow the matching. llvm-svn: 275046	2016-07-11 05:36:53 +00:00
Craig Topper	516e14cd8e	[AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one vectors. llvm-svn: 275045	2016-07-11 05:36:48 +00:00
Hal Finkel	02012bcfee	Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot. llvm-svn: 275042	2016-07-11 04:51:23 +00:00
Hal Finkel	2cac58f604	Pointer-comparison folding should look through returned-argument functions For functions which are known to return a specific argument, pointer-comparison folding can look through the function calls as part of its analysis. Differential Revision: http://reviews.llvm.org/D9387 llvm-svn: 275039	2016-07-11 03:37:59 +00:00
Hal Finkel	bf3957a553	Teach isDereferenceablePointer to look through returned-argument functions For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 llvm-svn: 275038	2016-07-11 03:08:49 +00:00
Hal Finkel	e186debb8b	Teach SCEV to look through returned-argument functions When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 llvm-svn: 275037	2016-07-11 02:48:23 +00:00
Hal Finkel	6fd5e1f02b	Teach computeKnownBits to look through returned-argument functions If a function is known to return one of its arguments, we can use that in order to compute known bits of the return value. Differential Revision: http://reviews.llvm.org/D9397 llvm-svn: 275036	2016-07-11 02:25:14 +00:00
Hal Finkel	5c12d8fe8f	BasicAA should look through functions with returned arguments Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 llvm-svn: 275035	2016-07-11 01:32:20 +00:00
Hal Finkel	d66a7b05db	Let FuncAttrs infer the 'returned' argument attribute A function can have one argument with the 'returned' attribute, indicating that the associated argument is always the return value of the function. Add FuncAttrs inference logic. Differential Revision: http://reviews.llvm.org/D22202 llvm-svn: 275027	2016-07-10 22:02:55 +00:00
Jan Vesely	2fa28c330c	AMDGPU/R600: Add implicitarg.ptr intrinsic Differential Revision: http://reviews.llvm.org/D21622 llvm-svn: 275024	2016-07-10 21:20:29 +00:00
Simon Pilgrim	2191faa433	[X86][SSE] Add support for target shuffle combining to PSHUFLW/PSHUFHW llvm-svn: 275022	2016-07-10 21:02:47 +00:00
Sanjay Patel	ccd08fc8c4	[x86, SSE, AVX] add tests for icmp+zext (PR28484) Note the inconsistent vpbroadcast generation for AVX2; another bug. llvm-svn: 275020	2016-07-10 20:45:14 +00:00
Simon Pilgrim	51c786bd91	[X86][SSE] Added tests for combining shuffles to PSHUFLW/PSHUFHW llvm-svn: 275019	2016-07-10 20:19:56 +00:00
Marcin Koscielnicki	cf7cc724a7	[SystemZ] Utilize Test Data Class instructions. This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32\|64\|128), which maps straight to the test data class instructions. A new IR pass is added to recognize instructions that can be converted to TDC and perform the necessary replacements. Differential Revision: http://reviews.llvm.org/D21949 llvm-svn: 275016	2016-07-10 14:41:22 +00:00
Craig Topper	0b0954570a	[AVX512] Add support for lowering to 512-bit SHUFPS. llvm-svn: 275011	2016-07-10 05:55:53 +00:00
Sean Silva	db90d4d9c1	[PM] Port LoopVectorize to the new PM. llvm-svn: 275000	2016-07-09 22:56:50 +00:00
Simon Pilgrim	606126e848	[X86][SSE] Add support for target shuffle combining to INSERTPS llvm-svn: 274990	2016-07-09 21:47:55 +00:00
Simon Pilgrim	890b415902	[X86][SSE] Regenerate vector shift tests llvm-svn: 274987	2016-07-09 20:55:20 +00:00
David Majnemer	28c3646f82	[COFF, Dwarf] Don't emit DW_AT_location for dllimported entities There exists no relocation which can describe the address of a dllimported variable: do not try to describe their location. llvm-svn: 274986	2016-07-09 20:47:48 +00:00
Jingyue Wu	debce55ac3	[SLSR] Fix crash on handling 128-bit integers. ConstantInt::getSExtValue may fail on >64-bit integers. Add checks to call getSExtValue only on narrow integers. As a minor aside, simplify slsr-gep.ll to remove unnecessary load instructions. llvm-svn: 274982	2016-07-09 19:13:18 +00:00
Jacques Pienaar	b32a912f72	[lanai] Treat .t as optional in assembly parser for RR operands and add predicate operand to ShiftRR llvm-svn: 274980	2016-07-09 18:26:04 +00:00
Matt Arsenault	c1e6a45f2e	AMDGPU: Merge / reorganize tests llvm-svn: 274972	2016-07-09 08:02:28 +00:00
Matt Arsenault	b2cb5f8105	AMDGPU: Simplify tests with per function subtargets llvm-svn: 274971	2016-07-09 07:55:03 +00:00
Matt Arsenault	dfec5ce032	AMDGPU: Fix fdiv lowering when f32 denormals supported Also fix test not actually using function labels. llvm-svn: 274969	2016-07-09 07:48:11 +00:00
Craig Topper	70610cf7b6	[X86] Remove and autoupgrade 512-bit non-temporal store intrinsics. llvm-svn: 274966	2016-07-09 04:38:27 +00:00
Davide Italiano	92b933a55c	[PM] Port CrossDSOCFI to the new pass manager. llvm-svn: 274962	2016-07-09 03:25:35 +00:00
Davide Italiano	cd96cfd8df	[PM] Port LoopSimplify to the new pass manager. While here move simplifyLoop() function to the new header, as suggested by Chandler in the review. Differential Revision: http://reviews.llvm.org/D21404 llvm-svn: 274959	2016-07-09 03:03:01 +00:00
Matt Arsenault	1322b6f8bb	AMDGPU: Improve offset folding for register indexing llvm-svn: 274954	2016-07-09 01:13:56 +00:00
Matthias Braun	152e7c8b12	VirtRegMap: Replace some identity copies with KILL instructions. An identity COPY like this: %AL = COPY %AL, %EAX<imp-def> has no semantic effect, but encodes liveness information: Further users of %EAX only depend on this instruction even though it does not define the full register. Replace the COPY with a KILL instruction in those cases to maintain this liveness information. (This reverts a small part of r238588 but this time adds a comment explaining why a KILL instruction is useful). llvm-svn: 274952	2016-07-09 00:19:07 +00:00
Piotr Padlewski	7a298c1df0	Added REQUIRES to TestingGuide documentation Reviewers: alexfh, wolfgangp, rengolin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22172 llvm-svn: 274949	2016-07-08 23:47:29 +00:00
Piotr Padlewski	3b77612839	Add 'thinlto_src_module' md with asserts or -enable-import-metadata Summary: This way the metadata will be only generated when asserts enabled, or when -enable-import-metadata specified FIXED missing colon on requires. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D22167 llvm-svn: 274947	2016-07-08 23:01:49 +00:00
Piotr Padlewski	d4b792346c	Revert "Add 'thinlto_src_module' md with asserts or -enable-import-metadata" Reverting because of 17463 http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17463 This reverts commit d20cb431bba2ba43b4c65a8556cff445bfefbb7c. llvm-svn: 274946	2016-07-08 22:55:48 +00:00
Jacques Pienaar	9e70127b0a	[lanai] Update test to use peephole-opt and not peephole-opts llvm-svn: 274945	2016-07-08 22:28:29 +00:00
Anna Thomas	9ad45adfd7	Revert "InstCombine rule to fold truncs whose value is available" This reverts commit r274853. Caused failure in ppcBE build llvm-svn: 274943	2016-07-08 22:15:08 +00:00
David Majnemer	230bbfbeec	[MC, COFF] Permit a variable to be redefined Our assertions in WinCOFFStreamer had unexpected side effects resulting in symbols getting unexpectedly marked as used. This fixes PR28462. llvm-svn: 274941	2016-07-08 21:54:16 +00:00
Piotr Padlewski	d6efefa2b8	Add 'thinlto_src_module' md with asserts or -enable-import-metadata Summary: This way the metadata will be only generated when asserts enabled, or when -enable-import-metadata specified Reviewers: tejohnson, eraman, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D22167 llvm-svn: 274938	2016-07-08 21:25:39 +00:00
Matt Arsenault	3fb8f9eabf	Reapply r274829 with fix for FP vectors llvm-svn: 274937	2016-07-08 21:25:33 +00:00
Adam Nemet	f836067cc0	[LAA] Port test to the new PM This is a follow-on to r274452. The LAA with the new PM is a loop pass so we go from inner to outer loops. Also using a CHECK-NOT didn't make much sense because we print something in either case; whether an invariant is 'found' or 'not found'. llvm-svn: 274935	2016-07-08 21:24:06 +00:00
Sanjay Patel	664514f7fe	[InstCombine] don't form select from bitcasted logic ops if bitcasts have >1 use This isn't a sure thing (are 2 extra bitcasts less expensive than a logic op?), but we'll try to err on the conservative side by going with the case that has less IR instructions. Note: This question came up in http://reviews.llvm.org/D22114 , but this part is independent of that patch proposal, so I'm making this small change ahead of that one. See also: http://reviews.llvm.org/rL274926 llvm-svn: 274932	2016-07-08 21:17:51 +00:00
Sanjay Patel	5246482c7a	add another multi-use test for logic->select transform llvm-svn: 274929	2016-07-08 21:08:16 +00:00
Sanjay Patel	f4a08ede03	[InstCombine] don't form select from logic ops if it's unlikely that we'll eliminate any ops llvm-svn: 274926	2016-07-08 20:53:29 +00:00
Sanjay Patel	297a0e67b6	adjust test so it won't completely optimize away llvm-svn: 274925	2016-07-08 20:35:53 +00:00
Sanjay Patel	0733e6b61c	add tests for multi-use folding to select llvm-svn: 274922	2016-07-08 20:22:27 +00:00
Dehao Chen	429f5c735f	Remove inline hints computation from SampleProfile.cpp Summary: As we will move to use uniformed hotness check in inliner, we do not need inline hints in SampleProfile pass any more. Reviewers: dnovillo, davidxl Subscribers: eraman, llvm-commits Differential Revision: http://reviews.llvm.org/D19287 llvm-svn: 274918	2016-07-08 20:12:44 +00:00
Nico Weber	28410c6846	Revert r274829, it caused PR28472. llvm-svn: 274916	2016-07-08 19:52:19 +00:00
Simon Pilgrim	0a0e0d4e8e	[X86] Regenerated bitreverse tests to demonstrate what is going on. llvm-svn: 274915	2016-07-08 19:51:08 +00:00
Simon Pilgrim	aaaeedb8cb	[X86] Added bitreverse tests for non-legal types Requested on D21578 llvm-svn: 274914	2016-07-08 19:48:33 +00:00
Simon Pilgrim	950419f948	[X86][AVX2] Add support for target shuffle combining to VPERMPD/VPERMQ llvm-svn: 274908	2016-07-08 19:23:29 +00:00
Davide Italiano	d555bde59f	[SCCP] Fold constants as we build them whne visiting cast instructions. This should be slightly more efficient and could avoid spurious overdefined markings, as Eli pointed out. Differential Revision: http://reviews.llvm.org/D22122 llvm-svn: 274905	2016-07-08 19:13:40 +00:00
Sanjay Patel	1b6b824548	[InstCombine] check for one-use before turning simple logic op into a select llvm-svn: 274891	2016-07-08 17:26:47 +00:00
Simon Pilgrim	4ca42e232d	[SLPVectorizer][X86] Added fma vectorization tests llvm-svn: 274889	2016-07-08 17:19:13 +00:00
Sanjay Patel	910ce0d511	add test to show multi-use output llvm-svn: 274887	2016-07-08 17:12:27 +00:00
Simon Pilgrim	b600ba3b79	[X86][AVX] Added combine test that should simplify to insertps llvm-svn: 274884	2016-07-08 17:01:42 +00:00
Sanjay Patel	cbfca9e8ef	[InstCombine] allow or(sext(A), B) --> A ? -1 : B transform for vectors llvm-svn: 274883	2016-07-08 17:01:15 +00:00
Zhan Jun Liau	7d4d436c74	[SystemZ] Add support for the .word directive. Summary: Branch off the work to add support for the .word directive, using addAliasForDirective. Reviewers: koriakin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22142 llvm-svn: 274878	2016-07-08 16:50:02 +00:00
Sanjay Patel	647174c8a4	add vector tests to show missing transform llvm-svn: 274876	2016-07-08 16:39:53 +00:00
Matt Arsenault	44540a3db2	PeepholeOptimizer: Make pass name match DEBUG_TYPE llvm-svn: 274874	2016-07-08 16:29:11 +00:00
Zhan Jun Liau	3b4c3f4d51	[SystemZ] Add support for missing instructions Summary: Add support to allow clang integrated assembler to recognize some missing instructions, for openssl. Instructions are: LM, LMH, LMY, STM, STMH, STMY, ICM, ICMH, ICMY, SLA, SLAK, TML, TMH, EX, EXRL. Reviewers: uweigand Subscribers: koriakin, llvm-commits Differential Revision: http://reviews.llvm.org/D22050 llvm-svn: 274869	2016-07-08 16:18:40 +00:00
Sanjay Patel	46df968326	minimize tests The cmp and load aren't required. llvm-svn: 274864	2016-07-08 16:11:48 +00:00
Sanjay Patel	e1acad9b61	regenerate checks llvm-svn: 274860	2016-07-08 16:06:38 +00:00
Chris Dewhurst	3202f065b8	[Sparc] Leon errata fix passes. Errata fixes for various errata in different versions of the Leon variants of the Sparc 32 bit processor. The nature of the errata are listed in the comments preceding the errata fix passes. Relevant unit tests are implemented for each of these. Note: Running clang-format has changed a few other lines too, unrelated to the implemented errata fixes. These have been left in as this keeps the code formatting consistent. Differential Revision: http://reviews.llvm.org/D21960 llvm-svn: 274856	2016-07-08 15:33:56 +00:00
Sjoerd Meijer	1ee119f897	Do not expand SDIV when compiling for minimum code size Differential Revision: http://reviews.llvm.org/D22139 llvm-svn: 274855	2016-07-08 15:32:01 +00:00
Anna Thomas	3124f6273a	InstCombine rule to fold truncs whose value is available We can fold truncs whose operand feeds from a load, if the trunc value is available through a prior load/store. This change is from: http://reviews.llvm.org/D21246, which folded the trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW call, when the load type didnt match the prior load/store type. Differential Revision: http://reviews.llvm.org/D21791 llvm-svn: 274853	2016-07-08 15:18:56 +00:00
Valery Pykhtin	68853ab2c5	[AMDGPU] fix ds_swizzle_b32 opcode for VI (bz 28371) Differential Revision: http://reviews.llvm.org/D22049 llvm-svn: 274852	2016-07-08 15:12:46 +00:00
Sjoerd Meijer	46c4c3d31c	Addressing post-commit comments regarding not expanding UDIV; we don't expand only when compiling for minimum code size. llvm-svn: 274847	2016-07-08 14:17:09 +00:00
Simon Pilgrim	4f1877fb57	[X86][SSE] Improve constant folding tests for CVTSD/CVTSS/CVTTSD/CVTTSS As discussed on D22106, improve the testing for constant folding sse scalar conversion intrinsics to ensure we are correctly handling special/out of range cases llvm-svn: 274846	2016-07-08 13:28:34 +00:00
Sjoerd Meijer	a625af3feb	Code size optimisation: don't expand a div to a mul and and a shift sequence. As a result, the urem instruction will not be expanded to a sequence of umull, lsrs, muls and sub instructions, but just a call to __aeabi_uidivmod. Differential Revision: http://reviews.llvm.org/D22131 llvm-svn: 274843	2016-07-08 12:54:43 +00:00
Simon Pilgrim	828c731880	[X86][SSE] Accept any shuffle mask that is all zeroes Until we have a better way to extract constants through bitcasted build vectors (and how to handle undefs of partial lanes etc.) at least accept build vectors that are all zeroes. llvm-svn: 274833	2016-07-08 10:39:12 +00:00
Matt Arsenault	c3a6fe6ecd	Bug 28444: Fix assertion when extract_vector_elt has mismatched type For some reason extract_vector_elt is sometimes allowed to have a different result type than the vector element type. llvm-svn: 274829	2016-07-08 07:05:00 +00:00
Craig Topper	f7bf6de0af	[AVX512] Remove and autoupgrade a duplicate set of 512-bit masked shift intrinsics. I'm not sure if clang ever used these builtin names or not. llvm-svn: 274827	2016-07-08 06:14:47 +00:00
Wei Mi	90d195a5fd	[PM] Port UnreachableBlockElim to the new Pass Manager Differential Revision: http://reviews.llvm.org/D22124 llvm-svn: 274824	2016-07-08 03:32:49 +00:00
Saleem Abdulrasool	eb059b0e0a	ARM: support high registers in __builtin_longjmp on WoA Windows on ARM uses a pure thumb-2 environment. This means that it can select a high register when doing a __builtin_longjmp. We would use a tLDRi which would truncate the register to a low register. Use a t2LDRi12 to get the full register file access. Tweak the code to just load into PC, as that is an interworking branch on all supported cores anyways. llvm-svn: 274815	2016-07-08 00:48:22 +00:00
Andrew Kaylor	3387074ae9	Temporarily remove a test case to unblock PPC bots. llvm-svn: 274813	2016-07-08 00:35:39 +00:00
Andrew Kaylor	8b8805c94c	Temporarily remove one test run line to unblock PPC bots. llvm-svn: 274812	2016-07-08 00:32:58 +00:00
Jacques Pienaar	6d3eecc843	[lanai] Use peephole optimizer to generate more conditional ALU operations. Summary: * Similiar to the ARM backend yse the peephole optimizer to generate more conditional ALU operations; * Add predicated type with default always true to RR instructions in LanaiInstrInfo.td; * Move LanaiSetflagAluCombiner into optimizeCompare; * The ASM parser can currently only handle explicitly specified CC, so specify ".t" (true) where needed in the ASM test; * Remove unused MachineOperand flags; Reviewers: eliben Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D22072 llvm-svn: 274807	2016-07-07 23:36:04 +00:00
Michael Kuperstein	3e3652aef2	Recommit r274692 - [X86] Transform setcc + movzbl into xorl + setcc xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. The original commit tried inserting an 8bit-subreg into a GR32 (not GR32_ABCD) which was not appreciated by fast regalloc on 32-bit. llvm-svn: 274802	2016-07-07 22:50:23 +00:00
Vedant Kumar	0fdffd3709	[tsan] Try harder to not instrument gcov counters GCOVProfiler::emitProfileArcs() can create many variables with names starting with "__llvm_gcov_ctr", so llvm appends a numeric suffix to most of them. Teach tsan about this. llvm-svn: 274801	2016-07-07 22:45:28 +00:00
Kevin Enderby	1851a827a0	Add checks to the MachOObjectFile() constructor to make sure load commands sizes are the correct multiple. llvm-svn: 274798	2016-07-07 22:11:42 +00:00
Davide Italiano	16284df8ec	[PM] Port InstSimplify to the new pass manager. llvm-svn: 274796	2016-07-07 21:14:36 +00:00
Anna Thomas	6a78c78a03	[DSE] Remove dead stores in end blocks containing fence We can remove dead stores in the presence of fence instructions. Fence does not change an otherwise thread local store to visible. reviewers: reames, dexonsmith, jfb Differential Revision: http://reviews.llvm.org/D22001 llvm-svn: 274795	2016-07-07 20:51:42 +00:00
Chad Rosier	112d0e996b	[AArch64] Change the preferred alignment for char and short to word alignment. The commit reinstates r273279, which was informally approved. Original Review: http://reviews.llvm.org/D21414 This reverts commit ca632c91aaa7cafc50942f890c49f727a046ace1. llvm-svn: 274790	2016-07-07 20:02:18 +00:00
Andrew Kaylor	65fa0704aa	Include SelectionDAGISel in the opt-bisect process Differential Revision: http://reviews.llvm.org/D21143 llvm-svn: 274786	2016-07-07 18:55:02 +00:00
Peter Collingbourne	73589f321b	ThinLTO: Do not take into account whether a definition has multiple copies when promoting. We currently do not touch a symbol's linkage in the case where a definition has a single copy. However, this code is effectively unnecessary: either the definition is not exported, in which case the internalize phase sets its linkage to internal, or it is exported, in which case we need to promote linkage to weak. Those two cases are already handled by existing code. I believe that the only real functional change here is in the case where we have a single definition which does not prevail (e.g. because the definition in a native object file prevails). In that case we now lower linkage to available_externally following the existing code path for that case. As a result we can remove the isExported function parameter from the thinLTOResolveWeakForLinkerInIndex function. Differential Revision: http://reviews.llvm.org/D21883 llvm-svn: 274784	2016-07-07 18:31:51 +00:00
Tim Northover	1d106c5fc2	tests: accept different TargetOpcode values. These tests don't actually care about the internal opcode number, but have to be updated whenever we add a new one for GlobalISel. That's bad. llvm-svn: 274774	2016-07-07 17:51:42 +00:00
Michael Kuperstein	edb38a94f8	Revert r274692 to check whether this is what breaks windows selfhost. llvm-svn: 274771	2016-07-07 16:55:35 +00:00
Justin Bogner	a466cc33fa	NVPTX: Remove the legacy ptx intrinsics - Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0. There's a fair amount of code motion here, but it's all very mechanical. llvm-svn: 274769	2016-07-07 16:40:17 +00:00
Chad Rosier	3972953efd	Revert "[AArch64] Change the preferred alignment for char and short to word alignment" This reverts commit r273279 as the change was not properly approved. llvm-svn: 274768	2016-07-07 16:37:29 +00:00
Valery Pykhtin	af8b1bddbd	[AMDGPU] fix ds_write_src2 encoding (bz26027) Differential revision: http://reviews.llvm.org/D22041 llvm-svn: 274756	2016-07-07 14:23:38 +00:00
Rafael Espindola	b34cba97b7	Don't crash trying to relax 32 loads on COFF. Fixes pr28452. llvm-svn: 274754	2016-07-07 14:00:07 +00:00
Sjoerd Meijer	17c08dc701	Code size optimisation: don't rewrite fputs to fwrite when optimising for size because fwrite requires more arguments and thus extra MOVs are required. llvm-svn: 274753	2016-07-07 13:56:23 +00:00
David Majnemer	7afb46d3c8	[LoopAccessAnalysis] Fix an integer overflow We were inappropriately using 32-bit types to account for quantities that can be far larger. Fixed in PR28443. llvm-svn: 274737	2016-07-07 06:24:36 +00:00
Craig Topper	d5d2a35013	[AVX512] Zero extend the result of vpcmpeq/vpcmpgt and similar intrinsics in the autoupgrade code. This currently results in worse codegen but is needed for correctness. llvm-svn: 274736	2016-07-07 06:11:07 +00:00
Elena Demikhovsky	fc1e969dfc	Fixed a bug in vectorizing GEP before gather/scatter intrinsic. Vectorizing GEP was incorrect and broke SSA in some cases. The patch fixes PR27997 https://llvm.org/bugs/show_bug.cgi?id=27997. Differential revision: http://reviews.llvm.org/D22035 llvm-svn: 274735	2016-07-07 06:06:46 +00:00
David Majnemer	a54fe1acdc	[CodeView] Implement support for thread-local variables llvm-svn: 274734	2016-07-07 05:14:21 +00:00
Qin Zhao	c35b2cba6f	[esan:cfrag] Add option -esan-aux-field-info Summary: Adds option -esan-aux-field-info to control generating binary with auxiliary struct field information. Extracts code for creating auxiliary information from createCacheFragInfoGV into createCacheFragAuxGV. Adds test struct_field_small.ll for -esan-aux-field-info test. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D22019 llvm-svn: 274726	2016-07-07 03:20:16 +00:00
Peter Collingbourne	730c82e6b8	ThinLTO: Remove check for multiple modules before applying weak resolutions. This check is not only unnecessary, it can produce the wrong result. If we are linking a single module and it has an exported linkonce symbol, we need to promote to weak in order to avoid PR19901-style problems. Differential Revision: http://reviews.llvm.org/D21917 llvm-svn: 274722	2016-07-07 01:51:11 +00:00
Sean Silva	284b0324e2	[PM] Avoid getResult on a higher level in LoopAccessAnalysis Note that require<domtree> and require<loops> aren't needed because they come in implicitly via the loop pass manager. llvm-svn: 274712	2016-07-07 01:01:53 +00:00
Sean Silva	59fe82f4ce	[PM] Port TailCallElim llvm-svn: 274708	2016-07-06 23:48:41 +00:00
Sean Silva	b025d375a1	[PM] Port CorrelatedValuePropagation llvm-svn: 274705	2016-07-06 23:26:29 +00:00
Peter Collingbourne	d1d2614ee1	ThinLTO: Add test cases for promote+internalize. This tests the effect of both promotion and internalization on a module, and helps show that D21883 is NFC wrt promotion+internalization. Differential Revision: http://reviews.llvm.org/D21915 llvm-svn: 274699	2016-07-06 22:53:02 +00:00
Sanjay Patel	65a51c25c1	[InstCombine] enhance (select X, C1, C2 --> ext X) to handle vectors By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we allow these transforms for splat vectors. Differential Revision: http://reviews.llvm.org/D21899 llvm-svn: 274696	2016-07-06 22:23:01 +00:00
Manman Ren	524ca27b90	Add testing coverage for r274582. llvm-svn: 274693	2016-07-06 22:01:28 +00:00
Michael Kuperstein	1ef6c59b1d	[X86] Transform setcc + movzbl into xorl + setcc xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. Differential Revision: http://reviews.llvm.org/D21774 llvm-svn: 274692	2016-07-06 21:56:18 +00:00
Vedant Kumar	4c01092a25	[llvm-cov] Add support for creating html reports Based on a patch by Harlan Haskins! Differential Revision: http://reviews.llvm.org/D18278 llvm-svn: 274688	2016-07-06 21:44:05 +00:00
Matthias Braun	ad0032a649	AArch64: Change modeling of zero cycle zeroing. On CPUs with the zero cycle zeroing feature enabled "movi v.2d" should be used to zero a vector register. This was previously done at instruction selection time, however the register coalescer sometimes widened multiple vregs to the Q width because of that leading to extra spills. This patch leaves the decision on how to zero a register to the AsmPrinter phase where it doesn't affect register allocation anymore. This patch also sets isAsCheapAsAMove=1 on FMOVS0, FMOVD0. This fixes http://llvm.org/PR27454, rdar://25866262 Differential Revision: http://reviews.llvm.org/D21826 llvm-svn: 274686	2016-07-06 21:39:33 +00:00
Chad Rosier	232e29ebea	[MemorySSA] Reinstate the legacy printer and verifier. Differential Revision: http://reviews.llvm.org/D22058 llvm-svn: 274679	2016-07-06 21:20:47 +00:00
Rafael Espindola	a29971faeb	Add initial support for R_386_GOT32X. This adds it only for movl mov@GOT(%reg), %reg. llvm-svn: 274678	2016-07-06 21:19:11 +00:00
David Majnemer	7abd269aa9	[CodeView] Emit an appropriate symbol kind for globals We emitted debug info for globals/functions as if they all had external linkage. Instead, emit local symbol records when appropriate. llvm-svn: 274676	2016-07-06 21:07:47 +00:00
David Majnemer	e1e7372e93	[CodeView] Unions are always sealed It is impossible to inherit from a union. We are missing a way to represent this in IR for classes/structs... llvm-svn: 274675	2016-07-06 21:07:42 +00:00
Justin Lebar	6f9d01bbd5	[NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D22068 llvm-svn: 274674	2016-07-06 21:06:10 +00:00
Haicheng Wu	a95cd1267f	[LIR] Fix mis-compilation with unwinding. To fix PR27859, bail out if there is an instruction may throw. Differential Revision: http://reviews.llvm.org/D20638 llvm-svn: 274673	2016-07-06 21:05:40 +00:00
Piotr Padlewski	6deaa6afae	Add 'thinlto_src_module' metadata to imported function Added metadata to be able to make statistics on how many functions that have been imported have been removed. Also module name might be helpfull when debugging. Reviewers: tejohnson, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D21943 llvm-svn: 274668	2016-07-06 20:26:25 +00:00
Derek Bruening	d712a3c10e	[esan\|wset] Fix incorrect memory size assert Summary: Fixes an incorrect assert that fails on 128-bit-sized loads or stores. Augments the wset tests to include this case. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D22062 llvm-svn: 274666	2016-07-06 20:13:53 +00:00
Justin Bogner	a463537a36	NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 Everywhere where cuda.syncthreads or __syncthreads is used, use the properly namespaced nvvm.barrier0 instead. llvm-svn: 274664	2016-07-06 20:02:45 +00:00
Adrian McCarthy	820ca5404c	Retry: "Emit CodeView type records for nested classes." Now with a corrected test to account for a recently supported properties bit in the debug info of a struct. Original review: http://reviews.llvm.org/D21939 This reverts commit 970c3fd497a28d25dd69526eb52594a696c37968. llvm-svn: 274661	2016-07-06 19:49:51 +00:00
Chad Rosier	dcfce2d0ec	[DSE] Avoid iterator invalidation bugs. The dse_with_dbg_value.ll test committed with r273141 is removed because this we no longer performs any type of back tracking, which is what was causing the codegen differences with and without debug information. Differential Revision: http://reviews.llvm.org/D21613 llvm-svn: 274660	2016-07-06 19:48:52 +00:00
Sanjay Patel	04b3496d9b	[x86] fix cost of SINT_TO_FP for i32 --> float (PR21356, PR28434) This is "cvtdq2ps" which does not appear to be particularly slow on any CPU according to Agner's tables. Choosing "5" as a cost here as suggested in: https://llvm.org/bugs/show_bug.cgi?id=21356 ...but it seems very conservative given that the instruction is fully pipelined, and I think these costs are supposed to model throughput. Note that related costs are also most likely too high, but this fixes PR21356 and partly fixes PR28434. llvm-svn: 274658	2016-07-06 19:15:54 +00:00
Sean Silva	f50d4b6cdc	Work around PR28400 a bit harder. We were still crashing in the "no change" case because LVI was not getting invalidated. See the thread "Should analyses be able to hold AssertingVH to IR? (related to PR28400)" for more discussion. llvm-svn: 274656	2016-07-06 19:05:41 +00:00
Elliot Colp	bc2cfc2291	[SystemZ] Remove AND mask of bottom 6 bits when result is used for shift/rotate On SystemZ, shift and rotate instructions only use the bottom 6 bits of the shift/rotate amount. Therefore, if the amount is ANDed with an immediate mask that has all of the bottom 6 bits set, we can remove the AND operation entirely. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 274650	2016-07-06 18:13:11 +00:00
Zachary Turner	8848a7a6b2	[pdb] Round trip the PDB stream between YAML and binary PDB. This gets writing of the PDB stream working. llvm-svn: 274647	2016-07-06 18:05:57 +00:00
Kit Barton	f9d0a40573	Ensure all uses of permute instructions feed vector stores There is a problem in VSXSwapRemoval where it is incorrectly removing permute instructions. In this case, the permute is feeding both a vector store and also a non-store instruction. In this case, the permute cannot be removed. The fix is to simply look at all the uses of the vector register defined by the permute and ensure that all the uses are vector store instructions. This problem was reported in PR 27735 (https://llvm.org/bugs/show_bug.cgi?id=27735). Test case based on the original problem reported. Phabricator Review: http://reviews.llvm.org/D21802 llvm-svn: 274645	2016-07-06 18:03:52 +00:00
Tim Shen	1c3c0afc53	[DAGCombiner] Fix visitSTORE to continue processing current SDNode, if findBetterNeighborChains doesn't actually CombineTo it. Summary: findBetterNeighborChains may or may not find a better chain for each node it finds, which include the node ("St") that visitSTORE is currently processing. If no better chain is found for St, visitSTORE should continue instead of return SDValue(St, 0), as if it's CombinedTo'ed. This fixes bug 28130. There might be other ways to make the test pass (see D21409). I think both of the patches are fixing actual bugs revealed by the same testcase. Reviewers: echristo, wschmidt, hfinkel, kbarton, amehsan, arsenm, nemanjai, bogner Subscribers: mehdi_amini, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D21692 llvm-svn: 274644	2016-07-06 17:44:03 +00:00
Michael Kuperstein	aa71bdd3af	[TTI] The cost model should not assume vector casts get completely scalarized The cost model should not assume vector casts get completely scalarized, since on targets that have vector support, the common case is a partial split up to the legal vector size. So, when a vector cast gets split, the resulting casts end up legal and cheap. Instead of pessimistically assuming scalarization, base TTI can use the costs the concrete TTI provides for the split vector, plus a fudge factor to account for the cost of the split itself. This fudge factor is currently 1 by default, except on AMDGPU where inserts and extracts are considered free. Differential Revision: http://reviews.llvm.org/D21251 llvm-svn: 274642	2016-07-06 17:30:56 +00:00
Adrian McCarthy	7649d8388a	Revert "Emit CodeView type records for nested classes." This reverts commit 256b29322c827a2d94da56468c936596f5509032. llvm-svn: 274632	2016-07-06 15:14:10 +00:00
Simon Pilgrim	118da63a9d	[X86][SSE] Added test cases for missed opportunities to combine pshufb to pslldq/psrldq llvm-svn: 274631	2016-07-06 15:09:48 +00:00
Adrian McCarthy	024a7b6358	Emit CodeView type records for nested classes. Differential Revision: http://reviews.llvm.org/D21939 llvm-svn: 274629	2016-07-06 14:47:32 +00:00
Matthew Simpson	433cb1dfe3	[LV] Don't widen trivial induction variables We currently always vectorize induction variables. However, if an induction variable is only used for counting loop iterations or computing addresses with getelementptr instructions, we don't need to do this. Vectorizing these trivial induction variables can create vector code that is difficult to simplify later on. This is especially true when the unroll factor is greater than one, and we create vector arithmetic when computing step vectors. With this patch, we check if an induction variable is only used for counting iterations or computing addresses, and if so, scalarize the arithmetic when computing step vectors instead. This allows for greater simplification. This patch addresses the suboptimal pointer arithmetic sequence seen in PR27881. Reference: https://llvm.org/bugs/show_bug.cgi?id=27881 Differential Revision: http://reviews.llvm.org/D21620 llvm-svn: 274627	2016-07-06 14:26:59 +00:00
Elena Demikhovsky	ad0a56f3da	Re-commit of 274613. The prev commit failed on compilation. A minor change in one pattern in lib/Target/X86/X86InstrAVX512.td fixes the failure. llvm-svn: 274626	2016-07-06 14:15:43 +00:00
Sam Kolton	3c21a69077	[AMDGPU] Assembler: regression tests for bug 28413. NFC llvm-svn: 274623	2016-07-06 12:52:20 +00:00
Diana Picus	b772e409ba	[ARM] Do not test for CPUs, use SubtargetFeatures. Also remove 2 flags. This is a follow-up for r273544. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. This commit also removes two command-line flags that weren't used in any of the tests: widen-vmovs and swift-partial-update-clearance. The former may be easily replaced with the mattr mechanism, but the latter may not (as it is a subtarget property, and not a proper feature). Differential Revision: http://reviews.llvm.org/D21797 llvm-svn: 274620	2016-07-06 11:22:11 +00:00
Elena Demikhovsky	02ced295aa	Reverted 274613 due to compilation failue. llvm-svn: 274615	2016-07-06 09:11:49 +00:00
Elena Demikhovsky	5a4f2476fd	AVX-512: Optimization for patterns with i1 scalar type The patch removes redundant kmov instructions (not all, we still have a lot of work here) and redundant "and" instructions after "setcc". I use "AssertZero" marker between X86ISD::SETCC node and "truncate" to eliminate extra "and $1" instruction. I also changed zext, aext and trunc patterns in the .td file. It allows to remove extra "kmov" instruictions. This patch fixes https://llvm.org/bugs/show_bug.cgi?id=28173. Fast ISEL mode is not supported correctly for AVX-512. ICMP/FCMP scalar instruction should return result in k-reg. It will be fixed in one of the next patches. I redirected handling of "cmp" to the DAG builder mode. (The code looks worse in one specific test case, but without this fix the new patch fails). Differential revision: http://reviews.llvm.org/D21956 llvm-svn: 274613	2016-07-06 09:01:20 +00:00
Nicolai Haehnle	e40530ea7b	AMDGPU: Fix return of non-void-returning shaders Summary: Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that ensures that a non-void-returning shader falls off the end of the last basic block was effectively disabled, since SI_RETURN is now used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21975 llvm-svn: 274612	2016-07-06 08:35:17 +00:00
Elena Demikhovsky	971fbfda1e	Vector GEP test: renamed + some comments Differential revision: http://reviews.llvm.org/D21957 llvm-svn: 274611	2016-07-06 08:11:23 +00:00
Daniel Berlin	fc7e651bfd	Fix handling of forward unreachable but reverse-reachable blocks in MemorySSA construction llvm-svn: 274606	2016-07-06 05:32:05 +00:00
George Burgess IV	bfa401e5ad	[CFLAA] Split into Anders+Steens analysis. StratifiedSets (as implemented) is very fast, but its accuracy is also limited. If we take a more aggressive andersens-like approach, we can be way more accurate, but we'll also end up being slower. So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA. Long-term, we want to end up in a place where CFLSteens is queried first; if it can provide an answer, great (since queries are basically map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc. This patch splits everything out so we can try to do something like that when we get a reasonable CFLAnders implementation. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21910 llvm-svn: 274589	2016-07-06 00:26:41 +00:00
Ryan Govostes	e51401bdab	[asan] Add a hidden option for Mach-O global metadata liveness tracking llvm-svn: 274578	2016-07-05 21:53:08 +00:00
Tim Northover	e6ae6767d9	AArch64: TableGenerate system instruction operands. The way the named arguments for various system instructions are handled at the moment has a few problems: - Large-scale duplication between AArch64BaseInfo.h and AArch64BaseInfo.cpp - That weird Mapping class that I have no idea what I was on when I thought it was a good idea. - Searches are performed linearly through the entire list. - We print absolutely all registers in upper-case, even though some are canonically mixed case (SPSel for example). - The ARM ARM specifies sysregs in terms of 5 fields, but those are relegated to comments in our implementation, with a slightly opaque hex value indicating the canonical encoding LLVM will use. This adds a new TableGen backend to produce efficiently searchable tables, and switches AArch64 over to using that infrastructure. llvm-svn: 274576	2016-07-05 21:23:04 +00:00
Balaram Makam	d4acd7ed10	Revert r259387: "AArch64: Implement missed conditional compare sequences." This reverts commit r259387 because it inserts illegal code after legalization in some backends where i64 OR type is illegal for example. llvm-svn: 274573	2016-07-05 20:24:05 +00:00
Simon Pilgrim	bec6543d17	[X86][AVX2] Add support for target shuffle combining to BROADCAST Only support broadcast from vector register so far - memory folding support will have to wait. llvm-svn: 274572	2016-07-05 20:11:29 +00:00
Simon Pilgrim	48adedffb7	[X86][AVX512] Fixed decoding of permd/permpd variable mask shuffles + enabled them for target shuffle combining Corrected element mask masking to extract the bottom index bits (now matches the perm2 implementation but for unary inputs). llvm-svn: 274571	2016-07-05 18:31:17 +00:00
Saleem Abdulrasool	4d950ef892	ARM: fix `-mlong-calls` for WoA Not all code-paths set the relocation model to static for Windows. This currently breaks on Windows ARM with `-mlong-calls` when built with clang. Loosen the assertion to what it was previously. We would ideally ensure that all the configuration sets Windows to static relocation model. llvm-svn: 274570	2016-07-05 18:30:52 +00:00
Matt Arsenault	2d79389508	DAGCombiner: Fold away vector extract of insert with the same index This only really matters when the index is non-constant since the constant case already gets taken care of by other combines. llvm-svn: 274569	2016-07-05 18:25:02 +00:00
Tim Northover	01dff9d18a	AArch64: use correct SDValue # when looking for bitfield placement. The other use really does only care about the SDNode (it checks the opcode against a whitelist), but bitFieldPlacement can be misled if the node produces multiple results. Patch by Ismail Badawi. llvm-svn: 274567	2016-07-05 18:02:57 +00:00
Matt Arsenault	ffc8275f2b	AMDGPU: Fix folding SGPRs into madak/madmk src0 Because of the special immediate operand, the constant bus is already used so SGPRs are never useful. r263212 changed the name of the immediate operand, which broke the verifier check for the restriction. llvm-svn: 274564	2016-07-05 17:09:01 +00:00
Sam Kolton	a9cd6aa895	[AMDGPU] Assembler: Fix parsing error with floating-point literals passed to integer instructions Differential Revision: http://reviews.llvm.org/D21972 llvm-svn: 274551	2016-07-05 14:01:11 +00:00
Simon Pilgrim	4e96fbf3c1	[X86][AVX512] Autoupgrade the BROADCAST intrinsics llvm-svn: 274550	2016-07-05 13:58:47 +00:00
Simon Pilgrim	1e91654b38	[X86][AVX512BW] Added BROADCAST intrinsics fast-isel generic IR tests llvm-svn: 274545	2016-07-05 13:16:05 +00:00
James Molloy	ae5ff990ae	[Thumb] Reapply r272251 with a fix for PR28348 (mk 2) The important thing I was missing was ensuring newly added constants were kept in topological order. Repositioning the node is correct if the constant is newly added (so it has no topological ordering) but wrong if it already existed - positioning it next in the worklist would break the topological ordering. Original commit message: [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 274543	2016-07-05 12:37:13 +00:00
Simon Pilgrim	20ede63a33	[X86][AVX512] Added BROADCAST intrinsics fast-isel generic IR tests llvm-svn: 274537	2016-07-05 10:15:14 +00:00
Nemanja Ivanovic	44513e545f	[PowerPC] - Legalize vector types by widening instead of integer promotion This patch corresponds to review: http://reviews.llvm.org/D20443 It changes the legalization strategy for illegal vector types from integer promotion to widening. This only applies for vectors with elements of width that is a multiple of a byte since we have hardware support for vectors with 1, 2, 3, 8 and 16 byte elements. Integer promotion for vectors is quite expensive on PPC due to the sequence of breaking apart the vector, extending the elements and reconstituting the vector. Two of these operations are expensive. This patch causes between minor and major improvements in performance on most benchmarks. There are very few benchmarks whose performance regresses. These regressions can be handled in a subsequent patch with a DAG combine (similar to how this patch handles int -> fp conversions of illegal vector types). llvm-svn: 274535	2016-07-05 09:22:29 +00:00
Simon Pilgrim	dea33cc2f3	[X86][AVX512] Added VSHUFPD intrinsics fast-isel generic IR tests llvm-svn: 274534	2016-07-05 09:10:07 +00:00
Simon Pilgrim	8a01915bd2	[X86][AVX512VL] Added VSHUFPD/VSHUFPS intrinsics fast-isel generic IR tests llvm-svn: 274533	2016-07-05 09:09:41 +00:00
Saleem Abdulrasool	4d3626ed31	test: relax the match on the timestamp llvm-svn: 274529	2016-07-05 01:14:53 +00:00
Saleem Abdulrasool	aecbdf70bf	Object: support empty UID/GID fields Normal archives do not have empty UID/GID fields. However, the Microsoft Import library format is a customized archive (it just uses an alternate symbol index format). When the import library is constructed by lib.exe, the UID and GID fields are left empty. Do not abort on such an input. llvm-svn: 274528	2016-07-05 00:23:05 +00:00
Simon Pilgrim	3ad040909a	[X86][AVX512] Add support for lowering shuffles to VSHUFPD llvm-svn: 274520	2016-07-04 20:41:24 +00:00
James Molloy	c3b4ed4a70	Revert "[Thumb] Reapply r272251 with a fix for PR28348" This reverts commit r274510 - it made green dragon unhappy. llvm-svn: 274512	2016-07-04 17:14:24 +00:00
James Molloy	9f019835ef	[Thumb] Reapply r272251 with a fix for PR28348 We were using DAG->getConstant instead of DAG->getTargetConstant. This meant that we could inadvertently increase the use count of a constant if stars aligned, which it did in this testcase. Increasing the use count of the constant could cause ISel to fall over (because DAGToDAG lowering assumed the constant had only one use!) Original commit message: [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 274510	2016-07-04 16:35:41 +00:00
Simon Pilgrim	02d435d2f4	[X86][AVX512] Autoupgrade the VPERMPD/VPERMQ intrinsics llvm-svn: 274506	2016-07-04 14:19:05 +00:00
Simon Pilgrim	8b82fce537	[X86][AVX512] Added VPERMPD/VPERMQ intrinsics fast-isel generic IR tests llvm-svn: 274503	2016-07-04 13:43:10 +00:00
Simon Pilgrim	9fca300cbe	[X86][AVX512] Autoupgrade the VPERMILPD/VPERMILPS intrinsics llvm-svn: 274498	2016-07-04 12:40:54 +00:00
Simon Pilgrim	c8cf2ddb6d	[X86][AVX512] Added VPERMILPD/VPERMILPS intrinsics fast-isel generic IR tests Added PSHUFD tests as well llvm-svn: 274493	2016-07-04 11:07:50 +00:00
Nicolai Haehnle	84c9f9919a	Add writeonly IR attribute Summary: This complements the earlier addition of IntrWriteMem and IntrWriteArgMem LLVM intrinsic properties, see D18291. Also start using the attribute for memset, memcpy, and memmove intrinsics, and remove their special-casing in BasicAliasAnalysis. Reviewers: reames, joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18714 llvm-svn: 274485	2016-07-04 08:01:29 +00:00
Craig Topper	d83f818a3e	[CodeGen] Make the code that detects a if a shuffle is really a concatenation of the inputs more general purpose. We can now handle concatenation of each source multiple times. The previous code just checked for each source to appear once in either order. This also now handles an entire source vector sized piece having undef indices correctly. We now concat with UNDEF instead of using one of the sources. This is responsible for the test case change. llvm-svn: 274483	2016-07-04 06:19:35 +00:00
Simon Pilgrim	7f096de0b8	[X86][AVX512] Add support for 512-bit shuffle lowering to VPERMPD/VPERMQ llvm-svn: 274473	2016-07-03 19:50:06 +00:00
Craig Topper	d1eca0f32c	[CodeGen] Teach OR combine of shuffles involving zero vectors to better handle undef indices. Undef indices can now be treated as zeros. Or if its undef ORed with zero, we will keep the undef. llvm-svn: 274472	2016-07-03 19:37:12 +00:00
Craig Topper	8e826d5abe	[X86] Add tests to show that the DAG combine for OR of shuffles with zero vectors doesn't handle undefs as well as it could. Fix coming in another commit. llvm-svn: 274471	2016-07-03 19:37:10 +00:00
Haicheng Wu	b71b2f622a	[MBB] add a missing corner case in UpdateTerminator() After the block placement, if a block ends with a conditional branch, but the next block is not its successor. The conditional branch should be changed to unconditional branch. This patch fixes PR28307, PR28297, PR28402. Differential Revision: http://reviews.llvm.org/D21811 llvm-svn: 274470	2016-07-03 19:14:17 +00:00
Simon Pilgrim	68ea80649b	[X86][AVX512] Add support for VPERMPD/VPERMQ masked shuffle comments llvm-svn: 274469	2016-07-03 18:40:24 +00:00
Simon Pilgrim	a0d73835b2	[X86][AVX512] Add support for 512-bit shuffle decoding of VPERMPD/VPERMQ llvm-svn: 274468	2016-07-03 18:27:37 +00:00
Simon Pilgrim	dbd6db0dc7	[X86][AVX512] Add support for VPALIGNR/PSHUFD/PSHUFHW/PSHUFLW masked shuffle comments llvm-svn: 274466	2016-07-03 15:00:51 +00:00
Sanjay Patel	cbaac41856	[InstCombine] enable vector select of bools -> logic folds llvm-svn: 274465	2016-07-03 14:34:39 +00:00
Simon Pilgrim	598bdb6bfe	[X86][AVX512] Add support for UNPCK masked shuffle comments llvm-svn: 274464	2016-07-03 14:26:21 +00:00
Simon Pilgrim	1f59076196	[X86][AVX512] Add support for VPERM/VSHUF masked shuffle comments llvm-svn: 274462	2016-07-03 13:55:41 +00:00
Simon Pilgrim	68f438a036	[X86][AVX512] Add support for PMOVZX masked shuffle comments llvm-svn: 274461	2016-07-03 13:33:28 +00:00
Sanjay Patel	42396ae0ea	add vector bool select tests and regenerate checks for scalar bool select tests llvm-svn: 274460	2016-07-03 13:26:02 +00:00
Simon Pilgrim	7c2fbdc101	[X86][AVX512] Add support for masked shuffle comments This patch adds support for including the avx512 mask register information in the mask/maskz versions of shuffle instruction comments. This initial version just adds support for MOVDDUP/MOVSHDUP/MOVSLDUP to reduce the mass of test regenerations, other shuffle instructions can be added in due course. Differential Revision: http://reviews.llvm.org/D21953 llvm-svn: 274459	2016-07-03 13:08:29 +00:00
Simon Pilgrim	129b720c18	[X86][AVX512] Add support for lowering shuffles to VPERMILPS llvm-svn: 274458	2016-07-03 12:47:21 +00:00
Sean Silva	45835e731d	Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC. This actually uncovered a surprisingly large chain of ultimately unused TLI args. From what I can gather, this argument is a remnant of when isKnownNonNull would look at the TLI directly. The current approach seems to be that InferFunctionAttrs runs early in the pipeline and uses TLI to annotate the TLI-dependent non-null information as return attributes. This also removes the dependence of functionattrs on TLI altogether. llvm-svn: 274455	2016-07-02 23:47:27 +00:00
Xinliang David Li	8a021317a2	[PM] Port LoopAccessInfo analysis to new PM It is implemented as a LoopAnalysis pass as discussed and agreed upon. llvm-svn: 274452	2016-07-02 21:18:40 +00:00
Simon Pilgrim	99e8a1aa0b	[X86][AVX512] Add support for lowering shuffles to VPERMILPD llvm-svn: 274450	2016-07-02 20:20:12 +00:00
Simon Pilgrim	72052f6de9	[X86][AVX512VL] Add fast-isel MOVDDUP/MOVSLDUP/MOVSHDUP shuffle tests llvm-svn: 274448	2016-07-02 19:22:46 +00:00
Simon Pilgrim	cde7c54baa	[X86][AVX512] Add support for 512-bit PSHUFB lowering llvm-svn: 274444	2016-07-02 18:14:31 +00:00
Simon Pilgrim	77dda7c2e0	[X86][AVX512] Converted the MOVDDUP/MOVSLDUP/MOVSHDUP masked intrinsics to generic IR llvm-svn: 274443	2016-07-02 17:16:41 +00:00
Simon Pilgrim	19adee9d84	[X86][AVX512] Autoupgrade the MOVDDUP/MOVSLDUP/MOVSHDUP intrinsics llvm-svn: 274439	2016-07-02 14:42:35 +00:00
Simon Pilgrim	f040d8c061	[X86][AVX512] Add support for lowering shuffles to MOVDDUP/MOVSLDUP/MOVSHDUP llvm-svn: 274436	2016-07-02 12:45:03 +00:00
Simon Pilgrim	5e95390957	[X86][AVX512] Add test cases that should lower to MOVSLDUP/MOVSHDUP llvm-svn: 274435	2016-07-02 12:20:35 +00:00
Simon Pilgrim	a6f262a1f9	[X86][AVX512] Add fast-isel shuffle tests Its not worth trying to write out tests for all the avx512f builtins yet, just adding tests for lowering of generic IR as we transition to it (shuffles mainly right now). llvm-svn: 274434	2016-07-02 12:13:29 +00:00
Qin Zhao	b463c23c10	[esan\|cfrag] Add counters for struct array accesses Summary: Adds one counter to the struct counter array for counting struct array accesses. Adds instrumentation to insert counter update for struct array accesses. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D21594 llvm-svn: 274420	2016-07-02 03:25:37 +00:00
Michael Kuperstein	071d8306b0	[PM] Port ConstantHoisting to the new Pass Manager Differential Revision: http://reviews.llvm.org/D21945 llvm-svn: 274411	2016-07-02 00:16:47 +00:00
Reid Kleckner	e092dad72c	[codeview] Set the Nested and Scoped ClassOptions based on the scope chain These are set on both the declaration record and the definition record. llvm-svn: 274410	2016-07-02 00:11:07 +00:00
Matt Arsenault	accddacb70	TII: Fix inlineasm size counting comments as insts The main problem was counting comments on their own line as instructions. llvm-svn: 274405	2016-07-01 23:26:50 +00:00
David Majnemer	08bd744c2c	[CodeView] Include the offset of nested members Given something like: struct S { int a; struct { int b; }; }; We would fail to give 'b' offset 4. Instead, we would give it the offset it has inside of it's struct. llvm-svn: 274400	2016-07-01 23:12:48 +00:00
David Majnemer	6bdc24e7b6	[CodeView] Pretty print anonymous scopes A namespace without a name should be written out as `anonymous namespace' while a tag type without a name should be written out as <unnamed-tag>. llvm-svn: 274399	2016-07-01 23:12:45 +00:00
Matt Arsenault	7f681ac7a9	AMDGPU: Add feature for unaligned access llvm-svn: 274398	2016-07-01 23:03:44 +00:00
Matt Arsenault	8af47a09e5	AMDGPU: Expand unaligned accesses early Due to visit order problems, in the case of an unaligned copy the legalized DAG fails to eliminate extra instructions introduced by the expansion of both unaligned parts. llvm-svn: 274397	2016-07-01 22:55:55 +00:00
Evgeniy Stepanov	b736335dc3	[msan] Fix __msan_maybe_ for non-standard type sizes. Fix incorrect calculation of the type size for __msan_maybe_warning_N call that resulted in an invalid (narrowing) zext instruction and "Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed." Only happens in very large functions (with more than 3500 MSan checks) operating on integer types that are not power-of-two. llvm-svn: 274395	2016-07-01 22:49:59 +00:00
Matt Arsenault	327bb5ad82	AMDGPU: Improve load/store of illegal types. There was a combine before to handle the simple copy case. Split this into handling loads and stores separately. We might want to change how this handles some of the vector extloads, since this can result in large code size increases. llvm-svn: 274394	2016-07-01 22:47:50 +00:00
Reid Kleckner	ad56ea3129	[codeview] Don't record UDTs for anonymous structs MSVC makes up names for these anonymous structs, but we don't (yet). Eventually Clang should use getTypedefNameForAnonDecl() to put some name in the debug info, and we can update the test case when that happens. llvm-svn: 274391	2016-07-01 22:24:51 +00:00
Alina Sbirlea	8d8aa5dd6c	Address two correctness issues in LoadStoreVectorizer Summary: GetBoundryInstruction returns the last instruction as the instruction which follows or end(). Otherwise the last instruction in the boundry set is not being tested by isVectorizable(). Partially solve reordering of instructions. More extensive solution to follow. Reviewers: tstellarAMD, llvm-commits, jlebar Subscribers: escha, arsenm, mzolotukhin Differential Revision: http://reviews.llvm.org/D21934 llvm-svn: 274389	2016-07-01 21:44:12 +00:00
Dehao Chen	7b2c997736	Specify mtriple for the frame-order.ll test. Summary: original test may have different bahavior on different bot, specifically it broke llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21931 llvm-svn: 274368	2016-07-01 17:35:13 +00:00
Dehao Chen	ad2b4e1334	Do not count debug instructions when counting number of uses to reorder frame objects. Summary: The code generation should be independent of the debug info. Reviewers: zansari, davidxl, mkuper, majnemer Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D21911 llvm-svn: 274357	2016-07-01 15:40:25 +00:00
Nikolay Haustov	beb24f5b20	Resubmit r268719 - AMDGPU/SI: Add amdgpu_kernel calling convention. Part 2. This was reverted in r268740 because of problems with corresponding Clang change. Clang change was updated and resubmitted in r274220. Check calling convention in AMDGPUMachineFunction::isKernel This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF. Also, in the future unused non-kernels may be optimized. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19917 llvm-svn: 274341	2016-07-01 10:00:58 +00:00
Sam Kolton	5196b88f07	[AMDGPU] Assembler: support SDWA for VOPC instructions Summary: dst_sel and dst_unused disabled for VOPC as they have no effect on result Reviewers: artem.tamazov, tstellarAMD, vpykhtin Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21376 llvm-svn: 274340	2016-07-01 09:59:21 +00:00
Duncan P. N. Exon Smith	e60719b3fa	Revert "add tests for bugs fixed by the GVN hoist pass" This reverts commit r274327 since the tests fail. E.g.: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17240 It looks like this commit is building on r274305, but that commit caused a miscompile and was reverted in r274320. llvm-svn: 274332	2016-07-01 04:55:13 +00:00
Sebastian Pop	196ba4f844	add tests for bugs fixed by the GVN hoist pass https://llvm.org/bugs/show_bug.cgi?id=20242 https://llvm.org/bugs/show_bug.cgi?id=22005 llvm-svn: 274327	2016-07-01 03:03:19 +00:00
Reid Kleckner	b5af11dfa3	[codeview] Add DISubprogram::ThisAdjustment Summary: This represents the adjustment applied to the implicit 'this' parameter in the prologue of a virtual method in the MS C++ ABI. The adjustment is always zero unless multiple inheritance is involved. This increases the size of DISubprogram by 8 bytes, unfortunately. The adjustment really is a signed 32-bit integer. If this size increase is too much, we could probably win it back by splitting out a subclass with info specific to virtual methods (virtuality, vindex, thisadjustment, containingType). Reviewers: aprantl, dexonsmith Subscribers: aaboud, amccarth, llvm-commits Differential Revision: http://reviews.llvm.org/D21614 llvm-svn: 274325	2016-07-01 02:41:21 +00:00
Matt Arsenault	0101ecade0	LoadStoreVectorizer: Don't increase alignment with no align set If no alignment was set on the load/stores, it would vectorize to the new type even though this increases the default alignment. llvm-svn: 274323	2016-07-01 02:09:38 +00:00
Matt Arsenault	370e8226c7	LoadStoreVectorizer: Check TTI for vec reg bit width llvm-svn: 274322	2016-07-01 02:07:22 +00:00
Matt Arsenault	42ad17059a	LoadStoreVectorizer: Fix assert when merging pointer ops This needs to use inttoptr/ptrtoint if combining an int and pointer load. If a pointer is used always do an integer load. llvm-svn: 274321	2016-07-01 01:55:52 +00:00
Duncan P. N. Exon Smith	9d1f156418	Revert "code hoisting pass based on GVN" This reverts commit r274305, since it breaks self-hosting: http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/22349/ http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17232 Note that the blamelist on lab.llvm.org:8011 is incorrect. The previous build was r274299, but somehow r274305 wasn't included in the blamelist: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules llvm-svn: 274320	2016-07-01 01:51:40 +00:00
Matt Arsenault	241f34cde8	LoadStoreVectorizer: Use AA metadata This was not passing the full instruction with metadata to the alias query. llvm-svn: 274318	2016-07-01 01:47:46 +00:00
Matt Arsenault	d7e8898bdd	LoadStoreVectorizer: if one element of a vector is integer, default to integer. Fixes issues on some architectures where we use arithmetic ops to build vectors, which can cause bad things to happen for loads/stores of mixed types. Patch by Fiona Glaser llvm-svn: 274307	2016-07-01 00:37:01 +00:00
Matt Arsenault	8a4ab5e19f	LoadStoreVectorizer: Fix crashes on sub-byte types llvm-svn: 274306	2016-07-01 00:36:54 +00:00
Sebastian Pop	5c5798c57c	code hoisting pass based on GVN This pass hoists duplicated computations in the program. The primary goal of gvn-hoist is to reduce the size of functions before inline heuristics to reduce the total cost of function inlining. Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki. Important algorithmic contributions by Daniel Berlin under the form of reviews. Differential Revision: http://reviews.llvm.org/D19338 llvm-svn: 274305	2016-07-01 00:24:31 +00:00
Matt Arsenault	079d0f19a2	LoadStoreVectorizer: Check skipFunction first. Also add test I forgot to add to r274296. llvm-svn: 274299	2016-06-30 23:50:18 +00:00
Matt Arsenault	2cbe52b990	LoadStoreVectorizer: Skip optnone functions llvm-svn: 274296	2016-06-30 23:30:29 +00:00
Matt Arsenault	08debb0244	Add LoadStoreVectorizer pass This was contributed by Apple, and I've been working on minimal cleanups and generalizing it. llvm-svn: 274293	2016-06-30 23:11:38 +00:00
Matt Arsenault	727e279ac4	SLPVectorizer: Move propagateMetadata to VectorUtils This will be re-used by the LoadStoreVectorizer. Fix handling of range metadata and testcase by Justin Lebar. llvm-svn: 274281	2016-06-30 21:17:59 +00:00
Yunzhong Gao	b386955adc	Add an artificial line-0 debug location when the compiler emits a call to __stack_chk_fail(). This avoids a compiler crash. Differential Revision: http://reviews.llvm.org/D21818 llvm-svn: 274263	2016-06-30 18:49:04 +00:00
Wei Mi	95685faeee	Refine the set of UniformAfterVectorization instructions. Except the seed uniform instructions (conditional branch and consecutive ptr instructions), dependencies to be added into uniform set should only be used by existing uniform instructions or intructions outside of current loop. Differential Revision: http://reviews.llvm.org/D21755 llvm-svn: 274262	2016-06-30 18:42:56 +00:00
Etienne Bergeron	078d8f69b6	revert http://reviews.llvm.org/D21101 llvm-svn: 274251	2016-06-30 17:52:24 +00:00
Zachary Turner	ab58ae8730	[pdb] Re-add code to write PDB files. Somehow all the functionality to write PDB files got removed, probably accidentally when uploading the patch perhaps the wrong one got uploaded. This re-adds all the code, as well as the corresponding test. llvm-svn: 274248	2016-06-30 17:43:00 +00:00
Zachary Turner	a30bd1a1bc	Update llvm-pdbdump to use subcommands. llvm-svn: 274247	2016-06-30 17:42:48 +00:00
Etienne Bergeron	47cf4eabe6	[exceptions] Upgrade exception handlers when stack protector is used Summary: MSVC provide exception handlers with enhanced information to deal with security buffer feature (/GS). To be more secure, the security cookies (GS and SEH) are validated when unwinding the stack. The following code: ``` void f() {} void foo() { __try { f(); } __except(1) { f(); } } ``` Reviewers: majnemer, rnk Subscribers: thakis, llvm-commits, chrisha Differential Revision: http://reviews.llvm.org/D21101 llvm-svn: 274239	2016-06-30 15:36:59 +00:00
Jun Bum Lim	596a3bd9ec	[DSE] Fix bug in partial overwrite tracking Summary: Found cases where DSE incorrectly add partially-overwritten intervals. Please see the test case for details. Reviewers: mcrosier, eeckstein, hfinkel Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D21859 llvm-svn: 274237	2016-06-30 15:32:20 +00:00
Sanjay Patel	7c6eab5777	[InstCombine] shrink switch conditions better (PR24766) https://llvm.org/bugs/show_bug.cgi?id=24766#c2 This removes a hack that was added for the benefit of x86 codegen. It prevented shrinking the switch condition even to smaller legal (DataLayout) types. We have a safety mechanism in CGP after: http://reviews.llvm.org/rL251857 ...so we're free to use the optimal (smallest) IR type now. Differential Revision: http://reviews.llvm.org/D12965 llvm-svn: 274233	2016-06-30 14:51:21 +00:00
Sanjay Patel	7ad98babfa	[InstCombine] extend matchSelectFromAndOr() to work with i1 scalar types If the incoming types are i1, then we don't have to pattern match any sext ops. Differential Revision: http://reviews.llvm.org/D21740 llvm-svn: 274228	2016-06-30 14:18:18 +00:00
Jonas Paulsson	25e193da4c	[SystemZ] Let z13 also support FeatureMiscellaneousExtensions. This processor feature had been left out by mistake from the z13 ProcessorModel. This time with updated test case. Thanks, Hans. Reviewed by Ulrich Weigand. llvm-svn: 274216	2016-06-30 07:13:56 +00:00
David Majnemer	9319cbc045	[CodeView] Implement support for bitfields in LLVM CodeView need to know the offset of the storage allocation for a bitfield. Encode this via the "extraData" field in DIDerivedType and introduced a new flag, DIFlagBitField, to indicate whether or not a member is a bitfield. This fixes PR28162. Differential Revision: http://reviews.llvm.org/D21782 llvm-svn: 274200	2016-06-30 03:00:20 +00:00
Sanjoy Das	0da2d14766	[SCEV] Compute max be count from shift operator only if all else fails In particular, check to see if we can compute a precise trip count by exhaustively simulating the loop first. llvm-svn: 274199	2016-06-30 02:47:28 +00:00
George Burgess IV	d86e38e1db	[CFLAA] Add support for ModRef queries. This patch makes CFLAA answer some ModRef queries. Because we don't distinguish between reading/writing when making StratifiedSets, we're unable to offer any of the readonly-related answers. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21858 llvm-svn: 274197	2016-06-30 02:11:26 +00:00
Sanjay Patel	348111f4b9	add vector tests to show missing transform llvm-svn: 274192	2016-06-30 00:09:13 +00:00
Sanjay Patel	c3701e8b92	regenerate checks llvm-svn: 274188	2016-06-29 23:58:39 +00:00
Vedant Kumar	d6d192cd12	[llvm-cov] Use relative paths to file reports in -output-dir mode This makes it possible to e.g copy a report to another filesystem. llvm-svn: 274173	2016-06-29 21:55:46 +00:00
Artem Belevich	4d5d7be8cc	Revert r273313 "[NVPTX] Improve lowering of byval args of device functions." The change causes llvm crash in some unoptimized builds. llvm-svn: 274163	2016-06-29 20:51:15 +00:00
Evgeniy Stepanov	a5da256f92	StackColoring for SafeStack. This is a fix for PR27842. An IR-level implementation of stack coloring tailored to work with SafeStack. It is a bit weaker than the MI implementation in that it does not the "lifetime start at first access" logic. This can be improved in the future. This patch also replaces the naive implementation of stack frame layout with a greedy algorithm that can split existing stack slots and even fit small objects inside the alignment padding of other objects. llvm-svn: 274162	2016-06-29 20:37:43 +00:00
Tim Shen	aec68b263d	[InstCombine] Simplify and correct folding fcmps with the same children Summary: Take advantage of FCmpInst::Predicate's bit pattern and handle (fcmp , x, y) \| (fcmp , x, y) and (fcmp , x, y) & (fcmp , x, y) more consistently. Also fold more FCmpInst::FCMP_FALSE and FCmpInst::FCMP_TRUE to constants. Currently InstCombine wrongly folds (fcmp ogt, x, y) \| (fcmp ord, x, y) to (fcmp ogt, x, y); this patch also fixes that. Reviewers: spatel Subscribers: llvm-commits, iteratee, echristo Differential Revision: http://reviews.llvm.org/D21775 llvm-svn: 274156	2016-06-29 20:10:17 +00:00
Tim Shen	860a67eb4c	[InstCombine, NFC] Change the generated variable names by creating new instructions This removes some noise for D21775's test changes. llvm-svn: 274155	2016-06-29 20:10:13 +00:00
Nirav Dave	8e10380b73	Permit memory operands in ins/outs instructions [x86] (PR15455) While (ins\|outs)[bwld] instructions do not take %dx as a memory operand, various unofficial references do and objdump disassembles to this format. Extend special treatment of similar (in\|out)[bwld] operations. Reviewers: craig.topper, rnk, ab Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18837 llvm-svn: 274152	2016-06-29 19:54:27 +00:00
Rafael Espindola	2211f015cc	Don't verify inputs to the Linker if ODR merging. This fixes pr28072. The point, as Duncan pointed out, is that the file is already partially linked by just reading it. Long term I think the solution is to make metadata owned by the module and then the linker will lazily read it and be in charge of all the linking. Running a verifier in each input will defeat the lazy loading, but will be legal. Right now we are at the unfortunate position that to support odr merging we cannot verify the inputs, which mildly annoying (see test update). llvm-svn: 274148	2016-06-29 18:31:48 +00:00
Tim Shen	4561e784f4	[InstCombine] Add full tests for FoldAndOfFCmps and FoldOrOfFCmps Summary: This adds tests for covering all cases that FoldAndOfFCmps and FoldOrOfFCmps handle. Reviewers: spatel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21844 llvm-svn: 274144	2016-06-29 17:55:11 +00:00
Vedant Kumar	c1561cb2fa	[llvm-cov] Change some FileCheck prefixes to make tests reusable (NFC) I'm planning on extending these two tests with checks that validate html coverage reports. Make it easier to extend them by not using a prefix called "CHECK". llvm-svn: 274143	2016-06-29 17:47:08 +00:00
Nico Weber	0a480b2c05	Add a regression test for PR28348. llvm-svn: 274142	2016-06-29 17:34:31 +00:00
Nico Weber	12fdf60b75	Revert r272251, it caused PR28348. llvm-svn: 274141	2016-06-29 17:33:41 +00:00
Ahmed Bougacha	15a2f6d58c	[X86] Lower blended PACKUSes using appropriate types. When lowering two blended PACKUS, we used to disregard the types of the PACKUS inputs, indiscriminately generating a v16i8 PACKUS. This leads to non-selectable things like: (v16i8 (PACKUS (v4i32 v0), (v4i32 v1))) Instead, check that the PACKUSes have the same type, and use that as the final result type. llvm-svn: 274138	2016-06-29 16:56:09 +00:00
Vedant Kumar	84cfb884e2	[llvm-cov] Disable PGO name compression in a test file Some bots do not configure llvm with zlib enabled. Should fix: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/15571 llvm-svn: 274137	2016-06-29 16:34:57 +00:00
Vedant Kumar	2ca5eaa85a	Fix a typo; NFC llvm-svn: 274136	2016-06-29 16:23:34 +00:00
Vedant Kumar	4a54abeacd	[llvm-cov] Do not allow ".." to escape the coverage sub-directory In -output-dir mode, file reports are placed into a "coverage" directory. If filenames in the coverage mapping contain "..", they might escape out of this directory. Fix the problem by removing ".." from source filenames (expand the path component). llvm-svn: 274135	2016-06-29 16:22:12 +00:00
Rafael Espindola	c4cabb8054	Update tests to use at least darwin9. llvm-svn: 274129	2016-06-29 14:51:10 +00:00
Simon Pilgrim	f9c5908ffd	[X86][SSE2] Added _mm_loadu_si64 test to match llvm\tools\clang\test\CodeGen\sse2-builtins.c llvm-svn: 274127	2016-06-29 14:05:33 +00:00
Simon Pilgrim	851019175b	[X86] Regenerated popcnt combine tests llvm-svn: 274124	2016-06-29 13:54:03 +00:00
Elena Demikhovsky	5e21c94f25	Reverted patch 273864 llvm-svn: 274115	2016-06-29 10:01:06 +00:00
Marcin Koscielnicki	518cbc7cc3	[SystemZ] Add floating-point test data class instructions. These are not used by CodeGen yet - ISD combiners creating the new node will come in subsequent patches. llvm-svn: 274108	2016-06-29 07:29:07 +00:00
Craig Topper	df7454f94b	Revert "[ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition." This is breaking an optimizaton remark test in clang. I've identified a couple fixes for that, but want to understand it better before I commit to anything. llvm-svn: 274102	2016-06-29 04:57:00 +00:00
Craig Topper	2cc199baff	[ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition. If a operation for a recurrence is an addition with no signed wrap and both input sign bits are 0, then the result sign bit must also be 0. Similar for the negative case. I found this deficiency while playing around with a loop in the x86 backend that contained a signed division that could be optimized into an unsigned division if we could prove both inputs were positive. One of them being the loop induction variable. With this patch we can perform the conversion for this case. One of the test cases here is a contrived variation of the loop I was looking at. Differential revision: http://reviews.llvm.org/D21493 llvm-svn: 274098	2016-06-29 03:46:47 +00:00
Craig Topper	3a011de10c	[DAGCombine] Teach DAG combine to handle ORs of shuffles involving zero vectors where the zero vector is the first operand to the shuffle instead of the second. llvm-svn: 274097	2016-06-29 03:29:12 +00:00
Craig Topper	1e7e36e7e6	[DAGCombine] Add test cases to show that DAG combining an OR of two shuffles with zero vectors doesn't work if the zero vector is the first operand of the shuffle. Fix coming in a follow up patch. llvm-svn: 274096	2016-06-29 03:29:09 +00:00
Eric Christopher	0c58837b1f	Revert "[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions" Revert "[InstCombine] Combine A->B->A BitCast" as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847 This reverts commits r270135 and r263734. llvm-svn: 274094	2016-06-29 03:05:58 +00:00
Vedant Kumar	8d74cb27e8	[llvm-cov] Minor cleanups to prepare for the html format patch - Add renderView{Header,Footer}, renderLineSuffix, and hasSubViews to support creating tables with nested views. - Move the 'Format' cl::opt to make it easier to extend. - Just create one function view file, instead of overwriting the same file for every new function. Add a regression test for this. llvm-svn: 274086	2016-06-29 00:38:21 +00:00
Kevin Enderby	42398051d8	Finish cleaning up most of the error handling in libObject’s MachOUniversalBinary and its clients to use the new llvm::Error model for error handling. Changed getAsArchive() from ErrorOr<...> to Expected<...> so now all interfaces there use the new llvm::Error model for return values. In the two places it had if (!Parent) this is actually a program error so changed from returning errorCodeToError(object_error::parse_failed) to calling report_fatal_error() with a message. In getObjectForArch() added error messages to its two llvm::Error return values instead of returning errorCodeToError(object_error::arch_not_found) with no error message. For the llvm-obdump, llvm-nm and llvm-size clients since the only binary files in Mach-O Universal Binaries that are supported are Mach-O files or archives with Mach-O objects, updated their logic to generate an error when a slice contains something like an ELF binary instead of ignoring it. And added a test case for that. The last error stuff to be cleaned up for libObject’s MachOUniversalBinary is the use of errorOrToExpected(Archive::create(ObjBuffer)) which needs Archive::create() to be changed from ErrorOr<...> to Expected<...> first, which I’ll work on next. llvm-svn: 274079	2016-06-28 23:16:13 +00:00
Weiming Zhao	5410edddb1	[ARM] Fix 28282: cost computation for constant hoisting Summary: This fixes bug: https://llvm.org/bugs/show_bug.cgi?id=28282 Currently the cost model of constant hoisting checks the bit width of the data type of the constants. However, the actual immediate value is small enough and not need to be hoisted. This patch checks for the actual bit width needed for the constant. Reviewers: t.p.northover, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D21668 llvm-svn: 274073	2016-06-28 22:30:45 +00:00
Dehao Chen	8cd84aaa6f	Relax the clearance calculating for breaking partial register dependency. Summary: LLVM assumes that large clearance will hide the partial register spill penalty. But in our experiment, 16 clearance is too small. As the inserted XOR is normally fairly cheap, we should have a higher clearance threshold to aggressively insert XORs that is necessary to break partial register dependency. Reviewers: wmi, davidxl, stoklund, zansari, myatsina, RKSimon, DavidKreitzer, mkuper, joerg, spatel Subscribers: davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D21560 llvm-svn: 274068	2016-06-28 21:19:34 +00:00
Chris Bieneman	92b2e8a295	[YAML] Fix YAML tags appearing before the start of sequence elements Our existing yaml::Output code writes tags immediately when mapTag is called, without any state handling. This results in tags on sequence elements being written before the element itself. For example, we see this: SomeArray: !elem_type - key1: 1 key2: 2 !elem_type2 - key3: 3 key4: 4 We should instead see: SomeArray: - !elem_type key1: 1 key2: 2 - !elem_type2 key3: 3 key4: 4 Our reader handles reading properly, so this bug only impacts writing yaml sequences with tagged elements. As a test for this I've modified the Mach-O yaml encoding to allways apply the !mach-o tag when encoding MachOYAML::Object entries. This results in the !mach-o tag appearing as expected in dumped fat files. llvm-svn: 274067	2016-06-28 21:10:26 +00:00
Zhan Jun Liau	347db3e18e	[SystemZ] Use NILL instruction instead of NILF where possible Summary: SystemZ shift instructions only use the last 6 bits of the shift amount. When the result of an AND operation is used as a shift amount, this means that we can use the NILL instruction (which operates on the last 16 bits) rather than NILF (which operates on the last 32 bits) for a 16-bit savings in instruction size. Reviewers: uweigand Subscribers: llvm-commits Author: colpell Committing on behalf of Elliot. Differential Revision: http://reviews.llvm.org/D21686 llvm-svn: 274066	2016-06-28 21:03:19 +00:00
Matthias Braun	0b9a07883d	X86FrameLowering: Check subregs when deciding prolog kill flags llvm-svn: 274057	2016-06-28 20:31:56 +00:00
Sanjay Patel	3a0f2606ec	minimize regression tests and update checks llvm-svn: 274047	2016-06-28 18:40:08 +00:00
Sanjay Patel	8ce43c098b	minimize regression tests and update checks llvm-svn: 274046	2016-06-28 18:33:10 +00:00
Artur Pilipenko	7ad95ec22d	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details). This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 274043	2016-06-28 18:27:25 +00:00
Jacques Pienaar	f43266b868	[lanai] Update ELF number to correspond to the assigned number. Change EM_LANAI to correspond to machine number assigned by Xinuos. llvm-svn: 274042	2016-06-28 18:22:22 +00:00
Michael Kuperstein	a118acb82f	[X86] Update a test with more explicit checks. NFC. llvm-svn: 274040	2016-06-28 17:42:13 +00:00
Vedant Kumar	9cbad2c2b8	[llvm-cov] Create an index of reports in -output-dir mode This index lists the reports available in the 'coverage' sub-directory. This will help navigate coverage output from large projects. This commit factors the file creation code out of SourceCoverageView and into CoveragePrinter. llvm-svn: 274029	2016-06-28 16:12:24 +00:00
Vedant Kumar	64d8a029e9	[llvm-cov] Minor cleanups (NFC) - Test the '-o' alias for -output-dir. - Use a helper method in a conditional. - Add a period. llvm-svn: 274028	2016-06-28 16:12:20 +00:00
David Majnemer	1c7d532cde	[X86] Make WRPKRU/RDPKRU pass -verify-machineinstrs The original implementation attempted to zero registers using XOR %foo, %foo. This is problematic because it constitutes a read-modify-write of a register which might not be defined. Instead, use MOV32r0 to avoid these problems; expandPostRAPseudo does the right thing here. llvm-svn: 274024	2016-06-28 16:04:46 +00:00
Marcin Koscielnicki	234e5a809b	[SystemZ] Save/restore r6 and r7 if function contains landing pad. This fixes PR27102. Differential Revision: http://reviews.llvm.org/D18541 llvm-svn: 274017	2016-06-28 14:13:11 +00:00
Simon Pilgrim	5f71c909f0	[X86][AVX] Peek through bitcasts to find the source of broadcasts (reapplied) AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors. This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected. Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node. As we're being more aggressive with bitcasts, we also need to ensure that the broadcast type is correctly bitcasted Differential Revision: http://reviews.llvm.org/D21660 llvm-svn: 274013	2016-06-28 13:24:05 +00:00
Arnaud A. de Grandmaison	eee4711fbe	[gold] Really fix test to run on non x86 platforms. Address post-commit comment from H.J. Lu. llvm-svn: 274000	2016-06-28 08:12:09 +00:00
Simon Pilgrim	c15d217831	[X86][SSE] Added support for combining target shuffles to (V)PSHUFD/VPERMILPD/VPERMILPS immediate permutes This patch allows target shuffles to be combined to single input immediate permute instructions - (V)PSHUFD/VPERMILPD/VPERMILPS - allowing more general pattern matching than what we current do and improves the likelihood of memory folding compared to existing patterns which tend to reuse the input in multiple arguments. Further permute instructions (V)PSHUFLW/(V)PSHUFHW/(V)PERMQ/(V)PERMPD may be added in the future but its proven tricky to create tests cases for them so far. (V)PSHUFLW/(V)PSHUFHW is already handled quite well in combineTargetShuffle so it may be that removing some of that code may allow us to perform more of the combining in one place without duplication. Differential Revision: http://reviews.llvm.org/D21148 llvm-svn: 273999	2016-06-28 08:08:15 +00:00
Elena Demikhovsky	a727f3cfde	[X86 Target Lowering] Merged ICMP test. llvm-svn: 273995	2016-06-28 06:25:38 +00:00
Adam Nemet	bd861acf29	[LLE] Don't hoist conditionally executed loads If the load is conditional we can't hoist its 0-iteration instance to the preheader because that would make it unconditional. Thus we would access a memory location that the original loop did not access. llvm-svn: 273991	2016-06-28 04:02:47 +00:00
Vedant Kumar	7937ef3796	Reapply "[llvm-cov] Add an -output-dir option for the show sub-command"" Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if it doesn't already exist, and prints reports into that directory. In function view mode, all views are written into path/to/dir/functions.$EXTENSION. In file view mode, all views are written into path/to/dir/coverage/$PATH.$EXTENSION. Changes since the initial commit: - Avoid accidentally closing stdout twice. llvm-svn: 273985	2016-06-28 02:09:39 +00:00
Nick Lewycky	9980075133	NFC. Fix popular typo in comment 'deferencing' --> 'dereferencing'. Bonus changes, * placement in X86ISelLowering and 'exerce' -> 'exercise' in test. llvm-svn: 273984	2016-06-28 01:45:05 +00:00
Vedant Kumar	a48d9fe86a	Revert "[llvm-cov] Add an -output-dir option for the show sub-command" This reverts commit r273971. test/profile/instrprof-visibility.cpp is failing because of an uncaught error in SafelyCloseFileDescriptor. llvm-svn: 273978	2016-06-28 01:14:04 +00:00
Matt Arsenault	b4d9503171	AMDGPU: Fix out of bounds indirect indexing errors This was producing acceses to registers beyond the super register's limits, resulting in verifier failures. llvm-svn: 273977	2016-06-28 01:09:00 +00:00
Vedant Kumar	02507c435c	[llvm-cov] Add an -output-dir option for the show sub-command Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if it doesn't already exist, and prints reports into that directory. In function view mode, all views are written into path/to/dir/functions.$EXTENSION. In file view mode, all views are written into path/to/dir/coverage/$PATH.$EXTENSION. llvm-svn: 273971	2016-06-28 00:18:57 +00:00
Vedant Kumar	dcbf4d68b2	[llvm-cov] Use -check-prefixes in a test (NFC) llvm-svn: 273970	2016-06-28 00:18:53 +00:00
Vedant Kumar	635c83c1b4	[llvm-cov] Add a format option for the 'show' sub-command (mostly NFC) llvm-svn: 273968	2016-06-28 00:15:54 +00:00
Chandler Carruth	dca834089a	[PM] Improve the debugging and logging facilities of the CGSCC bits of the new pass manager. This adds operator<< overloads for the various bits of the LazyCallGraph, dump methods for use from the debugger, and debug logging using them to the CGSCC pass manager. Having this was essential for debugging the call graph update patch, and I've extracted what I could from that patch here to minimize the delta. llvm-svn: 273961	2016-06-27 23:26:08 +00:00
Easwaran Raman	22eb80a114	Fix size computation of array allocation in inline cost analysis Differential revision: http://reviews.llvm.org/D21690 llvm-svn: 273952	2016-06-27 22:31:53 +00:00
Sanjay Patel	59ed2ffca3	[InstCombine] shrink type of sdiv if dividend is sexted and constant divisor is small enough (PR28153) This should fix PR28153: https://llvm.org/bugs/show_bug.cgi?id=28153 Differential Revision: http://reviews.llvm.org/D21769 llvm-svn: 273951	2016-06-27 22:27:11 +00:00
Kevin Enderby	1051909df1	Change all but the last ErrorOr<...> use for MachOUniversalBinary to Expected<...> to allow a good error message to be produced. I added the one test case that the object file tools could produce an error message. The other two errors can’t be triggered if the input file is passed through sys::fs::identify_magic(). But the malformedError("bad magic number") does get triggered by the logic in llvm-dsymutil when dealing with a normal Mach-O file. The other "File too small ..." error would take a logic error currently to produce and is not tested for. llvm-svn: 273946	2016-06-27 21:39:39 +00:00
Matt Arsenault	59c0ffa22a	AMDGPU: Implement per-function subtargets llvm-svn: 273940	2016-06-27 20:48:03 +00:00
Matt Arsenault	03d8584590	AMDGPU: Move subtarget feature checks into passes llvm-svn: 273937	2016-06-27 20:32:13 +00:00
Sanjay Patel	5cdf699daa	add tests for PR28153 llvm-svn: 273936	2016-06-27 20:28:59 +00:00
Justin Holewinski	cb29fb4a98	Only emit extension for zeroext/signext arguments if type is < 32 bits Reviewers: jingyue, jlebar Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D21756 llvm-svn: 273922	2016-06-27 20:22:22 +00:00
Rafael Espindola	8121becac3	Teach shouldAssumeDSOLocal about tls. Fixes a fixme about handling other visibilities. llvm-svn: 273921	2016-06-27 20:19:14 +00:00
Elena Demikhovsky	6f2ec8104a	Fixed crash of SLP Vectorizer on KNL The bug is connected to vector GEPs. https://llvm.org/bugs/show_bug.cgi?id=28313 llvm-svn: 273919	2016-06-27 20:07:00 +00:00
Chris Bieneman	e5cc1fd498	[yaml2obj] Missed updating a few test cases in r273915 This should fix the broken bots. llvm-svn: 273918	2016-06-27 20:02:49 +00:00
Matt Arsenault	21a4625a16	AMDGPU: Fix verifier errors with undef vector indices Also fix pointlessly adding exec to liveins. llvm-svn: 273916	2016-06-27 19:57:44 +00:00
Chris Bieneman	8ff0c11357	[yaml2obj] Remove --format option in favor of YAML tags Summary: Our YAML library's handling of tags isn't perfect, but it is good enough to get rid of the need for the --format argument to yaml2obj. This patch does exactly that. Instead of requiring --format, it infers the format based on the tags found in the object file. The supported tags are: !ELF !COFF !mach-o !fat-mach-o I have a corresponding patch that is quite large that fixes up all the in-tree test cases. Reviewers: rafael, Bigcheese, compnerd, silvas Subscribers: compnerd, llvm-commits Differential Revision: http://reviews.llvm.org/D21711 llvm-svn: 273915	2016-06-27 19:53:53 +00:00
Matt Arsenault	82f41518ed	Verifier: Reject non-float !fpmath Code already assumes this is float. getFPAccuracy() crashes on any other type. llvm-svn: 273912	2016-06-27 19:43:15 +00:00
Matt Arsenault	f0f721a682	DAGCombiner: Don't narrow volatile vector loads + extract llvm-svn: 273909	2016-06-27 19:31:04 +00:00
Elena Demikhovsky	ad3929cc64	X86 Lowering - Fixed a crash in ICMP scalar instruction Fixed a bug in EmitTest() function in combining shl + icmp. https://llvm.org/bugs/show_bug.cgi?id=28119 llvm-svn: 273899	2016-06-27 18:07:16 +00:00
Sanjay Patel	c6ada53be5	[InstCombine] use m_APInt for div --> ashr fold The APInt matcher works with splat vectors, so we get this fold for vectors too. llvm-svn: 273897	2016-06-27 17:25:57 +00:00
Artur Pilipenko	72f76b8805	Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures. llvm-svn: 273895	2016-06-27 16:54:33 +00:00
Easwaran Raman	1832bf6aee	[PM] Port PartialInlining to the new PM Differential revision: http://reviews.llvm.org/D21699 llvm-svn: 273894	2016-06-27 16:50:18 +00:00
Artur Pilipenko	a36aa41519	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details). This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 273892	2016-06-27 16:29:26 +00:00
Simon Pilgrim	476e8ceed3	[X86][SSE] Added extra broadcast tests to cover PR28327 llvm-svn: 273891	2016-06-27 16:15:37 +00:00
Zhan Jun Liau	4f130b4410	[SystemZ] Avoid generating 2 XOR instructions for (and (xor x, -1), y) Summary: Created a pattern to match 64-bit mode (and (xor x, -1), y) to a shorter sequence of instructions. Before the change, the canonical form is translated to: xihf %r3, 4294967295 xilf %r3, 4294967295 ngr %r2, %r3 After the change, the canonical form is translated to: ngr %r3, %r2 xgr %r2, %r3 Reviewers: zhanjunl, uweigand Subscribers: llvm-commits Author: assem Committing on behalf of Assem. Differential Revision: http://reviews.llvm.org/D21693 llvm-svn: 273887	2016-06-27 15:55:30 +00:00
Krzysztof Parzyszek	5da24e5495	[Hexagon] Equally-sized vectors are equivalent in ISel (except vNi1) llvm-svn: 273885	2016-06-27 15:08:22 +00:00
Nico Weber	1e058160dd	Revert 273848, it caused PR28329 llvm-svn: 273879	2016-06-27 14:36:46 +00:00
Simon Pilgrim	9c2f378587	Removed duplicate assertions note llvm-svn: 273874	2016-06-27 13:06:18 +00:00
Elena Demikhovsky	f65e865e33	Removed extra test from the prev commit. llvm-svn: 273865	2016-06-27 11:40:49 +00:00
Elena Demikhovsky	4c58b2761a	Fixed consecutive memory access detection in Loop Vectorizer. It did not handle correctly cases without GEP. The following loop wasn't vectorized: for (int i=0; i<len; i++) to++ = from++; I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1. Re-commit rL273257 - revision: http://reviews.llvm.org/D20789 llvm-svn: 273864	2016-06-27 11:19:23 +00:00
Arnaud A. de Grandmaison	efb0b899d3	[gold] Fix test to not assume it runs on x86 hardware. llvm-svn: 273854	2016-06-27 09:13:03 +00:00
Hrvoje Varga	24b975dc66	[mips][micromips] Implement LD, LLD, LWU, SD, DSRL, DSRL32 and DSRLV instructions Differential Revision: http://reviews.llvm.org/D16625 llvm-svn: 273850	2016-06-27 08:23:28 +00:00
Simon Pilgrim	a45da385f8	[X86][AVX] Peek through bitcasts to find the source of broadcasts AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors. This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected. Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node. Differential Revision: http://reviews.llvm.org/D21660 llvm-svn: 273848	2016-06-27 07:44:32 +00:00
Igor Breger	7357849dca	[ConstantFolding] Fix bitcast vector of i1. Differential Revision: http://reviews.llvm.org/D21735 llvm-svn: 273845	2016-06-27 06:42:54 +00:00
Rafael Espindola	1ac1fa818e	Mips: Fix access to private functions. llvm-svn: 273843	2016-06-27 03:19:40 +00:00
Sanjay Patel	1d745384da	add tests for potential select transforms llvm-svn: 273833	2016-06-26 23:44:21 +00:00
Nico Weber	d8db1e172c	Revert r273807 (and r273809, r273810), it caused PR28311 llvm-svn: 273815	2016-06-26 15:10:34 +00:00
Amjad Aboud	ff976c99c7	[codeview] Improved array type support. Added support for: 1. Multi dimension array. 2. Array of structure type, which previously was declared incompletely. 3. Dynamic size array. Differential Revision: http://reviews.llvm.org/D21526 llvm-svn: 273807	2016-06-26 11:44:45 +00:00
Sanjoy Das	a37bb4a65d	[LoopUnswitch] Unswitch on conditions feeding into guards Summary: This is a straightforward extension of what LoopUnswitch does to branches to guards. That is, we unswitch ``` for (;;) { ... guard(loop_invariant_cond); ... } ``` into ``` if (loop_invariant_cond) { for (;;) { ... // There is no need to emit guard(true) ... } } else { for (;;) { ... guard(false); // SimplifyCFG will clean this up by adding an // unreachable after the guard(false) ... } } ``` Reviewers: majnemer Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D21725 llvm-svn: 273801	2016-06-26 05:10:45 +00:00
Jan Vesely	3bc1af2be4	AMDGPU/R600: Fix GlobalValue regressions. Don't cast GV expression to MCSymbolRefExpr. r272705 changed GV to binary expressions by including offset even if the offset it 0 (we haven't hit this sooner since tested workloads don't include static offsets) We don't really care about the type of expression, so set it directly. Fixes: r272705 Consider section relative relocations. Since all const as data is in one boffer section relative is equivalent to abs32. Fixes: r273166 Differential Revision: http://reviews.llvm.org/D21633 llvm-svn: 273785	2016-06-25 18:24:16 +00:00
Sanjay Patel	51ff79fd82	update tests to use FileCheck llvm-svn: 273784	2016-06-25 17:39:10 +00:00
David Majnemer	e14e7bc4b8	Revert "[SimplifyCFG] Stop inserting calls to llvm.trap for UB" This reverts commit r273778, it seems to break UBSan :/ llvm-svn: 273779	2016-06-25 08:19:55 +00:00
David Majnemer	d346a37737	[SimplifyCFG] Stop inserting calls to llvm.trap for UB SimplifyCFG had logic to insert calls to llvm.trap for two very particular IR patterns: stores and invokes of undef/null. While InstCombine canonicalizes certain undefined behavior IR patterns to stores of undef, phase ordering means that this cannot be relied upon in general. There are much better tools than llvm.trap: UBSan and ASan. N.B. I could be argued into reverting this change if a clear argument as to why it is important that we synthesize llvm.trap for stores, I'd be hard pressed to see why it'd be useful for invokes... llvm-svn: 273778	2016-06-25 08:04:19 +00:00
David Majnemer	bb53d23ef8	[InstSimplify] Replace calls to null with undef Calling null is undefined behavior, we can simplify the resulting value to undef. llvm-svn: 273777	2016-06-25 07:37:30 +00:00
David Majnemer	1fea77c6fc	[SimplifyCFG] Replace calls to null/undef with unreachable Calling null is undefined behavior, a call to undef can be trivially treated as a call to null. llvm-svn: 273776	2016-06-25 07:37:27 +00:00
Konstantin Zhuravlyov	f2f3d14774	[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 llvm-svn: 273769	2016-06-25 03:11:28 +00:00
Saleem Abdulrasool	92d33bd2af	llvm-ar: add some tests for llvm-ar default selection This adds some tests for the smarter llvm-ar selection mode as well as some additional tests as per Rafael's post commit review comments. llvm-svn: 273768	2016-06-25 03:05:56 +00:00
Tom Stellard	b164a9843b	AMDGPU/SI: Make sure not to fold offsets into local address space globals Summary: Offset folding only works if you are emitting relocations, and we don't emit relocations for local address space globals. Reviewers: arsenm, nhaustov Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21647 llvm-svn: 273765	2016-06-25 01:59:16 +00:00
Sanjoy Das	f63768cbfc	[PlaceSafepoints] Don't call undef in test case; NFC llvm-svn: 273764	2016-06-25 01:40:54 +00:00
Sanjoy Das	d850068282	[LoopUnswitch] Avoid exponential behavior Summary: (No semantic change intended). Reviewers: majnemer, bogner, mzolotukhin Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D21707 llvm-svn: 273763	2016-06-25 01:14:19 +00:00
David Majnemer	0f45572761	The absence of noreturn doesn't ensure mayReturn There are two separate issues: - LLVM doesn't consider infinite loops to be side effects: we happily hoist/sink above/below loops whose bounds are unknown. - The absence of the noreturn attribute is insufficient for us to know if a function will definitely return. Relying on noreturn in the middle-end for any property is an accident waiting to happen. llvm-svn: 273762	2016-06-25 00:55:12 +00:00
Peter Collingbourne	0312f614b1	IR: Introduce llvm.type.checked.load intrinsic. This intrinsic safely loads a function pointer from a virtual table pointer using type metadata. This intrinsic is used to implement control flow integrity in conjunction with virtual call optimization. The virtual call optimization pass will optimize away llvm.type.checked.load intrinsics associated with devirtualized calls, thereby removing the type check in cases where it is not needed to enforce the control flow integrity constraint. This patch also introduces the capability to copy type metadata between global variables, and teaches the virtual call optimization pass to do so. Differential Revision: http://reviews.llvm.org/D21121 llvm-svn: 273756	2016-06-25 00:23:04 +00:00
Matthias Braun	6ad3d05b68	MachineScheduler: Fully compare top/bottom candidates In bidirectional scheduling this gives more stable results than just comparing the "reason" fields of the top/bottom node because the reason field may be higher depending on what other nodes are in the queue. Differential Revision: http://reviews.llvm.org/D19401 llvm-svn: 273755	2016-06-25 00:23:00 +00:00
David Majnemer	b8da3a2bb2	Reinstate r273711 r273711 was reverted by r273743. The inliner needs to know about any call sites in the inlined function. These were obscured if we replaced a call to undef with an undef but kept the call around. This fixes PR28298. llvm-svn: 273753	2016-06-25 00:04:10 +00:00
Matthias Braun	1e374a7aa6	AMDGPU: Define a schedule class for COPY. COPY was lacking a scheduling class, define it to avoid regressions in the upcoming change to the bidirectional MachineScheduler. Approved by tstellar on IRC. Differential Revision: http://reviews.llvm.org/D21540 llvm-svn: 273751	2016-06-24 23:52:11 +00:00
Michael Kuperstein	83b753d430	[PM] Port float2int to the new pass manager Differential Revision: http://reviews.llvm.org/D21704 llvm-svn: 273747	2016-06-24 23:32:02 +00:00
Dehao Chen	c66a06ad0e	Hookup ProfileSummary with SampleProfilerLoader Summary: Set ProfileSummary in SampleProfilerLoader. Reviewers: davidxl, eraman Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21702 llvm-svn: 273745	2016-06-24 22:57:06 +00:00
Nico Weber	ae2ef4ccd4	Revert r273711, it caused PR28298. llvm-svn: 273743	2016-06-24 22:52:39 +00:00
Adrian Prantl	29ce701a06	Fix the type signature of DwarfExpression::Add.*Constant to support values >32 bits. This fixes an embarrassing bug when emitting .debug_loc entries for 64-bit+ constants, which were previously silently truncated to 32 bits. <rdar://problem/26843232> llvm-svn: 273736	2016-06-24 21:35:09 +00:00
Krzysztof Parzyszek	709a626015	[Hexagon] Simplify (+fix) instruction selection for indexed loads/stores llvm-svn: 273733	2016-06-24 21:27:17 +00:00
Peter Collingbourne	7efd750607	IR: New representation for CFI and virtual call optimization pass metadata. The bitset metadata currently used in LLVM has a few problems: 1. It has the wrong name. The name "bitset" refers to an implementation detail of one use of the metadata (i.e. its original use case, CFI). This makes it harder to understand, as the name makes no sense in the context of virtual call optimization. 2. It is represented using a global named metadata node, rather than being directly associated with a global. This makes it harder to manipulate the metadata when rebuilding global variables, summarise it as part of ThinLTO and drop unused metadata when associated globals are dropped. For this reason, CFI does not currently work correctly when both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable globals, and fails to associate metadata with the rebuilt globals. As I understand it, the same problem could also affect ASan, which rebuilds globals with a red zone. This patch solves both of those problems in the following way: 1. Rename the metadata to "type metadata". This new name reflects how the metadata is currently being used (i.e. to represent type information for CFI and vtable opt). The new name is reflected in the name for the associated intrinsic (llvm.type.test) and pass (LowerTypeTests). 2. Attach metadata directly to the globals that it pertains to, rather than using the "llvm.bitsets" global metadata node as we are doing now. This is done using the newly introduced capability to attach metadata to global variables (r271348 and r271358). See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html Differential Revision: http://reviews.llvm.org/D21053 llvm-svn: 273729	2016-06-24 21:21:32 +00:00
Rafael Espindola	a895a0cd01	Add support for musl-libc on ARM Linux. Patch by Lei Zhang! llvm-svn: 273726	2016-06-24 21:14:33 +00:00
Chris Bieneman	93e7119380	[obj2yaml] [yaml2obj] Support for MachO Universal binaries This patch adds round-trip support for MachO Universal binaries to obj2yaml and yaml2obj. Universal binaries have a header and list of architecture structures, followed by a the individual object files at specified offsets. llvm-svn: 273719	2016-06-24 20:42:28 +00:00
Michael Kuperstein	82d5da5aac	[PM] Port PreISelIntrinsicLowering to the new PM llvm-svn: 273713	2016-06-24 20:13:42 +00:00
David Majnemer	3b3e954ea2	SimplifyInstruction does not imply DCE We cannot remove an instruction with no uses just because SimplifyInstruction succeeds. It may have side effects. llvm-svn: 273711	2016-06-24 19:34:46 +00:00
Rafael Espindola	88ae09e9be	Use shouldAssumeDSOLocal in isOffsetFoldingLegal. This makes it slightly more powerful for dynamic-no-pic. llvm-svn: 273704	2016-06-24 18:48:36 +00:00
Reid Kleckner	fbd5eef691	Revert "InstCombine rule to fold trunc when value available" This reverts commit r273608. Broke building code with sanitizers, where apparently these kinds of loads, casts, and truncations are common: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/24502 http://crbug.com/623099 llvm-svn: 273703	2016-06-24 18:42:58 +00:00
Sanjay Patel	f8b08f7179	[InstCombine] consolidate commutation variants of matchSelectFromAndOr() in one place; NFCI By putting all the possible commutations together, we simplify the code. Note that this is NFCI, but I'm adding tests that actually exercise each commutation pattern because we don't have this anywhere else. llvm-svn: 273702	2016-06-24 18:26:02 +00:00
Kyle Butt	267164df0a	Codegen: Fix broken assumption in Tail Merge. Tail merge was making the assumption that a layout successor or predecessor was always a cfg successor/predecessor. Remove that assumption. Changes to tests are necessary because the errant cfg edges were preventing optimizations. llvm-svn: 273700	2016-06-24 18:16:36 +00:00
Rafael Espindola	955d3569e7	Use FileCheck. NFC. llvm-svn: 273699	2016-06-24 18:04:39 +00:00
Reid Kleckner	10dd55c548	[codeview] Emit parameter variables in the right order Clang emits them in reverse order to conform to the ABI, which requires left-to-right destruction. As a result, the order doesn't fall out naturally, and we have to sort things out in the backend. Fixes PR28213 llvm-svn: 273696	2016-06-24 17:55:40 +00:00
Peter Collingbourne	4f7c16dd53	Linker: Copy metadata when linking declarations. Differential Revision: http://reviews.llvm.org/D21624 llvm-svn: 273692	2016-06-24 17:42:21 +00:00
Reid Kleckner	9f7f3e1e64	[codeview] Emit base class information from DW_TAG_inheritance nodes There are two remaining issues here: 1. No vbptr information 2. Need to mention indirect virtual bases Getting indirect virtual bases is just a matter of adding an "indirect" flag, emitting them in the frontend, and ignoring them when appropriate for DWARF. All virtual bases use the same artificial vbptr field, so I think the vbptr offset will be best represented by an implicit __vbptr$ClassName member similar to our existing __vptr$ member. llvm-svn: 273688	2016-06-24 16:24:24 +00:00
Matthew Simpson	e794678404	[LV] Preserve order of dependences in interleaved accesses analysis The interleaved access analysis currently assumes that the inserted run-time pointer aliasing checks ensure the absence of dependences that would prevent its instruction reordering. However, this is not the case. Issues can arise from how code generation is performed for interleaved groups. For a load group, all loads in the group are essentially moved to the location of the first load in program order, and for a store group, all stores in the group are moved to the location of the last store. For groups having members involved in a dependence relation with any other instruction in the loop, this reordering can violate the dependence. This patch teaches the interleaved access analysis how to avoid breaking such dependences, and should fix PR27626. An assumption of the original analysis was that the accesses had been collected in "program order". The analysis was then simplified by visiting the accesses bottom-up. However, this ordering was never guaranteed for anything other than single basic block loops. Thus, this patch also enforces the desired ordering. Reference: https://llvm.org/bugs/show_bug.cgi?id=27626 Differential Revision: http://reviews.llvm.org/D19984 llvm-svn: 273687	2016-06-24 15:33:25 +00:00
Artur Pilipenko	6c7a8abf5c	Remangle intrinsics names when types are renamed This is a resubmittion of previously reverted rL273568. This is a fix for the problem mentioned in "LTO and intrinsics mangling" llvm-dev mail thread: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098387.html Reviewers: mehdi_amini, reames Differential Revision: http://reviews.llvm.org/D19373 llvm-svn: 273686	2016-06-24 15:10:29 +00:00
Saleem Abdulrasool	f6b5f0fffd	ExecutionEngine: add preliminary support for COFF ARM This adds rudimentary support for COFF ARM to the dynamic loader for the exeuction engine. This can be used by lldb to JIT code into a COFF ARM environment. This lays the foundation for the loader, though a few of the relocation types are yet unhandled. llvm-svn: 273682	2016-06-24 14:11:44 +00:00
Chad Rosier	fd342808e0	[MachineDominatorTree] Add a MDT verifier. Differential Revision: http://reviews.llvm.org/D21657 llvm-svn: 273678	2016-06-24 13:32:22 +00:00
Daniel Sanders	0d97270ae5	[mips] Use --check-prefixes where appropriate. NFC. llvm-svn: 273669	2016-06-24 12:23:17 +00:00
Matt Arsenault	86de486d31	AMDGPU: Add stub custom CodeGenPrepare pass This will do various things including ones CodeGenPrepare does, but with knowledge of uniform values. llvm-svn: 273657	2016-06-24 07:07:55 +00:00
George Burgess IV	2efbb3f394	Remove hack introduced by r273641. Hopefully the buildbots have had enough time to pick this up by now. llvm-svn: 273656	2016-06-24 06:58:15 +00:00
Matt Arsenault	0534f4aa79	AMDGPU: Un-xfail and add tests Un XFAIL a few tests plus a few more I had lying around in my tree, which seem to all work now but I don't see tests that quite test the same things. llvm-svn: 273655	2016-06-24 06:58:01 +00:00
Matt Arsenault	c581611e11	AMDGPU: Remove disable-irstructurizer subtarget feature The only real reason to use it is for testing, so replace it with a command line option instead of a potentially function dependent feature. llvm-svn: 273653	2016-06-24 06:30:22 +00:00
Vedant Kumar	2c96e88ed4	[llvm-cov] Fix two warnings They were using output streams inconsistently. One also had a grammar bug. I noticed these while trying to pare down D18278. llvm-svn: 273642	2016-06-24 02:33:01 +00:00
George Burgess IV	43e9ba0e5a	Temporary hack to clean a file from buildbots. Some buildbots are complaining about a .s file under test/ that was inadvertently created by a test earlier today, and is still hanging around. I'll undo this change in ~1hr (or whenever the bots are done getting rid of said file). llvm-svn: 273641	2016-06-24 02:19:11 +00:00
Chuang-Yu Cheng	68f7f1cf00	Teaching SimplifyCFG to recognize the Or-Mask trick that InstCombine uses to reduce the number of comparisons. Specifically, InstCombine can turn: (i == 5334 \|\| i == 5335) into: ((i \| 1) == 5335) SimplifyCFG was already able to detect the pattern: (i == 5334 \|\| i == 5335) to: ((i & -2) == 5334) This patch supersedes D21315 and resolves PR27555 (https://llvm.org/bugs/show_bug.cgi?id=27555). Thanks to David and Chandler for the suggestions! Author: Thomas Jablin (tjablin) Reviewers: majnemer chandlerc halfdan cycheng http://reviews.llvm.org/D21397 llvm-svn: 273639	2016-06-24 01:59:00 +00:00
Peter Collingbourne	b19924a425	BitcodeWriter: Remove redundant (and incorrect) check for whether to emit module summary. The function name Module::empty() is slightly misleading in that it only tests for the presence of functions in the module. However we still want to emit the module summary if the module contains only global variables or aliases. The presence of such entities can be determined simply by checking the summary directly, as we are doing below. Differential Revision: http://reviews.llvm.org/D21669 llvm-svn: 273638	2016-06-24 01:58:02 +00:00
George Burgess IV	a3d62be733	[CFLAA] Propagate StratifiedAttrs in interproc. analysis. This patch also has a refactor that kills StratifiedAttr, and leaves us with StratifiedAttrs, because having both was mildly redundant. This patch makes us correctly handle stratified attributes when doing interprocedural analysis. It also adds another attribute, AttrCaller, which acts like AttrUnknown. We can filter out AttrCaller values when during interprocedural analysis, since the caller should have information about what arguments it's passing to its callee. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21645 llvm-svn: 273636	2016-06-24 01:00:03 +00:00
Ahmed Bougacha	f0b46ee0aa	[ARM] Use aapcs_vfp for ___truncdfhf2 on v7k. r215348 overrode the f16 libcalls to be soft-float, but v7k uses the default (hard-float) calling convention. llvm-svn: 273631	2016-06-24 00:08:01 +00:00
Tom Stellard	14416ae6cd	Support/ELF: Add R_AMDGPU_GOTPCREL relocation Summary: We will start generating this in a future patch. Reviewers: arsenm, kzhuravl, rafael, ruiu, tony-tye Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21482 llvm-svn: 273628	2016-06-23 23:11:29 +00:00
Hans Wennborg	4b63a98de3	[codeview] Add classes and unions to the Local/Global UDTs lists Differential Revision: http://reviews.llvm.org/D21655 llvm-svn: 273626	2016-06-23 22:57:25 +00:00
Chris Bieneman	10dcd3bd3b	[yaml2macho] Removing asserts in favor of explicit yaml parse error 32-bit Mach headers don't have reserved fields. When generating the mapping for 32-bit headers leaving off the reserved field will result in parse errors if the field is present in the yaml. Added a CHECK-NOT line to ensure that mach_header.yaml isn't adding a reserved field, and a test to ensure that the parser error gets hit with 32-bit headers. llvm-svn: 273623	2016-06-23 22:36:31 +00:00
Kyle Butt	991df7889b	Codegen: [X86] preservere memory refs for folded umul_lohi Memory references were not being propagated for this folded load. This prevented optimizations like LICM from hoisting the load. Added test to verify that this allows LICM to proceed. llvm-svn: 273617	2016-06-23 21:40:35 +00:00
Kyle Butt	178314ab52	Codegen: LICM Remove check for exactly 1 register def. When considering whether to split an instruction with a memory operand into an explicit load and a register-based instruction, we currently check that the resulting instruction has exactly 1 def. This prevents 2 important LICM optimizations: compares with memory operands, and double indirect calls. All the tests and the test-suite pass without the check. My guess as to original intent is to limit the additional register pressure created by the new instruction, but given that we only split out a single register, it is already limited. The licm-dominance test now checks actual memory loads for hoisting instead of undef, and it tests compares. hoist-invariant-load.ll now checks for 2 hoists, the intended hoist, and a bonus from calling a got-relative function in a loop. llvm-svn: 273616	2016-06-23 21:38:49 +00:00
Rafael Espindola	2d3cce71ee	Uses shouldAssumeDSOLocal. With that SystemZ knows to avoid a GOT for PIE. llvm-svn: 273614	2016-06-23 21:18:59 +00:00
Rafael Espindola	f2898d73a5	Convert test to FileCheck. llvm-svn: 273609	2016-06-23 20:37:49 +00:00
Anna Thomas	31a0b2088f	InstCombine rule to fold trunc when value available Summary: This instcombine rule folds away trunc operations that have value available from a prior load or store. This kind of code can be generated as a result of GVN widening the load or from source code as well. Reviewers: reames, majnemer, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21246 llvm-svn: 273608	2016-06-23 20:22:22 +00:00
George Burgess IV	1f99da54c2	[CFLAA] Use better interprocedural function summaries. Previously, we just unified any arguments that seemed to be related to each other. With this patch, we now respect dereference levels, etc. which should make us substantially more accurate. Proper handling of StratifiedAttrs will be done in a later patch. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21536 llvm-svn: 273596	2016-06-23 18:55:23 +00:00
Hans Wennborg	a21a263101	[codeview] Fix letter casing in FileCheck regexes We print those hex numbers with uppercase letters. llvm-svn: 273594	2016-06-23 18:23:28 +00:00
Michael Kuperstein	0194d30e09	[X86] Extract HiPE prologue constants into metadata X86FrameLowering::adjustForHiPEPrologue() contains a hard-coded offset into an Erlang Runtime System-internal data structure (the PCB). As the layout of this data structure is prone to change, this poses problems for maintaining compatibility. To address this problem, the compiler can produce this information as module-level named metadata. For example (where P_NSP_LIMIT is the offending offset): !hipe.literals = !{ !2, !3, !4 } !2 = !{ !"P_NSP_LIMIT", i32 152 } !3 = !{ !"X86_LEAF_WORDS", i32 24 } !4 = !{ !"AMD64_LEAF_WORDS", i32 24 } Patch by Magnus Lang Differential Revision: http://reviews.llvm.org/D20363 llvm-svn: 273593	2016-06-23 18:17:25 +00:00
Nirav Dave	38bb1c15fd	Prevent generation of temp file in test from r273585. llvm-svn: 273588	2016-06-23 18:06:35 +00:00
Nirav Dave	bfdb483755	Preserve DebugInfo when replacing values in DAGCombiner Recommiting after correcting over-eager Debug Value transfer fixing PR28270. [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 273585	2016-06-23 17:52:57 +00:00
Pablo Barrio	7a64346533	[ARM] Lower (select_cc k k (select_cc ~k ~k x)) into (SSAT l_k x) Summary: SSAT saturates an integer, making sure that its value lies within an interval [-k, k]. Since the constant is given to SSAT as the number of bytes set to one, k + 1 must be a power of 2, otherwise the optimization is not possible. Also, the select_cc must use < and > respectively so that they define an interval. Reviewers: mcrosier, jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D21372 llvm-svn: 273581	2016-06-23 16:53:49 +00:00
Artur Pilipenko	80771b9ad9	Upgrade other old memset/memcpy signatures in tests causing buildbot failures with rL273568. llvm-svn: 273580	2016-06-23 16:34:52 +00:00
Hans Wennborg	b510b458b9	[codeview] Emit retained types Differential Revision: http://reviews.llvm.org/D21630 llvm-svn: 273579	2016-06-23 16:33:53 +00:00
Hans Wennborg	a63b50afb8	Revert r273568 "Remangle intrinsics names when types are renamed" It broke 2008-07-15-Bswap.ll and 2009-09-01-PostRAProlog.ll llvm-svn: 273574	2016-06-23 16:13:23 +00:00
Artur Pilipenko	4fec7b7131	Fix an old memset signature in 2009-09-01-PostRAProlog.ll test causing a buildbot failure llvm-svn: 273573	2016-06-23 16:07:10 +00:00
Artur Pilipenko	f0c9f81379	Remangle intrinsics names when types are renamed This is a fix for the problem mentioned in "LTO and intrinsics mangling" llvm-dev mail thread: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098387.html Reviewers: mehdi_amini, reames Differential Revision: http://reviews.llvm.org/D19373 llvm-svn: 273568	2016-06-23 15:25:09 +00:00
Michael Zolotukhin	2d3592d481	[LoopUnrollAnalyzer] Fix a bug in UnrolledInstAnalyzer::visitLoad. When simplifying a load we need to make sure that the type of the simplified value matches the type of the instruction we're processing. In theory, we can handle casts here as we deal with constant data, but since it's not implemented at the moment, we at least need to bail out. This fixes PR28262. llvm-svn: 273562	2016-06-23 14:31:31 +00:00
Valery Pykhtin	a852d695b8	[AMDGPU] Enable absolute expression initializer for amd_kernel_code_t fields. Differential Revision: http://reviews.llvm.org/D21380 llvm-svn: 273561	2016-06-23 14:13:06 +00:00
Simon Pilgrim	595dddb103	[X86][AVX512] Added AVX512F vector sign extend tests Now that Elena has confirmed that PR26474 has been fixed llvm-svn: 273560	2016-06-23 14:01:45 +00:00
Hal Finkel	a1271036c5	Allow DeadStoreElimination to track combinations of partial later wrties DeadStoreElimination can currently remove a small store rendered unnecessary by a later larger one, but could not remove a larger store rendered unnecessary by a series of later smaller ones. This adds that capability. It works by keeping a map, which is used as an effective interval map, for each store later overwritten only partially, and filling in that interval map as more such stores are discovered. No additional walking or aliasing queries are used. In the map forms an interval covering the the entire earlier store, then it is dead and can be removed. The map is used as an interval map by storing a mapping between the ending offset and the beginning offset of each interval. I discovered this problem when investigating a performance issue with code like this on PowerPC: #include <complex> using namespace std; complex<float> bar(complex<float> C); complex<float> foo(complex<float> C) { return bar(C)C; } which produces this: define void @_Z4testSt7complexIfE(%"struct.std::complex" noalias nocapture sret %agg.result, i64 %c.coerce) { entry: %ref.tmp = alloca i64, align 8 %tmpcast = bitcast i64* %ref.tmp to %"struct.std::complex"* %c.sroa.0.0.extract.shift = lshr i64 %c.coerce, 32 %c.sroa.0.0.extract.trunc = trunc i64 %c.sroa.0.0.extract.shift to i32 %0 = bitcast i32 %c.sroa.0.0.extract.trunc to float %c.sroa.2.0.extract.trunc = trunc i64 %c.coerce to i32 %1 = bitcast i32 %c.sroa.2.0.extract.trunc to float call void @_Z3barSt7complexIfE(%"struct.std::complex"* nonnull sret %tmpcast, i64 %c.coerce) %2 = bitcast %"struct.std::complex"* %agg.result to i64* %3 = load i64, i64* %ref.tmp, align 8 store i64 %3, i64* %2, align 4 ; <--- *** THIS SHOULD NOT BE HERE ** %_M_value.realp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 0 %4 = lshr i64 %3, 32 %5 = trunc i64 %4 to i32 %6 = bitcast i32 %5 to float %_M_value.imagp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 1 %7 = trunc i64 %3 to i32 %8 = bitcast i32 %7 to float %mul_ad.i.i = fmul fast float %6, %1 %mul_bc.i.i = fmul fast float %8, %0 %mul_i.i.i = fadd fast float %mul_ad.i.i, %mul_bc.i.i %mul_ac.i.i = fmul fast float %6, %0 %mul_bd.i.i = fmul fast float %8, %1 %mul_r.i.i = fsub fast float %mul_ac.i.i, %mul_bd.i.i store float %mul_r.i.i, float* %_M_value.realp.i.i, align 4 store float %mul_i.i.i, float* %_M_value.imagp.i.i, align 4 ret void } the problem here is not just that the i64 store is unnecessary, but also that it blocks further backend optimizations of the other uses of that i64 value in the backend. In the future, we might want to add a special case for handling smaller accesses (e.g. using a bit vector) if the map mechanism turns out to be noticeably inefficient. A sorted vector is also a possible replacement for the map for small numbers of tracked intervals. Differential Revision: http://reviews.llvm.org/D18586 llvm-svn: 273559	2016-06-23 13:46:39 +00:00
Daniel Sanders	de393329b9	[mips] Don't derive the default ABI from the CPU in the backend. Summary: The backend has no reason to behave like a driver and should generally do as it's told (and error out if it can't) instead of trying to figure out what the API user meant. The default ABI is still derived from the arch component as a concession to backwards compatibility. API-users that previously passed an explicit CPU and a triple that was inconsistent with the CPU (e.g. mips-linux-gnu and mips64r2) may get a different ABI to what they got before. However, it's expected that there are no such users on the basis that CodeGen has been asserting that the triple is consistent with the selected ABI for several releases. API-users that were consistent or passed '' or 'generic' as the CPU will see no difference. Reviewers: sdardis, rafael Subscribers: rafael, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21466 llvm-svn: 273557	2016-06-23 12:42:53 +00:00
Daniel Sanders	8e17bea7d5	[mips][ias] Integers are not registers. Summary: When parseAnyRegister() encounters a symbol alias, it parses integers and adds a corresponding expression to the operand list. This is clearly wrong since the only operands that parseAnyRegister() should be accepting are registers. It's not clear why this code was added and there are no test cases that cover it. I think it might be leftover from when searchSymbolAlias() was more widely used. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21377 llvm-svn: 273555	2016-06-23 10:54:09 +00:00
Diana Picus	e440f99913	[AMDGPU] Remove exit-on-error in test (PR27761) The exit-on-error flag was necessary in order to avoid an assertion when handling DYNAMIC_STACKALLOC nodes in SelectionDAGLegalize. We can avoid the assertion by creating some dummy nodes. This enables us to remove the exit-on-error flag on the first 2 run lines (SI), but on the third run line (R600) we would run into another assertion when trying to reserve indirect registers. This patch also replaces that assertion with an early exit from the function. Fixes PR27761. Differential Revision: http://reviews.llvm.org/D20852 llvm-svn: 273550	2016-06-23 09:19:16 +00:00
Simon Dardis	724e530296	[mips] Fix dext/dins definitions dext and dins, along with their 'm' and 'u' variants are defined in mips64r2, not mips64. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D21608 llvm-svn: 273549	2016-06-23 09:06:20 +00:00
Craig Topper	597aa42fec	[AVX512] Remove masked unpack intrinsics and autoupgrade to vectorshuffle and selects. llvm-svn: 273543	2016-06-23 07:37:33 +00:00
David Majnemer	d1fbf48566	[SCCP] Don't assume all Constants are ConstantInt This fixes PR28269. llvm-svn: 273521	2016-06-23 00:14:29 +00:00
Peter Collingbourne	6717803485	Revert r273456, "Preserve DebugInfo when replacing values in DAGCombiner" as it caused pr28270. llvm-svn: 273518	2016-06-23 00:06:17 +00:00
Vedant Kumar	cae1cff94a	[llvm-cov] Fix a buggy lit test There is no check prefix for "WHOLE-FILE": this particular line was supposed to use the "ALL" prefix. llvm-svn: 273517	2016-06-22 23:58:03 +00:00
Matt Arsenault	3cb4ddeb4e	AMDGPU: Fix liveness when expanding m0 loop llvm-svn: 273514	2016-06-22 23:40:57 +00:00
Sanjoy Das	e57bf680ec	[ImplicitNullChecks] Hoist trivial depdendencies if possible When trying to convert a loading instruction into a FAULTING_LOAD, we sometimes face code like this: if %R10 is not null: %R9<def> = MOV32ri Immediate %R9<def, tied> = AND32rm %R9, 0x20(%R10) else: goto TRAP In these cases we would like to use the AND32rm instruction as the faulting operation by hoisting the "depedency" def-ing %R9 also above the control flow, transforming the program into: %R9<def> = MOV32ri Immediate %R9<def, tied> = FAULTING_LOAD_OP(AND32rm %R9, 0x20(%R10), FailPath: TRAP) This change teaches ImplicitNullChecks to do the above, when safe. llvm-svn: 273501	2016-06-22 22:16:51 +00:00
Rafael Espindola	928a95d0b0	Use shouldAssumeDSOLocal. With this it handle -fPIE. llvm-svn: 273499	2016-06-22 22:09:17 +00:00
Changpeng Fang	47efe1f6db	AMDGPU/SI: Define an intrinsic to expose ds_swizzle_b32 Reviewers: tstellarAMD, arsenm Differential Revision: http://reviews.llvm.org/D21533 llvm-svn: 273496	2016-06-22 21:33:49 +00:00
Hans Wennborg	9a519a099e	[codeview] Write LF_UDT_SRC_LINE records (PR28251) Differential Revision: http://reviews.llvm.org/D21621 llvm-svn: 273495	2016-06-22 21:22:13 +00:00
Reid Kleckner	ac460619d2	[codeview] Fix the alignment padding that we add to list records Tweak the big-types.ll test case to catch this bug. We just need an enumerator name that doesn't have a length that is a multiple of 4. llvm-svn: 273477	2016-06-22 20:59:17 +00:00
Davide Italiano	ec7e29e941	[IRObjectFile] Propagate .weak attribute correctly for ASM symbols. PR: 28256 Differential Revision: http://reviews.llvm.org/D21616 llvm-svn: 273474	2016-06-22 20:48:15 +00:00
Peter Collingbourne	6d88fde3af	IR: Introduce Module::global_objects(). This is a convenience iterator that allows clients to enumerate the GlobalObjects within a Module. Also start using it in a few places where it is obviously the right thing to use. Differential Revision: http://reviews.llvm.org/D21580 llvm-svn: 273470	2016-06-22 20:29:42 +00:00
Matt Arsenault	9babdf4265	AMDGPU: Fix verifier errors in SILowerControlFlow The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predecessors. Split the blocks up to avoid this and introduce new pseudo instructions for branches taken with exec masking. Also use a pseudo instead of emitting s_endpgm and erasing it in the special case of a non-void return. llvm-svn: 273467	2016-06-22 20:15:28 +00:00
Krzysztof Parzyszek	f7f7068109	[Hexagon] Add SDAG preprocessing step to expose shifted addressing modes Transform: (store ch addr (add x (add (shl y c) e))) to: (store ch addr (add x (shl (add y d) c))), where e = (shl d c) for some integer d. The purpose of this is to enable generation of loads/stores with shifted addressing mode, i.e. mem(x+y<<#c). For that, the shift value c must be 0, 1 or 2. llvm-svn: 273466	2016-06-22 20:08:27 +00:00
Sanjay Patel	a06d989552	[ValueTracking] improve ComputeNumSignBits for vector constants This is similar to the computeKnownBits improvement in rL268479. There's probably more we can do for vector logic instructions, but this should let us see non-splat constant masking ops that can become vector selects instead of and/andn/or sequences. Differential Revision: http://reviews.llvm.org/D21610 llvm-svn: 273459	2016-06-22 19:20:59 +00:00
Chad Rosier	8c106bcbe8	[AArch64] Remove an overly aggressive assert. llvm-svn: 273458	2016-06-22 19:18:52 +00:00
Rafael Espindola	8474fdf90d	Start using shouldAssumeDSOLocal on Hexagon. Include a token test showing that access to private is now the same as to internal. llvm-svn: 273457	2016-06-22 19:09:14 +00:00
Nirav Dave	96beb7dee5	Preserve DebugInfo when replacing values in DAGCombiner Recommiting after fixing over-aggressive assertion [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 273456	2016-06-22 19:03:26 +00:00
Wei Ding	0526e7f8d9	AMDGPU: Add convergent flag to INLINEASM instruction. Differential Revision: http://reviews.llvm.org/D21214 llvm-svn: 273455	2016-06-22 18:51:08 +00:00
Reid Kleckner	156a7239c1	[codeview] Add IntroducingVirtual debug info flag CodeView needs to know if a virtual method was introduced in the current class, and base classes may not have complete type information, so we need to thread this bit through from the frontend. llvm-svn: 273453	2016-06-22 18:31:14 +00:00
Vedant Kumar	f5ac6d49e4	[asan] Do not instrument accesses to profiling globals It's only useful to asan-itize profiling globals while debugging llvm's profiling instrumentation passes. Enabling asan along with instrprof or gcov instrumentation shouldn't incur extra overhead. This patch is in the same spirit as r264805 and r273202, which disabled tsan instrumentation of instrprof/gcov globals. Differential Revision: http://reviews.llvm.org/D21541 llvm-svn: 273444	2016-06-22 17:30:58 +00:00
Reid Kleckner	643dd83661	[codeview] Defer emission of all referenced complete records This is the motivating example: struct B { int b; }; struct A { B b; }; int f(A p) { return p->b->b; } Clang emits complete types for both A and B because they are required to be complete, but our CodeView emission would only emit forward declarations of A and B. This was a consequence of the fact that the A* type must reference the forward declaration of A, which doesn't reference B at all. We can't eagerly emit complete definitions of A and B when we request the forward declaration's type index because of recursive types like linked lists. If we did that, our stack usage could get out of hand, and it would be possible to lower a type while attempting to lower a type, and we would need to double check if our type is already present in the TypeIndexMap after all recursive getTypeIndex calls. Instead, defer complete type emission until after all type lowering has completed. This ensures that all referenced complete types are emitted, and that type lowering is not re-entrant. llvm-svn: 273443	2016-06-22 17:15:28 +00:00
Zhan Jun Liau	0df350589f	[SystemZ] Recognize RISBG opportunities involving a truncate Summary: Recognize RISBG opportunities where the end result is narrower than the original input - where a truncate separates the shift/and operations. The motivating case is some code in postgres which looks like: srlg %r2, %r0, 11 nilh %r2, 255 Reviewers: uweigand Author: RolandF Differential Revision: http://reviews.llvm.org/D21452 llvm-svn: 273433	2016-06-22 16:16:27 +00:00
Krzysztof Parzyszek	f228c95f87	[Hexagon] Handle expansion of cmpxchg llvm-svn: 273432	2016-06-22 16:07:10 +00:00
Artur Pilipenko	1cec4fdddf	Upgrade old memset/memcpy signatures (without isVolatile argument) in tests We no longer have corresponding code in autoupgrade and the vast majority of the tests were fixed long time ago. Fix the remaining few. One of the verifier test cases is marked as XFAIL because it was passing only because the signature was incorrect. llvm-svn: 273428	2016-06-22 15:16:06 +00:00
Sanjay Patel	c6cacd6067	[InstSimplify] add ashr tests including vector types llvm-svn: 273421	2016-06-22 14:18:04 +00:00
Simon Pilgrim	bc35f9f702	[SLPVectorizer][X86] Added ceil/floor/nearbyint/rint/trunc vectorization tests llvm-svn: 273420	2016-06-22 14:07:46 +00:00
Sanjay Patel	21579bb39a	[InstSimplify] regenerate checks llvm-svn: 273419	2016-06-22 14:00:16 +00:00
George Rimar	ff8b539f7b	[llvm-readobj] - Teach llvm-readobj to print dependencies of SHT_GNU_verdef and refactor dumping method. This patch changes single method of llvm-readobj. It teaches SHT_GNU_verdef dumper to print version dependencies, also it removes few fields from output that can be dumped with other keys and slightly refactors code. Testcase was also modified to match the changes. Change is required for testcases of upcoming lld patches. Differential revision: http://reviews.llvm.org/D21552 llvm-svn: 273417	2016-06-22 13:43:38 +00:00
Simon Pilgrim	1536c19642	Regenerated test llvm-svn: 273404	2016-06-22 12:58:15 +00:00
Peter Zotov	ee462ca194	[OCaml] Add functions for accessing metadata nodes. Patch by Xinyu Zhuang. Differential Revision: http://reviews.llvm.org/D19309 llvm-svn: 273370	2016-06-22 03:30:24 +00:00
Reid Kleckner	0c5d874bea	[codeview] Improve names of types in scopes and member function ids We now include namespace scope info in LF_FUNC_ID records and we emit LF_MFUNC_ID records for member functions as we should. Class names are now fully qualified, which is what MSVC does. Add a little bit of scaffolding to handle ThisAdjustment when it arrives in DISubprogram. llvm-svn: 273358	2016-06-22 01:32:56 +00:00
Reid Kleckner	fd3b35ad84	[codeview] Add missing test from r273294 llvm-svn: 273355	2016-06-22 01:17:05 +00:00
Anna Zaks	644d9d3a44	[asan] Do not instrument pointers with address space attributes Do not instrument pointers with address space attributes since we cannot track them anyway. Instrumenting them results in false positives in ASan and a compiler crash in TSan. (The compiler should not crash in any case, but that's a different problem.) llvm-svn: 273339	2016-06-22 00:15:52 +00:00
Peter Collingbourne	21521891a2	IR: Allow metadata attachments on declarations, and fix lazy loaded metadata issue with globals. This change is motivated by an upcoming change to the metadata representation used for CFI. The indirect function call checker needs type information for external function declarations in order to correctly generate jump table entries for such declarations. We currently associate such type information with declarations using a global metadata node, but I plan [1] to move all such metadata to global object attachments. In bitcode, metadata attachments for function declarations appear in the global metadata block. This seems reasonable to me because I expect metadata attachments on declarations to be uncommon. In the long term I'd also expect this to be the case for CFI, because we'd want to use some specialized bitcode format for this metadata that could be read as part of the ThinLTO thin-link phase, which would mean that it would not appear in the global metadata block. To solve the lazy loaded metadata issue I was seeing with D20147, I use the same bitcode representation for metadata attachments for global variables as I do for function declarations. Since there's a use case for metadata attachments in the global metadata block, we might as well use that representation for global variables as well, at least until we have a mechanism for lazy loading global variables. In the assembly format, the metadata attachments appear after the "declare" keyword in order to avoid a parsing ambiguity. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html Differential Revision: http://reviews.llvm.org/D21052 llvm-svn: 273336	2016-06-21 23:42:48 +00:00
Haicheng Wu	a783bac50b	[Kryo] Enable loop prefetcher. Differential Revision: http://reviews.llvm.org/D21535 llvm-svn: 273329	2016-06-21 22:47:56 +00:00
Kevin Enderby	606a338db9	Update llvm-obdump(1) to print FAT_MAGIC_64 for Darwin’s 64-bit universal files with the -macho and -universal-headers flags. Just a follow on to r273207, I missed updating the printing of the fat magic number when the universal file is a 64-bit universal file. rdar://26899493 llvm-svn: 273324	2016-06-21 21:55:01 +00:00
Jan Vesely	fea814d531	AMDGPU: Add implicitarg.ptr intrinsic. Points to the start of implicit arguments (appended after explicit arguments) Differential Revision: http://reviews.llvm.org/D20297 llvm-svn: 273317	2016-06-21 20:46:20 +00:00
Michael Kuperstein	78028b84d2	[X86] Make arithmetic operations cost model test saner. NFC. llvm-svn: 273316	2016-06-21 20:41:40 +00:00
Artem Belevich	d7ebcfb291	[NVPTX] Improve lowering of byval args of device functions. Avoid unnecessary spills of such vars to local space on SASS level and pointer space conversion. Instead, make a local copy with appropriate addrspacecasts and let LLVM optimize them away when possible. This allows loading value of the argument using [symbol+offset] instead of converting argument to general space pointer and using it for indexing (which also implicitly converts param space pointer to local space one on SASS level and triggers copying of argument into local space in the process). This reduces call overhead, uses less registers and reduces overall SASS size by 2-4%. Differential Review: http://reviews.llvm.org/D21421 llvm-svn: 273313	2016-06-21 20:30:26 +00:00
Easwaran Raman	8bceb9d210	Fix PR28219: Use profile summary from reader and not compute it Differentiaal revision: http://reviews.llvm.org/D21546 llvm-svn: 273301	2016-06-21 19:29:49 +00:00
Reid Kleckner	5b335b864b	[codeview] Add support for splitting field list records over 64KB The basic structure is that once a list record goes over 64K, the last subrecord of the list is an LF_INDEX record that refers to the next record. Because the type record graph must be toplogically sorted, this means we have to emit them in reverse order. We build the type record in order of declaration, so this means that if we don't want extra copies, we need to detect when we were about to split a record, and leave space for a continuation subrecord that will point to the eventual split top-level record. Also adds dumping support for these records. Next we should make sure that large method overload lists work properly. llvm-svn: 273294	2016-06-21 18:33:01 +00:00
Silviu Baranga	03b6a4fc88	[AArch64] Fix merge-store.ll regression test after r273271 r273271 changed the RUN line of the regression test to use -march=cyclone instead of -mtriple=aarch64-none-none. This caused a change in the output syntax for the ext instruction, causing the test to fail. Change this test back to using -mtriple=aarch64-none-none. llvm-svn: 273286	2016-06-21 17:15:49 +00:00
Etienne Bergeron	f6be62f2c8	[StackProtector] Fix computation of GSCookieOffset and EHCookieOffset with SEH4 Summary: Fix the computation of the offsets present in the scopetable when using the SEH (__except_handler4). This patch added an intrinsic to track the position of the allocation on the stack of the EHGuard. This position is needed when producing the ScopeTable. ``` struct _EH4_SCOPETABLE { DWORD GSCookieOffset; DWORD GSCookieXOROffset; DWORD EHCookieOffset; DWORD EHCookieXOROffset; _EH4_SCOPETABLE_RECORD ScopeRecord[1]; }; struct _EH4_SCOPETABLE_RECORD { DWORD EnclosingLevel; long (FilterFunc)(); union { void (HandlerAddress)(); void (*FinallyFunc)(); }; }; ``` The code to generate the EHCookie is added in `X86WinEHState.cpp`. Which is adding these instructions when using SEH4. ``` Lfunc_begin0: # BB#0: # %entry pushl %ebp movl %esp, %ebp pushl %ebx pushl %edi pushl %esi subl $28, %esp movl %ebp, %eax <<-- Loading FramePtr movl %esp, -36(%ebp) movl $-2, -16(%ebp) movl $L__ehtable$use_except_handler4_ssp, %ecx xorl ___security_cookie, %ecx movl %ecx, -20(%ebp) xorl ___security_cookie, %eax <<-- XOR FramePtr and Cookie movl %eax, -40(%ebp) <<-- Storing EHGuard leal -28(%ebp), %eax movl $__except_handler4, -24(%ebp) movl %fs:0, %ecx movl %ecx, -28(%ebp) movl %eax, %fs:0 movl $0, -16(%ebp) calll _may_throw_or_crash LBB1_1: # %cont movl -28(%ebp), %eax movl %eax, %fs:0 addl $28, %esp popl %esi popl %edi popl %ebx popl %ebp retl ``` And the corresponding offset is computed: ``` Luse_except_handler4_ssp$parent_frame_offset = -36 .p2align 2 L__ehtable$use_except_handler4_ssp: .long -2 # GSCookieOffset .long 0 # GSCookieXOROffset .long -40 # EHCookieOffset <<---- .long 0 # EHCookieXOROffset .long -2 # ToState .long _catchall_filt # FilterFunction .long LBB1_2 # ExceptionHandler ``` Clang is not yet producing function using SEH4, but it's a work in progress. This patch is a step toward having a valid implementation of SEH4. Unfortunately, it is not yet fully working. The EH registration block is not allocated at the right offset on the stack. Reviewers: rnk, majnemer Subscribers: llvm-commits, chrisha Differential Revision: http://reviews.llvm.org/D21231 llvm-svn: 273281	2016-06-21 15:58:55 +00:00
Evandro Menezes	230083ff9d	[AArch64] Change the preferred alignment for char and short to word alignment Differential Revision: http://reviews.llvm.org/D21414 llvm-svn: 273279	2016-06-21 15:55:18 +00:00
Silviu Baranga	dc43d61a25	[AArch64] Switch regression tests to test features not CPUs Summary: We have switched to using features for all heuristics, but the tests for these are still using -mcpu, which means we are not directly testing the features. This converts at least some of the existing regression tests to use the new features. This still leaves the following features untested: merge-narrow-ld predictable-select-expensive alternate-sextload-cvt-f32-pattern disable-latency-sched-heuristic Reviewers: mcrosier, t.p.northover, rengolin Subscribers: MatzeB, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D21288 llvm-svn: 273271	2016-06-21 15:16:34 +00:00
Daniel Sanders	bf2c03ee69	[arm+x86] Make GNU variants behave like GNU w.r.t combining sin+cos into sincos. Summary: canCombineSinCosLibcall() would previously combine sin+cos into sincos for GNUX32/GNUEABI/GNUEABIHF regardless of whether UnsafeFPMath were set or not. However, GNU would only combine them for UnsafeFPMath because sincos does not set errno like sin and cos do. It seems likely that this is an oversight. Reviewers: t.p.northover Subscribers: t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D21431 llvm-svn: 273259	2016-06-21 12:29:03 +00:00
Elena Demikhovsky	a266cf0518	reverted the prev commit due to assertion failure llvm-svn: 273258	2016-06-21 12:10:11 +00:00
Elena Demikhovsky	9823c995bc	Fixed consecutive memory access detection in Loop Vectorizer. It did not handle correctly cases without GEP. The following loop wasn't vectorized: for (int i=0; i<len; i++) to++ = from++; I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1. Differential revision: http://reviews.llvm.org/D20789 llvm-svn: 273257	2016-06-21 11:32:01 +00:00
Craig Topper	283418fbb6	[AVX512] Add patterns for any-extending a mask that use the def of KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX. llvm-svn: 273253	2016-06-21 07:37:32 +00:00
Craig Topper	9038aa3001	[AVX512] Use update_llc_test_checks.py to regenerate a test in preparation for a future commit. llvm-svn: 273252	2016-06-21 07:37:27 +00:00
James Y Knight	03c1415b8f	Revert "Change RelaxELFRelocations for llc." This reverts commit r273019. From email I sent to list: > I don't think this makes sense. Either the linker you're using supports > this feature, or it doesn't. Having it enabled for llc if your linker > doesn't support it is not fun. > > Further note that this also affects basically all other code using llvm > libraries -- other than Clang, which explicitly sets it back to false by > default, unless you set the ENABLE_X86_RELAX_RELOCATIONS cmake flag to > true. > > If you want to enable the relax mode across all llvm tools in some > circumstances, I think it should be via moving the cmake flag from clang > down into llvm. > > I'm going to revert this commit, since I both think it intrinsically > doesn't make sense to do this, and because it's breaking some of our > tools. llvm-svn: 273245	2016-06-21 05:40:41 +00:00
Craig Topper	0a0fb0fda1	[AVX512] Remove the masked vpcmpeq/vcmpgt intrinsics and autoupgrade them to native icmps. llvm-svn: 273240	2016-06-21 03:53:24 +00:00
George Burgess IV	9fdbfe17a8	[CFLAA] Be more aggressive with interprocedural analysis. This patch makes us perform interprocedural analysis on functions that don't have internal linkage. It also removes a test that should've been deleted in an earlier commit (since other tests now cover everything that the newly-removed test covers). Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21513 llvm-svn: 273229	2016-06-21 01:42:47 +00:00
George Burgess IV	87b2e41416	[CFLAA] Add interprocedural function summaries. This patch adds function summaries, so that we don't need to recompute various properties about function parameters/return values at each callsite of a function. It also adds many interprocedural tests for CFLAA. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21475#inline-182390 llvm-svn: 273219	2016-06-20 23:10:56 +00:00
Simon Pilgrim	356e823b51	[X86][SSE] Add cost model for BSWAP of vectors The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization. Differential Revision: http://reviews.llvm.org/D21521 llvm-svn: 273217	2016-06-20 23:08:21 +00:00
Simon Pilgrim	225b2e37a0	[X86][X87] Fix issue with sitofp i64 -> fp128 on 32-bit targets Fix for PR27726 - sitofp i64 to fp128 was loading the merged load i64 to a x87 register preventing legalization for conversion to fp128. Added 32-bit tests for fp128 cast/conversions. llvm-svn: 273210	2016-06-20 22:41:17 +00:00
Kevin Enderby	d0a814ccec	Forgot to svn add one of my test files for the change in r273207. llvm-svn: 273208	2016-06-20 22:27:49 +00:00
Kevin Enderby	eb6d110c1d	Add support for Darwin’s 64-bit universal files with 64-bit offsets and sizes for the objects. Darwin added support in its Xcode 8.0 tools (released in the beta) for universal files where offsets and sizes for the objects are 64-bits to allow support for objects contained in universal files to be larger then 4gb. The change is very straight forward. There is a new magic number that differs by one bit, much like the 64-bit Mach-O files. Then there is a new structure that follow the fat_header that has the same layout but with the offset and size fields using 64-bit values instead of 32-bit values. rdar://26899493 llvm-svn: 273207	2016-06-20 22:16:18 +00:00
Vedant Kumar	0222adbcd2	[tsan] Do not instrument accesses to the gcov counters array There is a known intended race here. This is a follow-up to r264805, which disabled tsan instrumentation for updates to instrprof counters. For more background on this please see the discussion in D18164. llvm-svn: 273202	2016-06-20 21:24:26 +00:00
Sanjay Patel	9ad8fb68f7	[InstSimplify] analyze (optionally casted) icmps to eliminate obviously false logic (PR27869) By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question raised by PR27869: https://llvm.org/bugs/show_bug.cgi?id=27869 ...where InstCombine turns an icmp+zext into a shift causing us to miss the fold. Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp. Differential Revision: http://reviews.llvm.org/D21512 llvm-svn: 273200	2016-06-20 20:59:59 +00:00
Dehao Chen	071bb9d7af	Pass AssumptionCacheTracker from SampleProfileLoader to Inliner Summary: Inliner needs ACT when calling InlineFunction. Instead of nullptr, we need to pass it in from SampleProfileLoader Reviewers: davidxl Subscribers: eraman, vsk, danielcdh, llvm-commits Differential Revision: http://reviews.llvm.org/D21205 llvm-svn: 273199	2016-06-20 20:53:40 +00:00
Matt Arsenault	802ebcb4bb	InstCombine: Don't strip convergent from intrinsic callsites Specific instances of intrinsic calls may want to be convergent, such as certain register reads but the intrinsic declaration is not. llvm-svn: 273188	2016-06-20 19:04:44 +00:00
Sanjay Patel	445d7ecf89	[InstCombine] consolidate some icmp+logic tests and improve checks llvm-svn: 273186	2016-06-20 18:40:37 +00:00
Matt Arsenault	2209625387	AMDGPU: Preserve undef flag on vcc when shrinking v_cndmask_b32 The implicit operand is added by the initial instruction construction, so this was adding an additional vcc use. The original one was missing the undef flag the original condition had, so the verifier would complain. llvm-svn: 273182	2016-06-20 18:34:00 +00:00
Matt Arsenault	b6d8c37e1a	AMDGPU: Fold more custom nodes to undef This will help sneak undefs past GVN into the DAG for some tests. Also add missing intrinsic for rsq_legacy, even though the node was already selected to the instruction. Also start passing the debug location to intrinsic errors. llvm-svn: 273181	2016-06-20 18:33:56 +00:00
Sanjay Patel	14dcb042bc	[InstCombine] update to use FileCheck with autogenerated exact checking llvm-svn: 273180	2016-06-20 18:23:40 +00:00
Matt Arsenault	ff98241f37	Generalize DiagnosticInfoStackSize to support other limits Backends may want to report errors on resources other than stack size. llvm-svn: 273177	2016-06-20 18:13:04 +00:00
Sanjay Patel	06918ad79e	[InstCombine] update to use FileCheck with autogenerated exact checking llvm-svn: 273173	2016-06-20 17:56:13 +00:00
Matt Arsenault	a9720c67f1	AMDGPU: Use correct method for determining instruction size llvm-svn: 273172	2016-06-20 17:51:32 +00:00
Sanjay Patel	a038240660	[InstCombine] regenerate checks llvm-svn: 273170	2016-06-20 17:48:48 +00:00
Rafael Espindola	959e9c8d01	Use shouldAssumeDSOLocal. With this ARM fast isel knows that PIE variable are not preemptable. llvm-svn: 273169	2016-06-20 17:45:33 +00:00
Tom Stellard	5350894265	AMDGPU: Add support for R_AMDGPU_REL32 relocations Reviewers: arsenm, kzhuravl, rafael Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21401 llvm-svn: 273168	2016-06-20 17:33:43 +00:00
Tom Stellard	1c89eb7db0	AMDGPU: Emit R_AMDGPU_ABS32_{HI,LO} for scratch buffer relocations Reviewers: arsenm, rafael, kzhuravl Subscribers: rafael, arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21400 llvm-svn: 273166	2016-06-20 16:59:44 +00:00
Sam Parker	d616cf07b2	[ARM] Enable isel of UMAAL TargetLowering and DAGToDAG are used to combine ADDC, ADDE and UMLAL dags into UMAAL. Selection is split into the two phases because it is easier to match the two patterns at those different times. Differential Revision: http://http://reviews.llvm.org/D21461 llvm-svn: 273165	2016-06-20 16:47:09 +00:00
David Majnemer	c5601df9fd	Reapply "[LoopIdiom] Don't remove dead operands manually" This reverts commit r273160, reapplying r273132. RecursivelyDeleteTriviallyDeadInstructions cannot be called on a parentless Instruction. llvm-svn: 273162	2016-06-20 16:03:25 +00:00
Cong Liu	1c28b6d733	Revert "[LoopIdiom] Don't remove dead operands manually" This reverts commit r273132. Breaks multiple test under /llvm/test:Transforms (e.g. llvm/test:Transforms/LoopIdiom/basic.ll.test) under asan. llvm-svn: 273160	2016-06-20 15:22:15 +00:00
Simon Pilgrim	0a81b13f31	[X86][F16C] Added half <-> double conversion tests llvm-svn: 273153	2016-06-20 12:51:55 +00:00
Pankaj Gode	0aab2e398a	[AARCH64] Add support for Broadcom Vulcan Adding core tuning support for new Broadcom Vulcan core (ARMv8.1A). Differential Revision: http://reviews.llvm.org/D21500 llvm-svn: 273148	2016-06-20 11:13:31 +00:00
Patrik Hagglund	7205215591	Fix for PR27940 After a store has been eliminated, when making sure that the instruction iterator points to a valid instruction, dbg intrinsics are now ignored as a new instruction. Patch by Henric Karlsson. Reviewed by Daniel Berlin. Differential Revision: http://reviews.llvm.org/D21076 llvm-svn: 273141	2016-06-20 09:10:10 +00:00
Igor Breger	e59165ca63	[AVX512] [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic intrinsic lowering. Differential Revision: http://reviews.llvm.org/D20897 llvm-svn: 273138	2016-06-20 07:05:43 +00:00
David Majnemer	a705843f23	[LoopIdiom] Don't remove dead operands manually Removing dead instructions requires remembering which operands have already been removed. RecursivelyDeleteTriviallyDeadInstructions has this logic, don't partially reimplement it in LoopIdiomRecognize. This fixes PR28196. llvm-svn: 273132	2016-06-20 02:33:29 +00:00
Sanjay Patel	a4b052c7d1	[InstSimplify] add tests for PR27689; regenerate checks llvm-svn: 273128	2016-06-19 21:40:12 +00:00
Simon Pilgrim	0887d5b02e	[X86][AVX512] Added 512-bit BITREVERSE tests and enabled AVX512BW lowering support llvm-svn: 273125	2016-06-19 20:59:19 +00:00
Simon Pilgrim	3d881a0230	[X86][SSE] Allow target shuffle combining to match masks with SM_Sentinel values We currently only allow exact matches of shuffle mask patterns during target shuffle combining. This patch relaxes this to permit SM_SentinelUndef in the combined shuffle to always be accepted as well as allowing exact matching of the SM_SentinelZero value. I've adjusted some tests that were requiring exact shuffle masks to now include undef values. Differential Revision: http://reviews.llvm.org/D21495 llvm-svn: 273119	2016-06-19 18:03:52 +00:00
Chris Dewhurst	a294541c05	[SPARC[ Correcting out-of-date unit tests checked in as part of r273108 llvm-svn: 273110	2016-06-19 12:52:39 +00:00
Chris Dewhurst	0c1e0026aa	[SPARC] Fixes for hardware errata on LEON processor. Passes to fix three hardware errata that appear on some LEON processor variants. The instructions FSMULD, FMULS and FDIVS do not work as expected on some LEON processors. This change allows those instructions to be substituted for alternatives instruction sequences that are known to work. These passes only run when selected individually, or as part of a processor defintion. They are not included in general SPARC processor compilations for non-LEON processors or for those LEON processors that do not have these hardware errata. llvm-svn: 273108	2016-06-19 11:03:28 +00:00
David Majnemer	3119599475	[LoadCombine] Combine Loads formed from GEPS with negative indexes Change the underlying offset and comparisons to use int64_t instead of uint64_t. Patch by River Riddle! Differential Revision: http://reviews.llvm.org/D21499 llvm-svn: 273105	2016-06-19 06:14:56 +00:00
Simon Pilgrim	9a09652a3a	[X86][AVX] Added test case for PR28136 llvm-svn: 273098	2016-06-18 22:59:08 +00:00
Simon Pilgrim	cd6d4352bc	[X86][SSSE3] Added examples of target shuffle combining failing to match undefs in shuffle masks llvm-svn: 273097	2016-06-18 21:18:21 +00:00
Simon Pilgrim	ab009e9f41	[X86][XOP] Added fast-isel tests matching tools/clang/test/CodeGen/xop-builtins.c llvm-svn: 273096	2016-06-18 21:07:31 +00:00
Simon Pilgrim	b201678763	[X86][TBM] Added fast-isel tests matching tools/clang/test/CodeGen/tbm-builtins.c llvm-svn: 273087	2016-06-18 17:20:52 +00:00
Vasileios Kalintiris	0cf68df6cc	[mips] Emit a JALR with $rd equal to $zero, instead of a JR in MIPS32R6. Summary: JR is an alias of JALR with $rd=0 in the R6 ISA. Also, this fixes recursive builds in MIPS32R6. Reviewers: dsanders, sdardis Subscribers: jfb, dschuff, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21370 llvm-svn: 273085	2016-06-18 15:39:43 +00:00
Amjad Aboud	76c9eb99a7	[codeview] Emit non-virtual method type. Differential Revision: http://reviews.llvm.org/D21011 llvm-svn: 273084	2016-06-18 10:25:07 +00:00
Marcin Koscielnicki	3feda222c6	[sanitizers] Disable target-specific lowering of string functions. CodeGen has hooks that allow targets to emit specialized code instead of calls to memcmp, memchr, strcpy, stpcpy, strcmp, strlen, strnlen. When ASan/MSan/TSan/ESan is in use, this sidesteps its interceptors, resulting in uninstrumented memory accesses. To avoid that, make these sanitizers mark the calls as nobuiltin. Differential Revision: http://reviews.llvm.org/D19781 llvm-svn: 273083	2016-06-18 10:10:37 +00:00
Matt Arsenault	a466a7cf62	Add looping testcase that broke in r272987 llvm-svn: 273081	2016-06-18 05:15:58 +00:00
Matt Arsenault	e935f05a94	AMDGPU: Fix kernel argument alignment impacting stack size Don't use AllocateStack because kernel arguments have nothing to do with the stack. The ensureMaxAlignment call was still changing the stack alignment. llvm-svn: 273080	2016-06-18 05:15:53 +00:00
Sanjoy Das	e8fd9561cb	[SCEV] Fix incorrect trip count computation The way we elide max expressions when computing trip counts is incorrect -- it breaks cases like this: ``` static int wrapping_add(int a, int b) { return (int)((unsigned)a + (unsigned)b); } void test() { volatile int end_buf = 2147483548; // INT_MIN - 100 int end = end_buf; unsigned counter = 0; for (int start = wrapping_add(end, 200); start < end; start++) counter++; print(counter); } ``` Note: the `NoWrap` variable that was being tested has little to do with the values flowing into the max expression; it is a property of the induction variable. test/Transforms/LoopUnroll/nsw-tripcount.ll was added to solely test functionality I'm reverting in this change, so I've deleted the test fully. llvm-svn: 273079	2016-06-18 04:38:31 +00:00
Simon Pilgrim	f4b2af1b9f	[X86][SSE4A] Autoupgrade and remove MOVNTSD/MOVNTSS intrinsics Required better annotation of the instruction defs upon removal of the builtin intrinsic pattern. llvm-svn: 273077	2016-06-18 02:38:26 +00:00
Rafael Espindola	88bb8ce821	Add a test for r273022. llvm-svn: 273073	2016-06-18 00:24:49 +00:00
Matt Arsenault	8fd5978811	Revert "Revert "Revert "InstCombine: Reduce trunc (shl x, K) width.""" This seems to be causing an infinite loop / crash in instcombine on some bots. llvm-svn: 273069	2016-06-17 23:36:38 +00:00
Tom Stellard	f8db61c5f0	Support/ELF: Add AMDGPU relocation definitions to match documentation Reviewers: arsenm, kzhuravl, rafael Subscribers: llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21443 llvm-svn: 273066	2016-06-17 22:38:08 +00:00
Adam Nemet	a9f09c6245	[LAA] Enable symbolic stride speculation for all LAA clients This is a functional change for LLE and LDist. The other clients (LV, LVerLICM) already had this explicitly enabled. The temporary boolean parameter to LAA is removed that allowed turning off speculation of symbolic strides. This makes LAA's caching interface LAA::getInfo only take the loop as the parameter. This makes the interface more friendly to the new Pass Manager. The flag -enable-mem-access-versioning is moved from LV to a LAA which now allows turning off speculation globally. llvm-svn: 273064	2016-06-17 22:35:41 +00:00
Matt Arsenault	0bb294b224	AMDGPU: Temporarily select trap to s_endpgm This should select to s_trap, but that requires additonal work to setup and enable the trap handler. For now emit s_endpgm so bugpoint stops getting stuck on the unsupported call to abort. Emit a warning that this will only terminate the wave and not really trap. llvm-svn: 273062	2016-06-17 22:27:03 +00:00
Kevin Enderby	ae108ffb9a	Add support for Darwin’s static library table of contents with 64-bit offsets to the archive members. Darwin added support in its Xcode 8.0 tools (released in the beta) for static library table of contents with 64-bit offsets to the archive members. The change is very straight forward. The table of contents member is named ___.SYMDEF_64 or "___.SYMDEF_64 SORTED" and same layout is used but with fields using 64 bit values instead of 32 bit values. rdar://26869808 llvm-svn: 273058	2016-06-17 22:16:06 +00:00
Reid Kleckner	6fa1546ad9	[codeview] Emit incomplete member pointer types with the unknown model An incomplete member pointer type will always have a size of zero, so we don't need an extra flag. Credit to David Majnemer for the idea. llvm-svn: 273057	2016-06-17 22:14:39 +00:00
Reid Kleckner	604105bb90	[codeview] Add DIFlags for pointer to member representations Summary: This seems like the least intrusive way to pass this information through. Fixes PR28151 Reviewers: majnemer, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21444 llvm-svn: 273053	2016-06-17 21:31:33 +00:00
Matt Arsenault	8885910f8e	AMDGPU: Remove llvm.SI.tid intrinsic Mesa doesn't emit this for llvm >= 3.8 anymore. llvm-svn: 273050	2016-06-17 21:18:41 +00:00
Matt Arsenault	d76efc14b9	Revert "Revert "InstCombine: Reduce trunc (shl x, K) width."" Reapply r272987. Condition should be in terms of the destination type, and the flags should not be copied. llvm-svn: 273045	2016-06-17 20:33:53 +00:00
Marcin Koscielnicki	fd4b6b9e51	[SelectionDAG] Don't treat library calls specially if marked with nobuiltin. To be used by D19781. Differential Revision: http://reviews.llvm.org/D19801 llvm-svn: 273039	2016-06-17 20:24:07 +00:00
Michael Kuperstein	18d6d3d95e	[X86] Add missing AVX512 anyext patterns. Add AVX512 anyext patterns for i16 and i64, modeled on the existing i8 and i32 patterns. llvm-svn: 273038	2016-06-17 20:21:17 +00:00
Davide Italiano	b49aa5c0c4	[PM] Port MergedLoadStoreMotion to the new pass manager, take two. This is indeed a much cleaner approach (thanks to Daniel Berlin for pointing out), and also David/Sean for review. Differential Revision: http://reviews.llvm.org/D21454 llvm-svn: 273032	2016-06-17 19:10:09 +00:00
Tim Northover	28a9e7f4ba	ARM: take account of possible bundle when erasing an instruction. Fortunately this appears to be the only ARM-specific pass that runs while bundles might be in play, so no other cases need modifying. llvm-svn: 273029	2016-06-17 18:40:46 +00:00
Davide Italiano	16bfa13a77	[IRObjectFile] Handle .weak in RecordStreamer. Differential Revision: http://reviews.llvm.org/D21476 llvm-svn: 273027	2016-06-17 18:20:14 +00:00
James Y Knight	148a6469dc	Support expanding partial-word cmpxchg to full-word cmpxchg in AtomicExpandPass. Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1 or 2-byte. For those, you need to mask and shift the 1 or 2 byte values appropriately to use the 4-byte instruction. This change adds support for cmpxchg-based instruction sets (only SPARC, in LLVM). The support can be extended for LL/SC-based PPC and MIPS in the future, supplanting the ISel expansions those architectures currently use. Tests added for the IR transform and SPARCv9. Differential Revision: http://reviews.llvm.org/D21029 llvm-svn: 273025	2016-06-17 18:11:48 +00:00
Rafael Espindola	9f86baebe0	Change RelaxELFRelocations for llc. As a developer tool it makes sense for it to use the new relocations. llvm-svn: 273019	2016-06-17 17:43:41 +00:00
Rafael Espindola	e021e85166	Change the default of -relax-relocations. llvm-mc is a developer tool, as such it make sense for it to use new features by default. This doesn't change the user facing clang, which still defaults to non relaxable relocations. llvm-svn: 273014	2016-06-17 17:04:56 +00:00
Sanjay Patel	216d8cf720	[InstCombine] allow more than one use for vector bitcast folding with selects The motivating example for this transform is similar to D20774 where bitcasts interfere with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to produce min and max ops: define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) { %cmp = fcmp olt <4 x float> %a, %b %bc1 = bitcast <4 x float> %a to <4 x i32> %bc2 = bitcast <4 x float> %b to <4 x i32> %sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2 %sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1 %bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>* store <4 x i32> %sel1, <4 x i32>* %bc3 %bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>* store <4 x i32> %sel2, <4 x i32>* %bc4 ret void } With this patch, we move the selects up to use the input args which allows getting rid of all of the bitcasts: define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) { %cmp = fcmp olt <4 x float> %a, %b %sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b %sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16 store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16 ret void } The asm for x86 SSE then improves from: movaps %xmm0, %xmm2 cmpltps %xmm1, %xmm2 movaps %xmm2, %xmm3 andnps %xmm1, %xmm3 movaps %xmm2, %xmm4 andnps %xmm0, %xmm4 andps %xmm2, %xmm0 orps %xmm3, %xmm0 andps %xmm1, %xmm2 orps %xmm4, %xmm2 movaps %xmm0, (%rdi) movaps %xmm2, (%rsi) To: movaps %xmm0, %xmm2 minps %xmm1, %xmm2 maxps %xmm0, %xmm1 movaps %xmm2, (%rdi) movaps %xmm1, (%rsi) The TODO comments show that we're limiting this transform only to vectors and only to bitcasts because we need to improve other transforms or risk creating worse codegen. Differential Revision: http://reviews.llvm.org/D21190 llvm-svn: 273011	2016-06-17 16:46:50 +00:00
David Majnemer	da9548f949	[CodeView] Refactor enumerator emission This addresses Amjad's review comments on D21442. llvm-svn: 273010	2016-06-17 16:13:21 +00:00
Reid Kleckner	ac945e27dd	[codeview] Make function names more consistent with MSVC Names in function id records don't include nested name specifiers or template arguments, but names in the symbol stream include both. For the symbol stream, instead of having Clang put the fully qualified name in the subprogram display name, recreate it from the subprogram scope chain. For the type stream, take the unqualified name and chop of any template arguments. This makes it so that CodeView DI metadata is more similar to DWARF DI metadata. llvm-svn: 273009	2016-06-17 16:11:20 +00:00
Nirav Dave	fd91041ce1	Refactor and cleanup Assembly Parsing / Lexing Recommiting after fixing non-atomic insert to front of SmallVector in MCAsmLexer.h Add explicit Comment Token in Assembly Lexing for future support for outputting explicit comments from inline assembly. As part of this, CPPHash Directives are now explicitly distinguished from Hash line comments in Lexer. Line comments are recorded as EndOfStatement tokens, not Comment tokens to simplify compatibility with current TargetParsers. This slightly complicates comment output. This remove all lexing tasks out of the parser, does minor cleanup to remove extraneous newlines Asm Output, and some improvements white space handling. Reviewers: rtrieu, dwmw2, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20009 llvm-svn: 273007	2016-06-17 16:06:17 +00:00
Simon Pilgrim	6a35e5ab97	[X86][SSE4A] Remove the GCCBuiltins from the movntsd/movntss intrinsic defs so we can emit native IR from clang. Clang-side sibling commit to follow. llvm-svn: 273002	2016-06-17 14:27:38 +00:00
Adam Nemet	e7709d92ba	[LLE] Don't hard-code the name of the preheader in test Turns out a didn't get this right because symbolic stride versioning changes the name. Relax the matching. llvm-svn: 272992	2016-06-17 09:13:15 +00:00
Matt Arsenault	ce56f7bbaa	Revert "InstCombine: Reduce trunc (shl x, K) width." This reverts commit r272987. This might be causing crashes on some bots. llvm-svn: 272990	2016-06-17 06:28:53 +00:00
Qin Zhao	bb4496f8c8	[esan\|cfrag] Add the struct field size array in StructInfo Summary: Adds the struct field size array in struct StructInfo. Updates test struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, bruening, llvm-commits Differential Revision: http://reviews.llvm.org/D21341 llvm-svn: 272989	2016-06-17 04:50:20 +00:00
Matt Arsenault	028fd50642	InstCombine: Reduce trunc (shl x, K) width. llvm-svn: 272987	2016-06-17 04:43:22 +00:00
Ranjeet Singh	39d2d097d6	[ARM] Add support for mrrc/mrrc2 intrinsics. Reapplying patch as it was reverted when it was first committed because of an assertion failure when the mrrc2 intrinsic was called in ARM mode. The failure was happening because the instruction was being built in ARMISelDAGToDAG.cpp and the tablegen description for mrrc2 instruction doesn't allow you to use a predicate. The ARM architecture manuals do say that mrrc2 in ARM mode can be predicated with AL in assembly but this has no effect on the encoding of the instruction as the top 4 bits will always be 1111 not 1110 which is the encoding for the condition AL. Differential Revision: http://reviews.llvm.org/D21408 llvm-svn: 272982	2016-06-17 00:52:41 +00:00
Evgeniy Stepanov	45fa0fd758	[safestack] Sink unsafe address computation to each use. This is a fix for PR27844. When replacing uses of unsafe allocas, emit the new location immediately after each use. Without this, the pointer stays live from the function entry to the last use, while it's usually cheaper to recalculate. llvm-svn: 272969	2016-06-16 22:34:04 +00:00
Evgeniy Stepanov	72d961a1da	[safestack] Fixup llvm.dbg.value when rewriting unsafe allocas. When moving unsafe allocas to the unsafe stack, dbg.declare intrinsics are updated to refer to the new location. This change does the same to dbg.value intrinsics. llvm-svn: 272968	2016-06-16 22:34:00 +00:00
David Majnemer	979cb88870	[CodeView] Implement support for enums MSVC handles enums differently from structs and classes: a forward declaration is not emitted unconditionally. MSVC does not emit an S_UDT record for the enum. Differential Revision: http://reviews.llvm.org/D21442 llvm-svn: 272960	2016-06-16 21:32:16 +00:00
Nirav Dave	280ecf6ff0	Revert "Refactor and cleanup Assembly Parsing / Lexing" Reverting for unexpected crashes on various platforms. This reverts commit r272953. llvm-svn: 272957	2016-06-16 21:19:23 +00:00
Sanjoy Das	07c6521aed	[EarlyCSE] Fold invariant loads Redundant invariant loads can be CSE'ed with very little extra effort over what early-cse already tracks, so it looks reasonable to make early-cse handle this case. llvm-svn: 272954	2016-06-16 20:47:57 +00:00
Nirav Dave	c19c3260df	Refactor and cleanup Assembly Parsing / Lexing Add explicit Comment Token in Assembly Lexing for future support for outputting explicit comments from inline assembly. As part of this, CPPHash Directives are now explicitly distinguished from Hash line comments in Lexer. Line comments are recorded as EndOfStatement tokens, not Comment tokens to simplify compatibility with current TargetParsers. This slightly complicates comment output. This remove all lexing tasks out of the parser, does minor cleanup to remove extraneous newlines Asm Output, and some improvements white space handling. Reviewers: rtrieu, dwmw2, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20009 llvm-svn: 272953	2016-06-16 20:34:22 +00:00
Justin Lebar	b0bd07aff7	Fix strip-dead-debug-info test if path contains "bar". This test checks that the string 'bar' (no quotes) doesn't exist in the output after running opt. But opt embeds the absolute path to the filename, and on my machine, the filename contains the string 'jlebar', causing the test to fail. This patch changes the test to look for the string '"bar"' instead. llvm-svn: 272941	2016-06-16 19:39:55 +00:00
Paul Robinson	4ce6a93226	Make check lines not match themselves. Noticed during review of the recent FileCheck change. llvm-svn: 272940	2016-06-16 19:38:48 +00:00
Sanjay Patel	0e9afea3c8	[x86] autoupgrade and remove AVX2 integer min/max intrinsics This will (hopefully very temporarily) break clang. The clang side of this should be the next commit. llvm-svn: 272932	2016-06-16 18:44:20 +00:00
Rafael Espindola	5a07687a8e	dos2unix this test. NFC. llvm-svn: 272928	2016-06-16 18:21:11 +00:00
Davide Italiano	41315f7873	[PM] Revert the port of MergeLoadStoreMotion to the new pass manager. Daniel Berlin expressed some real concerns about the port and proposed and alternative approach. I'll revert this for now while working on a new patch, which I hope to put up for review shortly. Sorry for the churn. llvm-svn: 272925	2016-06-16 17:40:53 +00:00
Sanjay Patel	d09a21682f	remove old FileCheck lines that are no longer used llvm-svn: 272921	2016-06-16 17:04:16 +00:00
Sanjay Patel	f664f3a578	[DAG] Remove redundant FMUL in Newton-Raphson SQRT code When calculating a square root using Newton-Raphson with two constants, a naive implementation is to use five multiplications (four muls to calculate reciprocal square root and another one to calculate the square root itself). However, after some reassociation and CSE the same result can be obtained with only four multiplications. Unfortunately, there's no reliable way to do such a reassociation in the back-end. So, the patch modifies NR code itself so that it directly builds optimal code for SQRT and doesn't rely on any further reassociation. Patch by Nikolai Bozhenov! Differential Revision: http://reviews.llvm.org/D21127 llvm-svn: 272920	2016-06-16 16:58:54 +00:00
Adam Nemet	776346848a	[LLE] New test to check that no versioning for symbolic strides occurs. NFC This is currently only performed in the Vectorizer. I will change this as symbolic stride collection is moved to LAA. This test will track when the actual functional change occurs. llvm-svn: 272918	2016-06-16 16:45:29 +00:00
Igor Laevsky	87f0d0e185	Revert r272891 "[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo" It was causing failures in Profile-i386 and Profile-x86_64 tests. llvm-svn: 272912	2016-06-16 16:25:53 +00:00
Reid Kleckner	0166a71386	[PATCH] Fix RuntimeDyldCOFFI386 to handle relocations with a non-zero addend This fixes IMAGE_REL_I386_DIR32, IMAGE_REL_I386_DIR32NB, IMAGE_REL_I386_SECREL, and IMAGE_REL_I386_REL32 relocations. Based on patch by Jon Turney <jon.turney@dronecode.org.uk> llvm-svn: 272911	2016-06-16 16:21:41 +00:00
Rafael Espindola	afade35003	Don't print (PLT) on arm. The R_ARM_PLT32 relocation is deprecated and is not produced by MC. This means that the code being deleted is dead from the .o point of view and was making the .s more confusing. llvm-svn: 272909	2016-06-16 16:09:53 +00:00
Sanjay Patel	51ab757941	[x86] autoupgrade and remove SSE2/SSE41 integer min/max intrinsics Follow-up to: http://reviews.llvm.org/rL272806 http://reviews.llvm.org/rL272807 llvm-svn: 272907	2016-06-16 15:48:30 +00:00
Daniel Sanders	8e3c74210f	Remove redundant -mattr options from llvm-objdump commands. The -mattr options in these four tests have no effect on the output of llvm-objdump. In the case of the two Mips tests, removing the -mattr option left duplicate RUN lines so the duplicates have been removed. llvm-svn: 272906	2016-06-16 15:47:19 +00:00
Igor Laevsky	c9179fd2c2	[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo We should update results of the BranchProbabilityInfo after removing block in JumpThreading. Otherwise we will get dangling pointer inside BranchProbabilityInfo cache. Differential Revision: http://reviews.llvm.org/D20957 llvm-svn: 272891	2016-06-16 13:28:25 +00:00
Patrik Hagglund	0acaefaf9d	PR27938: Don't remove valid DebugLoc in Scalarizer Added checks to make sure the Scalarizer::transferMetadata() don't remove valid debug locations from instructions. This is important as the verifier pass require that e.g. inlinable callsites have a valid debug location. https://llvm.org/bugs/show_bug.cgi?id=27938 Patch by Karl-Johan Karlsson Reviewers: dblaikie Differential Revision: http://reviews.llvm.org/D20807 llvm-svn: 272884	2016-06-16 10:48:54 +00:00
Daniel Sanders	de7816b0cd	[mips][mips16] Fix machine verifier errors about incorrect register classes on load/stores. Summary: [ls][bh] and [ls][bh]u cannot use sp-relative addresses and must therefore lower frameindex nodes such that there is a copy to a CPU16Regs register. This is now done consistently using a separate addressing mode that does not permit frameindex nodes. As part of this I've had to remove an optimization that reduced the number of instructions needed to work around the lack of sp-relative addresses on [ls][bh] and [ls][bh]u. This optimization used one of the eight CPU16Regs registers as a copy of the stack pointer and it's implementation was the root cause of many of the register vs register class mismatches. lw/sw can use sp-relative addresses but we ought to ensure that we use the correct version of lw/sw internally for things like IAS. This is not currently the case and this change does not fix this. However, this change does clean it up sufficiently well to fix the machine verifier failures. Also removed irrelevant functions from stchar.ll. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21062 llvm-svn: 272882	2016-06-16 10:20:59 +00:00
Daniel Sanders	1d14864bb3	[llvm-objdump] Support detection of feature bits from the object and implement this for Mips. Summary: The Mips implementation only covers the feature bits described by the ELF e_flags so far. Mips stores additional feature bits such as MSA in the .MIPS.abiflags section. Also fixed a small bug this revealed where microMIPS wouldn't add the EF_MIPS_MICROMIPS flag when using -filetype=obj. Reviewers: echristo, rafael Subscribers: rafael, mehdi_amini, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21125 llvm-svn: 272880	2016-06-16 09:17:03 +00:00
Hrvoje Varga	f1e0a03d08	[mips][micromips] Implement DCLO, DCLZ, DROTR, DROTR32 and DROTRV instructions Differential Revision: http://reviews.llvm.org/D16917 llvm-svn: 272876	2016-06-16 07:06:25 +00:00
Chuang-Yu Cheng	dbe00d51b4	SimplifyCFG is able to detect the pattern: (i == 5334 \|\| i == 5335) to: ((i & -2) == 5334) This transformation has some incorrect side conditions. Specifically, the transformation is only applied when the right-hand side constant (5334 in the example) is a power of two not equal and not equal to the negated mask. These side conditions were added in r258904 to fix PR26323. The correct side condition is that: ((Constant & Mask) == Constant)[(5334 & -2) == 5334]. It's a little bit hard to see why these transformations are correct and what the side conditions ought to be. Here is a CVC3 program to verify them for 64-bit values: ONE : BITVECTOR(64) = BVZEROEXTEND(0bin1, 63); x : BITVECTOR(64); y : BITVECTOR(64); z : BITVECTOR(64); mask : BITVECTOR(64) = BVSHL(ONE, z); QUERY( (y & ~mask = y) => ((x & ~mask = y) <=> (x = y OR x = (y \| mask))) ); Please note that each pattern must be a dual implication (<--> or iff). One directional implication can create spurious matches. If the implication is only one-way, an unsatisfiable condition on the left side can imply a satisfiable condition on the right side. Dual implication ensures that satisfiable conditions are transformed to other satisfiable conditions and unsatisfiable conditions are transformed to other unsatisfiable conditions. Here is a concrete example of a unsatisfiable condition on the left implying a satisfiable condition on the right: mask = (1 << z) (x & ~mask) == y --> (x == y \|\| x == (y \| mask)) Substituting y = 3, z = 0 yields: (x & -2) == 3 --> (x == 3 \|\| x == 2) The version of this code before r258904 had no side-conditions and incorrectly justified itself in comments through one-directional implication. Thanks to Chandler for the suggestion! Author: Thomas Jablin (tjablin) Reviewers: chandlerc majnemer hfinkel cycheng http://reviews.llvm.org/D21417 llvm-svn: 272873	2016-06-16 04:44:25 +00:00
Eli Friedman	bd254a6f45	[InstCombine] Don't widen metadata on store-to-load forwarding The original check for load CSE or store-to-load forwarding is wrong when the forwarded stored value happened to be a load. Ref https://github.com/JuliaLang/julia/issues/16894 Differential Revision: http://reviews.llvm.org/D21271 Patch by Yichao Yu! llvm-svn: 272868	2016-06-16 02:33:42 +00:00
Tim Northover	daa1c018b0	AArch64: allow MOV (imm) alias to be printed The backend has been around for years, it's pretty ridiculous that we can't even use the preferred form for printing "MOV" aliases. Unfortunately, TableGen can't handle the complex predicates when printing so it's a bunch of nasty C++. Oh well. llvm-svn: 272865	2016-06-16 01:42:25 +00:00
Reid Kleckner	95e8af7395	[codeview] Regenerate test case with unique identifiers Clang now emits these, and these match MSVC. Should allow more powerful merging of type records across TUs. llvm-svn: 272864	2016-06-16 01:33:59 +00:00
Matt Arsenault	191763026c	AMDGPU: Disable scheduling in some slow tests Disabling the pre-RA scheduler on large-work-group-registers causes it to be ~50% slower. llvm-svn: 272860	2016-06-16 00:56:47 +00:00
Xinliang David Li	1eaecefaf9	[PM] Port Add discriminator pass to new PM llvm-svn: 272847	2016-06-15 21:51:30 +00:00
Rui Ueyama	5dbea9db10	[Codeview] Add a class for LF_UDT_MOD_SRC_LINE. Differential Revision: http://reviews.llvm.org/D21406 llvm-svn: 272843	2016-06-15 21:25:29 +00:00
Sanjay Patel	74b40bdb53	[x86, SSE] update packed FP compare tests for direct translation from builtin to IR The clang side of this was r272840: http://reviews.llvm.org/rL272840 A follow-up step would be to auto-upgrade and remove these LLVM intrinsics completely. Differential Revision: http://reviews.llvm.org/D21269 llvm-svn: 272841	2016-06-15 21:22:15 +00:00
Kevin Enderby	d8a6e83dcf	Fix llvm-objdump when disassembling a stripped Mach-O binary with the -macho option. It was printing out nothing in this case. llvm-objdump tries to disassemble sections a symbol at a time. In the case of a fully stripped Mach-O executable the only symbol remaining in the (__TEXT,__text) section is the special linker defined symbol __mh_execute_header . This symbol is special in that while it is N_SECT symbol in the (__TEXT,__text) its address is before the start of the (__TEXT,__text). It’s address is the start of the __TEXT segment which is where the mach header is statically linked. So the code in DisassembleMachO() needs to deal with this case specially. rdar://26778273 llvm-svn: 272837	2016-06-15 21:14:01 +00:00
Sanjay Patel	0b526676ab	[x86] delete unnecessary function declarations Missed this in r272806, r272807. llvm-svn: 272834	2016-06-15 20:51:47 +00:00
Tim Northover	389a1e39ea	AArch64: stop trying to use 32-bit MOVZs when expanding patchpoints. Of course the assembly was right but because the opcode was MOVZWi it was encoded as "movz w16, #65535, lsl #32" which is an unallocated encoding and would go horribly wrong on a CPU. No idea how this bug survived this long. It seems nobody is using that aspect of patchpoints. llvm-svn: 272831	2016-06-15 20:33:36 +00:00
Sanjay Patel	1a4569df54	[x86] add folds for x86 vector compare nodes (PR27924) Ideally, we can get rid of most x86 LLVM intrinsics by transforming them to IR (and some of that happened with http://reviews.llvm.org/rL272807), but it doesn't cost much to have some simple folds in the backend too while we're working on that and as a backstop. This fixes: https://llvm.org/bugs/show_bug.cgi?id=27924 Differential Revision: http://reviews.llvm.org/D21356 llvm-svn: 272828	2016-06-15 20:26:58 +00:00
Matthias Braun	98ea88be42	Statistic: Add machine parseable json output - We lacked a short unique identifier for a statistics, so I renamed the current "Name" field that just contained the DEBUG_TYPE name of the current file to DebugType and added a new "Name" field that contains the C++ identifier of the statistic variable. - Add the -stats-json option which outputs statistics in json format. Differential Revision: http://reviews.llvm.org/D20995 llvm-svn: 272826	2016-06-15 20:19:16 +00:00
Kevin B. Smith	acbda9ef30	[X86]: Updated r272801 to promote 16 bit compares with immediate operand to 32 bits. This is in response to a comment by Eli Friedman. llvm-svn: 272814	2016-06-15 18:18:05 +00:00
David Majnemer	3128b10cdc	[CodeView] Add support for emitting S_UDT for typedefs Emit a S_UDT record for typedefs. We still need to do something for class types. Differential Revision: http://reviews.llvm.org/D21149 llvm-svn: 272813	2016-06-15 18:00:01 +00:00
Sanjay Patel	a6c6f09967	[x86, SSE] remove the GCCBuiltins from the integer min/max intrinsics This allows us to emit native IR in Clang (next commit). Also, update the intrinsic tests to show that codegen already knows how to handle the IR that Clang will soon produce. llvm-svn: 272806	2016-06-15 17:17:27 +00:00
David Majnemer	b62692e2e0	[TargetLibraryInfo] Teach isValidProtoForLibFunc about tan We would fail to validate the type of the tan function which would cause downstream users of isValidProtoForLibFunc to assert. This fixes PR28143. llvm-svn: 272802	2016-06-15 16:47:23 +00:00
Kevin B. Smith	54566a0e9a	[X86]: Quit promoting 8 and 16 bit compares to 32 bit. Differential Revision: http://reviews.llvm.org/D21144 llvm-svn: 272801	2016-06-15 16:37:46 +00:00
Nirav Dave	194cb55f37	Revert "Preserve DebugInfo when replacing values in DAGCombiner" Reverting due to assertion failure in lib/CodeGen/SelectionDAG/InstrEmitter.cpp This reverts commit r272792. llvm-svn: 272799	2016-06-15 16:08:50 +00:00
Kevin B. Smith	c3c82cdbd0	[X86]: Improve Liveness checking for X86FixupBWInsts.cpp Differential Revision: http://reviews.llvm.org/D21085 llvm-svn: 272797	2016-06-15 16:03:06 +00:00
Nirav Dave	a72e308403	Preserve DebugInfo when replacing values in DAGCombiner [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 272792	2016-06-15 14:50:08 +00:00
Ranjeet Singh	0db7be886e	Reverting r272778 because there's an assertion failure when running the test CodeGen/ARM/intrinsics-coprocessor.ll llvm-svn: 272791	2016-06-15 14:23:29 +00:00
Simon Dardis	7bdf183ac1	[mips] Missing test case Add missing testcase from r272666. llvm-svn: 272784	2016-06-15 13:49:58 +00:00
Ranjeet Singh	351364fe76	[ARM] Add support for mrrc/mrrc2 intrinsics. Differential Revision: http://reviews.llvm.org/D21178 llvm-svn: 272778	2016-06-15 11:32:24 +00:00
Daniel Sanders	df3185d2ea	[mips] Removed invalid test from o32_cc.ll MIPS32R1 cannot implement a 64-bit FPU because this was introduced in MIPS32R2. llvm-svn: 272769	2016-06-15 09:47:27 +00:00
Sean Silva	e0a9e66040	[PM] Port SLPVectorizer to the new PM This uses the "runImpl" approach to share code with the old PM. Porting to the new PM meant abandoning the anonymous namespace enclosing most of SLPVectorizer.cpp which is a bit of a bummer (but not a big deal compared to having to pull the pass class into a header which the new PM requires since it calls the constructor directly). llvm-svn: 272766	2016-06-15 08:43:40 +00:00
Daniel Sanders	d3bb20821d	[mips][msa] Fix register/register-class mismatches in emitINSERT_DF_VIDX(). Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21068 llvm-svn: 272765	2016-06-15 08:43:23 +00:00
Zlatko Buljan	d2ed9c6c2c	[mips][microMIPS] Add CodeGen support for AND, OR16, OR, XOR*, NOT16 and NOR instructions Differential Revision: http://reviews.llvm.org/D16719 llvm-svn: 272764	2016-06-15 07:46:24 +00:00
Igor Breger	64cfd3a442	[AVX512] Fix BLENDM lowering patterns. Operands should be swapped to match SELECT behavior. Use BLENDM instead of masked move instruction. Differential Revision: http://reviews.llvm.org/D21001 llvm-svn: 272763	2016-06-15 07:30:38 +00:00
Nicolai Haehnle	a609259832	AMDGPU: Fix MUBUF offset bugs affecting llvm.amdgcn.buffer.* intrinsics Summary: This fixes two related bugs. First, the generic optimization passes unfortunately generate negative constant offsets but the hardware treats SOffset as an unsigned value. Second, there is a hardware bug on SI and CI, where address clamping in MUBUF instructions does not work correctly when SOffset is larger than the buffer size. This patch works around this bug by never using SOffset. An alternative workaround would be to do the clamping manually when SOffset is too large, but generating the required code sequence during instruction selection would be rather involved, and in any case the resulting code would probably be worse. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21326 llvm-svn: 272761	2016-06-15 07:13:05 +00:00
Sean Silva	a4c2d150d0	[PM] Port AlignmentFromAssumptions to the new PM. This uses the "runImpl" pattern to share code between the old and new PM. llvm-svn: 272757	2016-06-15 06:18:01 +00:00
Sanjoy Das	0272be206a	Don't force SP-relative addressing for statepoints Summary: ... when the offset is not statically known. Prioritize addresses relative to the stack pointer in the stackmap, but fallback gracefully to other modes of addressing if the offset to the stack pointer is not a known constant. Patch by Oscar Blumberg! Reviewers: sanjoy Subscribers: llvm-commits, majnemer, rnk, sanjoy, thanm Differential Revision: http://reviews.llvm.org/D21259 llvm-svn: 272756	2016-06-15 05:35:14 +00:00
Amaury Sechet	a65a237805	Add support for callsite in the new C API for attributes Summary: The second consumer of attributes. Reviewers: Wallbraker, whitequark, echristo, rafael, jyknight Subscribers: mehdi_amini Differential Revision: http://reviews.llvm.org/D21266 llvm-svn: 272754	2016-06-15 05:14:29 +00:00
Tom Stellard	82785e9fe7	AMDGPU/SI: Correctly encode constant expressions Summary: We we have an MCConstantExpr, we can encode it directly into the instruction instead of emitting fixups. Reviewers: artem.tamazov, vpykhtin, SamWot, nhaustov, arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21236 Change-Id: I88b3edf288d48e65c5d705fc4850d281f8e36948 llvm-svn: 272750	2016-06-15 03:09:39 +00:00
Tom Stellard	89049702ce	AMDGPU/AsmParser: Add support for parsing symbol operands Summary: We can now reference symbols directly in operands, like this: s_mov_b32 s0, global Reviewers: artem.tamazov, vpykhtin, SamWot, nhaustov Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21038 llvm-svn: 272748	2016-06-15 02:54:14 +00:00
Michael Kuperstein	3277a05fcf	Recommit [LV] Enable vectorization of loops where the IV has an external use r272715 broke libcxx because it did not correctly handle cases where the last iteration of one IV is the second-to-last iteration of another. Original commit message: Vectorizing loops with "escaping" IVs has been disabled since r190790, due to PR17179. This re-enables it, with support for external use of both "post-increment" (last iteration) and "pre-increment" (second-to-last iteration) IVs. llvm-svn: 272742	2016-06-15 00:35:26 +00:00
David Majnemer	4a697c312f	[LoopUnroll] Don't crash trying to unroll loop with EH pad exit We do not support splitting cleanuppad or catchswitches. This is problematic for passes which assume that a loop is in loop simplify form (the loop would have a dedicated exit block instead of sharing it). While it isn't great that we don't support this for cleanups, we still cannot make loop-simplify form an assertable precondition because indirectbr will also disable these sorts of CFG cleanups. This fixes PR28132. llvm-svn: 272739	2016-06-15 00:19:56 +00:00
David Majnemer	577be0fed3	[CodeView] Don't emit debuginfo for imported symbols Emitting symbol information requires us to have a definition for the symbol. A symbol reference is insufficient. This fixes PR28123. llvm-svn: 272738	2016-06-15 00:19:52 +00:00
David Majnemer	cbf614a93b	Remove the ScalarReplAggregates pass Nearly all the changes to this pass have been done while maintaining and updating other parts of LLVM. LLVM has had another pass, SROA, which has superseded ScalarReplAggregates for quite some time. Differential Revision: http://reviews.llvm.org/D21316 llvm-svn: 272737	2016-06-15 00:19:09 +00:00
Matt Arsenault	f42c69206d	AMDGPU: Run pointer optimization passes llvm-svn: 272736	2016-06-15 00:11:01 +00:00
Peter Collingbourne	6dbee00d67	Verifier: check that functions have at most a single !prof attachment. llvm-svn: 272734	2016-06-14 23:13:15 +00:00
Xinliang David Li	8052238ac0	Fix a test case to match its intention llvm-svn: 272733	2016-06-14 23:05:46 +00:00
Michael Kuperstein	d4bd3ab5fe	Reverting r272715 since it broke libcxx. llvm-svn: 272730	2016-06-14 22:30:41 +00:00
Dehao Chen	9f2bdfb40f	Set machine block placement hot prob threshold for both static and runtime profile. Summary: With runtime profile, we have more confidence in branch probability, thus during basic block layout, we set a lower hot prob threshold so that blocks can be layouted optimally. Reviewers: djasper, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20991 llvm-svn: 272729	2016-06-14 22:27:17 +00:00
Davide Italiano	d737dd2ec6	[PM] Port WholeProgramDevirt to the new pass manager. llvm-svn: 272721	2016-06-14 21:44:19 +00:00
Michael Kuperstein	23b6d6adc9	[LV] Enable vectorization of loops where the IV has an external use Vectorizing loops with "escaping" IVs has been disabled since r190790, due to PR17179. This re-enables it, with support for external use of both "post-increment" (last iteration) and "pre-increment" (second-to-last iteration) IVs. Differential Revision: http://reviews.llvm.org/D21048 llvm-svn: 272715	2016-06-14 21:27:27 +00:00
Sanjay Patel	4c3cb8b6c0	[x86] add current codegen tests for PR27924 llvm-svn: 272714	2016-06-14 21:25:46 +00:00
Evgeniy Stepanov	0be3cf1d35	Add a missing test. This is a test for r272421: Disable MSan-hostile loop unswitching. llvm-svn: 272713	2016-06-14 21:24:13 +00:00
Peter Collingbourne	96efdd6107	IR: Introduce local_unnamed_addr attribute. If a local_unnamed_addr attribute is attached to a global, the address is known to be insignificant within the module. It is distinct from the existing unnamed_addr attribute in that it only describes a local property of the module rather than a global property of the symbol. This attribute is intended to be used by the code generator and LTO to allow the linker to decide whether the global needs to be in the symbol table. It is possible to exclude a global from the symbol table if three things are true: - This attribute is present on every instance of the global (which means that the normal rule that the global must have a unique address can be broken without being observable by the program by performing comparisons against the global's address) - The global has linkonce_odr linkage (which means that each linkage unit must have its own copy of the global if it requires one, and the copy in each linkage unit must be the same) - It is a constant or a function (which means that the program cannot observe that the unique-address rule has been broken by writing to the global) Although this attribute could in principle be computed from the module contents, LTO clients (i.e. linkers) will normally need to be able to compute this property as part of symbol resolution, and it would be inefficient to materialize every module just to compute it. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html for earlier discussion. Part of the fix for PR27553. Differential Revision: http://reviews.llvm.org/D20348 llvm-svn: 272709	2016-06-14 21:01:22 +00:00
Zachary Turner	1dc9fd3c4a	Resubmit "[pdb] Actually write a PDB to disk from YAML."" Reviewed By: ruiu Differential Revision: http://reviews.llvm.org/D21220 llvm-svn: 272708	2016-06-14 20:48:36 +00:00
Sanjoy Das	d7e8206b58	[ValueTracking] Calls to @llvm.assume always return This change teaches llvm::isGuaranteedToTransferExecutionToSuccessor that calls to @llvm.assume always terminate. Most other relevant intrinsics should be covered by the "CS.onlyReadsMemory() \|\| CS.onlyAccessesArgMemory()" bit but we were missing @llvm.assumes because we state that it clobbers memory. Added an LICM test case, but this change is not specific to LICM. llvm-svn: 272703	2016-06-14 20:23:16 +00:00
Wei Mi	b799a625f9	[X86] Reduce the width of multiplification when its operands are extended from i8 or i16 For <N x i32> type mul, pmuludq will be used for targets without SSE41, which often introduces many extra pack and unpack instructions in vectorized loop body because pmuludq generates <N/2 x i64> type value. However when the operands of <N x i32> mul are extended from smaller size values like i8 and i16, the type of mul may be shrunk to use pmullw + pmulhw/pmulhuw instead of pmuludq, which generates better code. For targets with SSE41, pmulld is supported so no shrinking is needed. Differential Revision: http://reviews.llvm.org/D20931 llvm-svn: 272694	2016-06-14 18:53:20 +00:00
Zachary Turner	07c229c9e7	Revert "[pdb] Actually write a PDB to disk from YAML." This reverts commit 879139e1c6577b09df52de56a6bab856a19ed185. This was committed accidentally when I blindly typed git svn dcommit instead of the command to generate a patch. llvm-svn: 272693	2016-06-14 18:51:35 +00:00
Zachary Turner	fe5bc02492	[pdb] Actually write a PDB to disk from YAML. llvm-svn: 272692	2016-06-14 18:49:36 +00:00
George Burgess IV	24eb0daf7c	[CFLAA] Tag arguments as escaped instead of unknown. This patch also includes some refactoring. Prior to this patch, we tagged all CFLAA attributes as unknown. This is suboptimal, since it meant that any Value used as an argument would be considered to alias any other Value that existed. Now that we have the machinery to tag sets below the set for an arbitrary value with attributes, it's okay to be less conservative with arguments. (Specifically, we still tag the set under an argument with unknown). Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21262 llvm-svn: 272690	2016-06-14 18:12:28 +00:00
Nirav Dave	f8d00d5cac	Fix BSS global handling in AsmPrinter Change EmitGlobalVariable to check final assembler section is in BSS before using .lcomm/.comm directive. This prevents globals from being put into .bss erroneously when -data-sections is used. This fixes PR26570. Reviewers: echristo, rafael Subscribers: llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21146 llvm-svn: 272674	2016-06-14 15:09:30 +00:00
Artem Tamazov	17091364d1	[AMDGPU][llvm-mc] Predefined symbols to access -mcpu from the assembly source (.option.machine_version...) The feature allows for conditional assembly etc. TODO: make those symbols read-only. Test added. Differential Revision: http://reviews.llvm.org/D21238 llvm-svn: 272673	2016-06-14 15:03:59 +00:00
Daniel Sanders	fd557cb01f	[FileCheck] Add --check-prefixes as a shorthand for multiple --check-prefix options. Summary: This new alias takes a comma separated list of prefixes which allows '--check-prefix=A --check-prefix=B --check-prefix=C' to be written as '--check-prefixes=A,B,C'. Reviewers: probinson Subscribers: probinson, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D21293 llvm-svn: 272670	2016-06-14 14:28:04 +00:00
Simon Dardis	878c0b1b76	[mips] Optimize stack pointer adjustments. Instead of always using addu to adjust the stack pointer when the size out is of the range of an addiu instruction, use subu so that a smaller constant can be generated. This can give savings of ~3 instructions whenever a function has a a stack frame whose size is out of range of an addiu instruction. This change may break some naive stack unwinders. Partially resolves PR/26291. Thanks to David Chisnall for reporting the issue. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D21321 llvm-svn: 272666	2016-06-14 13:39:43 +00:00
James Molloy	65b6be1d3a	[Thumb] Fix off-by-one error in r272007 We can only generate immediates up to #510 with a MOV+ADD, not #511, because there's no such instruction as add #256. Found by Oliver Stannard and csmith! llvm-svn: 272665	2016-06-14 13:33:07 +00:00
Simon Dardis	4fbf76f7c3	[mips][atomics] Fix atomic instruction descriptions and uses. PR27458 highlights that the MIPS backend does not have well formed MIR for atomic operations (among other errors). This patch adds expands and corrects the LL/SC descriptions and uses for MIPS(64). Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D19719 llvm-svn: 272655	2016-06-14 11:29:28 +00:00
Daniel Sanders	e858136d91	[mips][ias] Implement one N32 case (of two) for .cpsetup. This patch implements the N32 case where -mno-shared is in effect. The case where -mshared is in effect will be added later since doing that now requires additional changes to how we handle %hi(%neg(%gp_rel(foo))) expressions to emit the three relocations as three relocations (currently only one of the three would be emitted) which then requires further changes to our MCFixup handling. While we could fix both cases together, fixing the -mno-shared case allows us to fix the ELFCLASS bug (where N32 incorrectly uses ELFCLASS64 instead of ELFCLASS32) in a way that allows cpsetup.s to check for a correct output instead of another incorrect output. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D21131 llvm-svn: 272652	2016-06-14 10:13:47 +00:00
Simon Pilgrim	cf1165b86e	[X86][SSE4A] Added patterns for nontemporal stores of scalar float/doubles using MOVNTSD/MOVNTSS llvm-svn: 272651	2016-06-14 09:43:38 +00:00
Adam Nemet	73a26957fc	[LoopVer] Update all existing PHIs in the exit block We only used to add the edge from the cloned loop to PHIs that corresponded to values defined by the loop. We need to do this for all PHIs obviously since we need a PHI operand for each incoming edge. This includes things like PHIs with a constant value or with values defined before the original loop (see the testcases). After the patch the PHIs are added to the exit block in two passes. In the first pass we ensure there is a single-operand (LCSSA) PHI for each value defined by the loop. In the second pass we loop through each (single-operand) PHI and add the value for the edge from the cloned loop. If the value is defined in the loop we'll use the cloned instruction from the cloned loop. Fixes PR28037 llvm-svn: 272649	2016-06-14 09:38:54 +00:00
Simon Dardis	e661e528db	[mips] MIPS32/64 itineraries Itineraries for some pre MIPSR6 and EVA instructions. Some pseudo expanded instructions are marked as having no scheduling info. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D20418 llvm-svn: 272648	2016-06-14 09:35:29 +00:00
Daniel Sanders	435a653437	[mips][dsp] Fix use without def on DSPCtrl registers read by rddsp intrinsic. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21063 llvm-svn: 272647	2016-06-14 09:29:46 +00:00
Daniel Sanders	d2a49ec3ab	[mips][msa] copyPhysReg() should not set RegState::Define on result of CTCMSA. Summary: The machine verifier reports 'Explicit operand marked as def' when it is manually specified even though it agrees with the operand info. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21065 llvm-svn: 272646	2016-06-14 09:11:33 +00:00
Diana Picus	bae1d89e45	[SelectionDAG] Remove exit-on-error flag from test (PR27765) The exit-on-error flag in the ARM test is necessary in order to avoid an unreachable in the DAGTypeLegalizer, when trying to expand a physical register. We can also avoid this situation by introducing a bitcast early on, where the invalid scalar-to-vector conversion is detected. We also add a test for PowerPC, which goes through a similar code path in the SelectionDAGBuilder. Fixes PR27765. Differential Revision: http://reviews.llvm.org/D21061 llvm-svn: 272644	2016-06-14 07:30:20 +00:00
Igor Breger	484bace21b	re-generate the tests using the update_llc_test_checks.py script llvm-svn: 272643	2016-06-14 07:05:10 +00:00
Davide Italiano	cccf4f01ad	[PM] Port Mem2Reg to the new pass manager. llvm-svn: 272630	2016-06-14 03:22:22 +00:00
Craig Topper	99e30e6a66	[AVX512] Use MOVZX32 instead of MOVZ16 for loading single v8/v4/v2/v1 masks when KMOVB is not available. This has better behavior with respect to partial register stalls since it won't need to preserve the upper 16-bits of the GPR. llvm-svn: 272626	2016-06-14 03:13:00 +00:00
Craig Topper	ddab395397	[AVX512] Add patterns for zero-extending a mask that use the def of KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX. llvm-svn: 272625	2016-06-14 03:12:54 +00:00
Craig Topper	cbe54a4bd9	[AVX512] Add tests for zero extending masks that show an unnecessary movzx instruction. A followup patch will remove that instruction, but adding the tests first to make the more obvious. llvm-svn: 272624	2016-06-14 03:12:48 +00:00
Sean Silva	6347df0f81	[PM] Port MemCpyOpt to the new PM. The need for all these Lookup* functions is just because of calls to getAnalysis inside methods (i.e. not at the top level) of the runOnFunction method. They should be straightforward to clean up when the old PM is gone. llvm-svn: 272615	2016-06-14 02:44:55 +00:00
Davide Italiano	5669ef1efe	Placate bots fixing a typo in AA-pipeline description. Sorry. llvm-svn: 272608	2016-06-14 01:11:12 +00:00
Sean Silva	46590d556a	Bring back "[PM] Port JumpThreading to the new PM" with a fix This reverts commit r272603 and adds a fix. Big thanks to Davide for pointing me at r216244 which gives some insight into how to fix this VS2013 issue. VS2013 can't synthesize a move constructor. So the fix here is to add one explicitly to the JumpThreadingPass class. llvm-svn: 272607	2016-06-14 00:51:09 +00:00
Davide Italiano	89ab89d6cd	[PM] Port MergedLoadStoreMotion to the new pass manager. llvm-svn: 272606	2016-06-14 00:49:23 +00:00
Sean Silva	7d5a57cbfc	Revert "[PM] Port JumpThreading to the new PM" This reverts commit r272597. Will investigate issue with VS2013 compilation and then recommit. llvm-svn: 272603	2016-06-14 00:26:31 +00:00
Sean Silva	f81328d0b4	[PM] Port JumpThreading to the new PM This follows the approach in r263208 (for GVN) pretty closely: - move the bulk of the body of the function to the new PM class. - expose a runImpl method on the new-PM class that takes the IRUnitT and pointers/references to any analyses and use that to implement the old-PM class. - use a private namespace in the header for stuff that used to be file scope llvm-svn: 272597	2016-06-13 22:52:52 +00:00
Kevin Enderby	d2d2ce9b9f	Update the AArch64ExternalSymbolizer to print literal strings as escaped strings so it is the same as the MCExternalSymbolizer. rdar://17349181 llvm-svn: 272588	2016-06-13 21:08:57 +00:00
Sanjoy Das	98ac278b86	Move previously added test case to the right location In rL272580 I accidentally added a test case to test/CodeGen when test/Transforms/DeadStoreElimination/ is a better place for it. llvm-svn: 272581	2016-06-13 20:12:07 +00:00
Sanjoy Das	d0bdf3e02b	Fix AAResults::callCapturesBefore for operand bundles Summary: AAResults::callCapturesBefore would previously ignore operand bundles. It was possible for a later instruction to miss its memory dependency on a call site that would only access the pointer through a bundle. Patch by Oscar Blumberg! Reviewers: sanjoy Differential Revision: http://reviews.llvm.org/D21286 llvm-svn: 272580	2016-06-13 19:55:04 +00:00
Simon Pilgrim	582b9ce36e	[X86][SSE] Added extract to scalar nontemporal store tests llvm-svn: 272577	2016-06-13 19:08:28 +00:00
David Majnemer	248190ba69	[X86] Remove llvm.x86.bit.scan.{forward,reverse}.32 The need for these intrinsics has been obviated by r272564 which reimplements their functionality using generic IR. llvm-svn: 272566	2016-06-13 17:33:13 +00:00
Rafael Espindola	426ae7c72d	Add triple to input file. Patch by H.J. Lu. llvm-svn: 272563	2016-06-13 17:08:15 +00:00
Marek Olsak	e93f6d6923	AMDGPU/SI: Set INDEX_STRIDE for scratch coalescing Summary: Mesa and other users must set this to enable coalescing: - STRIDE = 0 - SWIZZLE_ENABLE = 1 This makes one particular compute shader 8x faster. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21136 llvm-svn: 272556	2016-06-13 16:05:57 +00:00
Ulrich Weigand	daae87aa21	[SystemZ] Enable index register memory constraints for inline ASM This enables use of the 'R' and 'T' memory constraints for inline ASM operands on SystemZ, which allow an index register as well as an immediate displacement. This patch includes corresponding documentation and test case updates. As with the last patch of this kind, I moved the 'm' constraint to the most general case, which is now 'T' (base + 20-bit signed displacement + index register). Author: colpell Differential Revision: http://reviews.llvm.org/D21239 llvm-svn: 272547	2016-06-13 14:24:05 +00:00
Ranjeet Singh	933e1aa39f	[ARM] Reverting r272544 because clang patch needs to go in as soon as llvm patch has gone in because tests will start breaking in Clang. llvm-svn: 272546	2016-06-13 10:58:24 +00:00
Ranjeet Singh	8feacb330d	[ARM] Add mrrc/mrrc2 co-processor intrinsics MRRC/MRRC2 instruction writes to two registers. The intrinsic definition returns a single uint64_t to represent the write, this is a compact way of representing a write to two 32 bit registers, the alternative might have been two return a struct of 2 uint32_t's but this isn't as nice. Differential Revision: llvm-svn: 272544	2016-06-13 10:43:50 +00:00
Strahinja Petrovic	f0980e4dc0	This patch fixes handling long double type when it is constant in soft float mode on PowerPC 32 architecture. llvm-svn: 272543	2016-06-13 10:29:29 +00:00
Simon Pilgrim	377bc2ea43	[X86][SSE4A] Renamed tests to correspond with the the instruction with being tested llvm-svn: 272542	2016-06-13 10:14:42 +00:00
Craig Topper	13cf7cac07	[AVX512] Remove maksed pshufd, pshuflw, and phufhw intrinsics and autoupgrade them to selects and shufflevector. llvm-svn: 272527	2016-06-13 02:36:48 +00:00
Sanjay Patel	977530a8c9	[x86, SSE] change patterns for CMPP to float types to allow matching with SSE1 (PR28044) This patch is intended to solve: https://llvm.org/bugs/show_bug.cgi?id=28044 By changing the definition of X86ISD::CMPP to use float types, we allow it to be created and pass legalization for an SSE1-only target where v4i32 is not legal. The motivational trail for this change includes: https://llvm.org/bugs/show_bug.cgi?id=28001 and eventually makes this trigger: http://reviews.llvm.org/D21190 Ie, after this step, we should be free to have Clang generate FP compare IR instead of x86 intrinsics for SSE C packed compare intrinsics. (We can auto-upgrade and remove the LLVM sse.cmp intrinsics as a follow-up step.) Once we're generating vector IR instead of x86 intrinsics, a big pile of generic optimizations can trigger. Differential Revision: http://reviews.llvm.org/D21235 llvm-svn: 272511	2016-06-12 15:03:25 +00:00
Craig Topper	1067986c5b	[X86] Remove sse2 pshufd/pshuflw/pshufhw intrinsics and upgrade them to shufflevector. llvm-svn: 272510	2016-06-12 14:11:32 +00:00
Simon Pilgrim	9d8bed1796	[X86][BMI] Added fast-isel tests for BMI1 intrinsics A lot of the codegen is pretty awful for these as they are mostly implemented as generic bit twiddling ops llvm-svn: 272508	2016-06-12 09:56:05 +00:00
Sean Silva	e3bb457423	[PM] Port DeadArgumentElimination to the new PM The approach taken here follows r267631. deadarghaX0r should be easy to port when the time comes to add new-PM support to bugpoint. llvm-svn: 272507	2016-06-12 09:16:39 +00:00
Sean Silva	f5080194fd	[PM] Port ReversePostOrderFunctionAttrs to the new PM Below are my super rough notes when porting. They can probably serve as a basic guide for porting other passes to the new PM. As I port more passes I'll expand and generalize this and make a proper docs/HowToPortToNewPassManager.rst document. There is also missing documentation for general concepts and API's in the new PM which will require some documentation. Once there is proper documentation in place we can put up a list of passes that have to be ported and game-ify/crowdsource the rest of the porting (at least of the middle end; the backend is still unclear). I will however be taking personal responsibility for ensuring that the LLD/ELF LTO pipeline is ported in a timely fashion. The remaining passes to be ported are (do something like `git grep "<the string in the bullet point below>"` to find the pass): General Scalar: [ ] Simplify the CFG [ ] Jump Threading [ ] MemCpy Optimization [ ] Promote Memory to Register [ ] MergedLoadStoreMotion [ ] Lazy Value Information Analysis General IPO: [ ] Dead Argument Elimination [ ] Deduce function attributes in RPO Loop stuff / vectorization stuff: [ ] Alignment from assumptions [ ] Canonicalize natural loops [ ] Delete dead loops [ ] Loop Access Analysis [ ] Loop Invariant Code Motion [ ] Loop Vectorization [ ] SLP Vectorizer [ ] Unroll loops Devirtualization / CFI: [ ] Cross-DSO CFI [ ] Whole program devirtualization [ ] Lower bitset metadata CGSCC passes: [ ] Function Integration/Inlining [ ] Remove unused exception handling info [ ] Promote 'by reference' arguments to scalars Please let me know if you are interested in working on any of the passes in the above list (e.g. reply to the post-commit thread for this patch). I'll probably be tackling "General Scalar" and "General IPO" first FWIW. Steps as I port "Deduce function attributes in RPO" --------------------------------------------------- (note: if you are doing any work based on these notes, please leave a note in the post-commit review thread for this commit with any improvements / suggestions / incompleteness you ran into!) Note: "Deduce function attributes in RPO" is a module pass. 1. Do preparatory refactoring. Do preparatory factoring. In this case all I had to do was to pull out a static helper (r272503). (TODO: give more advice here e.g. if pass holds state or something) 2. Rename the old pass class. llvm/lib/Transforms/IPO/FunctionAttrs.cpp Rename class ReversePostOrderFunctionAttrs -> ReversePostOrderFunctionAttrsLegacyPass in preparation for adding a class ReversePostOrderFunctionAttrs as the pass in the new PM. (edit: actually wait what? The new class name will be ReversePostOrderFunctionAttrsPass, so it doesn't conflict. So this step is sort of useless churn). llvm/include/llvm/InitializePasses.h llvm/lib/LTO/LTOCodeGenerator.cpp llvm/lib/Transforms/IPO/IPO.cpp llvm/lib/Transforms/IPO/FunctionAttrs.cpp Rename initializeReversePostOrderFunctionAttrsPass -> initializeReversePostOrderFunctionAttrsLegacyPassPass (note that the "PassPass" thing falls out of `s/ReversePostOrderFunctionAttrs/ReversePostOrderFunctionAttrsLegacyPass/`) Note that the INITIALIZE_PASS macro is what creates this identifier name, so renaming the class requires this renaming too. Note that createReversePostOrderFunctionAttrsPass does not need to be renamed since its name is not generated from the class name. 3. Add the new PM pass class. In the new PM all passes need to have their declaration in a header somewhere, so you will often need to add a header. In this case llvm/include/llvm/Transforms/IPO/FunctionAttrs.h is already there because PostOrderFunctionAttrsPass was already ported. The file-level comment from the .cpp file can be used as the file-level comment for the new header. You may want to tweak the wording slightly from "this file implements" to "this file provides" or similar. Add declaration for the new PM pass in this header: class ReversePostOrderFunctionAttrsPass : public PassInfoMixin<ReversePostOrderFunctionAttrsPass> { public: PreservedAnalyses run(Module &M, AnalysisManager<Module> &AM); }; Its name should end with `Pass` for consistency (note that this doesn't collide with the names of most old PM passes). E.g. call it `<name of the old PM pass>Pass`. Also, move the doxygen comment from the old PM pass to the declaration of this class in the header. Also, include the declaration for the new PM class `llvm/Transforms/IPO/FunctionAttrs.h` at the top of the file (in this case, it was already done when the other pass in this file was ported). Now define the `run` method for the new class. The main things here are: a) Use AM.getResult<...>(M) to get results instead of `getAnalysis<...>()` b) If the old PM pass would have returned "false" (i.e. `Changed == false`), then you should return PreservedAnalyses::all(); c) In the old PM getAnalysisUsage method, observe the calls `AU.addPreserved<...>();`. In the case `Changed == true`, for each preserved analysis you should do call `PA.preserve<...>()` on a PreservedAnalyses object and return it. E.g.: PreservedAnalyses PA; PA.preserve<CallGraphAnalysis>(); return PA; Note that calls to skipModule/skipFunction are not supported in the new PM currently, so optnone and optimization bisect support do not work. You can just drop those calls for now. 4. Add the pass to the new PM pass registry to make it available in opt. In llvm/lib/Passes/PassBuilder.cpp add a #include for your header. `#include "llvm/Transforms/IPO/FunctionAttrs.h"` In this case there is already an include (from when PostOrderFunctionAttrsPass was ported). Add your pass to llvm/lib/Passes/PassRegistry.def In this case, I added `MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())` The string is from the `INITIALIZE_PASS*` macros used in the old pass manager. Then choose a test that uses the pass and use the new PM `-passes=...` to run it. E.g. in this case there is a test that does: ; RUN: opt < %s -basicaa -functionattrs -rpo-functionattrs -S \| FileCheck %s I have added the line: ; RUN: opt < %s -aa-pipeline=basic-aa -passes='require<targetlibinfo>,cgscc(function-attrs),rpo-functionattrs' -S \| FileCheck %s The `-aa-pipeline=basic-aa` and `require<targetlibinfo>,cgscc(function-attrs)` are what is needed to run functionattrs in the new PM (note that in the new PM "functionattrs" becomes "function-attrs" for some reason). This is just pulled from `readattrs.ll` which contains the change from when functionattrs was ported to the new PM. Adding rpo-functionattrs causes the pass that was just ported to run. llvm-svn: 272505	2016-06-12 07:48:51 +00:00
Amaury Sechet	5db224e1f0	Make sure we have a Add/Remove/Has function for various thing that can have attribute. Summary: This also deprecated the get attribute function familly. Reviewers: Wallbraker, whitequark, joker.eph, echristo, rafael, jyknight Subscribers: axw, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19181 llvm-svn: 272504	2016-06-12 06:17:24 +00:00
Eli Friedman	9f8031c2da	[MergedLoadStoreMotion] Use correct helper for load hoist safety. It isn't legal to hoist a load past a call which might not return; even if it doesn't throw, it could, for example, call exit(). Fixes http://llvm.org/PR27953. llvm-svn: 272495	2016-06-12 02:11:20 +00:00
Craig Topper	b7713e413b	[X86] Move tests for llvm.x86.avx.vpermil.* intrinsics to a -upgrade test since they are autoupgraded to shufflevector. llvm-svn: 272494	2016-06-12 01:41:06 +00:00
Eli Friedman	f1da33e4d3	[LICM] Make isGuaranteedToExecute more accurate. Summary: Make isGuaranteedToExecute use the isGuaranteedToTransferExecutionToSuccessor helper, and make that helper a bit more accurate. There's a potential performance impact here from assuming that arbitrary calls might not return. This probably has little impact on loads and stores to a pointer because most things alias analysis can reason about are dereferenceable anyway. The other impacts, like less aggressive hoisting of sdiv by a variable and less aggressive hoisting around volatile memory operations, are unlikely to matter for real code. This also impacts SCEV, which uses the same helper. It's a minor improvement there because we can tell that, for example, memcpy always returns normally. Strictly speaking, it's also introducing a bug, but it's not any worse than everywhere else we assume readonly functions terminate. Fixes http://llvm.org/PR27857. Reviewers: hfinkel, reames, chandlerc, sanjoy Subscribers: broune, llvm-commits Differential Revision: http://reviews.llvm.org/D21167 llvm-svn: 272489	2016-06-11 21:48:25 +00:00
Simon Pilgrim	2b7c02a04f	[X86] Updated test checks script to generalise LCPI symbol refs The script now replace '.LCPI888_8' style asm symbols with the {{\.LCPI.*}} re pattern - this helps stop hardcoded symbols in 32-bit x86 tests changing with every edit of the file Refreshed some tests to demonstrate the new check llvm-svn: 272488	2016-06-11 20:39:21 +00:00
Simon Pilgrim	3fc09f7be6	[CostModel][X86][SSE] Updated costs for vector BITREVERSE ops on SSSE3+ targets To account for the fast PSHUFB implementation now available llvm-svn: 272484	2016-06-11 19:23:02 +00:00
Simon Pilgrim	5b9bade8dd	[X86][SSSE3] Added PSHUFB LUT implementation of BITREVERSE PSHUFB can speed up BITREVERSE of byte vectors by performing LUT on the low/high nibbles separately and ORing the results. Wider integer vector types are already BSWAP'd beforehand so also make use of this approach. llvm-svn: 272477	2016-06-11 15:44:13 +00:00
Craig Topper	46f49fb407	[AVX512] Re-generate v8i64 shuffle test now that we use pshufd for some cases. llvm-svn: 272474	2016-06-11 13:57:08 +00:00
Craig Topper	504fba5c8a	[AVX512] Lower v8i64 and v16i32 to pshufd when possible. llvm-svn: 272473	2016-06-11 13:43:21 +00:00
Simon Pilgrim	6800a45790	[X86][SSE] Added PSLLDQ/PSRLDQ as a target shuffle type Ensure that PALIGNR/PSLLDQ/PSRLDQ are byte vectors so that they can be correctly decoded for target shuffle combining llvm-svn: 272471	2016-06-11 13:38:28 +00:00
Simon Pilgrim	8dd73e3ffa	[X86][AVX2] Added PSLLDQ/PSRLDQ shuffle combining tests llvm-svn: 272469	2016-06-11 13:18:21 +00:00
Craig Topper	40abd1cc61	[AVX512] Add support for lowering v32i16 shuffles with repeated lanes. This allows us to create 512-bit PSHUFLW/PSHUFHW. llvm-svn: 272450	2016-06-11 03:27:42 +00:00
Qin Zhao	bc8fbeacf3	[esan\|cfrag] Handle complex GEP instr in the cfrag tool Summary: Iterates all (except the first and the last) operands within each GEP instruction for instrumentation. Adds test struct_field_gep.ll. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, bruening, llvm-commits Differential Revision: http://reviews.llvm.org/D21242 llvm-svn: 272442	2016-06-10 22:28:55 +00:00
Quentin Colombet	f2a1909bb5	[IRTranslator] Support the translation of or. Now or instructions get translated into G_OR. llvm-svn: 272433	2016-06-10 20:50:35 +00:00
Sanjay Patel	b114fd65fc	[x86] enable bitcasted fabs/fneg transforms The vector cases don't change because we already have folds in X86ISelLowering to look through and remove bitcasts. llvm-svn: 272427	2016-06-10 20:33:50 +00:00
Zhan Jun Liau	ab42cbce98	[SystemZ] Support Compare and Traps Support and generate Compare and Traps like CRT, CIT, etc. Support Trap as legal DAG opcodes and generate "j .+2" for them by default. Add support for Conditional Traps and use the If Converter to convert them into the corresponding compare and trap opcodes. Differential Revision: http://reviews.llvm.org/D21155 llvm-svn: 272419	2016-06-10 19:58:10 +00:00
Tom Stellard	f3af841462	AMDGPU/SI: Don't use fixup_si_rodata for scratch rsrc relocations Summary: We need to set the fixup type to FK_Data_4 for the SCRATCH_RSRC_DWORD[01] symbols, since these require absolute relocations, and fixup_si_rodata is for relative relocations. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21153 llvm-svn: 272417	2016-06-10 19:26:38 +00:00
Mehdi Amini	cbd68ecf04	Move CodeGen test from Generic to X86 specific directory llvm-svn: 272416	2016-06-10 19:14:01 +00:00
Mehdi Amini	1d396832d3	Interprocedural Register Allocation (IPRA): add a Transformation Pass Adds a MachineFunctionPass that scans the body to find calls, and update the register mask with the one saved by the RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo. Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: http://reviews.llvm.org/D21180 llvm-svn: 272414	2016-06-10 18:37:21 +00:00
Sanjay Patel	d558bdadd2	[x86] add test for PR28044 llvm-svn: 272411	2016-06-10 18:05:55 +00:00
Saleem Abdulrasool	6d0d228d2a	test: split test into two files Split up the test cases into two inputs as per post-commit review comments from Renato. NFC. llvm-svn: 272408	2016-06-10 17:33:28 +00:00
Michael Kuperstein	9a0542a792	[X86] Add costs for SSE zext/sext to v4i64 to TTI The costs are somewhat hand-wavy, but should be much closer to the truth than what we get from BasicTTI. Differential Revision: http://reviews.llvm.org/D21156 llvm-svn: 272406	2016-06-10 17:01:05 +00:00
Mehdi Amini	bbacddfe92	Interprocedural Register Allocation (IPRA) Analysis Add an option to enable the analysis of MachineFunction register usage to extract the list of clobbered registers. When enabled, the CodeGen order is changed to be bottom up on the Call Graph. The analysis is split in two parts, RegUsageInfoCollector is the MachineFunction Pass that runs post-RA and collect the list of clobbered registers to produce a register mask. An immutable pass, RegisterUsageInfo, stores the RegMask produced by RegUsageInfoCollector, and keep them available. A future tranformation pass will use this information to update every call-sites after instruction selection. Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: http://reviews.llvm.org/D20769 llvm-svn: 272403	2016-06-10 16:19:46 +00:00
Sanjay Patel	27f06ae7a5	[x86] fix test attributes and autogenerate checks llvm-svn: 272398	2016-06-10 15:30:52 +00:00
Sanjay Patel	cccccd9ab5	[x86] add missing tests for fcmp ueq/one Somehow, the codegen logic for these sequences has gone completely untested until now (note the 2 compare instructions generated per test). There's also an Intel AVX optimization opportunity exposed in these cases and the existing tests. Intel's (but not AMD's) AVX spec shows that extra FP predicates were added, so a single comparison should always be sufficient, and operand commutation should never be necessary. llvm-svn: 272397	2016-06-10 15:17:54 +00:00
Sanjay Patel	330a359fb3	[x86] regenerate checks llvm-svn: 272396	2016-06-10 14:48:50 +00:00
Matthew Simpson	12b9c5ba98	Reapply "[TTI] Refine default cost for interleaved load groups with gaps" This reapplies commit r272385 with a fix. The build was failing when compiled with gcc, but not with clang. With the fix, we now get the data layout from the current TTI implementation, which will hopefully solve the issue. llvm-svn: 272395	2016-06-10 14:33:30 +00:00
Simon Pilgrim	2fa2690bca	[X86][SSE] Added target shuffle combine tests for byte shift/rotates (PSLLDQ/PSRLDQ/PALIGNR) llvm-svn: 272392	2016-06-10 13:03:22 +00:00
Matthew Simpson	65c7b74de4	Revert "[TTI] Refine default cost for interleaved load groups with gaps" This reverts commit r272385. This commit broke the build. I'm temporarily reverting to investigate. llvm-svn: 272391	2016-06-10 12:41:33 +00:00
Matthew Simpson	b16907f17a	[TTI] Refine default cost for interleaved load groups with gaps This patch refines the default cost for interleaved load groups having gaps. If a load group has gaps, the legalized instructions corresponding to the unused elements will be dead. Thus, we don't need to account for them in the cost model. Instead, we only need to account for the fraction of legalized loads that will actually be used. Differential Revision: http://reviews.llvm.org/D20873 llvm-svn: 272385	2016-06-10 11:27:51 +00:00
Sam Kolton	945231ada5	[AMDGPU] AsmParser: Support for sext() modifier in SDWA. Some code cleaning in AMDGPUOperand. Summary: sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported. Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier. Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers. Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...). Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20968 llvm-svn: 272384	2016-06-10 09:57:59 +00:00
Simon Pilgrim	34263ad995	[X86][AVX512] Added VPSLLDQ/VPSRLDQ memory fold tests Memory operand is new for AVX512 (SSE/AVX2 didn't support it). Also dropped the 'mask' from the tests (VPSLLDQ/VPSRLDQ don't support masked operations). Regenerated VPALIGNR test now that the shuffle comments work llvm-svn: 272383	2016-06-10 09:56:20 +00:00
Craig Topper	200d237e57	[AVX512] Add shuffle comment printing for masked VPERMPD/VPERMQ. llvm-svn: 272371	2016-06-10 05:12:40 +00:00
Craig Topper	89c1761474	[AVX512] Fix shuffle comment printing to handle the masked versions of some shuffles. Previously we were printing the mask operands as the register names. llvm-svn: 272367	2016-06-10 04:48:05 +00:00
Qin Zhao	0b96aa7190	[esan\|cfrag] Add the struct field offset array in StructInfo Summary: Adds the struct field offset array in struct StructInfo. Updates test struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D21192 llvm-svn: 272362	2016-06-10 02:10:06 +00:00
Quentin Colombet	3198649199	[LiveRangeEdit] Add a test case for r272314. The test case is not great espicially because it is still cumbersome to run the regalloc pass with run-pass. (We miss a bunch of initiliazier to be properly implemented.) Related to llvm.org/PR27983 llvm-svn: 272360	2016-06-10 01:57:48 +00:00
Quentin Colombet	129458a7ed	[llc] Add support for several run-pass options. Previously we could run only one machine pass with the run-pass option. With that patch, we can now specify several passes with several run-pass options (or just one option with a list of comma separated passes) and llc will build the related pipeline. This is great to test the interaction of two passes that are not necessarily next to each other in the pipeline, or play with pass ordering. Now, we should be at parity with opt for the flexibility of running passes. Note: I also moved the run pass option from CommandFlags.h to llc.cpp because, really, this is needed only there! llvm-svn: 272356	2016-06-10 00:52:10 +00:00
Qin Zhao	d677d88867	[esan\|cfrag] Disable load/store instrumentation for cfrag Summary: Adds ClInstrumentFastpath option to control fastpath instrumentation. Avoids the load/store instrumentation for the cache fragmentation tool. Renames cache_frag_basic.ll to working_set_slow.ll for slowpath instrumentation test. Adds the __esan_init check in struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D21079 llvm-svn: 272355	2016-06-10 00:48:53 +00:00
Matt Arsenault	58ddad5bd6	AMDGPU: v_cndmask_b32 does not def vcc Fixes verifier errors after SIShrinkInstructions. llvm-svn: 272351	2016-06-10 00:18:41 +00:00
Tom Stellard	26a2ab7477	AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_permute Summary: This fixes a bug with ds_permute instructions where if it was passed a constant address, then the offset operand would get assigned a register operand instead of an immediate. Reviewers: scchan, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19994 llvm-svn: 272349	2016-06-10 00:01:04 +00:00
Matt Arsenault	7757c59e48	AMDGPU: Fix flat atomics The flat atomics could already be selected, but only when using flat instructions for global memory. Add patterns for flat addresses. llvm-svn: 272345	2016-06-09 23:42:54 +00:00
Matt Arsenault	887018179a	AMDGPU: Fix i64 global cmpxchg This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. llvm-svn: 272344	2016-06-09 23:42:48 +00:00
Matt Arsenault	25363d37fc	AMDGPU: Fix missing and broken check lines in atomic tests llvm-svn: 272343	2016-06-09 23:42:44 +00:00
Vitaly Buka	b451f1bdf6	Make sure that not interesting allocas are not instrumented. Summary: We failed to unpoison uninteresting allocas on return as unpoisoning is part of main instrumentation which skips such allocas. Added check -asan-instrument-allocas for dynamic allocas. If instrumentation of dynamic allocas is disabled it will not will not be unpoisoned. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21207 llvm-svn: 272341	2016-06-09 23:31:59 +00:00
Eric Christopher	1dbb23e162	Add aliases for mfvrsave/mtvrsave. Update a test as we're now going to emit it for easier reading of generated assembly as well. llvm-svn: 272339	2016-06-09 23:27:48 +00:00
George Burgess IV	652ec4f595	[CFLAA] Handle global/arg attrs more sanely. Prior to this patch, we used argument/global stratified attributes in order to note that a value could have come from either dereferencing a global/arg, or from the assignment from a global/arg. Now, AttrUnknown is placed on sets when we see a dereference, instead of the global/arg attributes. This allows us to be more aggressive in the future when we see global/arg attributes without AttrUnknown. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21110 llvm-svn: 272335	2016-06-09 23:15:04 +00:00
Vitaly Buka	79b75d3d11	Unpoison stack memory in use-after-return + use-after-scope mode Summary: We still want to unpoison full stack even in use-after-return as it can be disabled at runtime. PR27453 Reviewers: eugenis, kcc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21202 llvm-svn: 272334	2016-06-09 23:05:35 +00:00
Easwaran Raman	71069cf67d	Use ProfileSummaryInfo in inline cost analysis. Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo. Differential revision: http://reviews.llvm.org/D21045 llvm-svn: 272321	2016-06-09 22:23:21 +00:00
Simon Pilgrim	643734c565	[X86][AVX512] Added avx512 VPSLLDQ/VPSRLDQ instruction comments llvm-svn: 272319	2016-06-09 22:03:15 +00:00
Simon Pilgrim	f718682eb9	[X86][AVX512] Dropped avx512 VPSLLDQ/VPSRLDQ intrinsics Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ llvm-svn: 272308	2016-06-09 21:09:03 +00:00
Simon Pilgrim	47c76e201a	[X86][AVX512] Fixed issue with v16i32 shuffles lowering to VPALIGNR llvm-svn: 272307	2016-06-09 20:53:12 +00:00
Simon Pilgrim	0ab9d3026a	[X86][AVX512] Added support for lowering 512-bit vector shuffles to bit/byte shifts 512-bit VPSLLDQ/VPSRLDQ can only be used for avx512bw targets so lowerVectorShuffleAsShift had to be adjusted to include the subtarget llvm-svn: 272300	2016-06-09 20:13:58 +00:00
Justin Lebar	ed2c282d4b	[NVPTX] Add intrinsics for shfl instructions. Summary: Currently clang emits these instructions via inline (volatile) asm in the CUDA headers. Switching to intrinsics will let the optimizer reason across calls to these intrinsics. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D21160 llvm-svn: 272298	2016-06-09 20:04:08 +00:00
Easwaran Raman	e12c487b8c	[PM] Port LCSSA to the new PM. Differential Revision: http://reviews.llvm.org/D21090 llvm-svn: 272294	2016-06-09 19:44:46 +00:00
Wei Ding	ed0f97fad2	AMDGPU/SI: Fix 32-bit fdiv lowering We were using the fast fdiv lowering for all division, implementation of IEEE754 fdiv is added. http://reviews.llvm.org/D20557 llvm-svn: 272292	2016-06-09 19:17:15 +00:00
Michael Kuperstein	c5edcdeb0e	[LV] Use vector phis for some secondary induction variables Previously, we materialized secondary vector IVs from the primary scalar IV, by offseting the primary to match the correct start value, and then broadcasting it - inside the loop body. Instead, we can use a real vector IV, like we do for the primary. This enables using vector IVs for secondary integer IVs whose type matches the type of the primary. Differential Revision: http://reviews.llvm.org/D20932 llvm-svn: 272283	2016-06-09 18:03:15 +00:00
Davide Italiano	1a7e32cc48	Also fix a typo. Need more coffee today. llvm-svn: 272278	2016-06-09 17:06:01 +00:00
Davide Italiano	f326b30a15	Improve r272262, check that __stack_chk_guard is used. Thanks to Rafael for the suggestion. llvm-svn: 272277	2016-06-09 17:04:38 +00:00
Jan Vesely	2da0cba5fb	SelectionDAG: Implement expansion of {S,U}MIN/MAX in integer legalization Fixes {u,}long_{min,max,clamp} opencl piglit regressions on EG. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D17898 llvm-svn: 272272	2016-06-09 16:04:00 +00:00
Haicheng Wu	5b458cc1f6	Reapply "[MBP] Reduce code size by running tail merging in MBP."" This reapplies commit r271930, r271915, r271923. They hit a bug in Thumb which is fixed in r272258 now. The original message: The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. llvm-svn: 272267	2016-06-09 15:24:29 +00:00
Ulrich Weigand	79564611d9	[SystemZ] Enable long displacement constraints for inline ASM operands This enables use of the 'S' constraint for inline ASM operands on SystemZ, which allows for a memory reference with a signed 20-bit immediate displacement. This patch includes corresponding documentation and test case updates. I've changed the 'T' constraint to match the new behavior for 'S', as 'T' also uses a long displacement (though index constraints are still not implemented). I also changed 'm' to match the behavior for 'S' as this will allow for a wider range of displacements for 'm', though correct me if that's not the right decision. Author: colpell Differential Revision: http://reviews.llvm.org/D21097 llvm-svn: 272266	2016-06-09 15:19:16 +00:00
Davide Italiano	24f1f62dca	Move stackguard test to X86/ directory as it's not generic. llvm-svn: 272264	2016-06-09 15:16:58 +00:00
Davide Italiano	bd4243c519	[CodeGen] Change getSDagStackGuard to get an internal sym. Fixes a crash in the backend during an LTO build of rtld(1) in FreeBSD. llvm-svn: 272262	2016-06-09 14:23:38 +00:00
Hrvoje Varga	c962c4936e	[mips][microMIPS] Implement BOVC, BNVC, EXT, INS and JALRC instructions Differential Revision: http://reviews.llvm.org/D11798 llvm-svn: 272259	2016-06-09 12:57:23 +00:00
Igor Breger	f635367e2b	[AVX512] Remove masked_move/blendm intrinsic from back-end. This is complement patch to D21060. Differential Revision: http://reviews.llvm.org/D21174 llvm-svn: 272257	2016-06-09 11:46:55 +00:00
Zlatko Buljan	cd242c1655	[mips][microMIPS] Add CodeGen support for SEL., SELEQZ, SELNEZ, SELEQZ., SELNEZ.* and CMP.condn.fmt instructions Differential Revision: http://reviews.llvm.org/D20862 llvm-svn: 272256	2016-06-09 11:15:53 +00:00
Sam Kolton	c9bdcb75c4	[AMDGPU] Disassembler: Support for sdwa instructions Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21129 llvm-svn: 272255	2016-06-09 11:04:45 +00:00
Diana Picus	db2aff0ab4	[llc] Remove exit-on-error flag from MIR tests (PR27770) This is made possible by removing an assert in llc that assumed MIRParser::parseLLVMModule would exit on error. MIRParser's documentation states that it returns null if a parsing error occurs, so there's no reason to assert. We can instead just fall through to where the check for a module is performed and exit if it is null. This commit is part of the clean-up after r269655. Fixes PR27770 Differential Revision: http://reviews.llvm.org/D20371 llvm-svn: 272254	2016-06-09 10:31:05 +00:00
Craig Topper	6f7288dc44	[AVX512] Fix shuffle decode printing for several instructions with write masks. There are still more bugs here with UNPCK and PALIGN for sure. But these were the easiest ones to fix. llvm-svn: 272252	2016-06-09 07:49:08 +00:00
James Molloy	feb9f4243b	[Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 272251	2016-06-09 07:39:08 +00:00
Craig Topper	8537c11ff3	[X86] Fix a test I failed to re-generate in r272249. llvm-svn: 272250	2016-06-09 07:10:34 +00:00
Craig Topper	7a2993093e	[X86] Bring consistent naming to the SSE/AVX and AVX512 PALIGNR instructions. Then add shuffle decode printing for the EVEX forms which is made easier by having the naming structure more similar to other instructions. llvm-svn: 272249	2016-06-09 07:06:38 +00:00
Saleem Abdulrasool	d3568e3ba3	test: fix typo llvm-svn: 272242	2016-06-09 03:14:32 +00:00
Saleem Abdulrasool	6c19ffc8bc	AArch64: support the `.arch` directive in the IAS Add support to the AArch64 IAS for the `.arch` directive. This allows the assembly input to use architectural functionality in part of a file. This is used in existing code like BoringSSL. Resolves PR26016! llvm-svn: 272241	2016-06-09 02:56:40 +00:00
Teresa Johnson	7ab1f69272	[ThinLTO/gold] Enable summary-based internalization Summary: Enable existing summary-based importing support in the gold-plugin. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21080 llvm-svn: 272239	2016-06-09 01:14:13 +00:00
Sanjoy Das	c7f69b921f	Be wary of abnormal exits from loop when exploiting UB We can safely rely on a NoWrap add recurrence causing UB down the road only if we know the loop does not have a exit expressed in a way that is opaque to ScalarEvolution (e.g. by a function call that conditionally calls exit(0)). I believe with this change PR28012 is fixed. Note: I had to change some llvm-lit tests in LoopReroll, since it looks like they were depending on this incorrect behavior. llvm-svn: 272237	2016-06-09 01:13:59 +00:00
Reid Kleckner	6d1d27542f	[codeview] Skip DIGlobalVariables with no variable They have probably been discarded during optimization. llvm-svn: 272231	2016-06-09 00:29:00 +00:00
Quentin Colombet	2c6469687d	[MIR] Check that generic virtual registers get a size. Without that check it was possible to write test cases where the size was not specified and we ended up with weird asserts down the road, because the default value (1) would not make sense. llvm-svn: 272226	2016-06-08 23:27:46 +00:00
Michael Zolotukhin	8e7e76729d	[LoopSimplify] Preserve LCSSA when merging exit blocks. Summary: This fixes PR26682. Also add LCSSA as a preserved pass to LoopSimplify, that looks correct to me and allows to write a test for the issue. Reviewers: chandlerc, bogner, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21112 llvm-svn: 272224	2016-06-08 23:13:21 +00:00
Michael Zolotukhin	987ab631fa	[SLPVectorizer] Handle GEP with differing constant index types Summary: This fixes PR27617. Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64. The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert. Reviewers: nadav, mzolotukhin Subscribers: JesperAntonsson, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20685 Patch by Jesper Antonsson! llvm-svn: 272206	2016-06-08 21:55:16 +00:00
Dehao Chen	769219b11a	Revive http://reviews.llvm.org/D12778 to handle forward-hot-prob and backward-hot-prob consistently. Summary: Consider the following diamond CFG: A / \ B C \/ D Suppose A->B and A->C have probabilities 81% and 19%. In block-placement, A->B is called a hot edge and the final placement should be ABDC. However, the current implementation outputs ABCD. This is because when choosing the next block of B, it checks if Freq(C->D) > Freq(B->D) * 20%, which is true (if Freq(A) = 100, then Freq(B->D) = 81, Freq(C->D) = 19, and 19 > 8120%=16.2). Actually, we should use 25% instead of 20% as the probability here, so that we have 19 < 8125%=20.25, and the desired ABDC layout will be generated. Reviewers: djasper, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20989 llvm-svn: 272203	2016-06-08 21:30:12 +00:00
Reid Kleckner	de3d8b500f	[DebugInfo] Add calling convention support for DWARF and CodeView Summary: Now DISubroutineType has a 'cc' field which should be a DW_CC_ enum. If it is present and non-zero, the backend will emit it as a DW_AT_calling_convention attribute. On the CodeView side, we translate it to the appropriate enum for the LF_PROCEDURE record. I added a new LLVM vendor specific enum to the list of DWARF calling conventions. DWARF does not appear to attempt to standardize these, so I assume it's OK to do this until we coordinate with GCC on how to emit vectorcall convention functions. Reviewers: dexonsmith, majnemer, aaboud, amccarth Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D21114 llvm-svn: 272197	2016-06-08 20:34:29 +00:00
Evgeny Stupachenko	3e2f389a7e	The patch set unroll disable pragma when unroll with user specified count has been applied. Summary: Previously SetLoopAlreadyUnrolled() set the disable pragma only if there was some loop metadata. Now it set the pragma in all cases. This helps to prevent multiple unroll when -unroll-count=N is given. Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D20765 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 272195	2016-06-08 20:21:24 +00:00
Tim Shen	7aa0ad65ce	[MemCpyOpt] Do not exchange llvm.lifetime.start and llvm.memcpy Reviewers: iteratee Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21087 llvm-svn: 272192	2016-06-08 19:42:32 +00:00
Adrian McCarthy	f3c3c13206	Generate codeview for array type metadata. Differential Revision: http://reviews.llvm.org/D21107 llvm-svn: 272187	2016-06-08 18:22:59 +00:00
Reid Kleckner	ee641c20ca	[codeview] Avoid emitting an empty file checksum table Again, the Microsoft linker does not like empty substreams. We still emit an empty string table if CodeView is enabled, but that doesn't cause problems because it always contains at least one null byte. llvm-svn: 272183	2016-06-08 17:50:29 +00:00
Sanjoy Das	8598412e24	[SCEV] Track no-abnormal-exits instead of no-throw calls Absence of may-unwind calls is not enough to guarantee that a UB-generating use of an add-rec poison in the loop latch will actually cause UB. We also need to guard against calls that terminate the thread or infinite loop themselves. This partially addresses PR28012. llvm-svn: 272181	2016-06-08 17:48:42 +00:00
Sanjoy Das	9a65cd214d	Teach isGuarantdToTransferExecToSuccessor about debug info intrinsics Calls to `@llvm.dbg.*` can be assumed to terminate. llvm-svn: 272180	2016-06-08 17:48:36 +00:00
Sanjoy Das	a19edc4d15	Fix a bug in SCEV's poison value propagation The worklist algorithm introduced in rL271151 didn't check to see if the direct users of the post-inc add recurrence propagates poison. This change fixes the problem and makes the code structure more obvious. Note for release managers: correctness wise, this bug wasn't a regression introduced by rL271151 -- the behavior of SCEV around post-inc add recurrences was strictly improved (in terms of correctness) in rL271151. llvm-svn: 272179	2016-06-08 17:48:31 +00:00
Quentin Colombet	d1cd30b218	[AArch64][RegisterBankInfo] G_OR are fine on either GPR or FPR. Teach AArch64RegisterBankInfo that G_OR can be mapped on either GPR or FPR for 64-bit or 32-bit values. Add test cases demonstrating how this information is used to coalesce a computation on a single register bank. llvm-svn: 272170	2016-06-08 16:53:32 +00:00
Quentin Colombet	ea4d848be3	[Target] Introduce a generic opcode for bitwise OR: G_OR. This G_OR is used in GlobalISel to represent bitwise OR. llvm-svn: 272160	2016-06-08 16:12:19 +00:00
Oliver Stannard	b3378e2f3c	[ARM] MSR instructions implicitly set CPSR The MSR instructions can write to the CPSR, but we did not model this fact, so we could emit them in the middle of IT blocks, changing the condition flags for later instructions in the block. The tests use two calls to llvm.write_register.i32 because it is valid to use these instructions at the end of an IT block, which if conversion does do in some cases. With two calls, the first clobbers the flags, so a branch has to be used to make the second one conditional. Differential Revision: http://reviews.llvm.org/D21139 llvm-svn: 272154	2016-06-08 15:26:34 +00:00
Matthias Braun	3ef7df9cdf	MIR: Fix parsing of stack object references in MachineMemOperands The MachineMemOperand parser lacked the code to handle %stack.X references (%fixed-stack.X was working). llvm-svn: 272082	2016-06-08 00:47:07 +00:00
Rui Ueyama	f14a74c102	[pdbdump] Print out # of hash buckets. In the reference code, the field name is `cHashBuckets`. llvm-svn: 272075	2016-06-07 23:53:43 +00:00
Rui Ueyama	d833917f98	[pdbdump] Print out TPI hash key size. llvm-svn: 272073	2016-06-07 23:44:27 +00:00
Vedant Kumar	cef4360ac4	Retry^4 "[llvm-profdata] Add option to ingest filepaths from a file" Changes since the initial commit: - Use echo instead of printf. This should side-step the character escaping issues on Windows. Differential Revision: http://reviews.llvm.org/D20980 llvm-svn: 272068	2016-06-07 22:47:31 +00:00
Easwaran Raman	f894b5e89c	Use FileCheck instead of grepping for patterns. NFC. llvm-svn: 272065	2016-06-07 21:46:14 +00:00
Nicolai Haehnle	c00e03b8f5	AMDGPU: Add amdgpu-ps-wqm-outputs function attributes Summary: The presence of this attribute indicates that VGPR outputs should be computed in whole quad mode. This will be used by Mesa for prolog pixel shaders, so that derivatives can be taken of shader inputs computed by the prolog, fixing a bug. The generated code could certainly be improved: if a prolog pixel shader is used (which isn't common in modern OpenGL - they're used for gl_Color, polygon stipples, and forcing per-sample interpolation), Mesa will use this attribute unconditionally, because it has to be conservative. So WQM may be used in the prolog when it isn't really needed, and furthermore a silly back-and-forth switch is likely to happen at the boundary between prolog and main shader parts. Fixing this is a bit involved: we'd first have to add a mechanism by which LLVM writes the WQM-related input requirements to the main shader part binary, and then Mesa specializes the prolog part accordingly. At that point, we may as well just compile a monolithic shader... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D20839 llvm-svn: 272063	2016-06-07 21:37:17 +00:00
Simon Pilgrim	536434e80f	[X86][SSE4A] Regenerated SSE4A intrinsics tests There are no VEX encoded versions of SSE4A instructions, make sure that AVX targets give the same output llvm-svn: 272060	2016-06-07 21:15:45 +00:00
Eric Christopher	538d09d0dd	Revert "Differential Revision: http://reviews.llvm.org/D20557 " Author: Wei Ding <wei.ding2@amd.com> Date: Tue Jun 7 19:04:44 2016 +0000 Differential Revision: http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044 91177308-0d34-0410-b5e6-96231b3b80d8 as it was breaking the bots. This reverts commit r272044. llvm-svn: 272056	2016-06-07 20:27:12 +00:00
Etienne Bergeron	22bfa83208	[stack-protection] Add support for MSVC buffer security check Summary: This patch is adding support for the MSVC buffer security check implementation The buffer security check is turned on with the '/GS' compiler switch. * https://msdn.microsoft.com/en-us/library/8dbf701c.aspx * To be added to clang here: http://reviews.llvm.org/D20347 Some overview of buffer security check feature and implementation: * https://msdn.microsoft.com/en-us/library/aa290051(VS.71).aspx * http://www.ksyash.com/2011/01/buffer-overflow-protection-3/ * http://blog.osom.info/2012/02/understanding-vs-c-compilers-buffer.html For the following example: ``` int example(int offset, int index) { char buffer[10]; memset(buffer, 0xCC, index); return buffer[index]; } ``` The MSVC compiler is adding these instructions to perform stack integrity check: ``` push ebp mov ebp,esp sub esp,50h [1] mov eax,dword ptr [__security_cookie (01068024h)] [2] xor eax,ebp [3] mov dword ptr [ebp-4],eax push ebx push esi push edi mov eax,dword ptr [index] push eax push 0CCh lea ecx,[buffer] push ecx call _memset (010610B9h) add esp,0Ch mov eax,dword ptr [index] movsx eax,byte ptr buffer[eax] pop edi pop esi pop ebx [4] mov ecx,dword ptr [ebp-4] [5] xor ecx,ebp [6] call @__security_check_cookie@4 (01061276h) mov esp,ebp pop ebp ret ``` The instrumentation above is: * [1] is loading the global security canary, * [3] is storing the local computed ([2]) canary to the guard slot, * [4] is loading the guard slot and ([5]) re-compute the global canary, * [6] is validating the resulting canary with the '__security_check_cookie' and performs error handling. Overview of the current stack-protection implementation: * lib/CodeGen/StackProtector.cpp * There is a default stack-protection implementation applied on intermediate representation. * The target can overload 'getIRStackGuard' method if it has a standard location for the stack protector cookie. * An intrinsic 'Intrinsic::stackprotector' is added to the prologue. It will be expanded by the instruction selection pass (DAG or Fast). * Basic Blocks are added to every instrumented function to receive the code for handling stack guard validation and errors handling. * Guard manipulation and comparison are added directly to the intermediate representation. * lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp * lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp * There is an implementation that adds instrumentation during instruction selection (for better handling of sibbling calls). * see long comment above 'class StackProtectorDescriptor' declaration. * The target needs to override 'getSDagStackGuard' to activate SDAG stack protection generation. (note: getIRStackGuard MUST be nullptr). * 'getSDagStackGuard' returns the appropriate stack guard (security cookie) * The code is generated by 'SelectionDAGBuilder.cpp' and 'SelectionDAGISel.cpp'. * include/llvm/Target/TargetLowering.h * Contains function to retrieve the default Guard 'Value'; should be overriden by each target to select which implementation is used and provide Guard 'Value'. * lib/Target/X86/X86ISelLowering.cpp * Contains the x86 specialisation; Guard 'Value' used by the SelectionDAG algorithm. Function-based Instrumentation: * The MSVC doesn't inline the stack guard comparison in every function. Instead, a call to '__security_check_cookie' is added to the epilogue before every return instructions. * To support function-based instrumentation, this patch is * adding a function to get the function-based check (llvm 'Value', see include/llvm/Target/TargetLowering.h), * If provided, the stack protection instrumentation won't be inlined and a call to that function will be added to the prologue. * modifying (SelectionDAGISel.cpp) do avoid producing basic blocks used for inline instrumentation, * generating the function-based instrumentation during the ISEL pass (SelectionDAGBuilder.cpp), * if FastISEL (not SelectionDAG), using the fallback which rely on the same function-based implemented over intermediate representation (StackProtector.cpp). Modifications * adding support for MSVC (lib/Target/X86/X86ISelLowering.cpp) * adding support function-based instrumentation (lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp, .h) Results * IR generated instrumentation: ``` clang-cl /GS test.cc /Od /c -mllvm -print-isel-input ``` ``` * Final LLVM Code input to ISel * ; Function Attrs: nounwind sspstrong define i32 @"\01?example@@YAHHH@Z"(i32 %offset, i32 %index) #0 { entry: %StackGuardSlot = alloca i8* <<<-- Allocated guard slot %0 = call i8* @llvm.stackguard() <<<-- Loading Stack Guard value call void @llvm.stackprotector(i8* %0, i8** %StackGuardSlot) <<<-- Prologue intrinsic call (store to Guard slot) %index.addr = alloca i32, align 4 %offset.addr = alloca i32, align 4 %buffer = alloca [10 x i8], align 1 store i32 %index, i32* %index.addr, align 4 store i32 %offset, i32* %offset.addr, align 4 %arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 0 %1 = load i32, i32* %index.addr, align 4 call void @llvm.memset.p0i8.i32(i8* %arraydecay, i8 -52, i32 %1, i32 1, i1 false) %2 = load i32, i32* %index.addr, align 4 %arrayidx = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 %2 %3 = load i8, i8* %arrayidx, align 1 %conv = sext i8 %3 to i32 %4 = load volatile i8, i8* %StackGuardSlot <<<-- Loading Guard slot call void @__security_check_cookie(i8* %4) <<<-- Epilogue function-based check ret i32 %conv } ``` * SelectionDAG generated instrumentation: ``` clang-cl /GS test.cc /O1 /c /FA ``` ``` "?example@@YAHHH@Z": # @"\01?example@@YAHHH@Z" # BB#0: # %entry pushl %esi subl $16, %esp movl ___security_cookie, %eax <<<-- Loading Stack Guard value movl 28(%esp), %esi movl %eax, 12(%esp) <<<-- Store to Guard slot leal 2(%esp), %eax pushl %esi pushl $204 pushl %eax calll _memset addl $12, %esp movsbl 2(%esp,%esi), %esi movl 12(%esp), %ecx <<<-- Loading Guard slot calll @__security_check_cookie@4 <<<-- Epilogue function-based check movl %esi, %eax addl $16, %esp popl %esi retl ``` Reviewers: kcc, pcc, eugenis, rnk Subscribers: majnemer, llvm-commits, hans, thakis, rnk Differential Revision: http://reviews.llvm.org/D20346 llvm-svn: 272053	2016-06-07 20:15:35 +00:00
Wei Ding	a70216f1b3	Differential Revision: http://reviews.llvm.org/D20557 llvm-svn: 272044	2016-06-07 19:04:44 +00:00
George Burgess IV	a1f9a2daeb	[CFLAA] Add AttrEscaped, remove bit twiddling functions. This patch does a few things: - Unifies AttrAll and AttrUnknown (since they were used for more or less the same purpose anyway). - Introduces AttrEscaped, an attribute that notes that a value escapes our analysis for a given set, but not that an unknown value flows into said set. - Removes functions that take bit indices, since we also had functions that took bitsets, and the use of both (with similar names) was unclear and bug-prone. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21000 llvm-svn: 272040	2016-06-07 18:35:37 +00:00
Geoff Berry	486f49cc63	Reapply [AArch64] Fix isLegalAddImmediate() to return true for valid negative values. Originally reviewed here: http://reviews.llvm.org/D17463 llvm-svn: 272023	2016-06-07 16:48:43 +00:00
Andrey Turetskiy	94c2179550	Quick fix for the test from rL272014 "[LAA] Improve non-wrapping pointer detection by handling loop-invariant case" (s couple of buildbots failed). Patch by Roman Shirokiy. llvm-svn: 272019	2016-06-07 15:52:35 +00:00
Haicheng Wu	4fa9f3ae45	Revert "[MBP] Reduce code size by running tail merging in MBP." This reverts commit r271930, r271915, r271923. They break a thumb selfhosting bot. llvm-svn: 272017	2016-06-07 15:17:21 +00:00
Simon Pilgrim	15c6ab5fac	[X86][AVX512] Added 512-bit integer vector non-temporal load tests llvm-svn: 272016	2016-06-07 15:12:47 +00:00
Oliver Stannard	8de5f24d10	[ARM] Accept conditional versions of BXNS and BLXNS These instructions end in "S" but are not flag-setting, so they need including in the list of special cases in the assembly parser. Differential Revision: http://reviews.llvm.org/D21077 llvm-svn: 272015	2016-06-07 14:58:48 +00:00
Andrey Turetskiy	9f02c58670	[LAA] Improve non-wrapping pointer detection by handling loop-invariant case. This fixes PR26314. This patch adds new helper “isNoWrap” with detection of loop-invariant pointer case. Patch by Roman Shirokiy. Ref: https://llvm.org/bugs/show_bug.cgi?id=26314 Differential Revision: http://reviews.llvm.org/D17268 llvm-svn: 272014	2016-06-07 14:55:27 +00:00
Simon Pilgrim	9a89623b57	[X86][SSE] Add general lowering of nontemporal vector loads Currently the only way to use the (V)MOVNTDQA nontemporal vector loads instructions is through the int_x86_sse41_movntdqa style builtins. This patch adds support for lowering nontemporal loads from general IR, allowing us to remove the movntdqa builtins in a future patch. We currently still fold nontemporal loads into suitable instructions, we should probably look at removing this (and nontemporal stores as well) or at least make the target's folding implementation aware that its dealing with a nontemporal memory transaction. There is also an issue that VMOVNTDQA only acts on 128-bit vectors on pre-AVX2 hardware - so currently a normal ymm load is still used on AVX1 targets. Differential Review: http://reviews.llvm.org/D20965 llvm-svn: 272010	2016-06-07 13:34:24 +00:00
James Molloy	b101383fb5	[Thumb-1] Add optimized constant materialization for integers [256..512) We can materialize these integers using a MOV; ADDi8 pair. llvm-svn: 272007	2016-06-07 13:10:14 +00:00
Igor Breger	61e628591f	[AVX512] Fix load opcode for fast isel. Differential Revision: http://reviews.llvm.org/D21067 llvm-svn: 272006	2016-06-07 13:08:45 +00:00
Ulrich Weigand	6b0634b304	[PowerPC] Support multiple return values with fast isel Using an LLVM IR aggregate return value type containing three or more integer values causes an abort in the fast isel pass. This patch adds two more registers to RetCC_PPC64_ELF_FIS to allow returning up to four integers with fast isel, just the same as is currently supported with regular isel (RetCC_PPC). This is needed for Swift and (possibly) other non-clang frontends. Fixes PR26190. llvm-svn: 272005	2016-06-07 12:48:22 +00:00
Simon Pilgrim	ca1da1bf07	[X86][SSE] Improved blend+zero target shuffle combining to use combined shuffle mask directly We currently only combine to blend+zero if the target value type has 8 elements or less, but this was missing a lot of cases where the combined mask had been widened. This change makes it so we use the combined mask to determine the blend value type, allowing us to catch more widened cases. llvm-svn: 272003	2016-06-07 12:20:14 +00:00
James Molloy	53298a1808	[ARM] Shrink post-indexed LDR and STR to LDM/STM A Thumb-2 post-indexed LDR instruction such as: ldr.w r0, [r1], #4 Can be rewritten as: ldm.n r1!, {r0} LDMs can be more expensive than LDRs on some cores, so this has been enabled only in minsize mode. llvm-svn: 272002	2016-06-07 12:13:34 +00:00
James Molloy	75afc95112	[ARM] Transform LDMs into writeback form to save code size If we have an LDM that uses only low registers and doesn't write to its base register: ldm.w r0, {r1, r2, r3} And that base register is dead after the LDM, then we can convert it to writeback form and use a narrow encoding: ldm.n r0!, {r1, r2, r3} Obviously, this introduces a new register write and so can cause WAW hazards, so I've enabled it only in minsize mode. This is a code size trick that ARM Compiler 5 ("armcc") does that we don't. llvm-svn: 272000	2016-06-07 11:47:24 +00:00
George Rimar	cd36e182d2	[llvm-readobj] - Teach llvm-readobj to dump .gnu.version_r sections SHT_GNU_verneed (.gnu.version_r) is a version dependency section. It was the last symbol versioning relative section that was not dumped, now it is. Differential revision: http://reviews.llvm.org/D21024 llvm-svn: 271998	2016-06-07 11:04:49 +00:00
Peter Smith	353a2286e2	[ARM] Incorrect relocation type for Thumb2 B<cond>.w The Thumb2 conditional branch B<cond>.W has a different encoding (T3) to the unconditional branch B.W (T4) as it needs to record <cond>. As the encoding is different the B<cond>.W is given a different relocation type. ELF for the ARM Architecture 4.6.1.6 (Table-13) states that R_ARM_THM_JUMP19 should be used for B<cond>.W. At present the MC layer is using the R_ARM_THM_JUMP24 from B.W. This change makes B<cond>.W use R_ARM_THM_JUMP19 and alters the existing test that checks for R_ARM_THM_JUMP24 to expect R_ARM_THM_JUMP19. llvm-svn: 271997	2016-06-07 10:34:33 +00:00
Simon Pilgrim	db9893fb90	[InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to native shifts Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit). If the shift amount is constant we can sometimes convert these instructions to native shifts: 1 - if all shift amounts are in range then the conversion is trivial. 2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion. 3 - logical shifts just return zero if all elements have out of range shift amounts. In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts. Differential Revision: http://reviews.llvm.org/D19675 llvm-svn: 271996	2016-06-07 10:27:15 +00:00
Simon Pilgrim	91e3ac8293	[InstCombine][SSE] Add MOVMSK constant folding (PR27982) This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions. The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs. Differential Revision: http://reviews.llvm.org/D20998 llvm-svn: 271990	2016-06-07 08:18:35 +00:00
Saleem Abdulrasool	532dcbc2c5	ARM: correct TLS access on WoA TLS access requires an offset from the TLS index. The index itself is the section-relative distance of the symbol. For ARM, the relevant relocation (IMAGE_REL_ARM_SECREL) is applied as a constant. This means that the value may not be an immediate and must be lowered into a constant pool. This offset will not be base relocated. We were previously emitting the actual address of the symbol which would be base relocated and would therefore be the vaue offset by the ImageBase + TLS Offset. llvm-svn: 271974	2016-06-07 03:15:07 +00:00
Reid Kleckner	6f3406df67	Re-land "[codeview] Emit information about global variables" This reverts commit r271962 and reinstantes r271957. MSVC's linker doesn't appear to like it if you have an empty symbol substream, so only open a symbol substream if we're going to emit something about globals into it. Makes check-asan pass. llvm-svn: 271965	2016-06-07 00:02:03 +00:00
Vedant Kumar	8d0e861e9b	Revert "Retry^2 "[llvm-profdata] Add option to ingest filepaths from a file"" This reverts commit r271953. It's still breaking on Windows, though the list initialization issue is fixed: http://bb.pgr.jp/builders/ninja-clang-i686-msc19-R/builds/3751 llvm-svn: 271963	2016-06-06 23:43:56 +00:00
Reid Kleckner	e8a236fc2e	Revert "[codeview] Emit information about global variables" This reverts commit r271957, it broke check-asan on Windows. llvm-svn: 271962	2016-06-06 23:41:38 +00:00
Michael Kuperstein	a0c6ae02a5	[InstCombine] scalarizePHI should not assume the code it sees has been CSE'd scalarizePHI only looked for phis that have exactly two uses - the "latch" use, and an extract. Unfortunately, we can not assume all equivalent extracts are CSE'd, since InstCombine itself may create an extract which is a duplicate of an existing one. This extends it to handle several distinct extracts from the same index. This should fix at least some of the performance regressions from PR27988. Differential Revision: http://reviews.llvm.org/D20983 llvm-svn: 271961	2016-06-06 23:38:33 +00:00
Rui Ueyama	4bc848047b	Fix CRLF -> LF. llvm-svn: 271960	2016-06-06 23:35:52 +00:00
Reid Kleckner	87eddf723d	[codeview] Emit information about global variables This currently emits everything as S_GDATA32, which isn't right for things like thread locals, but it's a start. llvm-svn: 271957	2016-06-06 23:23:47 +00:00
Vedant Kumar	f051269a7f	Retry^2 "[llvm-profdata] Add option to ingest filepaths from a file" Changes since the initial commit: - Normalize file paths read from the file to prevent Windows path separators from escaping parts of the path. - Since we need to store the normalized file paths in WeightedFile, don't do tricky things to keep the source MemoryBuffer alive. - Don't use list-initialization for a std::string in WeightedFile. Differential Revision: http://reviews.llvm.org/D20980 llvm-svn: 271953	2016-06-06 23:17:22 +00:00
Vedant Kumar	87886425bd	Revert "Retry "[llvm-profdata] Add option to ingest filepaths from a file" This reverts commit r271949. It breaks the Windows build: http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/12796 llvm-svn: 271952	2016-06-06 23:01:42 +00:00
Vedant Kumar	d8ee75b8f5	Retry "[llvm-profdata] Add option to ingest filepaths from a file" Changes since the initial commit: - Normalize file paths read from the file to prevent Windows path separators from escaping parts of the path. - Since we need to store the normalized file paths in WeightedFile, don't do tricky things to keep the source MemoryBuffer alive. Differential Revision: http://reviews.llvm.org/D20980 llvm-svn: 271949	2016-06-06 22:39:22 +00:00
Rui Ueyama	2c5384ae4c	[pdbdump] Print section header flags. llvm-svn: 271943	2016-06-06 21:34:55 +00:00
Zachary Turner	7120a478fa	[llvm-pdbdump] Dump MSF headers to YAML. This is the simplest possible patch to get some kind of YAML output. All it dumps is the MSF header fields so that in theory an empty MSF file could be reconstructed. Reviewed By: ruiu, majnemer Differential Revision: http://reviews.llvm.org/D20971 llvm-svn: 271939	2016-06-06 20:37:05 +00:00
Matt Arsenault	3b2e2a59e8	AMDGPU: Fix constantexpr addrspacecasts If we had a constant group address space cast the queue pointer wasn't enabled for the function, resulting in a crash on noreg later. llvm-svn: 271935	2016-06-06 20:03:31 +00:00
Michael Zolotukhin	19edbadfc5	[LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost. In some cases, when simplifying with SCEV, we might consider pointer values as just usual integer values. Thus, we might get a different type from what we had originally in the map of simplified values, and hence we need to check types before operating on the values. This fixes PR28015. llvm-svn: 271931	2016-06-06 19:21:40 +00:00
Haicheng Wu	9ed77af89d	Fix a test case. NFC. llvm-svn: 271930	2016-06-06 19:11:53 +00:00
Geoff Berry	43e5160d0e	Reapply [LSR] Create fewer redundant instructions. Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Originally reviewed in http://reviews.llvm.org/D18001 Reviewers: atrick Subscribers: llvm-commits, mzolotukhin, mcrosier Differential Revision: http://reviews.llvm.org/D18480 llvm-svn: 271929	2016-06-06 19:10:46 +00:00
Rui Ueyama	ef2b488482	[pdbdump] Print out New FPO stream contents. The data strucutre in the new FPO stream is described in the PE/COFF spec. There is one record per function if frame pointer is omitted. Differential Revision: http://reviews.llvm.org/D20999 llvm-svn: 271926	2016-06-06 18:39:21 +00:00
Haicheng Wu	77ea344786	[MBP] Reduce code size by running tail merging in MBP. The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. Differential Revision: http://reviews.llvm.org/D20276 llvm-svn: 271925	2016-06-06 18:36:07 +00:00
Sanjay Patel	6a333c3ed9	[InstCombine] limit icmp transform to ConstantInt (PR28011) In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check above this to work for any Constant rather than ConstantInt. AFAICT, that part makes sense if we can determine that the shrunken/extended constant remained equal. But it doesn't make sense for this later transform where we assume that the constant DID change. This could assert for a ConstantExpr: https://llvm.org/bugs/show_bug.cgi?id=28011 And it could be wrong for a vector as shown in the added regression test. llvm-svn: 271908	2016-06-06 16:56:57 +00:00
Sanjay Patel	027c469158	regenerate checks llvm-svn: 271904	2016-06-06 16:03:06 +00:00
Sanjay Patel	70aa568c4e	regenerate checks llvm-svn: 271903	2016-06-06 15:55:00 +00:00
Artem Tamazov	135487767b	[AMDGPU][llvm-mc] v_cndmask_b32: src2 is mandatory; do not enforce VOP2 when src2 == VCC. Another step for unification llvm assembler/disassembler with sp3. Besides, CodeGen output is a bit improved, thus changes in CodeGen tests. Assembler/Disassembler tests updated/added. Differential Revision: http://reviews.llvm.org/D20796 llvm-svn: 271900	2016-06-06 15:23:43 +00:00
Igor Breger	edafb0595e	[KNL] Fix UMULO lowering. Differential Revision: http://reviews.llvm.org/D21013 llvm-svn: 271891	2016-06-06 12:24:52 +00:00
Craig Topper	33350cc406	[AVX512] Remove masked palignr intrinsics and auto-upgrade them to native IR of vector shuffle and select. llvm-svn: 271872	2016-06-06 06:12:54 +00:00
Craig Topper	143446d5c1	[AVX512] Add PALIGNR shuffle lowering for v32i16 and v16i32. llvm-svn: 271870	2016-06-06 05:39:10 +00:00
Craig Topper	ccad6d57c1	[AVX512] Update tests to show shuffle decoding for vpshuflw/vpshufhw. llvm-svn: 271869	2016-06-06 05:39:07 +00:00
Eli Friedman	ee89505799	LICM: Don't sink stores out of loops that may throw. Summary: This hasn't been caught before because it requires noalias or similarly strong alias analysis to actually reproduce. Fixes http://llvm.org/PR27952 . Reviewers: hfinkel, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20944 llvm-svn: 271858	2016-06-05 22:13:52 +00:00
Sanjoy Das	b7e861a488	Add safety check to InstCombiner::commonIRemTransforms Since FoldOpIntoPhi speculates the binary operation to potentially each of the predecessors of the PHI node (pulling it out of arbitrary control dependence in the process), we can FoldOpIntoPhi only if we know the operation doesn't have UB. This also brings up an interesting profitability question -- the way it is written today, commonIRemTransforms will hoist out work from dynamically dead code into code that will execute at runtime. Perhaps that isn't the best canonicalization? Fixes PR27968. llvm-svn: 271857	2016-06-05 21:17:04 +00:00
Sanjoy Das	0dcd1d859c	Add test case for InstCombiner::commonIRemTransforms; NFC The PHI case in commonIRemTransforms was untested; add a trivial test case. llvm-svn: 271856	2016-06-05 21:17:00 +00:00
Davide Italiano	a1cbc3f8cc	[Internalize] Test that __stack_chk_{guard, fail} are not internalized. r154645 introduced this feature without test. This should have better coverage now. llvm-svn: 271853	2016-06-05 19:08:54 +00:00
Filipe Cabecinhas	6328f8e9e6	[BitCode] Make sure atomicrmw's argument is an actual PointerType llvm-svn: 271851	2016-06-05 18:43:40 +00:00
Filipe Cabecinhas	036e73c8bf	[BitCode] Make sure storeatomic's argument is an actual PointerType llvm-svn: 271850	2016-06-05 18:43:33 +00:00
Filipe Cabecinhas	fc2a3c98e9	[BitCode] Diagnose GEPs with no indices llvm-svn: 271849	2016-06-05 18:43:26 +00:00
Filipe Cabecinhas	2849b48fea	[BitCode] Don't allow constants of void type. llvm-svn: 271848	2016-06-05 18:43:17 +00:00
Sanjoy Das	4d4339d1e8	[PM] Port IndVarSimplify to the new pass manager Summary: There are some rough corners, since the new pass manager doesn't have (as far as I can tell) LoopSimplify and LCSSA, so I've updated the tests to run them separately in the old pass manager in the lit tests. We also don't have an equivalent for AU.setPreservesCFG() in the new pass manager, so I've left a FIXME. Reviewers: bogner, chandlerc, davide Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20783 llvm-svn: 271846	2016-06-05 18:01:19 +00:00
Sanjoy Das	f90e28d6fd	[IndVars] Remove -liv-reduce It is an off-by-default option that no one seems to use[0], and given that SCEV directly understands the overflow instrinsics there is no real need for it anymore. [0]: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098181.html llvm-svn: 271845	2016-06-05 18:01:12 +00:00
Sanjay Patel	0fab306eb5	fix checks update_test_checks.py got confused matching the variable names. llvm-svn: 271844	2016-06-05 17:54:56 +00:00
Sanjay Patel	a6fbc82392	[InstCombine] allow vector icmp bool transforms llvm-svn: 271843	2016-06-05 17:49:45 +00:00
Sanjay Patel	54d7010627	add tests to show missing vector transforms llvm-svn: 271842	2016-06-05 17:32:58 +00:00
Sanjay Patel	51dc83c052	regenerate checks llvm-svn: 271841	2016-06-05 17:29:45 +00:00
Sanjay Patel	009c3da65f	update test to use FileCheck llvm-svn: 271840	2016-06-05 17:13:09 +00:00
Sanjay Patel	f48b909f28	update test to use FileCheck llvm-svn: 271838	2016-06-05 16:41:20 +00:00
Sanjay Patel	8a3b6d0d8b	update test to FileCheck llvm-svn: 271837	2016-06-05 16:29:15 +00:00
Simon Pilgrim	64c6de4525	[X86][XOP] Added VPERMIL2PD/VPERMIL2PS raw mask decoding for target shuffle combines llvm-svn: 271834	2016-06-05 15:21:30 +00:00
Simon Pilgrim	478295dadd	[X86][XOP] Added VPERMIL2PD/VPERMIL2PS as a target shuffle type llvm-svn: 271831	2016-06-05 15:01:45 +00:00
Craig Topper	8eeda57a40	[AVX512] Add support for lowering PALIGNR for v64i8. Could do this for other types to, but this is what's needed to replace the instrinsic with native IR in clang. llvm-svn: 271828	2016-06-05 06:29:12 +00:00
Craig Topper	5a315d4613	[AVX512] Split command lines and regenerate a test to prepare for a future commit. llvm-svn: 271827	2016-06-05 06:29:08 +00:00
Craig Topper	9f51c9ef15	[AVX512] Fix PANDN combining for v4i32/v8i32 when VLX is enabled. v4i32/v8i32 ANDs aren't promoted to v2i64/v4i64 when VLX is enabled. llvm-svn: 271826	2016-06-05 05:35:11 +00:00
Xinliang David Li	64dbb295b6	[PM] Port GCOVProfiler pass to the new pass manager llvm-svn: 271823	2016-06-05 05:12:23 +00:00
David Majnemer	2482e1c017	[SimplifyCFG] Don't kill empty cleanuppads with multiple uses A basic block could contain: %cp = cleanuppad [] cleanupret from %cp unwind to caller This basic block is empty and is thus a candidate for removal. However, there can be other uses of %cp outside of this basic block. This is only possible in unreachable blocks. Make our transform more correct by checking that the pad has a single user before removing the BB. This fixes PR28005. llvm-svn: 271816	2016-06-04 23:50:03 +00:00
Sanjay Patel	ea8a211169	[InstCombine] allow vector constants for cast+icmp fold This is step 1 of unknown towards fixing PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001 llvm-svn: 271810	2016-06-04 22:04:05 +00:00
Simon Pilgrim	2ead861d07	[X86][XOP] Added VPERMIL2PD/VPERMIL2PS shuffle mask comment decoding llvm-svn: 271809	2016-06-04 21:44:28 +00:00
Sanjay Patel	8e63999bee	[InstCombine] add test for missing vector optimization llvm-svn: 271808	2016-06-04 21:41:25 +00:00
Sanjay Patel	4c42211c6f	[InstCombine] add test for missing vector optimization llvm-svn: 271806	2016-06-04 21:20:03 +00:00
Sanjay Patel	58a92a327d	[InstCombine] minimize test case and use FileCheck llvm-svn: 271805	2016-06-04 21:04:59 +00:00
Simon Pilgrim	ba319ded5e	[Analysis] Enabled BITREVERSE as a vectorizable intrinsic Allows XOP to vectorize BITREVERSE - other targets will follow as their costmodels improve. llvm-svn: 271803	2016-06-04 20:21:07 +00:00
Saleem Abdulrasool	1fcdc23a6e	X86: enable TLS on Windows itanium Windows itanium is nearly identical to windows-msvc (MS ABI for C, itanium for C++). Enable the TLS support for the target similar to the MSVC model. llvm-svn: 271797	2016-06-04 18:27:22 +00:00
Simon Pilgrim	fd2eda4f64	[X86][AVX2] Fix v16i16 SHL lowering (PR27730) The AVX2 v16i16 shift lowering works by unpacking to 2 x v8i32, performing the shift and then truncating the result. The unpacking is used to place the values in the upper 16-bits so that we can correctly sign-extend for SRA shifts. Unfortunately we weren't ensuring that the lower 16-bits were zero to ensure that SHL correctly shifts in zero bits. llvm-svn: 271796	2016-06-04 16:45:33 +00:00
Simon Pilgrim	fda22d66fc	[InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614 Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type. llvm-svn: 271789	2016-06-04 13:42:46 +00:00
Chandler Carruth	57cd7ef1c9	[sancov] Revert r271695 which broke all of the PPC bots. Original commit message: [sancov] Run sancov tests on more platforms The only tests that need to be run on Linux are the ones that use C++ demangling. I'm assuming they will fail on Mac, since __cxa_demangle there won't handle the non-double-underscore prefixed mangled names. llvm-svn: 271763	2016-06-04 03:28:27 +00:00
Chandler Carruth	0c30f89cca	[llvm-profdata] Revert r271709 and the 3 subsequent commits - the code and/or tests aren't working on Windows currently. There seems to be some problem with quoting the file paths. I don't understand the test structure here or the code well enough to try to come up with a way to correctly handle paths with back slashes in them, and this has caused the Windows builds to be failing for 7 hours now, so I'm reverting the whole thing to bring them back to life. Sorry for the disruption, but a couple of these were bug fixes anyways that can be folded into a fresh commit. Reverts the following patches: r271756: Clean up the way we create the input filenames buffer (NFC) r271748: Fix use-after-free from discarded MemoryBuffer (NFC) r271710: Fix option description (NFC) r271709: Add option to ingest filepaths from a file llvm-svn: 271760	2016-06-04 03:08:01 +00:00
Adrian Prantl	2905d54af3	Testcase cleanup: Remove a redundant test input. llvm-svn: 271753	2016-06-04 00:10:17 +00:00
Matthias Braun	c25c9ccbcb	MIR: Support MachineMemOperands without associated value This is allowed (though used rarely) and useful to keep your tests short. llvm-svn: 271752	2016-06-04 00:06:31 +00:00
Easwaran Raman	019e0bf592	Reapply r271728 after adding move cobstructor for ProfileSummaryInfo llvm-svn: 271745	2016-06-03 22:54:26 +00:00
Derek Bruening	9ef5772154	[esan\|wset] Optionally assume intra-cache-line accesses Summary: Adds an option -esan-assume-intra-cache-line which causes esan to assume that a single memory access touches just one cache line, even if it is not aligned, for better performance at a potential accuracy cost. Experiments show that the performance difference can be 2x or more, and accuracy loss is typically negligible, so we turn this on by default. This currently applies just to the working set tool. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D20978 llvm-svn: 271743	2016-06-03 22:29:52 +00:00
Easwaran Raman	94edaaaefb	Revert r271728 as it breaks Windows build llvm-svn: 271738	2016-06-03 21:14:26 +00:00
Rui Ueyama	fd97bf1f76	pdbdump: print out TPI hashes. Differential Revision: http://reviews.llvm.org/D20945 llvm-svn: 271736	2016-06-03 20:48:51 +00:00
Easwaran Raman	d142050f3a	Analysis pass to access profile summary info Differential Revision: http://reviews.llvm.org/D20648 llvm-svn: 271728	2016-06-03 20:37:19 +00:00
Reid Kleckner	f27f3f8491	[Symbolize] Check if the PE file has a PDB and emit an error if we can't load it Summary: Previously we would try to load PDBs for every PE executable we tried to symbolize. If that failed, we would fall back to DWARF. If there wasn't any DWARF, we'd print mostly useless symbol information using the export table. With this change, we only try to load PDBs for executables that claim to have them. If that fails, we can now print an error rather than falling back silently. This should make it a lot easier to diagnose and fix common symbolization issues, such as not having DIA or not having a PDB. Reviewers: zturner, eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20982 llvm-svn: 271725	2016-06-03 20:25:09 +00:00
Chad Rosier	9faa5bcf13	[AArch64] Move tests from r271677 to a more appropriately named file. NFC. llvm-svn: 271718	2016-06-03 20:11:09 +00:00
Chad Rosier	be879ea751	[AArch64] Spot SBFX-compatible code expressed with sign_extend. This is very similar to r271677, but for extracts from i32 with the SIGN_EXTEND acting on a arithmetic shift. llvm-svn: 271717	2016-06-03 20:05:49 +00:00
Vedant Kumar	5c276d0e5d	[llvm-profdata] Add option to ingest filepaths from a file Differential Revision: http://reviews.llvm.org/D20980 llvm-svn: 271709	2016-06-03 19:05:20 +00:00
Derek Schuff	5859a9ed80	[WebAssembly] Emit type signatures for declared functions Under emscripten, C code can take the address of a function implemented in Javascript (which is exposed via an import in wasm). Because imports do not have linear memory address in wasm, we need to generate a thunk to be the target of the indirect call; it call the import directly. To make this possible, LLVM needs to emit the type signatures for these functions, because they may not be called directly or referred to other than where the address is taken. This uses s new .s directive (.functype) which specifies the signature. Differential Revision: http://reviews.llvm.org/D20891 Re-apply r271599 but instead of bailing with an error when a declared function has multiple returns, replace it with a pointer argument. Also add the test case I forgot to 'git add' last time around. llvm-svn: 271703	2016-06-03 18:34:36 +00:00
Reid Kleckner	98df480c8e	[sancov] Disable these tests if there is no X86 backend Copied from test/CodeGen/X86 llvm-svn: 271698	2016-06-03 18:07:32 +00:00
Reid Kleckner	e2bef1f143	[sancov] Run sancov tests on more platforms The only tests that need to be run on Linux are the ones that use C++ demangling. I'm assuming they will fail on Mac, since __cxa_demangle there won't handle the non-double-underscore prefixed mangled names. llvm-svn: 271695	2016-06-03 17:51:42 +00:00
Chris Bieneman	4c423773d8	[yaml2obj] Sort MachO LinkEdit write operations based on offset This re-applies r271611, and hopefully the bots won't break this time. Although ld64 always outputs linkedit data in the same order, it isn't actually required to. This change makes yaml2obj resilient if the offsets are in arbitrary order. llvm-svn: 271687	2016-06-03 16:58:05 +00:00
Reid Kleckner	a8d5740757	[codeview] Add basic record type translation This only translates data members for now. Translating overloaded methods is complicated, so I stopped short of doing that. Reviewers: aaboud Differential Revision: http://reviews.llvm.org/D20924 llvm-svn: 271680	2016-06-03 15:58:20 +00:00
Sjoerd Meijer	9bc93f6298	Code size optimisation: do not inline memcpy if this expansion results in more instructions than the libary call. Differential Revision: http://reviews.llvm.org/D20958 llvm-svn: 271678	2016-06-03 15:38:55 +00:00
Chad Rosier	2d658703e1	[AArch64] Spot SBFX-compatbile code expressed with sign_extend_inreg. We were assuming all SBFX-like operations would have the shl/asr form, but often when the field being extracted is an i8 or i16, we end up with a SIGN_EXTEND_INREG acting on a shift instead. This is a port of r213754 from ARM to AArch64. llvm-svn: 271677	2016-06-03 15:00:09 +00:00
Sanjay Patel	6cf18af1c5	[InstCombine] look through bitcasts to find selects There was concern that creating bitcasts for the simpler potential select pattern: define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) { %a2 = add <2 x i64> %a, %a %sext = sext <4 x i1> %cmp to <4 x i32> %bc = bitcast <4 x i32> %sext to <2 x i64> %and = and <2 x i64> %a2, %bc ret <2 x i64> %and } might lead to worse code for some targets, so this patch is matching the larger patterns seen in the test cases. The motivating example for this patch is this IR produced via SSE intrinsics in C: define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) { %t0 = bitcast <2 x i64> %a to <4 x i32> %t1 = bitcast <2 x i64> %b to <4 x i32> %cmp = icmp sgt <4 x i32> %t0, %t1 %sext = sext <4 x i1> %cmp to <4 x i32> %t2 = bitcast <4 x i32> %sext to <2 x i64> %and = and <2 x i64> %t2, %a %neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1> %neg2 = bitcast <4 x i32> %neg to <2 x i64> %and2 = and <2 x i64> %neg2, %b %or = or <2 x i64> %and, %and2 ret <2 x i64> %or } For an AVX target, this is currently: vpcmpgtd %xmm1, %xmm0, %xmm2 vpand %xmm0, %xmm2, %xmm0 vpandn %xmm1, %xmm2, %xmm1 vpor %xmm1, %xmm0, %xmm0 retq With this patch, it becomes: vpmaxsd %xmm1, %xmm0, %xmm0 Differential Revision: http://reviews.llvm.org/D20774 llvm-svn: 271676	2016-06-03 14:42:07 +00:00
Artem Tamazov	f88397c84c	[test/AMDGPU] Square-braced-syntax for registers: add macro test/example. Test added as per discussion in http://reviews.llvm.org/D20588. The macro is just a demonstration, useless in practice. Coding style fixes. Differential Revision: http://reviews.llvm.org/D20797 llvm-svn: 271675	2016-06-03 14:41:17 +00:00
Zachary Turner	71f725735c	[pdb] Add string table offsets to check output. llvm-svn: 271674	2016-06-03 14:22:46 +00:00
Simon Pilgrim	ff35eecd90	[X86][AVX512] Fixed 512-bit vector nontemporal load alignment llvm-svn: 271673	2016-06-03 14:12:43 +00:00
Sjoerd Meijer	d906bf1369	RAS extensions are part of ARMv8.2-A. This change enables them by introducing a new instruction to ARM and AArch64 targets and several system registers. Patch by: Roger Ferrer Ibanez and Oliver Stannard Differential Revision: http://reviews.llvm.org/D20282 llvm-svn: 271670	2016-06-03 14:03:27 +00:00
Simon Pilgrim	f92d175a78	[X86][AVX512] Added 512-bit vector nontemporal load tests llvm-svn: 271668	2016-06-03 13:42:49 +00:00
Sam Kolton	a4a99ad1bc	[AMDGPU] Assembler: More tests for SDWA instructions. Fix for SDWA float modifiers. Summary: Depends on D20625 Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20674 llvm-svn: 271662	2016-06-03 11:43:09 +00:00
Simon Pilgrim	a6022c9a63	[X86][SSE] Added nontemporal load tests These currently all lower to regular loads, generic nontemporal load support will be added in a future patch llvm-svn: 271659	2016-06-03 11:00:55 +00:00
Simon Pilgrim	960ca812ed	[X86] Added nontemporal scalar store tests llvm-svn: 271656	2016-06-03 10:30:54 +00:00
Sam Kolton	05ef1c940f	[AMDGPU] Assembler: Custom converters for SDWA instructions. Support for _dpp and _sdwa suffixes in mnemonics. Summary: Added custom converters for SDWA instruction to support optional operands and modifiers. Support for _dpp and _sdwa suffixes that allows to force DPP or SDWA encoding for instructions. Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20625 llvm-svn: 271655	2016-06-03 10:27:37 +00:00
Simon Pilgrim	02284541b2	[X86][SSE] Regenerated nontemporal vector store tests and added extra target types llvm-svn: 271654	2016-06-03 10:24:24 +00:00
Daniel Sanders	7b09493bff	[mips] Remove CPU-only triples from llvm-objdump commands. Summary: They aren't necessary since llvm-objdump can auto-detect the architecture. Reviewers: sdardis Subscribers: jfb, dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D20904 llvm-svn: 271653	2016-06-03 10:22:22 +00:00
Simon Pilgrim	38b4661b1b	[X86] Regenerated nontemporal store tests and added tests for all 128-bit vector types llvm-svn: 271651	2016-06-03 10:15:36 +00:00
Simon Pilgrim	205f65f62f	[X86][AVX2] Relaxed alignment on nontemporal store tests llvm-svn: 271646	2016-06-03 10:06:59 +00:00
Simon Pilgrim	8ea8940677	[X86][AVX2] Regenerated nontemporal store tests and added tests for all 256-bit vector types llvm-svn: 271645	2016-06-03 09:56:24 +00:00
Daniel Sanders	6ba3dd6b71	[mips] Implement 'la' macro in PIC mode for O32. Summary: N32 support will follow in a later patch since the symbol version of 'la' incorrectly believes N32 to have 64-bit pointers and rejects it early. This fixes the three incorrectly expanded 'la' macros found in bionic. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D20820 llvm-svn: 271644	2016-06-03 09:53:06 +00:00
Simon Pilgrim	e85506b6e0	[X86][XOP] Support for VPERMIL2PD/VPERMIL2PS 2-input shuffle instructions This patch begins adding support for lowering to the XOP VPERMIL2PD/VPERMIL2PS shuffle instructions - adding the X86ISD::VPERMIL2 opcode and cleaning up the usage. The internal llvm intrinsics were assuming the shuffle mask operand was the same type as the float/double input operands (I guess to simplify the intrinsic definitions in X86InstrXOP.td to a single value type). These needed changing to integer types (matching the clang builtin and the AMD intrinsics definitions), an auto upgrade path is added to convert old calls. Mask decoding/target shuffle support will be added in future patches. Differential Revision: http://reviews.llvm.org/D20049 llvm-svn: 271633	2016-06-03 08:06:03 +00:00
Zachary Turner	3df1bfaaec	[pdb] Print out file names instead of file offsets. When printing line information and file checksums, we were printing the file offset field from the struct header. This teaches llvm-pdbdump how to turn those numbers into the filename. In the case of file checksums, this is done by looking in the global string table. In the case of line contributions, this is done by indexing into the file names buffer of the DBI stream. Why they use a different technique I don't know. llvm-svn: 271630	2016-06-03 05:52:57 +00:00
Craig Topper	e7ae106147	[AVX512] Ensure EVEX vpshufd, vpshuflw, and vpshufhw have isel priority over the VEX encoded ones. llvm-svn: 271629	2016-06-03 05:31:04 +00:00
Craig Topper	01f53b1773	[AVX512] Fix shuffle comment printing for EVEX encoded PSHUFD, PSHUFHW, and PSHUFLW. llvm-svn: 271628	2016-06-03 05:31:00 +00:00
Zachary Turner	d0563f29f9	[pdb] Dump file checksums from pdb codeview line info. llvm-svn: 271622	2016-06-03 04:01:48 +00:00
Zachary Turner	a96cce64a5	[codeview] Dump line number and column information. To facilitate this, a couple of changes had to be made: 1. `ModuleSubstream` got moved from `DebugInfo/PDB` to `DebugInfo/CodeView`, and various codeview related types are defined there. It turns out `DebugInfo/CodeView/Line.h` already defines many of these structures, but this is really old code that is not endian aware, doesn't interact well with `StreamInterface` and not very helpful for getting stuff out of a PDB. Eventually we should migrate the old readobj `COFFDumper` code to these new structures, or at least merge their functionality somehow. 2. A `ModuleSubstream` visitor is introduced. Depending on where your module substream array comes from, different subsets of record types can be expected. We are already hand parsing these substream arrays in many places especially in `COFFDumper.cpp`. In the future we can migrate these paths to the visitor as well, which should reduce a lot of code in `COFFDumper.cpp`. Differential Revision: http://reviews.llvm.org/D20936 Reviewed By: ruiu, majnemer llvm-svn: 271621	2016-06-03 03:25:59 +00:00
Qin Zhao	c14c249343	[esan\|cfrag] Instrument GEP instr for struct field access. Summary: Instrument GEP instruction for counting the number of struct field address calculation to approximate the number of struct field accesses. Adds test struct_field_count_basic.ll to test the struct field instrumentation. Reviewers: bruening, aizatsky Subscribers: junbuml, zhaoqin, llvm-commits, eugenis, vitalybuka, kcc, bruening Differential Revision: http://reviews.llvm.org/D20892 llvm-svn: 271619	2016-06-03 02:33:04 +00:00
Adrian Prantl	61aa70fb42	Revert "Testcase cleanup: Remove an unused input file." This reverts commit r271612. I somehow managed to remove the wrong file m-( llvm-svn: 271616	2016-06-03 00:24:38 +00:00
Adrian Prantl	8d2763bd02	Testcase cleanup: remove an unused RUN line in an input file. llvm-svn: 271614	2016-06-02 23:58:52 +00:00
Chris Bieneman	24df841234	Revert "[yaml2obj] Sort MachO LinkEdit write operations based on offset" This reverts commit r271611 because it broke a bot: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/38184 I don't currently have a handle on what went wrong, so I'll revert while I investigate. llvm-svn: 271613	2016-06-02 23:58:13 +00:00
Adrian Prantl	03412067e7	Testcase cleanup: Remove an unused input file. llvm-svn: 271612	2016-06-02 23:57:42 +00:00
Chris Bieneman	d369cd9a92	[yaml2obj] Sort MachO LinkEdit write operations based on offset Although ld64 always outputs linkedit data in the same order, it isn't actually required to. This change makes yaml2obj resilient if the offsets are in arbitrary order. llvm-svn: 271611	2016-06-02 23:52:08 +00:00
Derek Schuff	f5bae9c1ce	Revert "[WebAssembly] Emit type signatures for declared functions" This reverts r271599, it broke the integration tests. More places than I expected had nontrival return types in imports, or else the check was wrong. llvm-svn: 271606	2016-06-02 23:02:44 +00:00
Chris Bieneman	07bb3c84a2	[obj2yaml] [yaml2obj] Support for MachO nlist and string table This commit adds round tripping for MachO symbol data. Symbols are entries in the name list, that contain offsets into the string table which is at the end of the __LINKEDIT segment. llvm-svn: 271604	2016-06-02 22:54:06 +00:00
Sanjay Patel	172bf6edd1	[InstCombine] change tests to show a more obvious transform possibility The original tests were intended to show a missing transform that would be solved by D20774: http://reviews.llvm.org/D20774 But it's not clear that the transform for the simpler tests is a win for all targets. Make the tests show a larger pattern that should be a win regardless of the cost of bitcast instructions. llvm-svn: 271603	2016-06-02 22:45:49 +00:00
Manuel Jacob	a485984c0c	[PM] Schedule InstSimplify after late LICM run, to clean up LCSSA nodes. Summary: The module pass pipeline includes a late LICM run after loop unrolling. LCSSA is implicitly run as a pass dependency of LICM. However no cleanup pass was run after this, so the LCSSA nodes ended in the optimized output. Reviewers: hfinkel, mehdi_amini Subscribers: majnemer, bruno, mzolotukhin, mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D20606 llvm-svn: 271602	2016-06-02 22:14:26 +00:00
Derek Schuff	23b7d65fe5	[WebAssembly] Emit type signatures for declared functions Under emscripten, C code can take the address of a function implemented in Javascript (which is exposed via an import in wasm). Because imports do not have linear memory address in wasm, we need to generate a thunk to be the target of the indirect call; it call the import directly. To make this possible, LLVM needs to emit the type signatures for these functions, because they may not be called directly or referred to other than where the address is taken. This uses s new .s directive (.functype) which specifies the signature. Differential Revision: http://reviews.llvm.org/D20891 llvm-svn: 271599	2016-06-02 21:34:18 +00:00
Zachary Turner	7eb6d358af	[llvm-pdbdump] Dump CodeView line information. This first pass only splits apart the records and dumps the line info kinds and binary data. Subsequent patches will parse out the binary data into more useful information and dump it in detail. llvm-svn: 271576	2016-06-02 20:11:22 +00:00
Sanjay Patel	dba8b4c04d	transform obscured FP sign bit ops into a fabs/fneg using TLI hook This is effectively a revert of: http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886) and: http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions. This is intended to resolve the objections raised on the dev list: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html and: https://llvm.org/bugs/show_bug.cgi?id=24886#c4 In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later. Differential Revision: http://reviews.llvm.org/D19391 llvm-svn: 271573	2016-06-02 20:01:37 +00:00
Matt Arsenault	d1097a38e2	AMDGPU: Cleanup load tests There are a lot of different kinds of loads to test for, and these were scattered around inconsistently with some redundancy. Try to comprehensively test all loads in a consistent way. llvm-svn: 271571	2016-06-02 19:54:26 +00:00
Matt Arsenault	52dec8d36a	AMDGPU: Temporary fix for broken store combine llvm-svn: 271567	2016-06-02 19:00:55 +00:00
Matt Arsenault	8e00194be8	AMDGPU: Fix crashes on unknown processor name If the processor name failed to parse for amdgcn, the resulting output would have R600 ISA in it. If the processor name was missing or invalid for R600, the wavefront size would not be set and there would be crashes from missing itinerary data. Fixes crashes in future commit caused by dividing by the unset/0 wavefront size. llvm-svn: 271561	2016-06-02 18:37:16 +00:00
Rui Ueyama	90db78816b	pdbdump: print out COFF section headers. Unlike other sections that can grow to any size, the COFF section header stream has maximum length because each record is fixed size and the COFF file format limits the maximum number of sections. So I decided to not create a specific stream class for it. Instead, I added a member function to DbiStream class which returns a vector of COFF headers. Differential Revision: http://reviews.llvm.org/D20717 llvm-svn: 271557	2016-06-02 18:20:20 +00:00
Geoff Berry	c932f533e1	[PowerPC] Run reg2mem on tests to simplify them. Summary: Also convert test/CodeGen/PowerPC/vsx-ldst-builtin-le.ll to use FileCheck instead of two grep and count runs. This change is needed to avoid spurious diffs in these tests when EarlyCSE is improved to use MemorySSA and can do more load elimination. Reviewers: hfinkel Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20238 llvm-svn: 271553	2016-06-02 18:02:50 +00:00
Simon Pilgrim	ab95b2fe26	[X86][SSE] Added SSE41/AVX2 non-temporal tests Useful for when we add MOVNTDQA support llvm-svn: 271552	2016-06-02 18:01:21 +00:00
Reid Kleckner	b9c80fd8b5	[codeview] Fix crash when handling qualified void types The DIType* for void is the null pointer. A null DIType can never be a qualified type, so we can just exit the loop at this point and go to getTypeIndex(BaseTy). Fixes PR27984 llvm-svn: 271550	2016-06-02 17:40:51 +00:00
Dimitry Andric	6a482a73d6	Only attempt to detect AVG if SSE2 is available Summary: In PR29973 Sanjay Patel reported an assertion failure when a certain loop was optimized, for a target without SSE2 support. It turned out this was because of the AVG pattern detection introduced in rL253952. Prevent the assertion failure by bailing out early in `detectAVGPattern()`, if the target does not support SSE2. Also add a minimized test case. Reviewers: congh, eli.friedman, spatel Subscribers: emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D20905 llvm-svn: 271548	2016-06-02 17:30:49 +00:00
Nirav Dave	1180e689b6	Ignore Lexing errors in macro body definitions Do not issue lexing errors found during the parsing of macro body definitions and parseIdentifier function in AsmParser. This changes the Parser to not issue a lexing error when we reach an error, but rather when it is consumed allowing us time to examine and recover from an error. As a result, of this, we stop issuing a both lexing error and a parsing error in floating-literals test. Minor tweak to parseDirectiveRealValue to favor more meaningful lexing error over less helpful parse error. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20535 llvm-svn: 271542	2016-06-02 17:15:05 +00:00
David Majnemer	75c3ebfa02	[CodeView] Implement function-type indices We still need to do something about member functions and calling conventions. Differential Revision: http://reviews.llvm.org/D20900 llvm-svn: 271541	2016-06-02 17:13:53 +00:00
Reid Kleckner	2da433ea99	[COFF] Expose the PE debug data directory and dump it This directory is used to find if there is a PDB associated with an executable. I plan to use this functionality to teach llvm-symbolizer whether it should use DIA or DWARF to symbolize a given DLL. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D20885 llvm-svn: 271539	2016-06-02 17:10:43 +00:00
Xinliang David Li	7008ce3f98	[profile] value profiling bug fix -- missing icall targets in profile-use Inline virtual functions has linkeonceodr linkage (emitted in comdat on supporting targets). If the vtable for the class is not emitted in the defining module, function won't be address taken thus its address is not recorded. At the mercy of the linker, if the per-func prf_data from this module (in comdat) is picked at link time, we will lose mapping from function address to its hash val. This leads to missing icall promotion. The second test case (currently disabled) in compiler_rt (r271528): instrprof-icall-prom.test demostrates the bug. The first profile-use subtest is fine due to linker order difference. With this change, no missing icall targets is found in instrumented clang's raw profile. llvm-svn: 271532	2016-06-02 16:33:41 +00:00
Pavel Labath	ec1c01e8d4	[cmake] Fix builds with LLVM_ENABLE_PIC=0 Summary: When this flag is specified, the target llvm-lto is not built, but is still used as a dependency of the test targets. cmake 2.8 silently ignored this situation, but with cmake_minimum_required(3.4) it becomes an error. Fix this by avoiding the inclusion of the target as a dependency. Reviewers: beanz Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20882 llvm-svn: 271530	2016-06-02 16:29:07 +00:00
Geoff Berry	66f6b65fed	[PEI, AArch64] Use empty spaces in stack area for local stack slot allocation. Summary: If the target requests it, use emptry spaces in the fixed and callee-save stack area to allocate local stack objects. AArch64: Change last callee-save reg stack object alignment instead of size to leave a gap to take advantage of above change. Reviewers: t.p.northover, qcolombet, MatzeB Subscribers: rengolin, mcrosier, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D20220 llvm-svn: 271527	2016-06-02 16:22:07 +00:00
Sanjay Patel	f509d85a6d	[DAG] use getBitcast() to reduce code Although this was intended to be NFC, the test case wiggle shows a change in code scheduling/RA caused by a difference in the SDLoc() generation. Depending on how you look at it, this is the (dis)advantage of exact checking in regression tests. llvm-svn: 271526	2016-06-02 16:01:15 +00:00
Simon Pilgrim	ebdc397c86	[X86][SSE] Added non-temporal load tests for vector types These currently lower to regular loads instead of MOVNTDQA llvm-svn: 271516	2016-06-02 13:51:50 +00:00
Simon Pilgrim	0afd5a4d80	[X86][SSE] Replace (V)CVTTPS2DQ and VCVTTPD2DQ truncating (round to zero) f32/f64 to i32 with generic IR (llvm) This patch removes the llvm intrinsics (V)CVTTPS2DQ and VCVTTPD2DQ truncation (round to zero) conversions and auto-upgrades to FP_TO_SINT calls instead. Note: I looked at updating CVTTPD2DQ as well but this still requires a lot more work to correctly lower. Differential Revision: http://reviews.llvm.org/D20860 llvm-svn: 271510	2016-06-02 10:55:21 +00:00
Sjoerd Meijer	0b7bb16e5b	This adds support for Cortex-A73 as an available target. Differential Revision: http://reviews.llvm.org/D20865 llvm-svn: 271508	2016-06-02 10:48:52 +00:00
David Majnemer	1c2cb1ddd7	[CodeView] Use the right type index for long long We used T_INT8 instead of T_QUAD. llvm-svn: 271497	2016-06-02 07:02:32 +00:00
David Majnemer	d065e23dac	[codeview] Return type indices for typedefs Use the type index of the underlying type unless we have a typedef from long to HRESULT; HRESULT typedefs are translated to T_HRESULT. llvm-svn: 271494	2016-06-02 06:21:37 +00:00
Zachary Turner	93839cb4ac	[pdb] Parse and dump section map and section contribs Differential Revision: http://reviews.llvm.org/D20876 Reviewed By: rnk, ruiu llvm-svn: 271488	2016-06-02 05:07:49 +00:00
Craig Topper	ca9c0801e1	[X86] Add AVX 256-bit load and stores to fast isel. I'm not sure why this was missing for so long. This also exposed that we were picking floating point 256-bit VMOVNTPS for some integer types in normal isel for AVX1 even though VMOVNTDQ is available. In practice it doesn't matter due to the execution dependency fix pass, but it required extra isel patterns. Fixing that in a follow up commit. llvm-svn: 271481	2016-06-02 04:19:45 +00:00
Craig Topper	f10fbfa738	[AVX512] Remove masked load intrinsics. Clang now emits generic masked load intrinsics instead. The intrinsics will be autoupgraded to the same generic masked loads. llvm-svn: 271478	2016-06-02 04:19:36 +00:00
Xinliang David Li	0b29330612	make icall pass name consistent /NFC llvm-svn: 271467	2016-06-02 01:52:05 +00:00
Rafael Espindola	41410cc812	Avoid a load for local functions. llvm-svn: 271437	2016-06-01 21:57:11 +00:00
Sanjay Patel	b4a4357ecb	[x86, AVX2] regenerate checks llvm-svn: 271434	2016-06-01 21:32:56 +00:00
Geoff Berry	b96d3b2dd8	[MemorySSA] Port to new pass manager Add support for the new pass manager to MemorySSA pass. Change MemorySSA to be computed eagerly upon construction. Change MemorySSAWalker to be owned by the MemorySSA object that creates it. Reviewers: dberlin, george.burgess.iv Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19664 llvm-svn: 271432	2016-06-01 21:30:40 +00:00
Michael Kuperstein	738ae45ce8	[DAG] Improve legalization of INSERT_SUBVECTOR When the index is known to be constant 0, insert directly into the the low half, instead of spilling, performing the insert in-memory, and reloading. Differential Revision: http://reviews.llvm.org/D20763 llvm-svn: 271428	2016-06-01 20:49:35 +00:00
Keno Fischer	5573483c5b	[PPC64] Fix SUBFC8 Defs list Fix PR27943 "Bad machine code: Using an undefined physical register". SUBFC8 implicitly defines the CR0 register, but this was omitted in the instruction definition. Patch by Jameson Nash <jameson@juliacomputing.com> Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D20802 llvm-svn: 271425	2016-06-01 20:31:07 +00:00
Daniel Berlin	73694bb92b	Revert "Claim NoAlias if two GEPs index different fields of the same struct" This reverts commit 2d5d6493f43eb68493a3852b8c226ac9fafdc7eb. llvm-svn: 271422	2016-06-01 18:55:32 +00:00
George Burgess IV	18b83fe6cf	[CFLAA] Recognize builtin allocation functions. This patch extends CFLAA to recognize allocation functions such as malloc, free, etc, so we can treat them more aggressively. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D20776 llvm-svn: 271421	2016-06-01 18:39:54 +00:00
Daniel Berlin	e846c9dc52	Claim NoAlias if two GEPs index different fields of the same struct Patch by Taewook Oh Summary: Patch for Bug 27478. Make BasicAliasAnalysis claims NoAlias if two GEPs index different fields of the same structure. Reviewers: hfinkel, dberlin Subscribers: dberlin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20665 llvm-svn: 271415	2016-06-01 18:12:01 +00:00
Than McIntosh	4ef761aa35	Better fix for PR27903. Summary: Re-enable lifetime-start-on-first-use for stack coloring, but explicitly disable it for slots with more than one start or end lifetime marker. Bug: 27903 Reviewers: wmi, tejohnson, qcolombet, gbiv Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20739 llvm-svn: 271412	2016-06-01 17:55:10 +00:00
Michael Kuperstein	3a3c64d23e	[LV] For some IVs, use vector phis instead of widening in the loop body Previously, whenever we needed a vector IV, we would create it on the fly, by splatting the scalar IV and adding a step vector. Instead, we can create a real vector IV. This tends to save a couple of instructions per iteration. This only changes the behavior for the most basic case - integer primary IVs with a constant step. Differential Revision: http://reviews.llvm.org/D20315 llvm-svn: 271410	2016-06-01 17:16:46 +00:00
Reid Kleckner	5acacbb04f	[codeview] Translate basic DITypes to CV type records Summary: This is meant to be the tiniest step towards DIType to CV type index translation that I could come up with. Whenever translation fails, we use type index zero, which is the unknown type. Reviewers: aaboud, zturner Subscribers: llvm-commits, amccarth Differential Revision: http://reviews.llvm.org/D20840 llvm-svn: 271408	2016-06-01 17:05:51 +00:00
Sanjoy Das	10df497a1f	Reduce dependence on pointee types when deducing dereferenceability Summary: Change some of the internal interfaces in Loads.cpp to keep track of the number of bytes we're trying to prove dereferenceable using an explicit `Size` parameter. Before this, the `Size` parameter was implicitly inferred from the pointee type of the pointer whose dereferenceability we were trying to prove, causing us to be conservative around bitcasts. This was unfortunate since bitcast instructions are no-ops and should never break optimizations. With an explicit `Size` parameter, we're more precise (as shown in the test cases), and the code is simpler. We should eventually move towards a `DerefQuery` struct that groups together a base pointer, an offset, a size and an alignment; but this patch is a first step. Reviewers: apilipenko, dblaikie, hfinkel, reames Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20764 llvm-svn: 271406	2016-06-01 16:47:45 +00:00
Sanjoy Das	c2cf6ef8e2	[IR] Disallow loading and storing unsized types Summary: It isn't clear what is the operational meaning of loading or storing an unsized types, since it cannot be lowered into something meaningful. Since there does not seem to be any practical need for it either, make such loads and stores illegal IR. Reviewers: majnemer, chandlerc Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20846 llvm-svn: 271402	2016-06-01 16:13:10 +00:00
Simon Pilgrim	1cd61b82bd	[X86][SSE] Added non-temporal store tests for all 512-bit vector types llvm-svn: 271393	2016-06-01 13:58:00 +00:00
Simon Pilgrim	288be8bab6	[X86][SSE] Added non-temporal store tests for all 256-bit vector types Also added KNL AVX-512 checks llvm-svn: 271391	2016-06-01 13:20:25 +00:00
Simon Pilgrim	80f5335969	[X86][SSE] Added non-temporal store tests for all 128-bit integer vector types llvm-svn: 271389	2016-06-01 13:05:00 +00:00
Michael Zuckerman	6a894956fc	Adding back-end support to two bit scanning intrinsics Adding LLVM back-end support to two intrinsics dealing with bit scan: _bit_scan_forward and _bit_scan_reverse. Their functionality is as described in Intel intrinsics guide: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_bit_scan_forward&expand=371,370 https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_bit_scan_reverse&expand=371,370 Commit on behalf of Omer Paparo Bivas Differential Revision: http://reviews.llvm.org/D19915 llvm-svn: 271386	2016-06-01 12:02:37 +00:00
Oliver Stannard	92ca83cccd	[ARM] Add additional matching for UBFX instructions This adds an additional matcher to select UBFX(..) from SRL(AND(..)) in ARMISelDAGToDAG to help with code size. Patch by David Green. Differential Revision: http://reviews.llvm.org/D20667 llvm-svn: 271384	2016-06-01 12:01:01 +00:00
Chris Dewhurst	53bde954db	[Sparc] Allow passing of empty structs. Passing an empty struct as a function call argument is now supported. unit tests for various scenarios added. llvm-svn: 271374	2016-06-01 08:48:56 +00:00
Craig Topper	4f2d5a68d3	Revert r271362 "[AVX512] Remove masked load intrinsics. Clang now emits generic masked load intrinsics instead." Looks like something isn't quite right still. Also forgot to move the test cases to an autoupgrade test. llvm-svn: 271363	2016-06-01 05:57:55 +00:00
Craig Topper	dacd9d2bac	[AVX512] Remove masked load intrinsics. Clang now emits generic masked load intrinsics instead. The intrinsics will be autoupgraded to the same generic masked loads. llvm-svn: 271362	2016-06-01 05:35:16 +00:00
Peter Collingbourne	382d81cacf	IR: Allow multiple global metadata attachments with the same type. This will be necessary to allow the global merge pass to attach multiple debug info metadata nodes to global variables once we reverse the edge from DIGlobalVariable to GlobalVariable. Differential Revision: http://reviews.llvm.org/D20414 llvm-svn: 271358	2016-06-01 01:17:57 +00:00
Peter Collingbourne	cceae7feda	Add support for metadata attachments for global variables. This patch adds an IR, assembly and bitcode representation for metadata attachments for globals. Future patches will port existing features to use these new attachments. Differential Revision: http://reviews.llvm.org/D20074 llvm-svn: 271348	2016-05-31 23:01:54 +00:00
Matthias Braun	f9acacaa92	CodeGen: Refactor renameDisconnectedComponents() as a pass Refactor LiveIntervals::renameDisconnectedComponents() to be a pass. Also change the name to "RenameIndependentSubregs": - renameDisconnectedComponents() worked on a MachineFunction at a time so it is a natural candidate for a machine function pass. - The algorithm is testable with a .mir test now. - This also fixes a problem where the lazy renaming as part of the MachineScheduler introduced IMPLICIT_DEF instructions after the number of a nodes in a region were counted leading to a mismatch. Differential Revision: http://reviews.llvm.org/D20507 llvm-svn: 271345	2016-05-31 22:38:06 +00:00
Kevin B. Smith	ed0b620a65	[X86]: Add a pattern that uses GR16_ABCD rather than GR32_ABCD to avoid falsely marking whole 32 bit register as live. Differential Revision: http://reviews.llvm.org/D20649 llvm-svn: 271341	2016-05-31 22:00:12 +00:00
Matthias Braun	ce0bcb78e6	ARM: Improve/fix comment in recently added test. llvm-svn: 271340	2016-05-31 21:59:59 +00:00
Matthias Braun	fe725c9241	ARM: Do not attempt to modify register class of physregs. Physregs have no associated register class, do not attempt to modify it in Thumb2InstrInfo::storeRegToStackSlot()/loadFromStackSlot(). llvm-svn: 271339	2016-05-31 21:39:12 +00:00
Guozhi Wei	b994f4cdbc	[SLP] Pass in correct alignment when query memory access cost This patch fixes bug https://llvm.org/bugs/show_bug.cgi?id=27897. When query memory access cost, current SLP always passes in alignment value of 1 (unaligned), so it gets a very high cost of scalar memory access, and wrongly vectorize memory loads in the test case. It can be fixed by simply giving correct alignment. llvm-svn: 271333	2016-05-31 20:41:19 +00:00
Kevin Enderby	9acb109930	Change llvm-objdump, llvm-nm and llvm-size when reporting an object file error when the object is from a slice of a Mach-O Universal Binary use something like "foo.o (for architecture i386)" as part of the error message when expected. Also fixed places in these tools that were ignoring object file errors from MachOUniversalBinary::getAsObjectFile() when the code moved on to see if the slice was an archive. To do this MachOUniversalBinary::getAsObjectFile() and MachOUniversalBinary::getObjectForArch() were changed from returning ErrorOr<...> to Expected<...> then that was threaded up to its users. Converting these interfaces to Expected<> from ErrorOr<> does involve touching a number of places. To contain the changes for now the use of errorToErrorCode() is still used in two places yet to be fully converted. llvm-svn: 271332	2016-05-31 20:35:34 +00:00
George Burgess IV	a880146925	[CFLAA] Don't link GEP pointers to GEP indices. Code like the following is considered broken, and doesn't need to be supported by our AA magicks: void getFoo(int P) { int PAlias = (int )((char )NULL + (uintptr_t)P); } This patch makes CFLAA drop support for code like this. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D20775 llvm-svn: 271322	2016-05-31 19:55:05 +00:00
Ahmed Bougacha	96ef87e910	[CodeGen] Promote FMINNAN/FMAXNAN like other binops. We think it's OK to generate half fminnan because it's legal for the transform-to type (f32; r245196). However, PromoteFloatRes was missing the case; simply promote like the other binops, including minnum. llvm-svn: 271317	2016-05-31 18:50:25 +00:00
Reid Kleckner	fbdbe9e22b	[codeview] Improve readability of type record assembly Adds the method MCStreamer::EmitBinaryData, which is usually an alias for EmitBytes. In the MCAsmStreamer case, it is overridden to emit hex dump output like this: .byte 0x0e, 0x00, 0x08, 0x10 .byte 0x03, 0x00, 0x00, 0x00 .byte 0x00, 0x00, 0x00, 0x00 .byte 0x00, 0x10, 0x00, 0x00 Also, when verbose asm comments are enabled, this patch prints the dump output for each comment before its record, like this: # ArgList (0x1000) { # TypeLeafKind: LF_ARGLIST (0x1201) # NumArgs: 0 # Arguments [ # ] # } .byte 0x06, 0x00, 0x01, 0x12 .byte 0x00, 0x00, 0x00, 0x00 This should make debugging easier and testing more convenient. Reviewers: aaboud Subscribers: majnemer, zturner, amccarth, aaboud, llvm-commits Differential Revision: http://reviews.llvm.org/D20711 llvm-svn: 271313	2016-05-31 18:45:36 +00:00
Rafael Espindola	4d29099f7f	Delete AArch64II::MO_CONSTPOOL. A constant pool holding the address of a variable in equivalent to a got entry. It produces exactly the same instruction sequence as a got use and unlike a got use this is not uniqued by the linker. llvm-svn: 271311	2016-05-31 18:31:14 +00:00
Simon Dardis	6896d3ec5e	[mips] Remove tests which should have been deleted. The two xfail tests for mis32r6 & mips64r6 were supposed to be removed in r271301. llvm-svn: 271306	2016-05-31 17:52:29 +00:00
Simon Dardis	b60833c0ca	[mips] Enforce compact branch register restrictions Enforce compact branch register restrictions such as the use of the zero register, both operands being the same register. Emit clear error in such cases as the issue is subtle. For bovc and bnvc, silently fixup such cases when emitting objects directly, like LLVM started doing in rL269899. Reviewers: vkalintiris, dsanders Differential Review: http://reviews.llvm.org/D20475 llvm-svn: 271301	2016-05-31 17:34:42 +00:00
Chris Bieneman	6852775414	[obj2yaml][yaml2obj] Support for reading and dumping the MachO export trie The MachO export trie is a serially encoded trie keyed by symbol name. This code parses the trie and preserves the structure so that it can be dumped again. llvm-svn: 271300	2016-05-31 17:26:36 +00:00
Erik Eckstein	0c48dd8ca5	Fix a crash in MergeFunctions related to ordering of weak/strong functions The assumption, made in insert() that weak functions are always inserted after strong functions, is only true in the first round of adding functions. In subsequent rounds this is no longer guaranteed , because we might remove a strong function from the tree (because it's modified) and add it later, where an equivalent weak function already exists in the tree. This change removes the assert in insert() and explicitly enforces a weak->strong order. This also removes the need of two separate loops in runOnModule(). llvm-svn: 271299	2016-05-31 17:20:23 +00:00
Qin Zhao	1762eef572	[esan\|cfrag] Create the skeleton of cfrag variable for the runtime Summary: Creates a global variable containing preliminary information for the cache-fragmentation tool runtime. Passes a pointer to the variable (null if no variable is created) to the compilation unit init and exit routines in the runtime. Reviewers: aizatsky, bruening Subscribers: filcab, kubabrecka, bruening, kcc, vitalybuka, eugenis, llvm-commits, zhaoqin Differential Revision: http://reviews.llvm.org/D20541 llvm-svn: 271298	2016-05-31 17:14:02 +00:00
Rafael Espindola	7ad97b2fe4	Add a use of shouldAssumeDSOLocal to ARM. Now this code path knows about position independent executables. llvm-svn: 271290	2016-05-31 15:31:55 +00:00
Ranjeet Singh	16c24f4d6e	[ARM] Add backend support for load/store intrinsics. Added support to map intrinsics __builtin_arm_{ldc,ldcl,ldc2,ldc2l,stc,stcl,stc2,stc2l} to their ARM instructions. Differential Revision: http://reviews.llvm.org/D20564 llvm-svn: 271271	2016-05-31 12:39:30 +00:00
Simon Pilgrim	e05dc45897	[X86][SSE] Add load-folding patterns for (V)CVTDQ2PD (PR27291) Added patterns for (V)CVTDQ2PD -> 2f64 loading from a 64-bit source. llvm-svn: 271269	2016-05-31 12:04:35 +00:00
Simon Dardis	03676dc969	[mips] bnec/beqc register constraint fix beqc and bnec cannot have $rs == $rt. Inhibit compact branch creation if that would occur. Reviewers: vkalintiris, dsanders Differential Revision: http://reviews.llvm.org/D20624 llvm-svn: 271260	2016-05-31 09:54:55 +00:00
Igor Breger	73ee8ba9b0	[AVX512] Fix intrinsic vcvtps2ph lowering. Differential Revision: http://reviews.llvm.org/D20788 llvm-svn: 271255	2016-05-31 08:04:21 +00:00
Igor Breger	52bd1d5fcc	Fix intrinsic vbroadcast{i32\|f32}x2 lowering. Differential Revision: http://reviews.llvm.org/D20780 llvm-svn: 271254	2016-05-31 07:43:39 +00:00
Craig Topper	50f85c22c5	[AVX512] Remove masked store intrinsics. Clang now emits generic masked store intrinsics instead. The intrinsics will be autoupgraded to the same generic masked stores. llvm-svn: 271245	2016-05-31 01:50:02 +00:00
Saleem Abdulrasool	d2f705ddf9	X86: permit using SjLj EH on x86 targets as an option This adds support to the backed to actually support SjLj EH as an exception model. This is NOT the default model, and requires explicitly opting into it from the frontend. GCC supports this model and for MinGW can still be enabled via the `--using-sjlj-exceptions` options. Addresses PR27749! llvm-svn: 271244	2016-05-31 01:48:07 +00:00
Craig Topper	8287fd8abd	[X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions. llvm-svn: 271236	2016-05-30 23:15:56 +00:00
Craig Topper	39716f8358	[X86] Use update_llc_test_checks.py to re-generate a test in preparation for an upcoming commit. NFC llvm-svn: 271234	2016-05-30 22:54:14 +00:00
Rafael Espindola	4f1062adb8	Fix a crash when producing COFF. llvm-svn: 271229	2016-05-30 20:18:53 +00:00
Simon Pilgrim	d788c9d83d	[X86][XOP] Split off auto-upgraded xop intrinsics llvm-svn: 271228	2016-05-30 19:50:56 +00:00
Simon Pilgrim	582d75b0eb	[X86][SSE] Renamed pmovxrm tests These aren't intrinsics anymore - as discussed on D20686 llvm-svn: 271226	2016-05-30 19:14:37 +00:00
Simon Pilgrim	24da61058a	[X86][AVX2] Regenerated AVX2 extension tests llvm-svn: 271224	2016-05-30 18:49:57 +00:00
Simon Pilgrim	d64af65f6d	[X86][SSE] Updated storeu fast-isel tests to match clang builtin tests Since rL271214 the headers have no longer used the storeu intrinsic llvm-svn: 271222	2016-05-30 18:42:51 +00:00
Simon Pilgrim	4ed0e07b23	[X86][SSE2] Updated _mm_store_pd1/_mm_store1_pd fast-isel tests to match D20617 llvm-svn: 271220	2016-05-30 18:18:44 +00:00
Diana Picus	f353a5e06d	[BPF] Remove exit-on-error from tests (PR27768, PR27769) The exit-on-error flag is necessary to avoid some assertions/unreachables. We can get past them by creating a few dummy nodes. Fixes PR27768, PR27769. Differential Revision: http://reviews.llvm.org/D20726 llvm-svn: 271200	2016-05-30 08:28:34 +00:00
Craig Topper	f565d37607	[X86] Remove some unnecessary declarations for old intrinsics from a test. llvm-svn: 271175	2016-05-29 06:37:39 +00:00
Sanjoy Das	ae09b3cd4c	[IndVars] Eliminate op.with.overflow when possible (re-apply) Summary: If we can prove that an op.with.overflow intrinsic does not overflow, we can get rid of the intrinsic, and replace it with non-wrapping arithmetic. This was first checked in at r265913 but reverted in r265950 because it exposed some issues around how SCEV handled post-inc add recurrences. Those issues have now been fixed. Reviewers: atrick, regehr Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18685 llvm-svn: 271153	2016-05-29 00:36:25 +00:00
Sanjoy Das	f49ca52b9d	[SCEV] See through op.with.overflow intrinsics (re-apply) Summary: This change teaches SCEV to see reduce `(extractvalue 0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if possible). This was first checked in at r265912 but reverted in r265950 because it exposed some issues around how SCEV handled post-inc add recurrences. Those issues have now been fixed. Reviewers: atrick, regehr Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18684 llvm-svn: 271152	2016-05-29 00:34:42 +00:00
Sanjoy Das	7e4a64167d	[SCEV] Don't always add no-wrap flags to post-inc add recs Fixes PR27315. The post-inc version of an add recurrence needs to "follow the same rules" as a normal add or subtract expression. Otherwise we miscompile programs like ``` int main() { int a = 0; unsigned a_u = 0; volatile long last_value; do { a_u += 3; last_value = (long) ((int) a_u); if (will_add_overflow(a, 3)) { // Leave, and don't actually do the increment, so no UB. printf("last_value = %ld\n", last_value); exit(0); } a += 3; } while (a != 46); return 0; } ``` This patch changes SCEV to put no-wrap flags on post-inc add recurrences only when the poison from a potential overflow will go ahead to cause undefined behavior. To avoid regressing performance too much, I've assumed infinite loops without side effects is undefined behavior to prove poison<->UB equivalence in more cases. This isn't ideal, but is not new to LLVM as a whole, and far better than the situation I'm trying to fix. llvm-svn: 271151	2016-05-29 00:32:17 +00:00
Sanjoy Das	70c2bbd29c	[ValueTracking] ICmp instructions propagate poison This is a stripped down version of D19211, leaving out the questionable "branching in poison is UB" bit. llvm-svn: 271150	2016-05-29 00:31:18 +00:00
David Majnemer	8d106d5ea4	Update test to deal with non-zero exit codes llvm-svn: 271135	2016-05-28 19:02:12 +00:00
Simon Pilgrim	9602d678cb	[X86][SSE] (Reapplied) Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused. Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed. Differential Revision: http://reviews.llvm.org/D20686 llvm-svn: 271131	2016-05-28 18:03:41 +00:00
Mehdi Amini	bcc47419d9	ValueMapper: fix assertion when null-mapping a constant for linking metadata Summary: When RF_NullMapMissingGlobalValues is set, mapValue can return null for GlobalValue. When mapping the operands of a constant that is referenced from metadata, we need to handle this case and actually return null instead of mapping this constant. Reviewers: dexonsmith, rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20713 llvm-svn: 271129	2016-05-28 17:26:03 +00:00
Sanjay Patel	395eca8d26	[InstCombine] add tests to show bitcast interference llvm-svn: 271125	2016-05-28 16:10:37 +00:00
Rafael Espindola	52bd330500	Fix production of R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX. We were producing R_X86_64_GOTPCRELX for invalid instructions and sometimes producing R_X86_64_GOTPCRELX instead of R_X86_64_REX_GOTPCRELX. llvm-svn: 271118	2016-05-28 15:51:38 +00:00
Sanjay Patel	f49c7b1570	regenerate checks llvm-svn: 271117	2016-05-28 15:44:28 +00:00
Sanjay Patel	cbc4aa6bdd	join RUN lines; NFC llvm-svn: 271115	2016-05-28 15:34:05 +00:00
Sanjay Patel	97c2c108fd	[x86] avoid printing unnecessary sign bits of hex immediates in asm comments (PR20347) It would be better to check the valid/expected size of the immediate operand, but this is generally better than what we print right now. Differential Revision: http://reviews.llvm.org/D20385 llvm-svn: 271114	2016-05-28 14:58:37 +00:00
Ahmed Bougacha	a3dc1ba142	[X86] Try to zero elts when lowering 256-bit shuffle with PSHUFB. Otherwise we fallback to a blend of PSHUFBs later on. Differential Revision: http://reviews.llvm.org/D19661 llvm-svn: 271113	2016-05-28 14:38:04 +00:00
Rafael Espindola	fe796dca90	Fix default reloc model on ARM. llvm-svn: 271111	2016-05-28 10:41:15 +00:00
Petr Hosek	67a94a795d	[MC] Support symbolic expressions in assembly directives This matches the behavior of GNU assembler which supports symbolic expressions in absolute expressions used in assembly directives. Differential Revision: http://reviews.llvm.org/D20752 llvm-svn: 271102	2016-05-28 05:57:48 +00:00
Renato Golin	9be88629d5	Revert "Revert "Map DynamicNoPIC to Static on non-darwin."" This reverts commit r271096, as reverting it broke even more buildbots! But that also means I'll break on ARM again... :( llvm-svn: 271099	2016-05-28 04:47:13 +00:00
Renato Golin	4f22c51b09	Revert "Map DynamicNoPIC to Static on non-darwin." This reverts commit r271052, as it broke some ARM buildbots. llvm-svn: 271096	2016-05-28 04:24:26 +00:00
Sean Silva	02b9d892c5	Bring back r271090 in a way that doesn't depend on r271089. llvm-svn: 271092	2016-05-28 04:05:36 +00:00
Sean Silva	9dd4b5c51d	Revert r271089 and r271090. It was triggering an msan bot. Revert "[IRPGO] Set the function entry count metadata." This reverts commit r271090. Revert "[IRPGO] Centralize the function attribute inliner hint logic. NFC." This reverts commit r271089. llvm-svn: 271091	2016-05-28 03:56:25 +00:00
Sean Silva	7884633c5b	[IRPGO] Set the function entry count metadata. llvm-svn: 271090	2016-05-28 03:02:54 +00:00
Matt Arsenault	1ff389a7bf	AMDGPU: Cleanup vector insert/extract tests This mostly makes sure that 3-vector dynamic inserts and extracts are covered. llvm-svn: 271082	2016-05-28 00:51:06 +00:00
Matt Arsenault	7401516985	AMDGPU: Add fract intrinsic Remove broken patterns matching it. This was matching the unsafe math pattern and expanding the fix for the buggy instruction from the pattern. The problems are also on CI. Remove the workarounds and only use fract with unsafe math or from the intrinsic. llvm-svn: 271078	2016-05-28 00:19:52 +00:00
Xinliang David Li	d24c383ec0	Fix windows build bot failure llvm-svn: 271075	2016-05-28 00:03:35 +00:00
Xinliang David Li	d38392ecd6	[PM] Port the Sample FDO to new PM (part-2) llvm-svn: 271072	2016-05-27 23:20:16 +00:00
Evgeny Stupachenko	ea2aef4a1d	The patch refactors unroll pass. Summary: Unroll factor (Count) calculations moved to a new function. Early exits on pragma and "-unroll-count" defined factor added. New type of unrolling "Force" introduced (previously used implicitly). New unroll preference "AllowRemainder" introduced and set "true" by default. (should be set to false for architectures that suffers from it). Reviewers: hfinkel, mzolotukhin, zzheng Differential Revision: http://reviews.llvm.org/D19553 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 271071	2016-05-27 23:15:06 +00:00
Andrew Kaylor	04f8e06696	Update the stack coloring pass to remove lifetime intrinsics in the optnone/opt-bisect skip case. Differential Revision: http://reviews.llvm.org/D20453 llvm-svn: 271068	2016-05-27 22:56:49 +00:00
Rafael Espindola	f9bda6805b	Map DynamicNoPIC to Static on non-darwin. DynamicNoPIC was only every used on darwin. This maps it to static on ELF. It matches what is done on X86. llvm-svn: 271052	2016-05-27 21:44:18 +00:00
Xinliang David Li	2bd4f8b66b	FileCheck: dump command line context with empty input Differential Revision: http://reviews.llvm.org/D20716 llvm-svn: 271047	2016-05-27 21:23:25 +00:00
Petr Hosek	97859ccd51	Revert "[MC] Support symbolic expressions in assembly directives" This reverts commit r271028, it causes the directive_fill.s to fail. llvm-svn: 271038	2016-05-27 19:58:05 +00:00
Sanjoy Das	6fff9dc932	[GVN] Preserve !range metadata when PRE'ing loads Reviewers: dberlin, reames, george.burgess.iv Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20743 llvm-svn: 271034	2016-05-27 19:03:10 +00:00
Michael Kuperstein	a75c77b127	[X86] Detect SAD patterns and emit psadbw instructions. This recommits r267649 with a fix for PR27539. Differential Revision: http://reviews.llvm.org/D20598 llvm-svn: 271033	2016-05-27 18:53:22 +00:00
Petr Hosek	ec73d8b383	[MC] Support symbolic expressions in assembly directives This matches the behavior of GNU assembler which supports symbolic expressions in absolute expressions used in assembly directives. Differential Revision: http://reviews.llvm.org/D20656 llvm-svn: 271028	2016-05-27 18:49:44 +00:00
Tim Northover	32b4d15e0a	Move test to X86 directory: I think it depends on X86 TTI. llvm-svn: 271019	2016-05-27 16:56:54 +00:00
Tim Northover	10a1e8b1fe	Vectorizer: track non-fast FP instructions through phis when finding reductions. When we traced through a phi node looking for floating-point reductions, we forgot whether we'd ever seen an instruction without fast-math flags (that would block vectorization). This propagates it through to the end. llvm-svn: 271015	2016-05-27 16:40:27 +00:00
Dehao Chen	80b16d4135	Remove sample profile dependency to instcombine, which is not a analysis pass. Summary: This patch removes dependency from sample profile pass to instcombine pass. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20501 llvm-svn: 271009	2016-05-27 16:14:15 +00:00
Simon Pilgrim	7e67a22298	[X86][AVX] Removed some remains of old (pre-regeneration) filechecks llvm-svn: 271007	2016-05-27 15:56:19 +00:00
Than McIntosh	4daf7f13b6	Disable lifetime-start-on-first-use analysis. Summary: Turn off lifetime-start-on-first-use enhancement for the moment pending a fix for bug 27903. Bug: 27903 Reviewers: tejohnson, wmi, qcolombet, gbiv Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20731 llvm-svn: 271003	2016-05-27 15:27:51 +00:00
Simon Dardis	4ccda502d5	[mips] Weaken asm predicate for memory offsets The isMemWithSimmOffset predicate rejects relocations which is incorrect behaviour. Linkers and other tools should handle\|warn\|error when the field overflows. Reviewers: dsanders, vkalintiris Differential Revision: http://reviews.llvm.org/D20727 llvm-svn: 270995	2016-05-27 13:56:36 +00:00
Igor Laevsky	df9db45c94	[RewriteStatepointsForGC] All constant should have null base pointer Currently we consider that each constant has itself as a base value. I.e "base(const) = const". This introduces couple of problems when we are trying to avoid reporting constants in statepoint live sets: 1. When querying "base( phi(const1, const2) )" we will get "phi(const1, const2)" as a base pointer. Since it's not a constant we will record it in a stack map. However on practice we don't want this to happen (constant are never relocated). 2. base( phi(const, gc ptr) ) = phi( const, base(gc ptr) ). This particular case imposes challenge on our runtime - we don't expect to see constant base pointers other than null. This problems can be avoided by treating all constant as if they were derived from null pointer base. I.e in a first case we will not include constant pointer in a stack map at all. In a second case we will get "phi(null, base(gc ptr))" as a base pointer which is a lot more convenient. Differential Revision: http://reviews.llvm.org/D20584 llvm-svn: 270993	2016-05-27 13:13:59 +00:00
George Rimar	7951fbf1a8	Attemp to fix build bot after r270987 It was: "Recommit 270977 - [llvm-mc] - Teach llvm-mc to generate zlib styled compression sections." Fix: since test requires no zlib available and r270987 changed the compression flag for llvm-mc to mandatory specify the compression style, then just add 2 available styles to this test. llvm-svn: 270992	2016-05-27 12:52:30 +00:00
Artem Tamazov	7da9b82e02	[AMDGPU][llvm-mc] Square-braced-syntax for registers - make ":expr2" optional. Register numbers may be specified as assembly-time expressions. This feature can be useful in macros and alike. However, expressions are supported within sqare braces only. Sqare braces were initially intended to support specifying of multiple (pairs/quads...) registers. Syntax like v[8:8] which specifies single register is also supported. That allows expressions but looks a bit unnatural. This change supports syntax REG[EXPR]. Tests added. Differential Revision: http://reviews.llvm.org/D20588 llvm-svn: 270990	2016-05-27 12:50:13 +00:00
George Rimar	c91e38c5eb	Recommit 270977 - [llvm-mc] - Teach llvm-mc to generate zlib styled compression sections. Fix: updated clang code which was not updated by mistake. Original commit message: [llvm-mc] - Teach llvm-mc to generate zlib styled compression sections. This patch is strongly based on previously reverted D20331. (because of gnuutils < 2.26 does not support compressed debug sections in non zlib-gnu style) Difference that this patch supports both zlib and zlib-gnu styles. -compress-debug-sections option now supports next values: -compress-debug-sections=zlib-gnu -compress-debug-sections=zlib -compress-debug-sections=none Previously specifying -compress-debug-sections enabled zlib-gnu compression, so anyone can put "-compress-debug-sections=zlib-gnu" to restore the behavior that was before this patch for case when compression was enabled. Differential revision: http://reviews.llvm.org/D20676 llvm-svn: 270987	2016-05-27 12:27:32 +00:00
George Rimar	e79fc3efca	Revert r270977 ([llvm-mc] - Teach llvm-mc to generate zlib styled compression sections.) It broke buildbot: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/13585/steps/build/logs/stdio Initial commit message: [llvm-mc] - Teach llvm-mc to generate zlib styled compression sections. This patch is strongly based on previously reverted D20331. (because of gnuutils < 2.26 does not support compressed debug sections in non zlib-gnu style) Difference that this patch supports both zlib and zlib-gnu styles. -compress-debug-sections option now supports next values: -compress-debug-sections=zlib-gnu -compress-debug-sections=zlib -compress-debug-sections=none Previously specifying -compress-debug-sections enabled zlib-gnu compression, so anyone can put "-compress-debug-sections=zlib-gnu" to restore the behavior that was before this patch for case when compression was enabled. Differential revision: http://reviews.llvm.org/D20676 llvm-svn: 270978	2016-05-27 10:06:16 +00:00
George Rimar	48dcd2b806	[llvm-mc] - Teach llvm-mc to generate zlib styled compression sections. This patch is strongly based on previously reverted D20331. (because of gnuutils < 2.26 does not support compressed debug sections in non zlib-gnu style) Difference that this patch supports both zlib and zlib-gnu styles. -compress-debug-sections option now supports next values: -compress-debug-sections=zlib-gnu -compress-debug-sections=zlib -compress-debug-sections=none Previously specifying -compress-debug-sections enabled zlib-gnu compression, so anyone can put "-compress-debug-sections=zlib-gnu" to restore the behavior that was before this patch for case when compression was enabled. Differential revision: http://reviews.llvm.org/D20676 llvm-svn: 270977	2016-05-27 09:58:08 +00:00
Simon Pilgrim	4642a57fbf	Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) llvm-svn: 270976	2016-05-27 09:02:25 +00:00
Simon Pilgrim	c013e5737b	[X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused. A companion patch (D20684) removes/auto-upgrade the clang intrinsics. Differential Revision: http://reviews.llvm.org/D20686 llvm-svn: 270973	2016-05-27 08:49:15 +00:00
Peter Collingbourne	1eaa97f439	Linker: teach the IR mover to return llvm::Error. This will be needed in order to consistently return an Error to clients of the API being developed in D20268. Differential Revision: http://reviews.llvm.org/D20550 llvm-svn: 270967	2016-05-27 05:21:35 +00:00
Pete Cooper	1929b5539a	Form objc_storeStrong in the presence of bitcasts. objc_storeStrong can be formed from a sequence such as %0 = tail call i8* @objc_retain(i8* %p) nounwind %tmp = load i8, i8* @x, align 8 store i8* %0, i8** @x, align 8 tail call void @objc_release(i8* %tmp) nounwind The code was already looking through bitcasts for most of the values involved, but had missed one case where the pointer operand for the store was a bitcast. Ultimately the pointer for the load and store have to be the same value, after stripping casts. llvm-svn: 270955	2016-05-27 02:13:53 +00:00
Michael Zolotukhin	15e745133e	[LoopUnrollAnalyzer] Bail out instead of dying with assert when facing huge index. This fixes PR27902. llvm-svn: 270946	2016-05-27 00:55:16 +00:00
Rui Ueyama	6816367a27	pdbdump: print out the name of the stream 0. Differential Revision: http://reviews.llvm.org/D20712 llvm-svn: 270943	2016-05-27 00:32:07 +00:00
Rui Ueyama	9dc034dba1	pdbdump: Add -raw-all to enable all -raw-* flags. Differential Revision: http://reviews.llvm.org/D20707 llvm-svn: 270937	2016-05-26 23:26:55 +00:00
Mitch Bodart	05aeeb5cf1	[CodeGen] Fix problem with X86 byte registers in CriticalAntiDepBreaker CriticalAntiDepBreaker was not correctly tracking defs of the high X86 byte registers, leading to incorrect use of a busy register to break an antidependence. Fixes pr27681, and its duplicates pr27580, pr27804. Differential Revision: http://reviews.llvm.org/D20456 llvm-svn: 270935	2016-05-26 23:08:52 +00:00
Easwaran Raman	5fe04a1d8e	Attach profile summary in IR based instrumentation pass. Differential revision: http://reviews.llvm.org/D20655 llvm-svn: 270933	2016-05-26 22:57:11 +00:00
Michael Zolotukhin	1ecdedad8d	[LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost. Condition might be simplified to a Constant, but it doesn't have to be ConstantInt, so we should dyn_cast, instead of cast. This fixes PR27886. llvm-svn: 270924	2016-05-26 21:42:51 +00:00
Chris Bieneman	44474c48ac	[obj2yaml][yaml2obj] Support for MachO lazy bindings This adds support for YAML round tripping dyld info lazy bindings. The storage and format of these is the same as regular bind opcodes, they are just interpreted differently by dyld, and can have DONE opcodes in the middle of the opcode lists. llvm-svn: 270920	2016-05-26 21:29:39 +00:00
Chris Bieneman	659b35a5d8	[obj2yaml][yaml2obj] Support for MachO weak bindings This adds support for YAML round tripping dyld info weak bindings. The storage and format of these is the same as regular bind opcodes, they are just interpreted differently by dyld. llvm-svn: 270911	2016-05-26 20:50:05 +00:00
Rafael Espindola	732eeaf2a9	coff: fix weak alias to local. We were creating a weak external that tried to reference a static symbol. That would always fail to link with link.exe. We now create an external symbol in the same position as the local and refer to that. This works with link.exe and matches what gas does. llvm-svn: 270906	2016-05-26 20:31:00 +00:00
Chris Bieneman	524243d61e	[obj2yaml][yaml2obj] Support for MachO bind opcodes This adds support for YAML round tripping dyld info bind opcodes. Bind opcodes can have signed or unsigned LEB128 data, and they can have symbols associated with them. llvm-svn: 270901	2016-05-26 20:06:14 +00:00
Krzysztof Parzyszek	da0b9a959e	[Hexagon] Enable the post-RA scheduler The aggressive anti-dependency breaker can rename the restored callee- saved registers. To prevent this, mark these registers are live on all paths to the return/tail-call instructions, and add implicit use operands for them to these instructions. llvm-svn: 270898	2016-05-26 19:44:28 +00:00
Chad Rosier	14aa2ad1f4	[AArch64] Generate rev16/rev32 from bswap + srl when upper bits are known zero. Canonicalize (srl (bswap i32 x), 16) to (rotr (bswap i32 x), 16), if the high 16-bits of x are zero. Similarly, canonicalize (srl (bswap i64 x), 32) to (rotr (bswap i64 x), 32), if the high 32-bits of x are zero. test_rev_w_srl16: test_rev_w_srl16: and w8, w0, #0xffff and w8, w0, #0xffff rev w8, w8 ---> rev16 w0, w8 lsr w0, w8, #16 test_rev_x_srl32: test_rev_x_srl32: rev x8, x8 ---> rev32 x0, x8 lsr x0, x8, #32 llvm-svn: 270896	2016-05-26 19:41:33 +00:00
Changpeng Fang	71369b3a39	AMDGPU/SI: Enable load-store-opt by default. Summary: Enable load-store-opt by default, and update LIT tests. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D20694 llvm-svn: 270894	2016-05-26 19:35:29 +00:00
Michael Kuperstein	ae21491819	[BasicAA] Extend inbound GEP negative offset logic to GlobalVariables r270777 improved the precision of alloca vs. inbounbds GEP alias queries: if we have (a) an inbounds GEP and (b) a pointer based on an alloca, and the beginning of the object the GEP points to would have a negative offset with respect to the alloca, then the GEP can not alias pointer (b). This makes the same logic fire when (b) is based on a GlobalVariable instead of an alloca. Differential Revision: http://reviews.llvm.org/D20652 llvm-svn: 270893	2016-05-26 19:30:49 +00:00
David Majnemer	d99068d26d	[MemCpyOpt] Don't perform callslot optimization across may-throw calls An exception could prevent a store from occurring but MemCpyOpt's callslot optimization would fire anyway, causing the store to occur. This fixes PR27849. llvm-svn: 270892	2016-05-26 19:24:24 +00:00
Rafael Espindola	30c080a085	coff: fix the section of weak symbols. llvm-svn: 270889	2016-05-26 18:48:23 +00:00
Michael Kuperstein	9a81b62a01	[BBVectorize] Don't vectorize selects with a scalar condition and vector operands. This fixes PR27879. Differential Revision: http://reviews.llvm.org/D20659 llvm-svn: 270888	2016-05-26 18:43:57 +00:00
Krzysztof Parzyszek	729e7ad31f	Add test/CodeGen/MIR/Hexagon/lit.local.cfg Require that Hexagon is a registered target. llvm-svn: 270887	2016-05-26 18:35:45 +00:00
Krzysztof Parzyszek	143f684a79	Do not rename registers that do not start an independent live range llvm-svn: 270885	2016-05-26 18:22:53 +00:00
Rafael Espindola	6ddf5f4437	coff: fix the value of weak definitions. It looks like this doesn't get a lot of use. llvm-svn: 270883	2016-05-26 18:04:53 +00:00
David Majnemer	7f32420ed5	[CaptureTracking] Volatile operations capture their memory location The memory location that corresponds to a volatile operation is very special. They are observed by the machine in ways which we cannot reason about. Differential Revision: http://reviews.llvm.org/D20555 llvm-svn: 270879	2016-05-26 17:36:22 +00:00
Artem Belevich	49e9a81236	[NVPTX] Added NVVMIntrRange pass NVVMIntrRange adds !range metadata to calls of NVVM intrinsics that return values within known limited range. This allows LLVM to generate optimal code for indexing arrays based on tid/ctaid which is a frequently used pattern in CUDA code. Differential Revision: http://reviews.llvm.org/D20644 llvm-svn: 270872	2016-05-26 17:02:56 +00:00
Artem Tamazov	6edc135d0f	[AMDGPU][llvm-mc] s_getreg/setreg* - hwreg - factor out strings/literals etc. Hwreg(...) syntax implementation unified with sendmsg(...). Common strings moved to Utils MathExtras.h functionality utilized. Added missing build dependency in Disassembler. Differential Revision: http://reviews.llvm.org/D20381 llvm-svn: 270871	2016-05-26 17:00:33 +00:00
Simon Pilgrim	cf340bd9c1	[X86][SSE] When lowering a 256-bit shuffle as PMOVZX, reduce the input vector to the lower 128-bit subvector. Most often as not this is what it started out as, the extraction is zero-cost on AVX and the PMOVZX/PMOVSX folding logic is based around 128-bit loads. llvm-svn: 270858	2016-05-26 15:40:36 +00:00
Diana Picus	81bc3170e8	[AMDGPU] Remove exit-on-error flag from test (PR27762) Similar to r269948, but for argument lowering. Fixes PR27762 Differential Revision: http://reviews.llvm.org/D20430 llvm-svn: 270856	2016-05-26 15:24:55 +00:00
Diana Picus	20a8d8e97e	[BPF] Remove exit-on-error flag in test (PR27767) The exit-on-error flag is needed to avoid an assert where llvm::SelectionDAGISel::LowerArguments doesn't create enough arguments. Fill up with zeroes to reach the right number of args. Fixes PR27767. Differential Revision: http://reviews.llvm.org/D20571 llvm-svn: 270855	2016-05-26 15:23:50 +00:00
Chad Rosier	e5819e2732	[InstCombine] Catch more bswap cases missed due to zext and truncs. Fixes PR27824. Differential Revision: http://reviews.llvm.org/D20591. llvm-svn: 270853	2016-05-26 14:58:51 +00:00
Simon Pilgrim	50c37ceb3b	[X86][SSE] Added load_zext_16i8_to_8i32 test Odd issue with input vector not being folded into pmovzx on AVX2+ targets llvm-svn: 270852	2016-05-26 14:45:30 +00:00
Teresa Johnson	28c03b56ec	[ThinLTO] Resolve LinkOnceAny Summary: Ensure we keep prevailing copy of LinkOnceAny by converting it to WeakAny. Rename odr_resolution test to the now more appropriate weak_resolution (weak in the linker sense includes linkonce). Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D20634 llvm-svn: 270850	2016-05-26 14:16:52 +00:00
Chad Rosier	816a67da49	[AArch64] Generate a BFI/BFXIL from 'or (and X, MaskImm), OrImm'. If and only if the value being inserted sets only known zero bits. This combine transforms things like and w8, w0, #0xfffffff0 movz w9, #5 orr w0, w8, w9 into movz w8, #5 bfxil w0, w8, #0, #4 The combine is tuned to make sure we always reduce the number of instructions. We avoid churning code for what is expected to be performance neutral changes (e.g., converted AND+OR to OR+BFI). Differential Revision: http://reviews.llvm.org/D20387 llvm-svn: 270846	2016-05-26 13:27:56 +00:00
Rafael Espindola	a224de06bc	Use shouldAssumeDSOLocal on AArch64. This reduces code duplication and now AArch64 also handles PIE. llvm-svn: 270844	2016-05-26 12:42:55 +00:00
Igor Breger	8437bb70fd	[AVX512] Fix intrinsic cmp{sd\|ss} lowering. Differential Revision: http://reviews.llvm.org/D20615 llvm-svn: 270843	2016-05-26 12:42:25 +00:00
Simon Pilgrim	ab3809193c	[X86][F16C] Added F16C fast-isel tests to match clang/test/CodeGen/f16c-builtins.c llvm-svn: 270837	2016-05-26 10:26:56 +00:00
Simon Pilgrim	0e4fdc0842	[X86][AVX2] Added gather fast-isel tests to match clang/test/CodeGen/avx2-builtins.c llvm-svn: 270835	2016-05-26 10:07:05 +00:00
David Majnemer	474512576e	[MergedLoadStoreMotion] Don't transform across may-throw calls It is unsafe to hoist a load before a function call which may throw, the throw might prevent a pointer dereference. Likewise, it is unsafe to sink a store after a call which may throw. The caller might be able to observe the difference. This fixes PR27858. llvm-svn: 270828	2016-05-26 07:11:09 +00:00
Adam Nemet	c68534bd13	[ConstantFold] Fix incorrect index rewrites for GEPs Summary: If an index for a vector or array type is out-of-range GEP constant folding tries to factor it into preceding dimensions. The code however does not consider addressing of structure field padding which should not qualify as out-of-range index. As demonstrated by the testcase, this can occur if the indexing performed on a vector type and the preceding index is an array type. SROA generates GEPs for example involving padding bytes as it slices an alloca. My fix disables this folding if the element type is a vector type. I believe that this is the only way we can end up with padding. (We have no access to DataLayout so I am not sure if there is actual robust way of actually checking the presence of padding.) Reviewers: majnemer Subscribers: llvm-commits, Gerolf Differential Revision: http://reviews.llvm.org/D20663 llvm-svn: 270826	2016-05-26 07:08:05 +00:00
Peter Collingbourne	b9aa1f4a03	MemorySSA: Revert r269678 and r268068; replace with special casing in MemorySSA. It turns out that too many passes are relying on alias analysis results for control dependencies. Until we fix that by introducing a more accurate modelling of control dependencies, special case assume in MemorySSA instead. Also introduce tests to ensure we don't regress the FunctionAttrs or LICM passes. Differential Revision: http://reviews.llvm.org/D20658 llvm-svn: 270823	2016-05-26 04:58:46 +00:00
Teresa Johnson	683abe79b2	[ThinLTO/gold] Handle bitcode archives Summary: Several changes were required for ThinLTO links involving bitcode archive static libraries. With this patch clang/llvm bootstraps with ThinLTO and gold. The first is that the gold callbacks get_input_file and release_input_file can normally be used to get file information for each constituent bitcode file within an archive. However, these interfaces lock the underlying file and can't be for each archive constituent for ThinLTO backends where we get all the input files up front and don't release any until after the backend threads complete. However, it is sufficient to only get and release once per file, and then each consituent bitcode file can be accessed via get_view. This required saving some information to identify which file handle is the "leader" for each claimed file sharing the same file descriptor, and other information so that get_input_file isn't necessary later when processing the backends. Second, the module paths in the index need to distinguish between different constituent bitcode files within the same archive file, otherwise they will all end up with the same archive file path. Do this by appending the offset within the archive for the start of the bitcode file, returned by get_input_file when we claim each bitcode file, and saving that along with the file handle. Third, rather than have the function importer try to load a file based on the module path identifier (which now contains a suffix to distinguish different bitcode files within an archive), use a custom module loader. This is the same approach taken in libLTO, and I am using the support refactored into the new LTO.h header in r270509. The module loader parses the bitcode files out of the memory buffers returned from gold via the get_view callback and saved in a map. This also means that we call the function importer directly, rather than add it to the pass pipeline (which was in the plan to do already for other reasons). Reviewers: pcc, joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D20559 llvm-svn: 270814	2016-05-26 01:46:41 +00:00
Saleem Abdulrasool	fbf920f9b4	llvm-objdump: support dumping AUX records for weak externals This is a support COFF feature. Ensure that we can display the weak externals auxiliary symbol. It contains useful information (such as the default binding and how to resolve the symbol). This reapplies the previous patch with a modification which hopefully should fix the endianness issues. The variadic call would promote the ulittle32_t to a uint32_t which would lose the byte-swapping behaviour desired. llvm-svn: 270813	2016-05-26 01:45:12 +00:00
David Blaikie	2274808153	PR11740: Disable assembly debug info when assembly already contains line directives If there is already debug info in the assembly file, and user hope to use -g option for compiling, we think we should not directly report an error. According to what GNU assembler did, it just reused the debug info in the assembly file, and turned off the DEBUG_TYPE option so that there will be no new debug info emitted by assembler. This fix is just as what GNU assembler did. The concern is the situation that there are two .text sections in the assembly file, one with debug info and the other one without. Currently with this fix, the assembler will no longer generate any debug info for the second .text section. And this is what GNU assembler exactly did for this situation. So I think this still make some sense. Patch by Zhizhou Yang! Differential Revision: http://reviews.llvm.org/D20002 llvm-svn: 270806	2016-05-26 00:22:26 +00:00
Sanjoy Das	a099268e85	[IRCE] Optimize conjunctions of range checks After this change, we do the expected thing for cases like ``` Check0Passed = /* range check IRCE can optimize / Check1Passed = / range check IRCE can optimize */ if (!(Check0Passed && Check1Passed)) throw_Exception(); ``` llvm-svn: 270804	2016-05-26 00:09:02 +00:00
Davide Italiano	1021c68e92	[PM] Port PartiallyInlineLibCalls to the new pass manager. llvm-svn: 270798	2016-05-25 23:38:53 +00:00
Reid Kleckner	63d3d6df7d	Revert "[MC] Support symbolic expressions in assembly directives" This reverts commit r270786, it causes the directive_fill.s to fail. llvm-svn: 270795	2016-05-25 23:29:08 +00:00
Reid Kleckner	5d122f872d	[codeview] Use comdats for debug info describing comdat functions Summary: This allows the linker to discard unused symbol information for comdat functions that were discarded during the link. Before this change, searching for the name of an inline function in the debugger would return multiple results, one per symbol subsection in the object file. After this change, there is only one result, the result for the function chosen by the linker. Reviewers: zturner, majnemer Subscribers: aaboud, amccarth, llvm-commits Differential Revision: http://reviews.llvm.org/D20642 llvm-svn: 270792	2016-05-25 23:16:12 +00:00
Manman Ren	b5d7ff4fa3	Objective-C Class Properties: Autoupgrade "Class Properties" module flag. When we have "Image Info Version" module flag but don't have "Class Properties" module flag, set "Class Properties" module flag to 0, so we can correctly emit errors when one module has the flag set and another module does not. rdar://26469641 llvm-svn: 270791	2016-05-25 23:14:48 +00:00
Petr Hosek	e25837528b	[MC] Support symbolic expressions in assembly directives This matches the behavior of GNU assembler which supports symbolic expressions in absolute expressions used in assembly directives. Differential Revision: http://reviews.llvm.org/D20337 llvm-svn: 270786	2016-05-25 22:47:51 +00:00
Michael Kuperstein	82069c44ca	[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries If a we have (a) a GEP and (b) a pointer based on an alloca, and the beginning of the object the GEP points would have a negative offset with repsect to the alloca, then the GEP can not alias pointer (b). For example, consider code like: struct { int f0, int f1, ...} foo; ... foo alloca; foo random = bar(alloca); int f0 = &alloca.f0 int f1 = &random->f1; Which is lowered, approximately, to: %alloca = alloca %struct.foo %random = call %struct.foo @random(%struct.foo* %alloca) %f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0 %f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1 Assume %f1 and %f0 alias. Then %f1 would point into the object allocated by %alloca. Since the %f1 GEP is inbounds, that means %random must also point into the same object. But since %f0 points to the beginning of %alloca, the highest %f1 can be is (%alloca + 3). This means %random can not be higher than (%alloca - 1), and so is not inbounds, a contradiction. Differential Revision: http://reviews.llvm.org/D20495 llvm-svn: 270777	2016-05-25 22:23:08 +00:00
Adrian Prantl	6ee02c7fce	PR26055: Speed up LiveDebugValues by replacing lists with bitvectors. This patch modifies the LiveDebugValues pass to use more efficient set data structures as outlined in PR26055. Both VarLocSet and VarLocList are now SparseBitVectors which allows us to perform much faster bitvector arithmetic on them. The speedup can be in the order of minutes especially on ASANified code. The change is not NFC in the assembler output because the inserted DBG_VALUEs are now sorted by variable and location. Many thanks to Daniel Berlin for helping design the improved algorithm and reviewing the patch. https://llvm.org/bugs/show_bug.cgi?id=26055 http://reviews.llvm.org/D20178 rdar://problem/24091200 llvm-svn: 270776	2016-05-25 22:21:12 +00:00
Hal Finkel	2f6886844e	Look for a loop's starting location in the llvm.loop metadata Getting accurate locations for loops is important, because those locations are used by the frontend to generate optimization remarks. Currently, optimization remarks for loops often appear on the wrong line, often the first line of the loop body instead of the loop itself. This is confusing because that line might itself be another loop, or might be somewhere else completely if the body was inlined function call. This happens because of the way we find the loop's starting location. First, we look for a preheader, and if we find one, and its terminator has a debug location, then we use that. Otherwise, we look for a location on an instruction in the loop header. The fallback heuristic is not bad, but will almost always find the beginning of the body, and not the loop statement itself. The preheader location search often fails because there's often not a preheader, and even when there is a preheader, depending on how it was formed, it sometimes carries the location of some preceeding code. I don't see any good theoretical way to fix this problem. On the other hand, this seems like a straightforward solution: Put the debug location in the loop's llvm.loop metadata. A companion Clang patch will cause Clang to insert llvm.loop metadata with appropriate locations when generating debugging information. With these changes, our loop remarks have much more accurate locations. Differential Revision: http://reviews.llvm.org/D19738 llvm-svn: 270771	2016-05-25 21:42:37 +00:00
Simon Pilgrim	d6469e3467	[X86][SSE41] Removed pblendw intrinsics tests - they are auto-upgraded Equivalent tests included in sse41-intrinsics-x86-upgrade.ll - the i8/i32 immediate diff doesn't matter anymore llvm-svn: 270767	2016-05-25 21:27:58 +00:00
Simon Pilgrim	fa814259ad	[X86][SSE41] Regenerated intrinsics tests llvm-svn: 270764	2016-05-25 21:21:51 +00:00
Ahmed Bougacha	201b97f550	[TLI] Also cover Linux 64 libfunc (stat64, ...) prototype checking. My script missed those in r270750. llvm-svn: 270763	2016-05-25 21:16:33 +00:00
Simon Pilgrim	1bed207f88	[X86][SSE41] Removed blendpd/blendps intrinsics tests - they are auto-upgraded Equivalent tests included in sse41-intrinsics-x86-upgrade.ll llvm-svn: 270761	2016-05-25 21:06:36 +00:00
Mehdi Amini	3d4f3a0da9	IRLinker: fix double scheduling of mapping a global value because of an alias This test was hitting an assertion in the value mapper because the IRLinker was trying to map two times @A while materializing the initializer for @C. Fix http://llvm.org/PR27850 Differential Revision: http://reviews.llvm.org/D20586 llvm-svn: 270757	2016-05-25 21:00:44 +00:00
Simon Pilgrim	971abe8256	[X86][AVX2] Regenerate avx2 vector shift tests llvm-svn: 270756	2016-05-25 21:00:40 +00:00
Ahmed Bougacha	1fe3f1ca50	[TLI] Fix NumParams==0 prototype checking typo. There was a typo in r267758. It caused invalid accesses when given something like "void @free(...)", as NumParams == 0, and we then try to look at the 0th parameter. Turns out, most of these were untested; add both attribute and missing-prototype checks for all libc libfuncs. Differential Revision: http://reviews.llvm.org/D20543 llvm-svn: 270750	2016-05-25 20:22:45 +00:00
Rafael Espindola	84f0562064	Fix shouldAssumeDSOLocal for private linkage. llvm-svn: 270746	2016-05-25 19:55:16 +00:00
Reid Kleckner	c0a0363d5c	[IR] Copy comdats in GlobalObject::copyAttributesFrom This is probably correct for all uses except cross-module IR linking, where we need to move the comdat from the source module to the destination module. Fixes PR27870. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D20631 llvm-svn: 270743	2016-05-25 18:36:22 +00:00
Matt Arsenault	e57206d81b	AMDGPU: Fix v2i64/v2f64 bitcasts These operations tend to get promoted away to v4i32 so this doesn't happen often. llvm-svn: 270740	2016-05-25 18:07:36 +00:00
Matt Arsenault	d89c99c26a	AMDGPU: Fix missing br_cc i1 test coverage Also un xfail a test. llvm-svn: 270739	2016-05-25 17:58:27 +00:00
Chad Rosier	e5314a94eb	[SelectionDAG] Add smarts for BSWAP in computeKnownBits. llvm-svn: 270738	2016-05-25 17:52:38 +00:00
Matt Arsenault	4578d6a9e1	AMDGPU: Make vectorization defeating test changes Simplifies test updates in the future. llvm-svn: 270736	2016-05-25 17:42:39 +00:00
Matt Arsenault	1cc4991412	AMDGPU: Fix inconsistent lowering of select of vectors f32 vectors would use a sequence of BFI instructions instead of unrolled cmp + select. This was better in the case of a VALU select with SGPR inputs, but we don't have a way of dealing with that in the DAG. llvm-svn: 270731	2016-05-25 17:34:58 +00:00
Sanjay Patel	aedc347b29	[x86] avoid code explosion from LoopVectorizer for gather loop (PR27826) By making pointer extraction from a vector more expensive in the cost model, we avoid the vectorization of a loop that is very likely to be memory-bound: https://llvm.org/bugs/show_bug.cgi?id=27826 There are still bugs related to this, so we may need a more general solution to avoid vectorizing obviously memory-bound loops when we don't have HW gather support. Differential Revision: http://reviews.llvm.org/D20601 llvm-svn: 270729	2016-05-25 17:27:54 +00:00
Chris Bieneman	e8e7555b10	[obj2yaml] [yaml2obj] MachO support for rebase opcodes This is the first bit of support for MachO __LINKEDIT segment data. llvm-svn: 270724	2016-05-25 17:09:07 +00:00
Tim Shen	fa57367ae5	Move and add comments to the top for tailcall-string-rvo.ll Differential Revision: http://reviews.llvm.org/D20311 llvm-svn: 270722	2016-05-25 17:01:09 +00:00
Hal Finkel	6f3387f434	[SDAG] Add a fallback multiplication expansion LegalizeIntegerTypes does not have a way to expand multiplications for large integer types (i.e. larger than twice the native bit width). There's no standard runtime call to use in that case, and so we'd just assert. Unfortunately, as it turns out, it is possible to hit this case from standard-ish C code in rare cases. A particular case a user ran into yesterday involved an __int128 induction variable and a loop with a quadratic (not linear) recurrence which triggered some backend logic using SCEVExpander. In this case, the BinomialCoefficient code in SCEV generates some i129 variables, which get widened to i256. At a high level, this is not actually good (i.e. the underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop in question for performance reasons), but regardless, the backend shouldn't crash because of cost-modeling issues in the optimizer. This is a straightforward implementation of the multiplication expansion, based on the algorithm in Hacker's Delight. I validated it against the code for the mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using random inputs. There should be no functional change for previously-working code (the new expansion code only replaces an assert). Fixes PR19797. llvm-svn: 270720	2016-05-25 16:50:22 +00:00
Teresa Johnson	bef0eb001b	[ThinLTO] Fix test check prefix so that intended prefix tested There aren't any checks with prefix PROMOTE, should be PROMOTE_MOD1 which wasn't being tested (but works as expected). llvm-svn: 270719	2016-05-25 16:45:08 +00:00
Sanjay Patel	3955360b24	[x86, AVX] allow explicit calls to VZERO* to modify state in VZeroUpperInserter pass (PR27823) As noted in the review, there are still problems, so this doesn't the bug completely. Differential Revision: http://reviews.llvm.org/D20529 llvm-svn: 270718	2016-05-25 16:39:47 +00:00
Simon Pilgrim	11081c98a3	[X86][AVX] Sync with clang/test/CodeGen/avx2-builtins.c Only tests for the gather intrinsic are still to be added llvm-svn: 270710	2016-05-25 15:30:08 +00:00
Oleg Ranevskyy	eb4eccae5c	[SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}" Summary: Description This makes `WidenIV::widenIVUse` (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code. When `WidenIV::widenIVUse` gets a NarrowUse such as `{(-2 + %inc.lcssa),+,1}<nsw><%for.body3>`, it first tries to get a wide recurrence for it via the `getWideRecurrence` call. `getWideRecurrence` returns recurrence like this: `{(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>`. Then a wide use operation is generated by `cloneIVUser`. The generated wide use is evaluated to `{(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>`, which is different from the `getWideRecurrence` result. `cloneIVUser` sees the difference and returns nullptr. This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction. Minimal reproducer: ``` int foo(int a, int b, int c); int baz(); void bar() { int arr[20]; int i = 0; for (i = 0; i < 4; ++i) arr[i] = baz(); for (; i < 20; ++i) arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]); } ``` Clang command line: ``` clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir ``` Expected result: The ` -mllvm -debug` log shows that all the IV's for the second `for` loop have been eliminated. Reviewers: sanjoy Subscribers: atrick, asl, aemerson, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D20058 llvm-svn: 270695	2016-05-25 13:01:33 +00:00
Simon Pilgrim	1bcf9847a4	[X86][AVX2] Added more fast-isel tests to match clang/test/CodeGen/avx2-builtins.c llvm-svn: 270685	2016-05-25 10:56:23 +00:00
Simon Pilgrim	c7dcbdc08a	[X86][AVX2] Begun adding fast-isel tests to match clang/test/CodeGen/avx2-builtins.c llvm-svn: 270683	2016-05-25 10:15:06 +00:00
Simon Pilgrim	4d1e258097	[X86][SSE2] Use storeu intrinsics for _mm_storeu_pd/_mm_storeu_pd tests Also fixed name of _mm_store1_pd test llvm-svn: 270681	2016-05-25 09:42:29 +00:00
Simon Pilgrim	f0ba364fb9	[X86][SSE] Use storeu intrinsics for _mm_storeu_ps test llvm-svn: 270680	2016-05-25 09:28:06 +00:00
Simon Pilgrim	4298d06d0f	[X86][SSE] Replace (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) lossless conversion intrinsics with generic IR Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead. Differential Revision: http://reviews.llvm.org/D20568 llvm-svn: 270678	2016-05-25 08:59:18 +00:00
Craig Topper	12e322a8cf	[X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a long time. llvm-svn: 270677	2016-05-25 06:56:32 +00:00
David Majnemer	124bdb7497	[FunctionAttrs] Volatile loads should disable readonly A volatile load has side effects beyond what callers expect readonly to signify. For example, it is not safe to reorder two function calls which each perform a volatile load to the same memory location. llvm-svn: 270671	2016-05-25 05:53:04 +00:00
Zachary Turner	d3076ab36f	[llvm-pdbdump] Decipher the remaining PDB streams. We know at least know the meaning of every stream of the PDB file. Yay! llvm-svn: 270669	2016-05-25 05:49:48 +00:00
Saleem Abdulrasool	546e64affd	Revert "llvm-objdump: support dumping AUX records for weak externals" Revert it until we can figure out the endianness issue. llvm-svn: 270667	2016-05-25 05:45:02 +00:00
Zachary Turner	c9972c64f5	[llvm-pdbdump] Dump the IPI stream and all records. llvm-svn: 270661	2016-05-25 04:35:22 +00:00
Rui Ueyama	b12b158f20	pdbdump: fix bug in name hash table. name_ids() did not return all IDs but only the first NameCount items. The number of non-zero entries in IDs vector is NameCount, but it does not mean that all non-zero entries are at the beginning of IDs vector. Differential Revision: http://reviews.llvm.org/D20611 llvm-svn: 270656	2016-05-25 04:07:17 +00:00
Zachary Turner	c59261ca37	[llvm-pdbdump] Stream 0 isn't actually the MSF superblock. Oddly enough, I realized we don't actually know what stream 0 is (if anything). llvm-svn: 270655	2016-05-25 03:53:16 +00:00
Saleem Abdulrasool	82dd8bae71	test: use a binary file instead Generate the obj rather than use yaml2obj. Hopefully, this fixes the PPC64 test failures. llvm-svn: 270654	2016-05-25 03:48:07 +00:00
Zachary Turner	85ed80b9e6	[llvm-pdbdump] Dump stream summary list. Try to figure out what each stream is, and dump its name. This gives us a better picture of what streams we still don't understand. llvm-svn: 270653	2016-05-25 03:43:17 +00:00
Saleem Abdulrasool	e7e467aaa9	llvm-objdump: support dumping AUX records for weak externals This is a support COFF feature. Ensure that we can display the weak externals auxiliary symbol. It contains useful information (such as the default binding and how to resolve the symbol). llvm-svn: 270648	2016-05-25 01:59:32 +00:00
Davide Italiano	655a145e83	[PM] Port BDCE to the new pass manager. llvm-svn: 270647	2016-05-25 01:57:04 +00:00
Derek Bruening	5662b93985	[esan\|wset] EfficiencySanitizer working set tool fastpath Summary: Adds fastpath instrumentation for esan's working set tool. The instrumentation for an intra-cache-line load or store consists of an inlined write to shadow memory bits for the corresponding cache line. Adds a basic test for this instrumentation. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D20483 llvm-svn: 270640	2016-05-25 00:17:24 +00:00
Richard Smith	b910e56604	Revert r270569 (teach llvm-mc to generate compressed debug sections in zlib style). It appears that current ELF linkers are not ready for this. llvm-svn: 270638	2016-05-25 00:14:12 +00:00
Dan Gohman	d530f68d45	[WebAssembly] Put __stack_pointer in the offset field of loads and stores. Instead of this: i32.const $push10=, __stack_pointer i32.load $push11=, 0($pop10) Emit this: i32.const $push10=, 0 i32.load $push11=, __stack_pointer($pop10) It's not currently clear which is better, though there's a chance the second form may be better at overall compression. We can revisit this when we have more data; for now it makes sense to make PEI consistent with isel. Differential Revision: http://reviews.llvm.org/D20411 llvm-svn: 270635	2016-05-24 23:47:41 +00:00
Michael Zolotukhin	8f7a242c7b	Re-enable "[LoopUnroll] Enable advanced unrolling analysis by default" one more time. This reverts commit r270577. llvm-svn: 270630	2016-05-24 23:00:05 +00:00
Michael Zolotukhin	7216dd4668	[LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst. This fixes PR27847. Now for real. llvm-svn: 270629	2016-05-24 22:59:58 +00:00
Zachary Turner	4caa1bf0bd	[codeview] Add support for new type records. This adds support for parsing and dumping the following symbol types: S_LPROCREF S_ENVBLOCK S_COMPILE2 S_REGISTER S_COFFGROUP S_SECTION S_THUNK32 S_TRAMPOLINE As of this patch, the test PDB files no longer have any unknown symbol types. llvm-svn: 270628	2016-05-24 22:58:46 +00:00
Derek Bruening	0b872d9399	[esan] Add calls from the ctor/dtor to the runtime library Summary: Adds createEsanInitToolGV for creating a tool-specific variable passed to the runtime library. Adds dtor "esan.module_dtor" and inserts calls from the dtor to "__esan_exit" in the runtime library. Updates the EfficiencySanitizer test. Patch by Qin Zhao. Reviewers: aizatsky Subscribers: bruening, kcc, vitalybuka, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D20488 llvm-svn: 270627	2016-05-24 22:48:24 +00:00
David Blaikie	c53e18d93a	DWARF: Omit DW_AT_APPLE attributes (except ObjC ones) when not targeting LLDB These attributes aren't used by other debuggers (& may be confused with other DWARF extensions) so they just waste space (about 1.5% on .dwo file size on a random large program I tested). We could remove the ObjC property ones too, but I figured they were probably more necessary when trying to understand ObjC (I could be wrong though) & so any debugger interested in working with ObjC would use them, perhaps? (also, there are some legacy tests in Clang that test for them - making it one of those annoying cross-project commits and/or cleanup to refactor those tests) llvm-svn: 270613	2016-05-24 21:19:28 +00:00
Zachary Turner	96e60f7573	[llvm-pdbdump] Rework command line options. When dumping huge PDB files, too many of the options were grouped together so you would get neverending spew of output. This patch introduces more granular display options so you can only dump the fields you actually care about. llvm-svn: 270607	2016-05-24 20:31:48 +00:00
Zachary Turner	9e33e6f89b	[codeview, pdb] Dump symbol records in publics stream Differential Revision: http://reviews.llvm.org/D20580 Reviewed By: ruiu llvm-svn: 270597	2016-05-24 18:55:14 +00:00
Xinliang David Li	f4edae6076	[profile] Fix runtime hook linkage bug for COFF Patch by: Johan Engelen the user hook has linkonceODR linkage and it needs to be in comdatAny group. llvm-svn: 270596	2016-05-24 18:47:38 +00:00
Konstantin Zhuravlyov	29ddd2b2f2	[AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegs Differential Revision: http://reviews.llvm.org/D20081 llvm-svn: 270594	2016-05-24 18:37:18 +00:00
Chad Rosier	47f0148c98	[InstCombine] Clean up and FileCheckize test case. llvm-svn: 270586	2016-05-24 17:35:49 +00:00
Zachary Turner	cac29ae038	Dump symbol record details in llvm-pdbdump This makes use of the newly introduced `CVSymbolVisitor` to dump details of each type of symbol record in the symbol streams. Future patches will bring this visitor based dumping to the publics stream, as well as creating a `SymbolDumpDelegate` to print more information about relocations etc. Differential Revision: http://reviews.llvm.org/D20545 Reviewed By: ruiu llvm-svn: 270585	2016-05-24 17:30:25 +00:00
Hans Wennborg	b64e4390a3	Revert r270518, which re-enabled "[LoopUnroll] Enable advanced unrolling analysis by default. Chromium builds are still hitting the assert in PR27874. llvm-svn: 270577	2016-05-24 16:10:12 +00:00
George Rimar	68003e0fbf	Recommit r270070 ([llvm-mc] - Teach llvm-mc to generate compressed debug sections in zlib style.) Now, after landing r270560, r270557, r270320 it is a proper time. Original commit message: [llvm-mc] - Teach llvm-mc to generate compressed debug sections in zlib style. Before this patch llvm-mc generated zlib-gnu styled sections. That means no SHF_COMPRESSED flag was set, magic 'zlib' signature was used in combination with full size field. Sections were renamed to ".z". This patch reimplements the compression style to zlib one as zlib-gnu looks to be depricated everywhere. Differential revision: http://reviews.llvm.org/D20331 llvm-svn: 270569	2016-05-24 15:19:35 +00:00
Sanjay Patel	23019d1006	[ValueTracking, InstSimplify] extend isKnownNonZero() to handle vector constants Similar in spirit to D20497 : If all elements of a constant vector are known non-zero, then we can say that the whole vector is known non-zero. It seems like we could extend this to FP scalar/vector too, but isKnownNonZero() says it only works for integers and pointers for now. Differential Revision: http://reviews.llvm.org/D20544 llvm-svn: 270562	2016-05-24 14:18:49 +00:00
Simon Pilgrim	0295fbe1bb	[InstCombine][X86][SSE41] The SSE41 PMOVSX intrinsics are auto upgraded now and aren't handled by InstCombine any more llvm-svn: 270561	2016-05-24 13:52:44 +00:00
George Rimar	d92694eed7	[MC/ELF] - Fixed insufficient compression.s test Main problem that .debug_info section was used to check that llvm-dwarfdump is able to decompress data that was compressed with llvm-mc tool. This section was not compressed actually, because consumes more space in compressed view. I changed testcase to use .debug_str section which is one that is really compressed. So currently test do what is probably was expected to do: checks that "data"->llvm-mc->llvm-dwarfdump->dumps back initial "data". Differential revision: http://reviews.llvm.org/D20466 llvm-svn: 270560	2016-05-24 13:45:29 +00:00
Than McIntosh	879ad8fa99	Rework/enhance stack coloring data flow analysis. Replace bidirectional flow analysis to compute liveness with forward analysis pass. Treat lifetimes as starting when there is a first reference to the stack slot, as opposed to starting at the point of the lifetime.start intrinsic, so as to increase the number of stack variables we can overlap. Reviewers: gbiv, qcolumbet, wmi Differential Revision: http://reviews.llvm.org/D18827 Bug: 25776 llvm-svn: 270559	2016-05-24 13:23:44 +00:00
Simon Pilgrim	caf0d9d92c	[X86][SSE] Added vector sitofp/uitofp folded load tests llvm-svn: 270558	2016-05-24 13:07:23 +00:00
George Rimar	401e4e570e	Recommit r270547 ([llvm-dwarfdump] - Teach dwarfdump to decompress debug sections in zlib style.) Fix was: 1) Had to regenerate dwarfdump-test-zlib.elf-x86-64, dwarfdump-test-zlib-gnu.elf-x86-64 (because llvm-symbolizer-zlib.test uses that inputs for its purposes and failed). 2) Updated llvm-symbolizer-zlib.test (updated used call function address to match new files + added one more check for newly created dwarfdump-test-zlib-gnu.elf-x86-64 binary input). 3) Updated comment in dwarfdump-test-zlib.cc. Original commit message: [llvm-dwarfdump] - Teach dwarfdump to decompress debug sections in zlib style. Before this llvm-dwarfdump only recognized zlib-gnu compression style of headers, this patch adds support for zlib style. It looks reasonable to support both styles for dumping, even if we are not going to suport generating of deprecated gnu one. Differential revision: http://reviews.llvm.org/D20470 llvm-svn: 270557	2016-05-24 12:48:46 +00:00
Sam Kolton	11de370cca	[AMDGPU] Assembler: rework parsing of optional operands. Summary: Change process of parsing of optional operands. All optional operands use same parsing method - parseOptionalOperand(). No default values are added to OperandsVector. Get rid of WORKAROUND_USE_DUMMY_OPERANDS_INSTEAD_MUTIPLE_DEFAULT_OPERANDS. Reviewers: tstellarAMD, vpykhtin, artem.tamazov, nhaustov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20527 llvm-svn: 270556	2016-05-24 12:38:33 +00:00
Artem Tamazov	212a251c8d	[AMDGPU][llvm-mc] Disassembler: support for TTMP/TBA/TMA registers. Differential Revision: http://reviews.llvm.org/D20476 llvm-svn: 270552	2016-05-24 12:05:16 +00:00
Igor Breger	23c2090606	[llvm][AVX512][intrinsics] Fix vperm{b\|w\|d\|q\|ps\|pd} intrinsics. Index is second argument to buildin function but it is first instruction operand. Differential Revision: http://reviews.llvm.org/D20515 llvm-svn: 270548	2016-05-24 11:06:22 +00:00
George Rimar	f059dd4f76	Revert r270543 ("Recommit r270540") Failed build bot in another test. I am sorry for noise. http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/23679/testReport/junit/LLVM/DebugInfo/llvm_symbolizer_zlib_test/ llvm-svn: 270547	2016-05-24 11:03:10 +00:00
George Rimar	e9b2e19109	Recommit r270540 fix: forgot to commit the updated dwarfdump-test-zlib.elf-x86-64 Original commit message: [llvm-dwarfdump] - Teach dwarfdump to decompress debug sections in zlib style. Before this llvm-dwarfdump only recognized zlib-gnu compression style of headers, this patch adds support for zlib style. It looks reasonable to support both styles for dumping, even if we are not going to suport generating of deprecated gnu one. Differential revision: http://reviews.llvm.org/D20470 llvm-svn: 270543	2016-05-24 10:46:43 +00:00
Sagar Thakur	672c710de4	[MIPS][LLVM-MC] Fix Disassemble of Negative Offset Patch by Nitesh Jain. Summary: The type of Imm in MipsDisassembler.cpp was incorrect since SignExtend64 return int64_t type.As per the MIPSr6 doc ,the offset is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address hence “4” is added to the offset. The offset of some test case are update to reflect the changes due to “ + 4 ” offset and new test case for negative offset are added. Reviewers: dsanders, vkalintiris Differential Revision: http://reviews.llvm.org/D17540 llvm-svn: 270542	2016-05-24 09:57:10 +00:00
George Rimar	6a6185fd78	Revert r270540 "[llvm-dwarfdump] - Teach dwarfdump to decompress debug sections in zlib style." it broked bot: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/5036 llvm-svn: 270541	2016-05-24 09:44:44 +00:00
George Rimar	6bcbf4c572	[llvm-dwarfdump] - Teach dwarfdump to decompress debug sections in zlib style. Before this llvm-dwarfdump only recognized zlib-gnu compression style of headers, this patch adds support for zlib style. It looks reasonable to support both styles for dumping, even if we are not going to suport generating of deprecated gnu one. Differential revision: http://reviews.llvm.org/D20470 llvm-svn: 270540	2016-05-24 09:28:36 +00:00
Simon Pilgrim	14000b3cea	[CostModel][X86][XOP] Added XOP costmodel for BITREVERSE Now that we have a nice fast VPPERM solution. Added framework for future intrinsic costs as well. llvm-svn: 270537	2016-05-24 08:17:50 +00:00
Michael Zolotukhin	96c150d154	Revert "Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default."" This reverts commit r270512 and reapplies r270478. Originally it caused PR27847, but it was fixed in r270517. llvm-svn: 270518	2016-05-24 01:22:20 +00:00
Michael Zolotukhin	3898b2b587	[LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst. This fixes PR27847. llvm-svn: 270517	2016-05-24 00:51:01 +00:00
Evgeniy Stepanov	b62303081b	[msan] Add a test for vector compare x86 intrinsics. This was actually meant to go in with r267966, but I forgot to git add the file. Better late than never. llvm-svn: 270515	2016-05-24 00:04:23 +00:00
Hans Wennborg	6951028b61	Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default." This caused PR27847. llvm-svn: 270512	2016-05-23 23:42:35 +00:00
Justin Bogner	a5690b00af	test: Be consistent with clang's sanitizer lit config The logic that sets up lit features for sanitizers is largely copied between here and clang, except clang's was fixed some time ago to handle multiple sanitizers (ie, Asan + Ubsan). This just makes the code in LLVM consistent with how it's done in clang to avoid any gotchas by users of this. llvm-svn: 270510	2016-05-23 23:02:11 +00:00
Simon Pilgrim	8a5ff3c59a	[X86][SSE] Updated (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) fast-isel codegen to match D20528 llvm-svn: 270501	2016-05-23 22:17:36 +00:00
Sanjoy Das	aa83c47bab	[IRCE] Optimize "uses" not branches; NFCI This changes IRCE to optimize uses, and not branches. This change is NFCI since the uses we do inspect are in practice only ever going to be the condition use in conditional branches; but this flexibility will later allow us to analyze more complex expressions than just a direct branch on a range check. llvm-svn: 270500	2016-05-23 22:16:45 +00:00
Sanjay Patel	adcaef7238	[InstSimplify] add vector tests for isKnownNonZero llvm-svn: 270498	2016-05-23 22:09:04 +00:00
Simon Pilgrim	8cfcf586bb	[X86][SSE] Added cvtdq2pd/cvtps2pd generic IR tests Added D20528 implementations as well as existing x86 intrinsics versions llvm-svn: 270494	2016-05-23 21:45:02 +00:00
Kevin Enderby	9873e2c467	Add the printing the Mach-O (__LLVM,__bundle) xar archive file section "verbosely" to llvm-objdump. This section is created with -fembed-bitcode option. This requires the use of libxar and the Cmake and lit support were crafted by Chris Bieneman! rdar://26202242 llvm-svn: 270491	2016-05-23 21:34:12 +00:00
Simon Pilgrim	f615191fa6	[X86][SSE] Use shuffle/sext instead of deprecated (+ auto-upgraded) pmovsxwd intrinsic call llvm-svn: 270489	2016-05-23 21:21:38 +00:00
James Y Knight	fdcc727da6	[SPARC] Fix 8 and 16-bit atomic load and store. They were accidentally using the 32-bit load/store instruction for 8/16-bit operations, due to incorrect patterns (8/16-bit cmpxchg and atomicrmw will be fixed in subsequent changes) llvm-svn: 270486	2016-05-23 20:33:00 +00:00
Reid Kleckner	2280f9325e	Modify emitTypeInformation to use MemoryTypeTableBuilder, take 2 This effectively revers commit r270389 and re-lands r270106, but it's almost a rewrite. The behavior change in r270106 was that we could no longer assume that each LF_FUNC_ID record got its own type index. This patch adds a map from DINode* to TypeIndex, so we can stop making that assumption. This change also emits padding bytes between type records similar to the way MSVC does. The size of the type record includes the padding bytes. llvm-svn: 270485	2016-05-23 20:23:46 +00:00
Gerolf Hoflehner	00e7092f68	[InstCombine] Fix assertion when bitcast is converted to gep When an aggregate contains an opaque type its size cannot be determined. This triggers an "Invalid GetElementPtrInst indices for type" assert in function checkGEPType. The fix suppresses the conversion in this case. http://reviews.llvm.org/D20319 llvm-svn: 270479	2016-05-23 19:23:17 +00:00
Michael Zolotukhin	be080fc51d	[LoopUnroll] Enable advanced unrolling analysis by default. Summary: This patch turns on LoopUnrollAnalyzer by default. To mitigate compile time regressions, I chose very conservative thresholds for now. Later we can make them more aggressive, but it might require being smarter in which loops we're optimizing. E.g. currently the biggest issue is that with more agressive thresholds we unroll many cold loops, which increases compile time for no performance benefit (performance of those loops is improved, but it doesn't matter since they are cold). Test results for compile time(using 4 samples to reduce noise): ``` MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19% SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect 4.19% MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow 3.39% MultiSource/Applications/JM/lencod/lencod 1.47% MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06% ``` I didn't see any performance changes in the testsuite, but it improves some internal tests. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20482 llvm-svn: 270478	2016-05-23 19:10:19 +00:00
David Blaikie	05f84cd31d	llvm-dwp: More error handling around invalid compressed sections llvm-svn: 270466	2016-05-23 17:59:17 +00:00
Xinliang David Li	872362c457	[profile] show more statistics Add value profile statistics with the 'show' command. llvm-svn: 270450	2016-05-23 16:36:11 +00:00
Diana Picus	b2da61196e	[BPF] Remove exit-on-error flag in test (PR27766) The exit-on-error flag on the many_args1.ll test is needed to avoid an unreachable in BPFTargetLowering::LowerCall. We can also avoid it by ignoring any superfluous arguments to the call (i.e. any arguments after the first 5). Fixes PR27766. Differential Revision: http://reviews.llvm.org/D20471 v2 of r270419 llvm-svn: 270440	2016-05-23 14:57:19 +00:00
Asaf Badouh	d32e4c9f0d	[X86][RTM] _xabort() should not have "noreturn" attribute Differential Revision: http://reviews.llvm.org/D20518 llvm-svn: 270437	2016-05-23 14:04:17 +00:00
Simon Pilgrim	7cc9814aaf	[X86][AVX] Added tests that access ymm registers before and after explicit vzeroupper/vzeroall calls llvm-svn: 270434	2016-05-23 13:03:45 +00:00
Renato Golin	2546b5ac5f	Reverts "[BPF] Remove exit-on-error flag in test (PR27766)" This patch reverts r270419 because it broke a lot of buildbots, mostly Windows. We'd like help in investigating the issues, but for now, it should stay out. llvm-svn: 270433	2016-05-23 13:02:11 +00:00
Simon Pilgrim	4adbf23e1f	[X86][SSE] Regenerated scalar load folding tests llvm-svn: 270431	2016-05-23 12:53:09 +00:00
Simon Pilgrim	07002e86e3	[X86][SSE] Regenerated partial register update tests llvm-svn: 270430	2016-05-23 12:49:37 +00:00
Simon Pilgrim	e699370f3b	[X86][SSE] Updated sse/avx cvtsi2sd tests to use non-constant value llvm-svn: 270425	2016-05-23 12:41:51 +00:00
Simon Pilgrim	e6f4d28d6a	[X86][SSE2] Regenerated sse2 upgraded intrinsics tests llvm-svn: 270423	2016-05-23 12:40:11 +00:00
Simon Pilgrim	b24542c588	[X86][AVX] Regenerated avx upgraded intrinsics tests llvm-svn: 270422	2016-05-23 12:39:06 +00:00
Diana Picus	eaf34cf67e	[BPF] Remove exit-on-error flag in test (PR27766) The exit-on-error flag on the many_args1.ll test is needed to avoid an unreachable in BPFTargetLowering::LowerCall. We can also avoid it by ignoring any superfluous arguments to the call (i.e. any arguments after the first 5). Fixes PR27766 llvm-svn: 270419	2016-05-23 12:33:34 +00:00
Chris Dewhurst	4f7cac3674	[Sparc][LEON] LEON Erratum fix. Insert NOP after LD or LDF instruction. Due to an erratum in some versions of LEON, we must insert a NOP after any LD or LDF instruction to ensure the processor has time to load the value correctly before using it. This pass will implement that erratum fix. The code will have no effect for other Sparc, but non-LEON processors. Differential Review: http://reviews.llvm.org/D20353 llvm-svn: 270417	2016-05-23 10:56:36 +00:00
Sam Kolton	1bdcef7697	[AMDGPU] Assembler: refactor parsing of modifiers and immediates. Allow modifiers for imms. Reviewers: nhaustov, tstellarAMD Subscribers: kzhuravl, arsenm Differential Revision: http://reviews.llvm.org/D20166 llvm-svn: 270415	2016-05-23 09:59:02 +00:00
Craig Topper	95bdabd338	[AVX512] Add patterns to implement stores of extracts of least signficant subvectors using XMM or YMM stores instead of the vector extract instructions. Similar is already done for AVX and we had lost it going to AVX512VL. llvm-svn: 270383	2016-05-22 23:44:33 +00:00
Simon Pilgrim	1ced2a6390	[X86][SSE] Added extra i8 extract element test llvm-svn: 270379	2016-05-22 20:35:42 +00:00
Sanjay Patel	2959ff4a88	[x86, AVX] don't add a vzeroupper if that's what the code is already doing (PR27823) This isn't the complete fix, but it handles the trivial examples of duplicate vzero* ops in PR27823: https://llvm.org/bugs/show_bug.cgi?id=27823 ...and amusingly, the bogus cases already exist as regression tests, so let's take this baby step. We'll need to do more in the general case where there's legitimate AVX usage in the function + there's already a vzero in the code. Differential Revision: http://reviews.llvm.org/D20477 llvm-svn: 270378	2016-05-22 20:22:47 +00:00
Sanjay Patel	f71fc95173	[x86, AVX] add test file to show vzeroupper pass excesses llvm-svn: 270375	2016-05-22 19:55:48 +00:00
Sanjay Patel	e2e89ef936	[ValueTracking, InstCombine] extend isKnownToBeAPowerOfTwo() to handle vector splat constants We could try harder to handle non-splat vector constants too, but that seems much rarer to me. Note that the div test isn't resolved because there's a check for isIntegerTy() guarding that transform. Differential Revision: http://reviews.llvm.org/D20497 llvm-svn: 270369	2016-05-22 15:41:53 +00:00
Igor Breger	2ba64ab9ae	[AVX512] Implement missing patterns for any_extend load lowering. Differential Revision: http://reviews.llvm.org/D20513 llvm-svn: 270357	2016-05-22 10:21:04 +00:00
Craig Topper	a1041ff001	[AVX512] Add an AddedComplexity line to the 512-bit insert_subvector undef index 0 patterns. This gives them higher priority than the memory patterns. This matches AVX1/2. llvm-svn: 270355	2016-05-22 07:40:40 +00:00
Craig Topper	dca03f8596	[X86] Add a common check-prefix to both run lines on a test so identical checks appear just once. llvm-svn: 270345	2016-05-22 00:39:33 +00:00
Craig Topper	33c550cb95	[AVX512] Add a couple patterns to fix some cases where two vector mask inversions could appear in a row. llvm-svn: 270344	2016-05-22 00:39:30 +00:00
Xinliang David Li	b628dd3568	[profile] Static counter allocation for value profiling (part-1) Differential Revision: http://reviews.llvm.org/D20459 llvm-svn: 270336	2016-05-21 22:55:34 +00:00
Craig Topper	db960eddfa	[AVX512] Add patterns for extracting subvectors and storing to memory. llvm-svn: 270334	2016-05-21 22:50:14 +00:00
Michael Zuckerman	a63a129749	[Clang][AVX512][intrinsics] Fix rcp and sqrt intrinsics. Differential Revision: http://reviews.llvm.org/D20438 llvm-svn: 270322	2016-05-21 14:44:18 +00:00
Michael Zuckerman	11b55b29d1	[Clang][AVX512][intrinsics] Fix vscalef intrinsics. Differential Revision: http://reviews.llvm.org/D20324 llvm-svn: 270321	2016-05-21 11:09:53 +00:00
George Rimar	c13c59afa7	[llvm-readobj] - Teach readobj to recognize SHF_COMPRESSED flag. Main problem here was that SHF_COMPRESSED has the same value with XCORE_SHF_CP_SECTION, which was included as standart (common) flag. As far I understand xCore is a family of controllers and it that means it's constant should be processed separately, only if e_machine == EM_XCORE, otherwise llvm-readobj would output different constants twice for compressed section: Flags [ .. SHF_COMPRESSED (0x800) .. XCORE_SHF_CP_SECTION (0x800) .. ] what probably does not make sence if you're not working with xcore file. Differential revision: http://reviews.llvm.org/D20273 llvm-svn: 270320	2016-05-21 10:16:58 +00:00
Craig Topper	02626c076b	[AVX512] Add patterns for VEXTRACT v16i16->v8i16 and v32i8->v16i8. Disable AVX2 versions of vector extract when AVX512VL is enabled. llvm-svn: 270318	2016-05-21 07:08:56 +00:00
Craig Topper	22ae353207	[AVX512] Disable AVX2 VPERMD, VPERMQ, VPERMPS, and VPERMPD patterns when AVX512VL is enabled. Also add shuffle comment printing for AVX512VL VPERMPD/VPERMQ to keep some tests that now use these instructions instead of the AVX2 ones. llvm-svn: 270317	2016-05-21 06:07:18 +00:00
Craig Topper	6be70deda3	[AVX512] Disable AVX/AVX2 VBROADCASTSS/VBROADCASTSD patterns when AVX512VL is enabled. llvm-svn: 270316	2016-05-21 05:47:25 +00:00
Craig Topper	1a23a521bb	[AVX512] Use update_llc_test_checks to update some tests so we can see all the instruction encodings and ensure everything is with EVEX. llvm-svn: 270315	2016-05-21 05:46:58 +00:00
David Majnemer	9f92f4c497	[SimplifyCFG] Remove cleanuppads which are empty except for calls to lifetime.end A cleanuppad is not cheap, they turn into many instructions and result in additional spills and fills. It is not worth keeping a cleanuppad around if all it does is hold a lifetime.end instruction. N.B. We first try to merge the cleanuppad with another cleanuppad to avoid dropping the lifetime and debug info markers. llvm-svn: 270314	2016-05-21 05:12:32 +00:00
Craig Topper	73f48f4662	[AVX512] Fix test cases I missed in r270311. llvm-svn: 270313	2016-05-21 03:59:55 +00:00
Matt Arsenault	7f9eabd2c2	AMDGPU: Define priorities for register classes Allocating larger register classes first should give better allocation results (and more importantly for myself, make the lit tests more stable with respect to scheduler changes). Patch by Matthias Braun llvm-svn: 270312	2016-05-21 03:55:07 +00:00
Matt Arsenault	71e6676169	AMDGPU: Cleanup lowering actions These are kind of a mess and hard to follow, particularly for loads and stores. Fix various redundant, unnecessary and dead settings. llvm-svn: 270307	2016-05-21 02:27:49 +00:00
Sanjoy Das	be6c7a12cb	[GuardWidening] Fix incorrect use of remove_if I had used `std::remove_if` under the assumption that it moves the predicate matching elements to the end, but actaully the elements remaining towards the end (after the iterator returned by `std::remove_if`) are indeterminate. Fix the bug (and make the code more straightforward) by using a temporary SmallVector, and add a test case demonstrating the issue. llvm-svn: 270306	2016-05-21 02:24:44 +00:00
Matt Arsenault	81a709503d	AMDGPU: Fix high bits after division optimization This is essentially doing a 24-bit signed division with FP. We need to truncate to the N bit result. llvm-svn: 270305	2016-05-21 01:53:33 +00:00
Matt Arsenault	b6e1cc2a92	AMDGPU: Fix verifier error when spilling SGPRs The current SGPR spilling test does not stress this because it is using s_buffer_load instructions to increase SGPR pressure and spill, but their output operands have the same SReg_32_XM0 constraint. This fixes an error when the SReg_32 output from most instructions is spilled. llvm-svn: 270301	2016-05-21 00:53:42 +00:00
Matt Arsenault	4945905f5f	AMDGPU: Handle cbranch vccz/vccnz llvm-svn: 270297	2016-05-21 00:29:40 +00:00
Matt Arsenault	72fcd5f597	AMDGPU: Implement ReverseBranchCondition llvm-svn: 270296	2016-05-21 00:29:34 +00:00
Matt Arsenault	6d09380532	AMDGPU: Implement AnalyzeBranch Original patch by Tom Stellard llvm-svn: 270295	2016-05-21 00:29:27 +00:00
Dan Gohman	b7c2400fa7	[WebAssembly] Optimize away return instructions using fallthroughs. This saves a small amount of code size, and is a first small step toward passing values on the stack across block boundaries. Differential Review: http://reviews.llvm.org/D20450 llvm-svn: 270294	2016-05-21 00:21:56 +00:00
Matt Arsenault	2907e51246	Fix constant folding of addrspacecast of null This should not be making assumptions on the value of the casted pointer. llvm-svn: 270293	2016-05-21 00:14:04 +00:00
Matthias Braun	71f9564e7f	LiveIntervalAnalysis: Rework constructMainRangeFromSubranges() We now use LiveRangeCalc::extendToUses() instead of a specially designed algorithm in constructMainRangeFromSubranges(): - The original motivation for constructMainRangeFromSubranges() were differences between the main liverange and subranges because of hidden dead definitions. This case however cannot happen anymore with the DetectDeadLaneMasks pass in place. - It simplifies the code. - This fixes a longstanding bug where we did not properly create new SSA values on merging control flow (the MachineVerifier missed most of these cases). - Move constructMainRangeFromSubranges() to LiveIntervalAnalysis and LiveRangeCalc to better match the implementation/available helper functions. This re-applies r269016. The fixes from r270290 and r270259 should avoid the machine verifier problems this time. llvm-svn: 270291	2016-05-20 23:14:56 +00:00
Matthias Braun	e29b7689bd	MachineVerifier: subregs so not require defs/valnos on every path It is fine for subregister ranges to be undefined on some CFG paths as we may have a "vregX:other_subreg<read-undef> =" def on that path. We do not (and should not) have live segments for the subregister ranges. The MachineVerifier should not complain about this. This is a slight variant of http://llvm.org/PR27705 llvm-svn: 270290	2016-05-20 23:02:13 +00:00
Tim Shen	95e84c5123	[PowerPC] Add a testcase for TCO on string rvo function Differential Revision: http://reviews.llvm.org/D20311 llvm-svn: 270287	2016-05-20 22:42:01 +00:00
Sanjay Patel	ec4d91a4d7	add test vector sdiv llvm-svn: 270285	2016-05-20 22:08:40 +00:00
Sanjay Patel	312c9afd90	add test for vector shift llvm-svn: 270284	2016-05-20 22:08:16 +00:00
Jacques Pienaar	5ffdef55f0	[lanai] Change reloc to use PIC_ by default and cleanup. * Change reloc to PIC_; * Cleanup (clang-format & modify test); llvm-svn: 270282	2016-05-20 21:41:53 +00:00
Sanjay Patel	54acedf88f	add tests for vector urem llvm-svn: 270271	2016-05-20 20:55:17 +00:00
Adrian Prantl	9e7e8839b2	dsymutil/modules: Reword the warning for static libraries without module caches In addition to clarifying the warning message this contains a minor functional change in that it now warns if the immediate parent directory in which the missing PCM is expected to be isn't found. This patch also includes a more comprehensive testcase. rdar://problem/25860711 llvm-svn: 270269	2016-05-20 20:36:06 +00:00
Sanjay Patel	3eded68bef	use FileCheck instead of grep for exact checking llvm-svn: 270265	2016-05-20 20:07:18 +00:00
Rui Ueyama	0fcd82605e	pdbdump: print out symbol names referred by publics stream. DBI stream contains a stream number of the symbol record stream. Symbol record streams is an array of length-type-value members. Each member represents one symbol. Publics stream contains offsets to the symbol record stream. This patch is to print out all symbols that are referenced by the publics stream. Note that even with this patch, llvm-pdbdump cannot dump all the information in a publics stream since it contains more information than symbol names. I'll improve it in followup patches. Differential Revision: http://reviews.llvm.org/D20480 llvm-svn: 270262	2016-05-20 19:55:17 +00:00
Matthias Braun	858d1df246	LiveIntervalAnalysis: Fix missing defs in renameDisconnectedComponents(). Fix renameDisconnectedComponents() creating vreg uses that can be reached from function begin withouthaving a definition (or explicit live-in). Fix this by inserting IMPLICIT_DEF instruction before control-flow joins as necessary. Removes an assert from MachineScheduler because we may now get additional IMPLICIT_DEF when preparing the scheduling policy. This fixes the underlying problem of http://llvm.org/PR27705 llvm-svn: 270259	2016-05-20 19:46:13 +00:00
Jun Bum Lim	b21d4e17a2	[AArch64] Disable narrow load merge by default Summary: As this optimization converts two loads into one load with two shift instructions, it could potentially hurt performance if a loop is arithmetic operation intensive. Reviewers: t.p.northover, mcrosier, jmolloy Subscribers: evandro, jmolloy, aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20172 llvm-svn: 270251	2016-05-20 18:45:49 +00:00
Mark Lacey	9b5fcf65ec	Functions with differing phis should not be merged. Check that the incoming blocks of phi nodes are identical, and block function merging if they are not. rdar://problem/26255167 Differential Revision: http://reviews.llvm.org/D20462 llvm-svn: 270250	2016-05-20 18:39:11 +00:00
Chris Bieneman	64c9e1227e	[MachOYAML] Removing duplicated field from LC_UUID YAML The uuid_command was duplicating the load_command.cmdsize field. This removes the duplicate from the YAML mapping and from the test cases. llvm-svn: 270248	2016-05-20 18:36:52 +00:00
Chris Bieneman	be70933d3c	[obj2yaml][yaml2obj] Adding enumFallback for MachO load commands This adds support for handling unknown load commands, and a bogus_load_command tests. Unknown or unsupported load commands can be specified in YAML by their hex value. llvm-svn: 270239	2016-05-20 17:20:42 +00:00
Simon Pilgrim	55ef3da27b	[X86][AVX] Generalized matching for target shuffle combines This patch is a first step towards a more extendible method of matching combined target shuffle masks. Initially this just pulls out the existing basic mask matches and adds support for some 256/512 bit equivalents. Future patterns will require a number of features to be added but I wanted to keep this patch simple. I hope we can avoid duplication between shuffle lowering and combining and share more complex pattern match functions in future commits. Differential Revision: http://reviews.llvm.org/D19198 llvm-svn: 270230	2016-05-20 16:19:30 +00:00
Simon Pilgrim	acb71db577	[X86][AVX] Sync with clang/test/CodeGen/avx-builtins.c llvm-svn: 270229	2016-05-20 16:05:55 +00:00
Sanjay Patel	75892a1543	[SimplifyCFG] eliminate switch cases based on known range of switch condition This was noted in PR24766: https://llvm.org/bugs/show_bug.cgi?id=24766#c2 We may not know whether the sign bit(s) are zero or one, but we can still optimize based on knowing that the sign bit is repeated. Differential Revision: http://reviews.llvm.org/D20275 llvm-svn: 270222	2016-05-20 14:53:09 +00:00
Sanjay Patel	42bbe77009	[MCExpr] avoid UB via negation of INT_MIN I accidentally exposed a bug in MCExpr::evaluateAsRelocatableImpl() with the test file added in: http://reviews.llvm.org/rL269977 Differential Revision: http://reviews.llvm.org/D20434 llvm-svn: 270218	2016-05-20 14:09:41 +00:00
Krzysztof Parzyszek	a053101c95	[Hexagon] Use pipe instead of temporary files in tests llvm-svn: 270217	2016-05-20 14:01:34 +00:00
Rafael Espindola	c7e9813228	Refactor X86 symbol access classification. This refactors the logic in X86 to avoid code duplication. It also splits it in two steps: it first decides if a symbol is local to the DSO and then uses that information to decide how to access it. The first part is implemented by shouldAssumeDSOLocal. It is not in any way specific to X86. In a followup patch I intend to move it to somewhere common and reused it in other backends. llvm-svn: 270209	2016-05-20 12:20:10 +00:00
Rafael Espindola	8571aa3d5d	Simplify handling of hidden stubs on PowerPC. We now handle them just like non hidden ones. This was already the case on x86 (r207518) and arm (r207517). llvm-svn: 270205	2016-05-20 12:00:52 +00:00
Igor Kudrin	ac40e81987	[Coverage] Fix an issue where improper coverage mapping data could be loaded for an inline function. If an inline function is observed but unused in a translation unit, dummy coverage mapping data with zero hash is stored for this function. If such a coverage mapping section came earlier than real one, the latter was ignored. As a result, llvm-cov was unable to show coverage information for those functions. Differential Revision: http://reviews.llvm.org/D20286 llvm-svn: 270194	2016-05-20 09:14:24 +00:00
Chris Dewhurst	0dfa6bc004	[Sparc] Enable more inline assembly constraints. Note: This is specifically to allow GCC's test pr44707 to pass. Trivial change, not put for differential revision. Test included. llvm-svn: 270192	2016-05-20 09:03:01 +00:00
Craig Topper	25363178bb	[X86] Run the AVX/AVX2 intrinsic tests in AVX512VL mode too just to make sure we don't break any older intrinsics. llvm-svn: 270183	2016-05-20 05:10:32 +00:00
Craig Topper	565463fbba	Revert accidental commit of a test command line addition. llvm-svn: 270175	2016-05-20 02:01:51 +00:00
Craig Topper	0a7a8dee2b	[X86] Fix some AVX patterns to only be disabled if VLX and BWI are supported. Without this we get isel failures on the avx-intrinsics-x86.ll test in AVX512VL. llvm-svn: 270174	2016-05-20 02:00:08 +00:00
Chris Bieneman	db373bed66	[obj2yaml] [yaml2obj] Adding a test for r270124 This test covers strings after load command structs and zero fill bytes. llvm-svn: 270159	2016-05-19 23:26:39 +00:00
Chris Bieneman	1abf005fe6	[yaml2obj] Removing debug code that scribbled 0xDEADBEEF Now that MachO load command fields are fully covered we can fill unaccounted for bytes with 0. That allows us to sparsely specify YAML to simplify tests. Simplifying load_commands test accordingly. llvm-svn: 270158	2016-05-19 23:26:31 +00:00
Lang Hames	45bd7ca7fc	[RuntimeDyld][MachO] Add support for SUBTRACTOR relocations between anonymous symbols on x86-64. llvm-svn: 270157	2016-05-19 23:26:05 +00:00
Easwaran Raman	bb578ef0dd	Allow -inline-threshold to override default threshold. Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior. Differential Revision: http://reviews.llvm.org/D20452 llvm-svn: 270153	2016-05-19 23:02:09 +00:00
Sanjoy Das	f5f0331a3b	[GuardWidening] Introduce range check merging Sequences of range checks expressed using guards, like guard((I - 2) u< L) guard((I - 1) u< L) guard((I + 0) u< L) guard((I + 1) u< L) guard((I + 2) u< L) can sometimes be combined into a smaller sequence: guard((I - 2) u< L AND (I + 2) u< L) if we can prove that (I - 2) u< L AND (I + 2) u< L implies all of checks expressed in the previous sequence. This change teaches GuardWidening to do this kind of merging when feasible. llvm-svn: 270151	2016-05-19 22:55:46 +00:00
Matthew Simpson	476c0afc01	[ARM, AArch64] Match additional patterns to ldN instructions When matching an interleaved load to an ldN pattern, the interleaved access pass checks that all users of the load are shuffles. If the load is used by an instruction other than a shuffle, the pass gives up and an ldN is not generated. This patch considers users of the load that are extractelement instructions. It attempts to modify the extracts to use one of the available shuffles rather than the load. After the transformation, the load is only used by shuffles and will then be matched with an ldN pattern. Differential Revision: http://reviews.llvm.org/D20250 llvm-svn: 270142	2016-05-19 21:39:00 +00:00
Guozhi Wei	b1d37199cc	[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703. If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it. llvm-svn: 270135	2016-05-19 21:07:01 +00:00
Sanjay Patel	cfe75fa72e	comment out line that is causing UBSAN bot failures Patch is awaiting review here: http://reviews.llvm.org/D20434 llvm-svn: 270128	2016-05-19 21:00:02 +00:00
Wei Mi	0456d9dd18	Recommit r255691 since PR26509 has been fixed. llvm-svn: 270113	2016-05-19 20:38:03 +00:00
Hans Wennborg	172eee9cfc	X86: Don't reset the stack after calls that don't return (PR27117) Since the calls don't return, the instruction afterwards will never run, and is just taking up unnecessary space in the binary. Differential Revision: http://reviews.llvm.org/D20406 llvm-svn: 270109	2016-05-19 20:15:33 +00:00
Sanjay Patel	c48a879ef8	[x86] add tests for urem lowering llvm-svn: 270096	2016-05-19 18:57:54 +00:00
Rui Ueyama	0376b1a2d7	pdbdump: Rename NumberOfSymbols -> SymbolRecordStreamIndex. Differential Revision: http://reviews.llvm.org/D20441 llvm-svn: 270088	2016-05-19 18:05:58 +00:00
Simon Pilgrim	7a8dcf2556	[X86][SSE] Added fast-isel tests to sync with clang/test/CodeGen/sse-builtins.c llvm-svn: 270081	2016-05-19 16:55:52 +00:00
Simon Pilgrim	b1ff2dd145	[X86][SSE2] Fixed shuffle of results in _mm_cmpnge_sd/_mm_cmpngt_sd tests llvm-svn: 270080	2016-05-19 16:49:53 +00:00
George Rimar	cf2bf9d015	Temporarily revert r270070 It broke buildbot: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/4817/steps/ninja%20check%201/logs/stdio Actually it is just because D20273 not yet commited, but these 2 were crossing with each other, and I`ll better find the way to land them separatelly soon. Initial commit message: [llvm-mc] - Teach llvm-mc to generate compressed debug sections in zlib style. Before this patch llvm-mc generated zlib-gnu styled sections. That means no SHF_COMPRESSED flag was set, magic 'zlib' signature was used in combination with full size field. Sections were renamed to ".z". This patch reimplements the compression style to zlib one as zlib-gnu looks to be depricated everywhere. Differential revision: http://reviews.llvm.org/D20331 llvm-svn: 270075	2016-05-19 15:58:05 +00:00
Matthew Simpson	6feebe9847	[LAA] Check independence of strided accesses before forward case This patch changes the order in which we attempt to prove the independence of strided accesses. We previously did this after we knew the dependence distance was positive. With this change, we check for independence before handling the negative distance case. The patch prevents LAA from reporting forward dependences for independent strided accesses. This change was requested in the review of D19984. llvm-svn: 270072	2016-05-19 15:37:19 +00:00
George Rimar	99c901fc47	[llvm-mc] - Teach llvm-mc to generate compressed debug sections in zlib style. Before this patch llvm-mc generated zlib-gnu styled sections. That means no SHF_COMPRESSED flag was set, magic 'zlib' signature was used in combination with full size field. Sections were renamed to ".z". This patch reimplements the compression style to zlib one as zlib-gnu looks to be depricated everywhere. Differential revision: http://reviews.llvm.org/D20331 llvm-svn: 270070	2016-05-19 15:08:31 +00:00
Chad Rosier	02f25a9565	[AArch64 ] Generate a BFXIL from 'or (and X, Mask0Imm),(and Y, Mask1Imm)'. Mask0Imm and ~Mask1Imm must be equivalent and one of the MaskImms is a shifted mask (e.g., 0x000ffff0). Both 'and's must have a single use. This changes code like: and w8, w0, #0xffff000f and w9, w1, #0x0000fff0 orr w0, w9, w8 into lsr w8, w1, #4 bfi w0, w8, #4, #12 llvm-svn: 270063	2016-05-19 14:19:47 +00:00
Ranjeet Singh	dbbbef5401	[ARM] Add cdp intrinsic tests. - Renamed intrinsics.ll to intrinsics-coprocessor.ll as all the tests were testing coprocessor instructions, also made the test checks match the full instruction. Differential Revision: http://reviews.llvm.org/D20393 llvm-svn: 270057	2016-05-19 12:59:17 +00:00
Artem Tamazov	8ce1f7177b	[AMDGPU][llvm-mc] Fixes to support buffer atomics. Fixes for MUBUF_Atomic instructions to make operand list valid: - For RTN insns, make a copy of $vdata_in operand as $vdata. - Do not add operand for GLC, it is hardcoded and comes as a token. Workaround to avoid adding multiple default optional operands. Tests added. Differential Revision: http://reviews.llvm.org/D20257 llvm-svn: 270049	2016-05-19 12:22:39 +00:00
Zoran Jovanovic	5f94cedeb5	ps][microMIPS] Add R_MICROMIPS_PC21_S1 relocation Differential Revision: http://reviews.llvm.org/D15526 llvm-svn: 270048	2016-05-19 12:20:40 +00:00
Simon Pilgrim	47825fad71	[X86][SSE2] Added _mm_move_* tests llvm-svn: 270046	2016-05-19 11:59:57 +00:00
Simon Pilgrim	01809e0506	[X86][SSE2] Added _mm_cast* and _mm_set* tests llvm-svn: 270041	2016-05-19 10:58:54 +00:00
Daniel Sanders	2f2ab5102c	[mips][mips16] Fix ZERO is not a CPU16Regs register error from the machine verifier. Summary: Partially fixes PR27458 Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D20330 llvm-svn: 270037	2016-05-19 10:42:14 +00:00
Andrey Turetskiy	45b22a4aff	[X86] Enable RRL part of the LEA optimization pass for -O2. Enable "Remove Redundant LEAs" part of the LEA optimization pass for -O2. This gives 6.4% performance improve on Broadwell on nnet benchmark from Coremark-pro. There is no significant effect on other benchmarks (Geekbench, Spec2000, Spec2006). Differential Revision: http://reviews.llvm.org/D19659 llvm-svn: 270036	2016-05-19 10:18:29 +00:00
Zlatko Buljan	e663e34e79	[mips][microMIPS] Implement BC1EQZC, BC1NEZC, BC2EQZC and BC2NEZC instructions Differential Revision: http://reviews.llvm.org/D18352 llvm-svn: 270030	2016-05-19 07:31:28 +00:00
Peter Collingbourne	fe12d0e3e5	CodeGen: Make the global-merge pass independently testable, and add a test. llvm-svn: 270023	2016-05-19 04:38:56 +00:00
Sanjoy Das	b784ed36c0	[GuardWidening] Use getEquivalentICmp to fold constant compares `ConstantRange::getEquivalentICmp` is more general, and better factored. llvm-svn: 270019	2016-05-19 03:53:17 +00:00
Dan Gohman	537bc9b9f5	[WebAssembly] Make several CHECK lines less fragile using regexes and CHECK-DAG. llvm-svn: 270011	2016-05-19 01:52:56 +00:00
Matt Arsenault	c438ef574d	AMDGPU: Fix promote alloca for pointer loads If the load has a pointer type, we don't want to change its type. llvm-svn: 270000	2016-05-18 23:20:24 +00:00
Sanjoy Das	083f38939b	New pass: guard widening Summary: Implement guard widening in LLVM. Description from GuardWidening.cpp: The semantics of the `@llvm.experimental.guard` intrinsic lets LLVM transform it so that it fails more often that it did before the transform. This optimization is called "widening" and can be used hoist and common runtime checks in situations like these: ``` %cmp0 = 7 u< Length call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ] call @unknown_side_effects() %cmp1 = 9 u< Length call @llvm.experimental.guard(i1 %cmp1) [ "deopt"(...) ] ... ``` to ``` %cmp0 = 9 u< Length call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ] call @unknown_side_effects() ... ``` If `%cmp0` is false, `@llvm.experimental.guard` will "deoptimize" back to a generic implementation of the same function, which will have the correct semantics from that point onward. It is always _legal_ to deoptimize (so replacing `%cmp0` with false is "correct"), though it may not always be profitable to do so. NB! This pass is a work in progress. It hasn't been tuned to be "production ready" yet. It is known to have quadriatic running time and will not scale to large numbers of guards Reviewers: reames, atrick, bogner, apilipenko, nlewycky Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20143 llvm-svn: 269997	2016-05-18 22:55:34 +00:00
Dehao Chen	f16376b505	Follow-up patch of http://reviews.llvm.org/D19948 to handle missing profiles when simplifying CFG. Summary: Set default branch weight to 1:1 if one of the branch has profile missing when simplifying CFG. Reviewers: spatel, davidxl Subscribers: danielcdh, llvm-commits Differential Revision: http://reviews.llvm.org/D20307 llvm-svn: 269995	2016-05-18 22:41:03 +00:00
Rafael Espindola	8c34dd8257	Delete Reloc::Default. Having an enum member named Default is quite confusing: Is it distinct from the others? This patch removes that member and instead uses Optional<Reloc> in places where we have a user input that still hasn't been maped to the default value, which is now clear has no be one of the remaining 3 options. llvm-svn: 269988	2016-05-18 22:04:49 +00:00
Michael Zolotukhin	d2268a73bc	[LoopUnrollAnalyzer] Take into account cost of instructions controlling branches, along with their operands. Previously, we didn't add their and their operands cost, which could've resulted in unrolling loops for no actual benefit. llvm-svn: 269985	2016-05-18 21:20:12 +00:00
Sanjay Patel	fbb9a5e91f	[x86] add test for immediate comment formatting llvm-svn: 269977	2016-05-18 20:26:32 +00:00
Chris Bieneman	6f6f9f0bb9	Fixing test failure on Windows bot http://bb.pgr.jp/builders/msbuild-llvmclang-x64-msc19-DA/builds/553/steps/test-llvm/logs/LLVM%20%3A%3A%20ObjectYAML__MachO__load_commands.yaml llvm-svn: 269975	2016-05-18 20:01:48 +00:00
Krzysztof Parzyszek	14a1c18448	When looking for a spill slot in reg scavenger, find one that matches RC When looking for an available spill slot, the register scavenger would stop after finding the first one with no register assigned to it. That slot may have size and alignment that do not meet the requirements of the register that is to be spilled. Instead, find an available slot that is the closest in size and alignment to one that is needed to spill a register from RC. Differential Revision: http://reviews.llvm.org/D20295 llvm-svn: 269969	2016-05-18 18:16:00 +00:00
Simon Pilgrim	5a0d728181	[X86][SSE2] Added fast-isel tests to sync with clang/test/CodeGen/sse2-builtins.c llvm-svn: 269966	2016-05-18 18:00:43 +00:00
Rui Ueyama	350b29862f	pdbdump: Print out section offsets in the publics stream. llvm-svn: 269955	2016-05-18 16:24:16 +00:00
Chris Bieneman	2de17d49dd	Re-apply: [obj2yaml] [yaml2obj] Support MachO section and section_64 This re-applies r269845, r269846, and r269850 with an included fix for a crash reported by zturner. llvm-svn: 269953	2016-05-18 16:17:23 +00:00
Matt Arsenault	1735da460b	AMDGPU: Other sizes of popcnt are fast We can chain bcnt instructions together, so any width popcnt is pretty fast. llvm-svn: 269950	2016-05-18 16:10:19 +00:00
Hans Wennborg	8eb336c14e	Re-commit r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions" with an additional fix to make RegAllocFast ignore undef physreg uses. It would previously get confused about the "push %eax" instruction's use of eax. That method for adjusting the stack pointer is used in X86FrameLowering::emitSPUpdate as well, but since that runs after register-allocation, we didn't run into the RegAllocFast issue before. llvm-svn: 269949	2016-05-18 16:10:17 +00:00
Matt Arsenault	9430b9113a	AMDGPU: Fix assert when erroring on a call For some reason an assert is now hit when a valid chain is not returned, so return the entry chain. llvm-svn: 269948	2016-05-18 16:10:11 +00:00
Matt Arsenault	891fccc0c1	AMDGPU: Handle alloca promoting with null operands If the second pointer in a multi-pointer instruction is a constant, we can replace the type. llvm-svn: 269945	2016-05-18 15:57:21 +00:00
Matt Arsenault	71fa1f375e	AMDGPU: Fix a few slightly broken tests Fix minor bugs and uses of undef which break when pointer related optimization passes are run. llvm-svn: 269944	2016-05-18 15:48:44 +00:00
Davide Italiano	98f7e0e790	[PM] Port per-function SCCP to the new pass manager. llvm-svn: 269937	2016-05-18 15:18:25 +00:00
Krzysztof Parzyszek	ca3b532e2c	[Hexagon] Recognize "q" and "v" in inline-asm as register constraints llvm-svn: 269933	2016-05-18 14:34:51 +00:00
Dan Gohman	b4c3c38276	[WebAssembly] Don't expand divisions by constants. Don't expand divisions by constants if it would require multiple instructions. The current assumption is that engines will perform the desired optimizations. llvm-svn: 269930	2016-05-18 14:29:42 +00:00
Simon Pilgrim	9829df5d56	[X86][SSE42] Added fast-isel tests to sync with clang/test/CodeGen/sse42-builtins.c llvm-svn: 269929	2016-05-18 14:28:54 +00:00
Simon Pilgrim	3b93835f5d	[X86][SSE41] Sync with clang/test/CodeGen/sse41-builtins.c llvm-svn: 269925	2016-05-18 13:46:10 +00:00
Bryan Chan	e656f61d1e	[SystemZ] Fix register ordering for BinaryRRF instructions Summary: The ordering of registers in BinaryRRF instructions are wrong, and affects the copysign instruction (CPSDR). This results in the wrong magnitude and sign being set. Author: zhanjunl Reviewers: kbarton, uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20308 llvm-svn: 269922	2016-05-18 13:24:57 +00:00
Simon Pilgrim	0b0a583151	[X86][SSE3] Sync with clang/test/CodeGen/sse3-builtins.c llvm-svn: 269920	2016-05-18 13:16:31 +00:00
Ashutosh Nema	348af9cc6b	Add new flag and intrinsic support for MWAITX and MONITORX instructions Summary: MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT pair while adding a timer function, such that another termination of the MWAITX instruction occurs when the timer expires. The presence of the MONITORX and MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29. The MONITORX and MWAITX instructions are intercepted by the same bits that intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be monitored. MWAITX instruction causes the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events. Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is "0F 01 FB". These opcode information is used in adding tests for the disassembler. These instructions are enabled for AMD's bdver4 architecture. Patch by Ganesh Gopalasubramanian! Reviewers: echristo, craig.topper, RKSimon Subscribers: RKSimon, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19795 llvm-svn: 269911	2016-05-18 11:59:12 +00:00
Rafael Espindola	699281cce7	Don't pass a Reloc::Model to MC. MC only needs to know if the output is PIC or not. It never has to decide about creating GOTs and PLTs for example. The only thing that MC itself uses this information for is expanding "macros" in sparc and mips. The rest I am pretty sure could be moved to CodeGen. This is a cleanup and isolates the code from future changes to Reloc::Model. llvm-svn: 269909	2016-05-18 11:58:50 +00:00
Simon Pilgrim	324d9200d6	[X86][SSSE3] Sync with clang/test/CodeGen/ssse3-builtins.c llvm-svn: 269903	2016-05-18 11:19:17 +00:00
Simon Pilgrim	102627405c	[X86][SSE4A] Sync with clang/test/CodeGen/sse4a-builtins.c llvm-svn: 269902	2016-05-18 11:14:58 +00:00
Simon Dardis	1549a2f46a	[mips] Restrict the creation of compact branches Restrict the creation of compact branches so that they meet the ISA encoding requirements. Notably do not permit $zero to be used as a operand for compact branches and ensure that some other branches fulfil the requirement that rs != rt. Fixup cases where $rs > $rt for bnec and beqc. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D20284 llvm-svn: 269893	2016-05-18 09:21:44 +00:00
Chris Dewhurst	68388a0a99	[Sparc] Add Soft Float support This change adds support for software floating point operations for Sparc targets. This is the first in a set of patches to enable software floating point on Sparc. The next patch will enable the option to be used with Clang. Differential Revision: http://reviews.llvm.org/D19265 llvm-svn: 269892	2016-05-18 09:14:13 +00:00
Igor Kudrin	eb10307347	[Coverage] Ensure that coverage mapping data has an expected alignment in 'covmapping' files. Coverage mapping data is organized in a sequence of blocks, each of which is expected to be aligned by 8 bytes. This feature is used when reading those blocks, see VersionedCovMapFuncRecordReader::readFunctionRecords(). If a misaligned covearge mapping data has more than one block, it causes llvm-cov to fail. Differential Revision: http://reviews.llvm.org/D20285 llvm-svn: 269887	2016-05-18 07:43:27 +00:00
Zlatko Buljan	6afea51a58	[mips][microMIPS] Implement LH, LHE, LHU and LHUE instructions and add CodeGen support Differential Revision: http://reviews.llvm.org/D15418 llvm-svn: 269883	2016-05-18 06:54:59 +00:00
Rafael Espindola	cdb2a15d9d	Don't pass relocation-model= to tests that don't need it. Very few things in MC itself use the option. Most of the code that that uses it could be move to CodeGen. llvm-svn: 269871	2016-05-18 00:27:17 +00:00
Zachary Turner	b18921b565	Revert "[obj2yaml] [yaml2obj] Support MachO section and section_64 structs" This reverts commits r269845, r269846, and r269850 as they introduce a crash in obj2yaml when trying to do a roundtrip. llvm-svn: 269865	2016-05-17 23:38:22 +00:00
Dan Gohman	7100809080	[WebAssembly] Rename $discard to $drop in the assembly output. llvm-svn: 269862	2016-05-17 23:19:03 +00:00
Rui Ueyama	8dc18c5f45	pdbdump: Print out more strcutures. I don't yet fully understand the meaning of these data strcutures, but at least it seems that their sizes and types are correct. With this change, we can read publics streams till end. Differential Revision: http://reviews.llvm.org/D20343 llvm-svn: 269861	2016-05-17 23:07:48 +00:00
Dan Gohman	1054570a29	[WebAssembly] Model the stack evaluation order more precisely. We currently don't represent get_local and set_local explicitly; they are just implied by virtual register use and def. This avoids a lot of clutter, but it does complicate stackifying: get_locals read their operands at their position in the stack evaluation order, rather than at their parent instruction. This patch adds code to walk the stack to determine the precise ordering, when needed. llvm-svn: 269854	2016-05-17 22:24:18 +00:00
David Blaikie	8bef4125f2	llvm-dwp: Add error handling for multiple type sections in a dwp file. llvm-svn: 269851	2016-05-17 22:00:57 +00:00
Chris Bieneman	dd3c6b42c5	Fixing a test case that I broke by fixing r269846 This should fix the bots. llvm-svn: 269850	2016-05-17 21:55:45 +00:00
Justin Bogner	594e07bd78	[PM] Port DSE to the new pass manager Patch by JakeVanAdrighem. Thanks! llvm-svn: 269847	2016-05-17 21:38:13 +00:00
Chris Bieneman	7b504b7531	[obj2yaml] [yaml2obj] Support MachO section and section_64 structs This patch adds round trip support for MachO section structs. llvm-svn: 269845	2016-05-17 21:31:02 +00:00
Dan Gohman	d08cd15f33	[WebAssembly] Don't stackify calls past stack pointer modifications. llvm-svn: 269843	2016-05-17 21:14:26 +00:00
Hans Wennborg	759af30109	Revert r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions" Seems to have broken the Windows ASan bot. Reverting while investigating. llvm-svn: 269833	2016-05-17 20:38:56 +00:00
Sanjay Patel	22b01febd4	[InstCombine] add another test for wrong icmp constant (PR27792) It doesn't matter if the comparison is unsigned; the inc/dec is always signed. llvm-svn: 269831	2016-05-17 20:20:40 +00:00
Dan Gohman	12de0b91ac	[WebAssembly] Stackify induction variable increment instructions. This handles instructions where the defined register is also used, as in "x = x + 1". llvm-svn: 269830	2016-05-17 20:19:47 +00:00
Hans Wennborg	c3fb51171e	X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions This patch moves the expansion of WIN_ALLOCA pseudo-instructions into a separate pass that walks the CFG and lowers the instructions based on a conservative estimate of the offset between the stack pointer and the lowest accessed stack address. The goal is to reduce binary size and run-time costs by removing calls to _chkstk. While it doesn't fix all the code quality problems with inalloca calls, it's an incremental improvement for PR27076. Differential Revision: http://reviews.llvm.org/D20263 llvm-svn: 269828	2016-05-17 20:13:29 +00:00
Adrian Prantl	f0a41089ff	Debug Info: Don't emit bitfields in the DWARF4 format when tuning for GDB. As discovered in PR27758, GDB does not fully support the DWARF 4 format. This patch ensures we always emit bitfields in the DWARF 2 when tuning for GDB. llvm-svn: 269827	2016-05-17 20:12:08 +00:00
Renato Golin	38ed8021c7	Fix an assert in SelectionDAGBuilder when processing inline asm When processing inline asm that contains errors, make sure we can recover gracefully by creating an UNDEF SDValue for the inline asm statement before returning from SelectionDAGBuilder::visitInlineAsm. This is necessary for consumers that don't exit on the first error that is emitted (e.g. clang) and that would assert later on. Fixes PR24071. Patch by Diana Picus. llvm-svn: 269811	2016-05-17 19:52:01 +00:00
Chris Bieneman	3f2eb8369e	Reapply r269782 "[obj2yaml] [yaml2obj] Support for MachO load command structures"" This adds support for all the MachO *_command structures. The load_command payloads still are not represented, but that will come next. llvm-svn: 269808	2016-05-17 19:44:06 +00:00
Sanjay Patel	de96f39392	[InstCombine] add test for wrong icmp constant (PR27792) The code fix for this was checked in at r269797. llvm-svn: 269803	2016-05-17 19:25:55 +00:00
Reid Kleckner	8e96c3e9dd	[ThinLTO] Use semicolon to separate path prefix replacement Summary: Colons can appear in Windows paths after drive letters. Both colon and semicolon are valid characters in filenames, but neither are very common. Semicolon seems just as good, and makes the test pass on Windows. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D20332 llvm-svn: 269798	2016-05-17 18:43:22 +00:00
Sanjoy Das	fd67038c8b	[Guards] Add branch metadata when lowering Guards are expected to basically never fail. Reflect this in the branch probabilities in their lowered form. llvm-svn: 269791	2016-05-17 17:51:19 +00:00
Sanjoy Das	f5d40d5350	[SCEV] Be more aggressive in proving NUW ... for AddRec's in loops for which SCEV is unable to compute a max tripcount. This is the NUW variant of r269211 and fixes PR27691. (Note: PR27691 is not a correct or stability bug, it was created to track a pending task). llvm-svn: 269790	2016-05-17 17:51:14 +00:00
Chris Bieneman	1c0f0b242d	Revert "[obj2yaml] [yaml2obj] Support for MachO load command structures" This reverts commit r269782 because it broke bots with -fpermissive. llvm-svn: 269785	2016-05-17 17:13:50 +00:00
Kevin Enderby	ac9e15551d	Change llvm-objdump, llvm-nm and llvm-size when reporting an object file error when the object is in an archive to use something like libx.a(foo.o) as part of the error message. Also changed llvm-objdump and llvm-size to be like llvm-nm and ignore non-object files in archives and not produce any error message. To do this Archive::Child::getAsBinary() was changed from ErrorOr<...> to Expected<...> then that was threaded up to its users. Converting this interface to Expected<> from ErrorOr<> does involve touching a number of places. To contain the changes for now the use of errorToErrorCode() is still used in one place yet to be fully converted. Again there some were bugs in the existing code that did not deal with the old ErrorOr<> return values. So now with Expected<> since they must be checked and the error handled, I added a TODO and a comments for those. llvm-svn: 269784	2016-05-17 17:10:12 +00:00
Chris Bieneman	3552c426e9	[obj2yaml] [yaml2obj] Support for MachO load command structures This adds support for all the MachO *_command structures. The load_command payloads still are not represented, but that will come next. llvm-svn: 269782	2016-05-17 17:03:28 +00:00
Reid Kleckner	fcc5550544	[codeview] Test serialization of all known type records This just checks that we emit all type records once, and then after merging the type stream with no other type streams, we still emit every kind of type record. We could test the dumper output more closely, but that would make the test very brittle. Currently we're just getting coverage. llvm-svn: 269778	2016-05-17 16:20:35 +00:00
Teresa Johnson	ad66eaec2c	[ThinLTO] Force disable test on Windows via REQUIRES shell The "XFAIL: win32" was not enough to get the test to XFAIL on the bot: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/5478 For now, use "REQUIRES: shell" to suppress test on Windows while we investigate. llvm-svn: 269777	2016-05-17 16:06:16 +00:00
Rafael Espindola	712f957cae	Simplify handling of hidden stub. Since r207518 they are printed exactly like non-hidden stubs on x86 and since r207517 on ARM. This means we can use a single set for all stubs in those platforms. llvm-svn: 269776	2016-05-17 16:01:32 +00:00
Teresa Johnson	1e7d4ab9a3	[ThinLTO] XFAIL path manipulation test on Windows This test is creating and checking paths using '/'. XFAIL it on Windows to unbreak bot: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/5478 llvm-svn: 269775	2016-05-17 15:26:13 +00:00
Teresa Johnson	bbd10b4579	[ThinLTO] Option to control path of distributed backend files Summary: Add support to control where files for a distributed backend (the individual index files and optional imports files) are created. This is invoked with a new thinlto-prefix-replace option in the gold plugin and llvm-lto. If specified, expects a string of the form "oldprefix:newprefix", and instead of generating these files in the same directory path as the corresponding bitcode file, will use a path formed by replacing the bitcode file's path prefix matching oldprefix with newprefix. Also add a new replace_path_prefix helper to Path.h in libSupport. Depends on D19636. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19644 llvm-svn: 269771	2016-05-17 14:45:30 +00:00
Simon Pilgrim	2ea513847c	[CostModel][X86] Tidied up checks llvm-svn: 269770	2016-05-17 14:43:41 +00:00
Rafael Espindola	9210be5431	Add a test showing how hidden stubs are handled on ppc. llvm-svn: 269766	2016-05-17 14:24:33 +00:00
Igor Laevsky	953f2d2a54	[RewriteStatepointsForGC] Remove obsolete assertion This is assertion is no longer necessary since we never record constants in the live set anyway. (They are never recorded in the initial live set, and constant bases are removed near line 2119) Differential Revision: http://reviews.llvm.org/D20293 llvm-svn: 269764	2016-05-17 13:54:10 +00:00
Renato Golin	57bfb69aa4	[ARM] ARM mov InstAlias for MOVW lacks HasV6T2 The movw instruction is only available in ARM state for V6T2 and above. The MOVi16 instruction has requirement HasV6T2 but the InstAlias for mov rd, imm where the operand is imm0_65535_expr:$imm does not. This means that movw can incorrectly be used in ARMv4 and ARMv5 by writing mov rd, 0x1234. The simple fix is to the requirement HasV6T2 to the InstAlias. Tests added to not-armv4.s. Patch by Peter Smith. llvm-svn: 269761	2016-05-17 13:05:28 +00:00
David L Kreitzer	e7c583e06f	Fix for PR27750. Correctly handle the case where the fallthrough block and target block are the same in getFallThroughMBB. Differential Revision: http://reviews.llvm.org/D20288 llvm-svn: 269760	2016-05-17 12:47:46 +00:00
Benjamin Kramer	ca9a0fe2b9	[InstCombine] Don't crash when trying to take an element of a ConstantExpr. Fixes PR27786. llvm-svn: 269757	2016-05-17 12:08:55 +00:00
Zoran Jovanovic	84e4d59e47	[mips][microMIPS] Implement BEQZC and BNEZC instructions Differential Revision: http://reviews.llvm.org/D15417 llvm-svn: 269755	2016-05-17 11:10:15 +00:00
Simon Dardis	8d8f2f8b8d	[mips] Compact branch policy control for MIPSR6 This patch adds the commandline option -mips-compact-branches={never,optimal,always), which controls how LLVM generates compact branches for MIPS targets. By default, the compact branch policy is 'optimal' where LLVM will (hopefully) pick the optimal branch for any situation. The 'never' policy will disable the generation of compact branches and 'always' will generate compact branches wherever possible. Reviewers: dsanders Differential Review: http://reviews.llvm.org/D20167 llvm-svn: 269753	2016-05-17 10:21:43 +00:00
Zlatko Buljan	e9abe8816c	[mips][microMIPS][DSP] Implement BALIGN, BITREV, BPOSGE32, CMP, CMPGDU, CMPGU* and CMPU* instructions Differential Revision: http://reviews.llvm.org/D16182 llvm-svn: 269752	2016-05-17 09:32:58 +00:00
Dan Gohman	2644d74bc2	[WebAssembly] Improve the precision of memory and side effect dependence tracking. MachineInstr::isSafeToMove is more conservative than is needed here; use a more explicit check, and incorporate knowledge of some WebAssembly-specific opcodes. llvm-svn: 269736	2016-05-17 04:05:31 +00:00
Adrian Prantl	7aa34c8cbb	Debug Info: Don't emit a DW_AT_data_member_location for DWARF bitfields. The DWARF spec states that a member entry may have either a DW_AT_data_member_location or a DW_AT_data_bit_offset, but not both. This fixes a bug found in PR 27758. llvm-svn: 269731	2016-05-17 02:37:53 +00:00
Sanjay Patel	e9b2c32e7f	[InstCombine] check vector elements before trying to transform LE/GE vector icmp (PR27756) Fix a bug introduced with rL269426 : [InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819) We were assuming that a ConstantDataVector / ConstantVector / ConstantAggregateZero operand of an ICMP was composed of ConstantInt elements, but it might have ConstantExpr or UndefValue elements. Handle those appropriately. Also, refactor this function to join the scalar and vector paths and eliminate the switches. Differential Revision: http://reviews.llvm.org/D20289 llvm-svn: 269728	2016-05-17 00:57:57 +00:00
David Blaikie	4940f87bcc	llvm-dwp: Provide error handling for invalid string field forms This diagnostic could be improved by adding the name of the input file containing the invalid data and/or some information about how to identify the specific offending attribute/tag in the input. But that's not an immediate priority as these corner cases of invalid input shouldn't come up too often. llvm-svn: 269727	2016-05-17 00:07:10 +00:00
Easwaran Raman	01d98ba0b2	Remove .hot and .unlikely prefixes from function section names. This code currently relies on static methods in ProfileSummary to determine whether a function is hot or unlikley. I am refactoring the ProfileSummary code and these methods will be removed. As discussed offline, the right way to re-introduce this is to add a pass to annotate functions with unlikely/hot hints and use the hints to determine the prefix here. llvm-svn: 269726	2016-05-16 23:59:04 +00:00
Jan Vesely	687ca8df18	AMDGPU/R600: Use correct number of vector elements when lowering private loads Reviewer: tstellardAMD, arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D20032 llvm-svn: 269725	2016-05-16 23:56:32 +00:00
David Blaikie	7bb62ef1e9	llvm-dwp: Add error handling for invalid (non-CU) top level tag in debug_info.dwo The diagnostic could be improved a bit to include information about which input file had the mistake (& which unit (counted, since the name of the unit won't be accessible) within the input). llvm-svn: 269723	2016-05-16 23:26:29 +00:00
Adrian Prantl	e7d833defb	Debug info: Don't emit a DW_AT_byte_size when emitting a DWARF4 bit field. The DWARF spec clearly states that a bit field member should have either a DW_AT_byte_size or a DW_AT_bit_size, but not both. Also the DW_AT_byte_size is redundant with the size of the type of the member. This fixes a bug found in PR 27758. llvm-svn: 269714	2016-05-16 22:45:10 +00:00
Matt Arsenault	14a4d319dd	AMDGPU: Add some private element size tests llvm-svn: 269712	2016-05-16 22:17:27 +00:00
Matt Arsenault	8a028bf4d7	AMDGPU: Fix promote alloca pass creating huge arrays This was assuming it could use all memory before, which is a bad decision because it restricts occupancy. By default, only try to use enough space that could reduce occupancy to 7, an arbitrarily chosen limit. Based on the exist LDS usage, try to round up to the limit in the current tier instead of further hurting occupancy. This isn't ideal, because it doesn't accurately know how much space is going to be used for alignment padding. llvm-svn: 269708	2016-05-16 21:19:59 +00:00
Rafael Espindola	e64619ce6e	Fail early on unknown appending linkage variables. In practice only a few well known appending linkage variables work. Currently if codegen sees an unknown appending linkage variable it will just print it as a regular global. That is wrong as the symbol in the produced object file has different semantics as the one provided by the appending linkage. This just errors early instead of producing a broken .o. llvm-svn: 269706	2016-05-16 21:14:24 +00:00
David Blaikie	01704bb1db	llvm-dwp: Add .test files missing from r269339 llvm-svn: 269705	2016-05-16 21:13:15 +00:00
Matt Arsenault	c31a9d0671	SelectionDAG: Select min/max when both are used Allow two users of the condition if the other user is also a min/max select. i.e. %c = icmp slt i32 %x, %y %min = select i1 %c, i32 %x, i32 %y %max = select i1 %c, i32 %y, i32 %x llvm-svn: 269699	2016-05-16 20:58:23 +00:00
David Blaikie	d1f7ab3396	llvm-dwp: Streamline duplicate DWO ID diagnostic handling Actually use the error return path rather than printing the duplicate information then a separate error. But also just tidy up/deduplicate some of the code for generating the diagnostic text. llvm-svn: 269692	2016-05-16 20:42:27 +00:00
Bryan Chan	28b759c4c8	[SystemZ] Support LRVH and STRVH opcodes Summary: On Linux, /usr/include/bits/byteswap-16.h defines __byteswap_16(x) as an inlined LRVH (Load Reversed Half-word) instruction. The SystemZ back-end did not support this opcode and the inlined assembly would cause a fatal error. Reviewers: bryanpkc, uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18732 llvm-svn: 269688	2016-05-16 20:32:22 +00:00
Dan Gohman	4817a7577c	[WebAssembly] Mark COPY_LOCAL and TEE_LOCAL instructions has having no side effects. llvm-svn: 269683	2016-05-16 19:16:32 +00:00
Dan Gohman	804749c942	[WebAssembly] Use eqz to negate a branch conditions. llvm-svn: 269681	2016-05-16 18:59:34 +00:00
Michael Kuperstein	ac2088d122	[X86] Remove transformVSELECTtoBlendVECTOR_SHUFFLE The new X86 shuffle lowering can do just fine without transforming vselects into vector_shuffles. It looks like the only thing this code does right now is cause trouble - in particular, it can lead to combine/legalization infinite loops. Note that it's not completely NFC, since some of the shuffle masks get inverted, which may cause slight differences further down the line. We may want to find a way to invert those masks, but that's orthogonal to this commit. This fixes the hang in PR27689. llvm-svn: 269676	2016-05-16 18:27:00 +00:00
Matthew Simpson	37ec5f914e	[LAA] Rename forwarding conflict detection option (NFC) This patch renames the option enabling the store-to-load forwarding conflict detection optimization. This change was requested in the review of D20241. llvm-svn: 269668	2016-05-16 17:00:56 +00:00
Krzysztof Parzyszek	0a04ac2153	[Hexagon] Simplify HexagonInstrInfo::isPredicable Remove all the checks for constant extenders from isPredicable. The users of it should be the ones checking cost/profitability. llvm-svn: 269664	2016-05-16 16:56:10 +00:00
Xinliang David Li	f3c7a35238	[PM] Port indirect call promotion pass to new pass manager llvm-svn: 269660	2016-05-16 16:31:07 +00:00
Matthew Simpson	e43198dc4b	[LV] Ensure safe VF for loops with interleaved accesses The selection of the vectorization factor currently doesn't consider interleaved accesses. The vectorization factor is based on the maximum safe dependence distance computed by LAA. However, for loops with interleaved groups, we should instead base the vectorization factor on the maximum safe dependence distance divided by the maximum interleave factor of all the interleaved groups. Interleaved accesses not in a group will be scalarized. Differential Revision: http://reviews.llvm.org/D20241 llvm-svn: 269659	2016-05-16 15:08:20 +00:00
Renato Golin	4b9c0d4dcf	[llc] New diagnostic handler Without a diagnostic handler installed, llc's behaviour is to exit on the first error that it encounters. This is very different from the behaviour of clang and other front ends, which try to gather as many errors as possible before exiting. This commit adds a diagnostic handler to llc, allowing it to find and report more than one error. The old behaviour is preserved under a flag (-exit-on-error). Some of the tests fail with the new diagnostic handler, so they have to use the new flag in order to run under the previous behaviour. Some of these are known bugs, others need further investigation. Ideally, we should fix the tests and remove the flag at some point in the future. Reapplied after fixing the LLDB build that was broken due to the new DiagnosticSeverity in LLVMContext.h, and fixed an UB in the new change. Patch by Diana Picus. llvm-svn: 269655	2016-05-16 14:28:02 +00:00
Simon Pilgrim	265995ef53	[X86][SSSE3] Lower vector CTLZ with PSHUFB lookups This patch uses PSHUFB to lower vector CTLZ and avoid (slower) scalarizations. The leading zero count of each 4-bit nibble of the vector is determined by using a PSHUFB lookup. Pairs of results are then repeatedly combined up to the original element width. Differential Revision: http://reviews.llvm.org/D20016 llvm-svn: 269646	2016-05-16 11:19:11 +00:00
Chris Dewhurst	7d8412ff05	[Sparc][LEON] Add LEON-specific CASA instruction. Differental Revision: http://reviews.llvm.org/D20098 llvm-svn: 269644	2016-05-16 11:02:00 +00:00
Daniel Sanders	a2bde88e62	[mips][ias] Fix R_MICROMIPS_GOT16 evaluation and eliminate symbol for R_MICROMIPS_(GOT\|HI\|LO)16 Summary: The failure r269410 worked around turned out to be caused by an incorrect evaluation of R_MICROMIPS_GOT16 which then caused the GOT entries to be incorrect. This patch fixes the evaluation and reverts r269410. Reviewers: sdardis, vkalintiris, rafael Subscribers: rafael, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D20242 llvm-svn: 269641	2016-05-16 09:33:59 +00:00
Daniel Sanders	cda908a0b6	[mips][ias] EF_MIPS_MICROMIPS should iff microMIPS code was emitted. Summary: This fixes PR27682. Additionally, '.set micromips' by itself is not sufficient to raise the EF_MIPS_MICROMIPS flag. It is also necessary to emit a microMIPS instruction. This has also been fixed. Reviewers: sdardis, vkalintiris, rafael Subscribers: rafael, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D20214 llvm-svn: 269639	2016-05-16 09:10:13 +00:00
Zoran Jovanovic	973405bec5	[mips] Addition of a third operand to the instructions [d]div, [d]divu Author: obucina Reviewers: dsanders Adds support for third operand for [D]DIV[U] instructions. Additional test for case when destination reg is zero register Differential Revision: http://reviews.llvm.org/D16888 llvm-svn: 269636	2016-05-16 08:57:59 +00:00
Craig Topper	726cb506ff	[AVX512] Fix mask argument type for insertf32x4/inserti32x4. llvm-svn: 269616	2016-05-15 21:24:45 +00:00
Simon Pilgrim	26fbc75e93	[X86][SSE] Added constant index tests for 128-bit integer vector types llvm-svn: 269604	2016-05-15 19:27:28 +00:00
Simon Pilgrim	73b496e35c	[X86][SSE] Added variable index tests for 128-bit integer vector types llvm-svn: 269603	2016-05-15 19:12:39 +00:00
Simon Pilgrim	91c2839864	Fixed typo in test llvm-svn: 269602	2016-05-15 18:50:22 +00:00
Sanjay Patel	399780f088	add test to show missing optimization llvm-svn: 269601	2016-05-15 18:41:18 +00:00
Simon Pilgrim	5b68a0df04	[X86][SSE] Added extra extractelement tests Added constant index tests for all 256-bit integer vector types (touching lower / upper 128-bits) Added variable index tests for all 256-bit integer vector types Added out-of-range index tests for all 256-bit integer vector types llvm-svn: 269600	2016-05-15 18:22:21 +00:00
Sanjay Patel	ecdd13d788	regenerate checks llvm-svn: 269596	2016-05-15 18:05:10 +00:00
Simon Pilgrim	8fe1b1fd4f	[X86][SSE] Regenerate extractelement tests Added SSE2/AVX2 target tests llvm-svn: 269595	2016-05-15 18:02:39 +00:00
Simon Pilgrim	6e9898f362	[CostModel][X86] Added scalar bitreverse tests llvm-svn: 269594	2016-05-15 17:40:48 +00:00
Elena Demikhovsky	ee004bc0a2	Vector GEP - fixed a crash on InstSimplify Pass. Vector GEP with mixed (vector and scalar) indices failed on the InstSimplify Pass when all indices are constants. Differential revision http://reviews.llvm.org/D20149 llvm-svn: 269590	2016-05-15 12:30:25 +00:00
Craig Topper	258f874bb9	[AVX512] Make the permd intrinsics take a 32-bit immediate to match the software spec. llvm-svn: 269579	2016-05-14 21:13:20 +00:00
Saleem Abdulrasool	8df2f49889	ARM: support export directives for Windows It seems that cl will emit the export directives for Windows ARM targets. The fact that it did this had originally been missed and this functionality was never implemented. This makes it possible to rely solely on the source code for indicating what the exported interfaces are and brings us more compatibility with cl. llvm-svn: 269574	2016-05-14 18:58:34 +00:00
Elena Demikhovsky	e79b716daf	Fixed lowering of _comi_ intrinsics from all sets - SSE/SSE2/AVX/AVX-512 Differential revision http://reviews.llvm.org/D19261 llvm-svn: 269569	2016-05-14 15:06:09 +00:00
Renato Golin	f4917d35c9	Revert "[llc] New diagnostic handler" This reverts commit r269563. Even though now it passes all LLDB bots after a local fix, there's a new buildbot it fails with tests that we hadn't seen locally: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/15647 Adding those tests to the list to investigate. llvm-svn: 269568	2016-05-14 14:37:11 +00:00
NAKAMURA Takumi	2c8500996d	Re-enable llvm/test/ThinLTO/X86/cache.ll. This reverts; r269548, "XFAIL ThinLTO Caching test on Windows." r269561, "Rework r269548, "XFAIL ThinLTO Caching test on Windows.", not to use XFAIL, for now." llvm-svn: 269567	2016-05-14 14:28:17 +00:00
Dima Stepanov	590d7b2e4a	Revert changes after test commit. llvm-svn: 269564	2016-05-14 13:29:52 +00:00
Renato Golin	c001e67baf	[llc] New diagnostic handler Without a diagnostic handler installed, llc's behaviour is to exit on the first error that it encounters. This is very different from the behaviour of clang and other front ends, which try to gather as many errors as possible before exiting. This commit adds a diagnostic handler to llc, allowing it to find and report more than one error. The old behaviour is preserved under a flag (-exit-on-error). Some of the tests fail with the new diagnostic handler, so they have to use the new flag in order to run under the previous behaviour. Some of these are known bugs, others need further investigation. Ideally, we should fix the tests and remove the flag at some point in the future. Reapplied after fixing the LLDB build that was broken due to the new DiagnosticSeverity in LLVMContext.h. Patch by Diana Picus. llvm-svn: 269563	2016-05-14 13:15:22 +00:00
NAKAMURA Takumi	c50a2a93ae	Rework r269548, "XFAIL ThinLTO Caching test on Windows.", not to use XFAIL, for now. It was passing (and is XPASSing) with --host=linux --target=win32. llvm-svn: 269561	2016-05-14 12:47:40 +00:00
Daniel Sanders	e160f83f71	[mips] Enable IAS by default for 32-bit MIPS targets (O32). Summary: The MIPS IAS can now pass 'ninja check-all', recurse, build a bootable linux kernel, and pass a variety of LNT testing. Unfortunately we can't enable it by default for 64-bit targets yet since the N32 ABI is still very buggy and this also means we can't enable it for N64 either because we can't distinguish between N32 and N64 in the relevant code. Reviewers: vkalintiris Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18759 Differential Revision: http://reviews.llvm.org/D18761 llvm-svn: 269560	2016-05-14 12:43:08 +00:00
Dima Stepanov	435072d3e1	Test commt: remove a blank line. llvm-svn: 269558	2016-05-14 10:30:54 +00:00
Mehdi Amini	66862c2797	XFAIL ThinLTO Caching test on Windows. I have no idea what's going on on Windows here. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 269548	2016-05-14 05:38:58 +00:00
Mehdi Amini	ab4a8b6ca3	Add testing in llvm-lto for ThinLTO caching. Trying to improve code coverage for `make check` From: mehdi_amini <mehdi_amini@91177308-0d34-0410-b5e6-96231b3b80d8> llvm-svn: 269545	2016-05-14 05:16:41 +00:00
Mehdi Amini	34b0241b81	Revert "Add testing in llvm-lto for ThinLTO caching." This reverts commit r269538 and r269542. "rename()" is expected to fail across filesystems, will handle this. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 269543	2016-05-14 05:07:44 +00:00
Mehdi Amini	e19c794741	Increase verbosity in the test output to help debugging windows issues From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 269542	2016-05-14 05:01:36 +00:00
Mehdi Amini	dec0e54d58	Add testing in llvm-lto for ThinLTO caching. Trying to improve code coverage for `make check` From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 269538	2016-05-14 04:41:26 +00:00
Dan Gohman	a01e8bde57	[WebAssembly] Fix legalization of i128 shifts. compiler-rt/libgcc shift routines expect the shift count to be an i32, so use i32 as the shift count for shifts that are legalized to libcalls. This also reverts r268991, now that the signatures are correct. llvm-svn: 269531	2016-05-14 02:15:47 +00:00
Craig Topper	d8a9c0d120	[AVX512] Fix types for pshufd intrinsics. The immediate is the second argument and the mask is the 4th argument. Also move the 128/256 tests to the right test file. Prior to this the immediate was a strange 16-bits and the 512-bit intrinsic couldn't receive the full 16 mask bits it needs. llvm-svn: 269526	2016-05-14 00:47:18 +00:00
Reid Kleckner	0b269748a6	[codeview] Add type stream merging prototype Summary: This code is intended to be used as part of LLD's PDB writing. Until that exists, this is exposed via llvm-readobj for testing purposes. Type stream merging uses the following algorithm: - Begin with a new empty stream, and a new empty hash table that maps from type record contents to new type index. - For each new type stream, maintain a map from source type index to destination type index. - For each record, copy it and rewrite its type indices to be valid in the destination type stream. - If the new type record is not already present in the destination stream hash table, append it to the destination type stream, assign it the next type index, and update the two hash tables. - If the type record already exists in the destination stream, discard it and update the type index map to forward the source type index to the existing destination type index. Reviewers: zturner, ruiu Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20122 llvm-svn: 269521	2016-05-14 00:02:53 +00:00
Marcin Koscielnicki	a4fcd3681f	[MSan] [PowerPC] Implement PowerPC64 vararg helper. Differential Revision: http://reviews.llvm.org/D20000 llvm-svn: 269518	2016-05-13 23:55:33 +00:00
Davide Italiano	9922344178	[PM] Port LowerAtomic to the new pass manager. llvm-svn: 269511	2016-05-13 22:52:35 +00:00
Adam Nemet	c62e554e9a	[LAA] Include MaxSafeDepDistBytes in the analysis print-out llvm-svn: 269508	2016-05-13 22:49:13 +00:00
Michael Zolotukhin	963a6d9c69	Revert "Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."" This reverts commit r269395. Try to reapply with a fix from chapuni. llvm-svn: 269486	2016-05-13 21:23:25 +00:00
Rui Ueyama	1f6b6e2c53	pdbdump: Print "Publics" stream. Publics stream seems to contain information as to public symbols. It actually contains a serialized hash table along with fixed-sized headers. This patch is not complete. It scans only till the end of the stream and dump the header information. I'll write code to de-serialize the hash table later. Reviewers: zturner Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20256 llvm-svn: 269484	2016-05-13 21:21:53 +00:00
Jan Vesely	1680039a7a	AMDGPU/R600: Fold global address operand Reviewers: tstellard Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D19793 llvm-svn: 269480	2016-05-13 20:39:31 +00:00
Jan Vesely	f97de00745	AMDGPU/R600: Implement memory loads from constant AS Reviewers: tstellard Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D19792 llvm-svn: 269479	2016-05-13 20:39:29 +00:00
Steven Wu	25a7330cc8	Disable test from r269436 on unsupported platforms Fixing bots failure. test/ExecutionEngine/RuntimeDyld/SystemZ/cfi-relo-pc64.s requires SystemZ backend. Mark the test as unsupported if the backend is not available. llvm-svn: 269470	2016-05-13 20:10:51 +00:00
Reid Kleckner	4525fbe22a	[codeview] Align class and print names of types Summary: This way we can get rid of one of the fields in the .def file. Reviewers: llvm-commits Subscribers: zturner Differential Revision: http://reviews.llvm.org/D20251 llvm-svn: 269461	2016-05-13 19:37:07 +00:00
Tim Northover	f8b0a7af52	ARM: use callee-saved list in the order they're actually saved. When setting the frame pointer, the offset from SP is calculated based on the stack slot it gets allocated, but this slot is in turn based on the order of the CSR list so that list should match the order we actually save the registers in. Mostly it did, but in the edge-case of MachO AAPCS targets it was wrong. llvm-svn: 269459	2016-05-13 19:16:14 +00:00
Krzysztof Parzyszek	0f791f44c7	[Hexagon] Remove dead nodes from SelectionDAG to avoid cycles Recent changes to the instruction selection code exposed a problem where a dead node was not removed on time. This node had both input and output chains, which lead to an apparent cycle. llvm-svn: 269458	2016-05-13 18:48:15 +00:00
Konstantin Zhuravlyov	e3d322af57	[AMDGPU] Update nop insertion for debugger usage - Insert one nop for each high level statement instead of two - Do not insert nop before prologue Differential Revision: http://reviews.llvm.org/D20215 llvm-svn: 269452	2016-05-13 18:21:28 +00:00
Renato Golin	1d1b82cbeb	Revert "[ARM,AArch64] NFC. Add extra test cases for bswap lowering." This reverts commit r269425, as it fails on Windows (Thumb only). llvm-svn: 269451	2016-05-13 18:19:42 +00:00
Sanjay Patel	23fa090738	regenerate checks and add a run to show missed shrinkage llvm-svn: 269449	2016-05-13 18:04:39 +00:00
Sanjay Patel	4e0cf49318	regenerate checks llvm-svn: 269447	2016-05-13 18:02:16 +00:00
Paul Osmialowski	4f5b3be7f1	add support for -print-imm-hex for AArch64 Most immediates are printed in Aarch64InstPrinter using 'formatImm' macro, but not all of them. Implementation contains following rules: - floating point immediates are always printed as decimal - signed integer immediates are printed depends on flag settings (for negative values 'formatImm' macro prints the value as i.e -0x01 which may be convenient when imm is an address or offset) - logical immediates are always printed as hex - the 64-bit immediate for advSIMD, encoded in "a🅱️c:d:e:f:g:h" is always printed as hex - the 64-bit immedaite in exception generation instructions like: brk, dcps1, dcps2, dcps3, hlt, hvc, smc, svc is always printed as hex - the rest of immediates is printed depends on availability of -print-imm-hex Signed-off-by: Maciej Gabka <maciej.gabka@arm.com> Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com> Differential Revision: http://reviews.llvm.org/D16929 llvm-svn: 269446	2016-05-13 18:00:09 +00:00
Reid Kleckner	bab3fab806	[codeview] Dump the type index on the first line of each record This will make it easier to write FileCheck tests. llvm-svn: 269444	2016-05-13 17:48:24 +00:00
Chris Bieneman	8b5906ea7f	[obj2yaml] [yaml2obj] Basic support for MachO::load_command This patch adds basic support for MachO::load_command. Load command types and sizes are encoded in the YAML and expanded back into MachO. The YAML doesn't yet support load command structs, that is coming next. In the meantime as a temporary measure when writing MachO files the load commands are padded with zeros so that the generated binary is valid. llvm-svn: 269442	2016-05-13 17:41:41 +00:00
Sanjay Patel	0c8f3f9332	[InstCombine] handle zero constant vectors for LE/GE comparisons too Enhancement to: http://reviews.llvm.org/rL269426 With discussion in: http://reviews.llvm.org/D17859 This should complete the fixes for: PR26701, PR26819: https://llvm.org/bugs/show_bug.cgi?id=26701 https://llvm.org/bugs/show_bug.cgi?id=26819 llvm-svn: 269439	2016-05-13 17:28:12 +00:00
Bryan Chan	d1145ad253	[RuntimeDyld] Support R_390_PC64 relocation type Summary: When the MCJIT generates ELF code, some DWARF data requires 64-bit PC-relative relocation (R_390_PC64). This patch adds support for R_390_PC64 relocation to RuntimeDyld::resolveSystemZRelocation, to avoid an assertion failure. Reviewers: uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20033 llvm-svn: 269436	2016-05-13 17:23:48 +00:00
Jun Bum Lim	f28beac419	[MemCpyOpt] Use MaxIntSize in byte instead of bit Summary: This change fix the bug in isProfitableToUseMemset() where MaxIntSize shoule be in byte, not bit. Reviewers: arsenm, joker.eph, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20176 llvm-svn: 269433	2016-05-13 16:52:24 +00:00
Renato Golin	e9fa3585c5	Revert "[llc] New diagnostic handler" This reverts commit r269428, as it breaks the LLDB build. We need to understand how to change LLDB in the same way as LLC before landing this again. llvm-svn: 269432	2016-05-13 16:02:44 +00:00
Renato Golin	d7a64a5b23	[llc] New diagnostic handler Without a diagnostic handler installed, llc's behaviour is to exit on the first error that it encounters. This is very different from the behaviour of clang and other front ends, which try to gather as many errors as possible before exiting. This commit adds a diagnostic handler to llc, allowing it to find and report more than one error. The old behaviour is preserved under a flag (-exit-on-error). Some of the tests fail with the new diagnostic handler, so they have to use the new flag in order to run under the previous behaviour. Some of these are known bugs, others need further investigation. Ideally, we should fix the tests and remove the flag at some point in the future. Patch by Diana Picus. llvm-svn: 269428	2016-05-13 15:37:46 +00:00
Sanjay Patel	b79ab27853	[InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819) *We don't currently handle the edge case constants (min/max values), so it's not a complete canonicalization. To fully solve the motivating bugs, we need to enhance this to recognize a zero vector too because that's a ConstantAggregateZero which is a ConstantData, not a ConstantVector or a ConstantDataVector. Differential Revision: http://reviews.llvm.org/D17859 llvm-svn: 269426	2016-05-13 15:10:46 +00:00
Renato Golin	8793c521bc	[ARM,AArch64] NFC. Add extra test cases for bswap lowering. These tests were sitting in Phab for many months. They're good tests and should be in. Patch by Charlie Turner. llvm-svn: 269425	2016-05-13 15:10:24 +00:00
Simon Pilgrim	217b886b10	[X86][AVX512] Moved CHECKs inside functions to stop update_llc_test_checks going haywire I'm not going to regenerate these anytime soon but do have some diffs to apply that I'd like to do with update_llc_test_checks llvm-svn: 269420	2016-05-13 14:47:55 +00:00
Amjad Aboud	78b1fb0146	Assure calling "cld" instruction in prologue of X86 interrupt handler function. Differential Revision: http://reviews.llvm.org/D18725 llvm-svn: 269413	2016-05-13 12:46:57 +00:00
Daniel Sanders	e91e52671a	[mips][ias] Work around yet another incorrect microMIPS relocation evaluation exposed by r268900. It's not entirely clear why R_MICROMIPS_(GOT\|HI16\|LO16) are evaluated incorrectly in a small number of the LNT tests at this point. However, it's not related to the STO_MIPS_MICROMIPS issue. At this point all the microMIPS-related changes of r268900 have been reverted. llvm-svn: 269410	2016-05-13 12:07:14 +00:00
Hrvoje Varga	6f09cdfd48	[mips][microMIPS] Implement APPEND, BPOSGE32C, MODSUB, MULSA.W.PH and MULSAQ_S.W.PH instructions Differential Revision: http://reviews.llvm.org/D14117 llvm-svn: 269408	2016-05-13 11:32:53 +00:00
Michael Zolotukhin	9be3b8b9bb	Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..." This reverts commit r269388. It caused some bots to fail, I'm reverting it until I investigate the issue. llvm-svn: 269395	2016-05-13 06:32:25 +00:00
Matt Arsenault	999f7dd84c	AMDGPU: Remove verifier check for scc live ins We only really need this to be true for SIFixSGPRCopies. I'm not sure there's any way this could happen before that point. Fixes a case where MachineCSE could introduce a cross block scc use. llvm-svn: 269391	2016-05-13 04:15:48 +00:00
Michael Zolotukhin	b7b8052982	[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... Summary: ...loop after the last iteration. This is really hard to do correctly. The core problem is that we need to model liveness through the induction PHIs from iteration to iteration in order to get the correct results, and we need to correctly de-duplicate the common subgraphs of instructions feeding some subset of the induction PHIs. All of this can be driven either from a side effect at some iteration or from the loop values used after the loop finishes. This patch implements this by storing the forward-propagating analysis of each instruction in a cache to recall whether it was free and whether it has become live and thus counted toward the total unroll cost. Then, at each sink for a value in the loop, we recursively walk back through every value that feeds the sink, including looping back through the iterations as needed, until we have marked the entire input graph as live. Because we cache this, we never visit instructions more than twice -- once when we analyze them and put them into the cache, and once when we count their cost towards the unrolled loop. Also, because the cache is only two bits and because we are dealing with relatively small iteration counts, we can store all of this very densely in memory to avoid this from becoming an excessively slow analysis. The code here is still pretty gross. I would appreciate suggestions about better ways to factor or split this up, I've stared too long at the algorithmic side to really have a good sense of what the design should probably look at. Also, it might seem like we should do all of this bottom-up, but I think that is a red herring. Specifically, the simplification power is much greater working top-down. We can forward propagate very effectively, even across strange and interesting recurrances around the backedge. Because we use data to propagate, this doesn't cause a state space explosion. Doing this level of constant folding, etc, would be very expensive to do bottom-up because it wouldn't be until the last moment that you could collapse everything. The current solution is essentially a top-down simplification with a bottom-up cost accounting which seems to get the best of both worlds. It makes the simplification incremental and powerful while leaving everything dead until we know it is needed. Finally, a core property of this approach is its monotonicity. At all times, the current UnrolledCost is a conservatively low estimate. This ensures that we will never early-exit from the analysis due to exceeding a threshold when if we had continued, the cost would have gone back below the threshold. These kinds of bugs can cause incredibly hard to track down random changes to behavior. We could use a techinque similar (but much simpler) within the inliner as well to avoid considering speculated code in the inline cost. Reviewers: chandlerc Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D11758 llvm-svn: 269388	2016-05-13 01:42:39 +00:00
Michael Zolotukhin	a59a308e8d	[LoopUnrollAnalyzer] Don't treat gep-instructions with simplified offset as simplified. Summary: Currently we consider such instructions as simplified, which is incorrect, because if their user isn't simplified, we can't actually simplify them too. This biases our estimates of profitability: for instance the analyzer expects much more gains from unrolling memcpy loops than there actually are. Reviewers: hfinkel, chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17365 llvm-svn: 269387	2016-05-13 01:42:34 +00:00
Adrian Prantl	e1bc3e2027	dsymutil: Fix the DWOId mismatch check for cached modules. In verbose mode, we emit a warning if the DWOId of a skeleton CU mismatches the DWOId of the referenced module. This patch updates the cached DWOId after a module has been loaded to the DWOId of the module on disk (instead of storing the DWOId we expected to load). This allows us to correctly emit the mismatch warning for all subsequent object files that want to import the same module. This patch also ensures both warnings are only emitted in verbose mode. rdar://problem/26214027 llvm-svn: 269383	2016-05-13 00:17:58 +00:00
Reid Kleckner	0e85b97307	[codeview] Fix dumping VFTables, stop when we see LF_PAD* Also stop visiting type records when we encounter an error. llvm-svn: 269374	2016-05-12 22:46:41 +00:00
Renato Golin	608cb5def6	[ARM] Support and tests for transform of LDR rt, = to MOV This change implements the transformation in processInstruction() for the LDR rt, =expression to MOV rt, expression when the expression can be evaluated and can fit into the immediate field of the MOV or a MVN. Across the ARM and Thumb instruction sets there are several cases to consider, each with a different range of representatble constants. In ARM we have: * Modified immediate (All ARM architectures) * MOVW (v6t2 and above) In Thumb we have: * Modified immediate (v6t2, v7m and v8m.mainline) * MOVW (v6t2, v7m, v8.mainline and v8m.baseline) * Narrow Thumb MOV that can be used in an IT block (non flag-setting) If the immediate fits any of the available alternatives then we make the transformation. Fixes 25722. Patch by Peter Smith. llvm-svn: 269354	2016-05-12 21:22:42 +00:00
Renato Golin	d5491ab1f9	[ARM] Fixup tests to take into account mov translation. NFC. Alter instances in the test-suite that use immediates that can be represented in the immediate field of a MOV. The reason for doing this is that when the LDR rt,=imm transformation to MOV rt, imm the existing tests do not need to be modified. Required by the patch that fixes PR25722. Patch by Peter Smith. llvm-svn: 269353	2016-05-12 21:22:37 +00:00
Tom Stellard	740af6f3b0	Revert "LiveIntervalAnalysis: Rework constructMainRangeFromSubranges()" This reverts commit r269016 and also the follow-up commit r269020. This patch caused PR27705. llvm-svn: 269344	2016-05-12 20:27:40 +00:00
David Blaikie	bc8397cdf0	llvm-dwp: Use llvm::Error to improve diagnostic quality/error handling in llvm-dwp llvm-svn: 269339	2016-05-12 19:59:54 +00:00
Amjad Aboud	f29608265d	Fixed the callee saved registers list for X86 AllRegs calling convention. 32-bit AllRegs: SSE: xmm0-xmm7 AVX: ymm0-ymm7 AVX512: zmm0-zmm7 + k0-k7 64-bit AllRegs: SSE: xmm0-xmm15 AVX: ymm0-ymm15 AVX512: zmm0-zmm31 + k0-k7 Differential Revision: http://reviews.llvm.org/D20142 llvm-svn: 269337	2016-05-12 19:58:32 +00:00
Krzysztof Parzyszek	4afed5521d	[Hexagon] Expand VSelect pseudo instructions llvm-svn: 269328	2016-05-12 19:16:02 +00:00
Chris Bieneman	fc8892771e	[yaml2macho] Handle mach_header_64 reserved field I've added the reserved field as an "optional" in YAML, but I've added asserts in the yaml2macho code to enforce that the field is present in mach_header_64, but not in mach_header. llvm-svn: 269320	2016-05-12 18:21:09 +00:00
Chris Bieneman	a23b26f466	[yaml2obj] Support for dumping mach_header from yaml With this change obj2yaml and yaml2obj can now round-trip mach_headers. This change also adds ObjectYAML/MachO tests. llvm-svn: 269314	2016-05-12 17:44:48 +00:00
Krzysztof Parzyszek	e60e5fee0a	[Hexagon] Properly handle instruction selection of vsplat intrinsics llvm-svn: 269312	2016-05-12 17:21:40 +00:00
Xinliang David Li	b61f01d0a5	minor test clean up /NFC llvm-svn: 269308	2016-05-12 16:41:27 +00:00
Daniel Sanders	241c67989b	[mips][ias] Fix O32 .cprestore directive when inside .set noat region and offset is in range. Summary: This expands on r269179 to fix an additional case that was not covered by our tests. The assembler temporary is not needed when the .cprestore offset fits inside a simm16 and it is not an error to use it inside a '.set noat' in this case. Reviewers: emaste, seanbruno, sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D20199 llvm-svn: 269295	2016-05-12 14:01:50 +00:00
Daniel Sanders	5fb391c893	[mips][ias] Work around incorrect another microMIPS relocation evaluation exposed by r268900 As explained in r269196, microMIPS has a special case that is not correctly implemented in LLVM. If we have a symbol 'foo' which is equivalent to '.text+0x10'. The value of an R_MICROMIPS_LO16 relocation using 'foo' is 'foo+0x11' and not 'foo+0x10'. The in-place addend should therefore be 0x11. This commit reverts a little more of the effect of r268900 by keeping the symbol when the STO_MIPS_MICROMIPS flag is set for R_MIPS_GPREL32 relocations. This fixes SingleSource/UnitTests/2003-08-11-VaListArg, and SingleSource/UnitTests/2003-05-07-VarArgs for microMIPS. I believe there are additional relocations that have the same issue (e.g. R_MIPS_64, and R_MIPS_GPREL16) but for now I'm focusing on restoring our internal buildbots back to the green state we had in r268899. llvm-svn: 269294	2016-05-12 13:39:13 +00:00
Chad Rosier	39481ace40	[AArch64] Remove command-line option use for testing. The EXTR combine has been in tree for over 2 years without complain, so go ahead and remove the option. llvm-svn: 269292	2016-05-12 13:27:24 +00:00
Simon Pilgrim	89b89650f3	[SelectionDAG] Attempt to split BITREVERSE vector legalization into BSWAP and BITREVERSE stages For BITREVERSE, bit shifting/masking every bit in a vector element is a very lengthy procedure. If the input vector type is a whole multiple of bytes wide then we can split this into a BSWAP shuffle stage (to reverse at the byte level) and then a BITREVERSE stage applied to each byte. Most vector capable targets can efficiently BSWAP using shuffles resulting in a considerable reduction in instructions. With this patch targets would only need to implement a target specific vXi8 BITREVERSE implementation to efficiently reverse most legal vector types. Differential Revision: http://reviews.llvm.org/D19978 llvm-svn: 269290	2016-05-12 13:09:49 +00:00
Hrvoje Varga	cf6a78192b	Revert "[mips][microMIPS] Implement CFC, CTC and LDC* instructions" This reverts commit r269176 as it caused test-suite failure. llvm-svn: 269287	2016-05-12 12:46:06 +00:00
Daniel Sanders	415c159e09	[mips][ias] Correct ELF eflags when Octeon is the target. Reviewers: sdardis Subscribers: petarj, mpf, dsanders, spetrovic, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D18899 llvm-svn: 269283	2016-05-12 11:31:19 +00:00
Daniel Sanders	55d383319f	[mips][ias] Handle N64 compound relocations and R_MIPS_SUB in needsRelocateWithSymbol() Summary: This eliminates the default case for N64 that was left out of r269047. The change to R_MIPS_SUB is needed in this patch to make this testable since %lo(%neg(%gp_rel(foo))) and %hi(%neg(%gp_rel(foo))) remain the only ways to get a compound relocation from the assembler. Reviewers: sdardis, rafael Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D20097 llvm-svn: 269280	2016-05-12 10:55:00 +00:00
Dan Gohman	33e694a807	[WebAssembly] Fast-isel support for calls, arguments, and selects. llvm-svn: 269273	2016-05-12 04:19:09 +00:00
Hal Finkel	1fb10e846a	[PowerPC] Fix a DAG replacement bug in PPCTargetLowering::DAGCombineExtBoolTrunc While promoting nodes in PPCTargetLowering::DAGCombineExtBoolTrunc, it is possible for one of the nodes to be replaced by another. To make sure we do not visit the deleted nodes, and to make sure we visit the replacement nodes, use a list of HandleSDNodes to track the to-be-promoted nodes during the promotion process. The same fix has been applied to the analogous code in PPCTargetLowering::DAGCombineTruncBoolExt. Fixes PR26985. llvm-svn: 269272	2016-05-12 04:00:56 +00:00
David Majnemer	96f0d383a7	[SCCP] Resolve shifts beyond the bitwidth to undef Shifts beyond the bitwidth are undef but SCCP resolved them to zero. Instead, DTRT and resolve them to undef. This reimplements the transform which caused PR27712. llvm-svn: 269269	2016-05-12 03:07:40 +00:00
Xinliang David Li	a94e383157	[Layout] Add a new test case for optimal rotation Enabled by -force-precise-rotation-cost option llvm-svn: 269267	2016-05-12 02:19:16 +00:00
Matt Arsenault	a61cb48dd2	AMDGPU: Fix breaking IR on instructions with multiple pointer operands The promote alloca pass would attempt to promote an alloca with a select, icmp, or phi user, even though the other operand was from a non-promotable source, producing a select on two different pointer types. Only do this if we know that both operands derive from the same alloca. In the future we should be able to relax this to an alloca which will also be promoted. llvm-svn: 269265	2016-05-12 01:58:58 +00:00
Chad Rosier	9926a5e31d	[AArch64] Add support for unscaled narrow stores in getUsefulBitsForUse. llvm-svn: 269263	2016-05-12 01:42:01 +00:00
Sanjoy Das	e0aa414acf	All llvm.deoptimize declarations must use the same calling convention This new verifier rule lets us unambigously pick a calling convention when creating a new declaration for `@llvm.experimental.deoptimize.<ty>`. It is also congruent with our lowering strategy -- since all calls to `@llvm.experimental.deoptimize` are lowered to calls to `__llvm_deoptimize`, it is reasonable to enforce a unique calling convention. Some of the tests that were breaking this verifier rule have had to be split up into different .ll files. The inliner was violating this rule as well, and has been fixed to avoid producing invalid IR. llvm-svn: 269261	2016-05-12 01:17:38 +00:00
Davide Italiano	cd7c84bd8b	Revert "[SCCP] Partially propagate informations when the input is not fully defined." This reverts commit r269105 as it caused PR27712. llvm-svn: 269252	2016-05-11 23:06:10 +00:00
Wei Mi	8c4136b0d8	Fix a bug when hoist spill to a BB with landingpad successor. This is to fix the bug in https://llvm.org/bugs/show_bug.cgi?id=27612. When spill is hoisted to a BB with landingpad successor, and if the VNI of the spill reg lives into the landingpad successor, the spill should be inserted before the call which may throw exception. InsertPointAnalysis is used to compute the safe insert point. http://reviews.llvm.org/D20027 is a preparing patch for this patch. Differential Revision: http://reviews.llvm.org/D19884. llvm-svn: 269249	2016-05-11 22:37:43 +00:00
Sanjay Patel	810e329c88	regenerate checks llvm-svn: 269241	2016-05-11 21:51:28 +00:00
Chad Rosier	23a1a9a66d	[AArch64] Improve getUsefulBitsForUse for narrow stores. For narrow stores (e.g., strb, srth) we know the upper bits of the register are unused/not useful. In some cases we can use this information to eliminate unnecessary instructions. For example, without this patch we generate (from the 2nd test case): ldr w8, [x0] and w8, w8, #0xfff0 bfxil w8, w2, #16, #4 strh w8, [x1] and after the patch the 'and' is removed: ldr w8, [x0] bfxil w8, w2, #16, #4 strh w8, [x1] ret During the lowering of the bitfield insert instruction the 'and' is eliminated because we know the upper 16-bits that are masked off are unused and the lower 4-bits that are masked off are overwritten by the insert itself. Therefore, the 'and' is unnecessary. Differential Revision: http://reviews.llvm.org/D20175 llvm-svn: 269226	2016-05-11 20:19:54 +00:00
Simon Pilgrim	6ce35dd9ea	[X86][AVX512] Fixed VPERMILPD/VPERMILPS shuffle comments. Fixed incorrect operands indices used to access src registers llvm-svn: 269221	2016-05-11 18:53:44 +00:00
Sanjoy Das	4e8c80382f	[SCEVExpander] Fix a failed cast<> assertion SCEVExpander::replaceCongruentIVs assumes the backedge value of an SCEV-analysable PHI to always be an instruction, when this is not necessarily true. For now address this by bailing out of the optimization if the backedge value of the PHI is a non-Instruction. llvm-svn: 269213	2016-05-11 17:41:41 +00:00
Sanjoy Das	abb7b93eb9	[SCEVExpander] Don't break SSA in replaceCongruentIVs `SCEVExpander::replaceCongruentIVs` bypasses `hoistIVInc` if both the original and the isomorphic increments are PHI nodes. Doing this can break SSA if the isomorphic increment is not dominated by the original increment. Get rid of the bypass, and let `hoistIVInc` do the right thing. Fixes PR27232 (compile time crash/hang). llvm-svn: 269212	2016-05-11 17:41:34 +00:00
Sanjoy Das	787c2460c2	[SCEV] Be more aggressive around proving no-wrap ... for AddRec's in loops for which SCEV is unable to compute a max tripcount. This is not a problem for "normal" loops[0] that don't have guards or assumes, but helps in cases where we have guards or assumes in the loop that can be used to constrain incoming values over the backedge. This partially fixes PR27691 (we still don't handle the NUW case). [0]: for "normal" loops, in the cases where we'd be able to prove no-wrap via isKnownPredicate, we'd also be able to compute a max tripcount. llvm-svn: 269211	2016-05-11 17:41:26 +00:00
Jan Vesely	23dcd6e0ab	AMDGPU: Split private memory tests Reenable R600 testing reviewer: arsenm Differential Revision: http://reviews.llvm.org/D20031 llvm-svn: 269207	2016-05-11 17:24:45 +00:00
Dan Gohman	3a5ce733ce	[WebAssembl] Implement enough of fast-isel to run the comparison tests. llvm-svn: 269203	2016-05-11 16:32:42 +00:00
Vedant Kumar	ee20294af5	[BasicAA] Compare GEP indices based on value (Fix PR27418) Equivalent GEP indices with different types are treated as different indices altogether, leading to an incorrect AA result. Fix the issue by comparing indices based on their values. Thanks to Mikael Holmén for reporting the issue! Differential Revision: http://reviews.llvm.org/D19935 llvm-svn: 269197	2016-05-11 15:45:43 +00:00
Daniel Sanders	45533b4060	[mips][ias] Work around incorrect microMIPS relocation evaluation exposed by r268900 microMIPS has a special case that is not correctly implemented in LLVM. If we have a symbol 'foo' which is equivalent to '.text+0x10'. The value of an R_MICROMIPS_LO16 relocation using 'foo' is 'foo+0x11' and not 'foo+0x10'. The in-place addend should therefore be 0x11. Work around this by partially reverting the effect of r268900 by keeping the symbol when the STO_MIPS_MICROMIPS flag is set. This fixes SingleSource/Regression/C/PR640 for microMIPS. llvm-svn: 269196	2016-05-11 15:44:23 +00:00
Simon Pilgrim	87d05b9852	[X86][AVX512] Regenerate intrinsics test llvm-svn: 269193	2016-05-11 15:13:29 +00:00
Krzysztof Parzyszek	c2c7868591	[Hexagon] Use offsets relative to FP+8 in .cfi_offset instructions When generating .cfi_offset instructions, make sure that the offset is calculated with respect to the register used to define the CFA (which is currently always FP+8). llvm-svn: 269191	2016-05-11 14:53:07 +00:00
Simon Pilgrim	02699f3f3d	[X86] Regenerate shuffle test llvm-svn: 269186	2016-05-11 13:57:15 +00:00
Daniel Sanders	df8510d4fa	[mips][ias] Fix N32 and N64 .cprestore directive when inside .set noat region. Summary: r268058 unintentionally made the retrieval of the current assembler temporary unconditional. This was fine for the existing tests but it broke the cases where the assembler temporary is not needed (N32/N64 or not PIC) and is unavailable due to a '.set noat' directive. This fixes FreeBSD's libc. Reviewers: emaste, sdardis, seanbruno Subscribers: dsanders, emaste, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D20093 llvm-svn: 269179	2016-05-11 12:48:19 +00:00
Hrvoje Varga	52c9bed858	[mips][microMIPS] Implement CFC, CTC and LDC* instructions Differential Revision: http://reviews.llvm.org/D19713 llvm-svn: 269176	2016-05-11 12:12:24 +00:00
Hrvoje Varga	aeb1fe8f20	[mips][micromips] Implement DSBH, DSHD, DSLL, DSLL32, DSLLV, DSRA, DSRA32 and DSRAV instructions Differential Revision: http://reviews.llvm.org/D16800 llvm-svn: 269169	2016-05-11 11:17:04 +00:00
Weiming Zhao	095c271131	[AArch64] Fix DAG selection for cmps for fp16 type Summary: When emitting comparison for fp16, in addition to promote the LHS and RHS to fp32, we need to change the VT as well. Reviewers: t.p.northover Subscribers: t.p.northover, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D19922 llvm-svn: 269151	2016-05-11 01:26:32 +00:00
Matt Arsenault	e8ed8e59e5	AMDGPU: Change private_element_size to 4 llvm-svn: 269145	2016-05-11 00:28:54 +00:00
Xinliang David Li	864baf4abd	Add missing tests for new PM llvm-svn: 269139	2016-05-10 23:37:19 +00:00
Easwaran Raman	9b792923d0	Revert r269131 llvm-svn: 269138	2016-05-10 23:26:04 +00:00
Dehao Chen	b76e5d948a	Propagate branch metadata when some branch probability is missing. Summary: In sample profile, some branches may have profile missing due to profile inaccuracy. We want existing branch probability still valid after propagation. Reviewers: hfinkel, davidxl, spatel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19948 llvm-svn: 269137	2016-05-10 23:07:19 +00:00
Sanjay Patel	22c541438d	auto-generate checks llvm-svn: 269134	2016-05-10 22:33:26 +00:00
Tim Northover	9508a70adc	AArch64: allow vN to represent 64-bit registers in inline asm. Unlike xN/wN, the size of vN is genuinely ambiguous in the assembly, so we should try to infer what was intended from the type. But only down to 64-bits (vN can never represent sN, hN or bN). llvm-svn: 269132	2016-05-10 22:26:45 +00:00
Easwaran Raman	7eccf4ee0e	Reapply r266477 and r266488 llvm-svn: 269131	2016-05-10 22:03:23 +00:00
Xinliang David Li	da1955835d	[PM]: port IR based profUse pass to new pass manager llvm-svn: 269129	2016-05-10 21:59:52 +00:00
Sanjay Patel	2ef275d342	remove some comments and other cruft from checks llvm-svn: 269128	2016-05-10 21:52:15 +00:00
Tim Northover	3961735f03	Revert "MemCpyOpt: combine local load/store sequences into memcpy." This reverts commit r269125. It was in my tree when I ran "git svn dcommit". It's really still under review. llvm-svn: 269127	2016-05-10 21:49:40 +00:00
Tim Northover	56048d5c2c	ARM: report an error when attempting to target a misalgined BLX The CodeGen problem was fixed in r269101, but we still miscompiled assembly that tried the same thing. llvm-svn: 269126	2016-05-10 21:48:48 +00:00
Tim Northover	6c65c71639	MemCpyOpt: combine local load/store sequences into memcpy. Sort of the BB-local equivalent to idiom-recognizer: if we have a basic-block that really implements a memcpy operation, backends can benefit from seeing this. llvm-svn: 269125	2016-05-10 21:48:11 +00:00
Hans Wennborg	719b26ba54	Loop unroller: set thresholds for optsize and minsize functions to zero Before r268509, Clang would disable the loop unroll pass when optimizing for size. That commit enabled it to be able to support unroll pragmas in -Os builds. However, this regressed binary size in one of Chromium's DLLs with ~100 KB. This restores the original behaviour of no unrolling at -Os, but doing it in LLVM instead of Clang makes more sense, and also allows the pragmas to keep working. Differential revision: http://reviews.llvm.org/D20115 llvm-svn: 269124	2016-05-10 21:45:55 +00:00
Sanjay Patel	12de4aeeb3	update test to use FileCheck for tighter checking llvm-svn: 269123	2016-05-10 21:45:51 +00:00
Sanjay Patel	f68a8c4779	update test to use FileCheck for tighter checking llvm-svn: 269122	2016-05-10 21:42:09 +00:00
Lawrence Hu	e58a814c07	Enable loopreroll for sext of loop control only IV This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. Differential Revision: http://reviews.llvm.org/D19820 llvm-svn: 269121	2016-05-10 21:16:49 +00:00
Lawrence Hu	fe7c87beac	Revert r26084: Enable loopreroll for sext of loop control only IV llvm-svn: 269119	2016-05-10 21:11:09 +00:00
Lawrence Hu	4c623d27b5	Revert r269093: Enable loopreroll for sext of loop control only IV llvm-svn: 269117	2016-05-10 21:04:28 +00:00
Quentin Colombet	220f7da488	[X86] Properly check that EAX is dead when copying EFLAGS. This fixes a bug introduced in r267623, where we got smarter and avoided to save EAX before using it. However, we failed to check if any of the subregister of EAX were alive and thus, missed cases where we have to save EAX before using it. The problem may happen on every X86/i386/... platform. This fixes llvm.org/PR27624 llvm-svn: 269115	2016-05-10 20:49:46 +00:00
Sanjay Patel	6786bc5390	[InstSimplify] use computeKnownBits on shift amount operands Do simplifications common to all shift instructions based on the amount shifted: 1. If the shift amount is known larger than the bitwidth, the result is undefined. 2. If the valid bits of the shift amount are all known to be 0, it's a shift by zero, so the shift operand is the result. Note that we could generalize the shift-by-zero transform into a shift-by-constant if all of the valid bits in the shift amount are known, but that would have to be done in InstCombine rather than here because it would mean we need to create a new shift instruction. Differential Revision: http://reviews.llvm.org/D19874 llvm-svn: 269114	2016-05-10 20:46:54 +00:00
Chad Rosier	4e6cda2db5	[InstCombine] Fold icmp ugt/ult (udiv i32 C2, X), C1. This patch adds support for two optimizations: icmp ugt (udiv C2, X), C1 -> icmp ule X, C2/(C1+1) icmp ult (udiv C2, X), C1 -> icmp ugt X, C2/C1 Differential Revision: http://reviews.llvm.org/D20123 llvm-svn: 269109	2016-05-10 20:22:09 +00:00
Kit Barton	02d455768e	[SystemZ] Add support for additional branch extended mnemonics Added support for extended mnemonics for the following branch instructions and load/store-on-condition opcodes: BR, LOCR, LOCGR, LOC, LOCG, STOC, STOCG Phabricator: http://reviews.llvm.org/D19729 Committing on behalf of Zhan Liau llvm-svn: 269106	2016-05-10 20:11:24 +00:00
Davide Italiano	7860c9bbf4	[SCCP] Partially propagate informations when the input is not fully defined. With this patch: %r1 = lshr i64 -1, 4294967296 -> undef Before this patch: %r1 = lshr i64 -1, 4294967296 -> 0 llvm-svn: 269105	2016-05-10 19:49:47 +00:00
Adrian Prantl	723ccd2790	Debug Info: Prevent DW_AT_abstract_origin from being emitted twice for the same subprogram. This fixes a bug where DW_AT_abstract_origin is being emitted twice for the same subprogram if a function is both inlined and emitted in the same translation unit, by restoring the pre-r266446 behavior. http://reviews.llvm.org/D20072 llvm-svn: 269103	2016-05-10 19:38:51 +00:00
Tim Northover	b5ece527a1	ARM: stop emitting blx instructions for most calls on MachO. I'm really not sure why we were in the first place, it's the linker's job to convert between BL/BLX as necessary. Even worse, using BLX left Thumb calls that could be locally resolved completely unencodable since all offsets to BLX are multiples of 4. rdar://26182344 llvm-svn: 269101	2016-05-10 19:17:47 +00:00
Justin Bogner	da0fe183c3	LPM: Drop require<loops> from these tests, it's redundant. NFC The LoopPassManager needs to calculate the loops analysis in order to iterate over the loops at all. Requiring it is redundant and just adds noise to the RUN lines here. llvm-svn: 269097	2016-05-10 18:28:10 +00:00
Rafael Espindola	32483a7641	Make "@name =" mandatory for globals in .ll files. An oddity of the .ll syntax is that the "@var = " in @var = global i32 42 is optional. Writing just global i32 42 is equivalent to @0 = global i32 42 This means that there is a pretty big First set at the top level. The current implementation maintains it manually. I was trying to refactor it, but then started wondering why keep it a all. I personally find the above syntax confusing. It looks like something is missing. This patch removes the feature and simplifies the parser. llvm-svn: 269096	2016-05-10 18:22:45 +00:00
Lawrence Hu	b68f16e007	Enable loopreroll for sext of loop control only IV This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. Differential Revision: http://reviews.llvm.org/D19820 llvm-svn: 269093	2016-05-10 18:00:42 +00:00
Mandeep Singh Grang	e5a2f116d6	Fix PR26655: Bail out if all regs of an inst BUNDLE have the correct kill flag Summary: While setting kill flags on instructions inside a BUNDLE, we bail out as soon as we set kill flag on a register. But we are missing a check when all the registers already have the correct kill flag set. We need to bail out in that case as well. This patch refactors the old code and simply makes use of the addRegisterKilled function in MachineInstr.cpp in order to determine whether to set/remove kill on an instruction. Reviewers: apazos, t.p.northover, pete, MatzeB Subscribers: MatzeB, davide, llvm-commits Differential Revision: http://reviews.llvm.org/D17356 llvm-svn: 269092	2016-05-10 17:57:27 +00:00
Rong Xu	b6211a0b4f	[PGO] resubmit r268969 Put the test into a target specific directory. llvm-svn: 269090	2016-05-10 17:45:33 +00:00
Lawrence Hu	8cc3b37d2c	Enable loopreroll for sext of loop control only IV This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. llvm-svn: 269084	2016-05-10 17:42:27 +00:00
Dan Gohman	2e64438ae4	[WebAssembly] Preliminary fast-isel support. llvm-svn: 269083	2016-05-10 17:39:48 +00:00
Simon Pilgrim	d86b2d6b1e	[X86][AVX512] Added another masked shuffle combine from load test llvm-svn: 269077	2016-05-10 16:55:20 +00:00
Krzysztof Parzyszek	a356bb7fa4	[ScheduleDAG] Make sure to process all def operands before any use operands An example from Hexagon where things went wrong: %R0<def> = L2_loadrigp <ga:@fp04> ; load function address J2_callr %R0<kill>, ..., %R0<imp-def> ; call *R0, return value in R0 ScheduleDAGInstrs::buildSchedGraph would visit all instructions going backwards, and in each instruction it would visit all operands in their order on the operand list. In the case of this call, it visited the use of R0 first, then removed it from the set Uses after it visited the def. This caused the DAG to be missing the data dependence edge on R0 between the load and the call. Differential Revision: http://reviews.llvm.org/D20102 llvm-svn: 269076	2016-05-10 16:50:30 +00:00
Marcin Koscielnicki	bbac890b53	[PR27599] [SystemZ] [SelectionDAG] Fix extension of atomic cmpxchg result. Currently, SelectionDAG assumes 8/16-bit cmpxchg returns either a sign extended result, or a zero extended result. SystemZ takes a third option by returning junk in the high bits (rotated contents of the other bytes in the memory word). In that case, don't use Assert*ext, and zero-extend the result ourselves if a comparison is needed. Differential Revision: http://reviews.llvm.org/D19800 llvm-svn: 269075	2016-05-10 16:49:04 +00:00
Simon Pilgrim	bfa05d169b	[X86][AVX] Added some shuffle combine from load tests As discussed on D19198 - we need to check what happens when we shuffle with different value type to the load llvm-svn: 269068	2016-05-10 16:08:24 +00:00
Teresa Johnson	8570fe47ef	[ThinLTO] Add option to emit imports files for distributed backends Summary: Add support for emission of plaintext lists of the imported files for each distributed backend compilation. Used for distributed build file staging. Invoked with new gold-plugin thinlto-emit-imports-files option, which is only valid with thinlto-index-only (i.e. for distributed builds), or from llvm-lto with new -thinlto-action=emitimports value. Depends on D19556. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19636 llvm-svn: 269067	2016-05-10 15:54:09 +00:00
Teresa Johnson	84174c3771	Restore "[ThinLTO] Emit individual index files for distributed backends" This restores commit r268627: Summary: When launching ThinLTO backends in a distributed build (currently supported in gold via the thinlto-index-only plugin option), emit an individual index file for each backend process as described here: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098272.html ... Differential Revision: http://reviews.llvm.org/D19556 Address msan failures by avoiding std::prev on map.end(), the theory is that this is causing issues due to some known UB problems in __tree. llvm-svn: 269059	2016-05-10 13:48:23 +00:00
James Molloy	aa1d638800	Revert "[VectorUtils] Query number of sign bits to allow more truncations" This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690. This reverts commit r268921. llvm-svn: 269051	2016-05-10 12:27:23 +00:00
Daniel Sanders	2225d9415f	[mips][ias] Make the default path unreachable in needsRelocateWithSymbol() (except for N64). Following post-commit comments on r268900 from Rafael Espindola: The missing relocations are now explicitly listed in the switch statement with appropriate FIXME comments and the default path is now unreachable. The temporary exception to this is that compound relocations for N64 still have a default path that returns true. This is because fixing that case ought to be a separate patch. Also make R_MIPS_NONE return false since it has no effect on the section data. llvm-svn: 269047	2016-05-10 12:17:04 +00:00
Jeroen Ketema	895774225a	[OCaml] Update core test and re-enable testing Differential Revision: http://reviews.llvm.org/D19828 llvm-svn: 269040	2016-05-10 11:19:20 +00:00
Simon Pilgrim	efc757dceb	[X86][AVX512] Added masked version of MOVDDUP test with 16f32 llvm-svn: 269038	2016-05-10 10:30:00 +00:00
Chuang-Yu Cheng	175741d5a7	Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotation Loop rotation clones instruction from the old header into the preheader. If there were uses of values produced by these instructions that were outside the loop, we have to insert PHI nodes to merge the two values. If the values are used by DbgIntrinsics they will be used as a MetadataAsValue of a ValueAsMetadata of the original values, and iterating all of the uses of the original value will not update the DbgIntrinsics. The new code checks if the values are used by DbgIntrinsics and if so, updates them using essentially the same logic as the original code. The attached testcase demonstrates the issue. Without the fix, the DbgIntrinic outside the loop uses values computed inside the loop, even though these values do not dominate the DbgIntrinsic. Author: Thomas Jablin (tjablin) Reviewers: dblaikie aprantl kbarton hfinkel cycheng http://reviews.llvm.org/D19564 llvm-svn: 269034	2016-05-10 09:45:44 +00:00
Arnaud A. de Grandmaison	333ef381b8	[InstCombine] Remove trivially empty va_start/va_end and va_copy/va_end ranges. When a va_start or va_copy is immediately followed by a va_end (ignoring debug information or other start/end in between), then it is safe to remove the pair. As this code shares some commonalities with the lifetime markers, this has been factored to helper functions. This InstCombine pattern kicks-in 3 times when running the LLVM test suite. llvm-svn: 269033	2016-05-10 09:24:49 +00:00
Chris Dewhurst	7bb1c04943	[Sparc][LEON] Itineraries unit test. Added test to check LeonItineraries are being applied by code checked-in two weeks ago in r267121. Phabricator Review: http://reviews.llvm.org/D19359 llvm-svn: 269032	2016-05-10 09:09:20 +00:00
Renato Golin	d876eecf02	Revert "[PGO] Fix __llvm_profile_raw_version linkage in MACHO IR instrumentation generates a COMDAT symbol __llvm_profile_raw_version to overwrite the same symbol in profile run-time to distinguish IR profiles from Clang generated profiles. In MACHO, LinkOnceODR linkage is used due to the lack of COMDAT support." This reverts commits r268969, r268979 and r268984. They had target specific test in generic directories without the correct specifiers and made it hard for us to come up with a good solution by rapidly committing untested changes. This test needs to be in a target specific directory or have the correct REQUIRED identifier. llvm-svn: 269027	2016-05-10 08:23:57 +00:00
Jonas Paulsson	8e5b0c65cc	[foldMemoryOperand()] Pass LiveIntervals to enable liveness check. SystemZ (and probably other targets as well) can fold a memory operand by changing the opcode into a new instruction that as a side-effect also clobbers the CC-reg. In order to do this, liveness of that reg must first be checked. When LIS is passed, getRegUnit() can be called on it and the right LiveRange is computed on demand. Reviewed by Matthias Braun. http://reviews.llvm.org/D19861 llvm-svn: 269026	2016-05-10 08:09:37 +00:00
Elena Demikhovsky	c434d091c5	[LoopVectorize] Handling induction variable with non-constant step. Allow vectorization when the step is a loop-invariant variable. This is the loop example that is getting vectorized after the patch: int int_inc; int bar(int init, int restrict A, int N) { int x = init; for (int i=0;i<N;i++){ A[i] = x; x += int_inc; } return x; } "x" is an induction variable with loop-invariant* step. But it is not a primary induction. Primary induction variable with non-constant step is not handled yet. Differential Revision: http://reviews.llvm.org/D19258 llvm-svn: 269023	2016-05-10 07:33:35 +00:00
Matthias Braun	11e87cc945	liveness.mir requires asserts to use -debug-only llvm-svn: 269020	2016-05-10 05:38:47 +00:00
Craig Topper	3fef1de785	[X86] Update X86_INTR calling convention to save ZMM registers instead of YMM registers when AVX512 is enabled. llvm-svn: 269017	2016-05-10 05:27:56 +00:00
Matthias Braun	8d6e57b216	LiveIntervalAnalysis: Rework constructMainRangeFromSubranges() We now use LiveRangeCalc::extendToUses() instead of a specially designed algorithm in constructMainRangeFromSubranges(): - The original motivation for constructMainRangeFromSubranges() were differences between the main liverange and subranges because of hidden dead definitions. This case however cannot happen anymore with the DetectDeadLaneMasks pass in place. - It simplifies the code. - This fixes a longstanding bug where we did not properly create new SSA values on merging control flow (the MachineVerifier missed most of these cases). - Move constructMainRangeFromSubranges() to LiveIntervalAnalysis and LiveRangeCalc to better match the implementation/available helper functions. llvm-svn: 269016	2016-05-10 04:51:14 +00:00
Dan Gohman	0cfb5f852d	[WebAssembly] Move register stackification and coloring to a late phase. Move the register stackification and coloring passes to run very late, after PEI, tail duplication, and most other passes. This means that all code emitted and expanded by those passes is now exposed to these passes. This also eliminates the need for prologue/epilogue code to be manually stackified, which significantly simplifies the code. This does require running LiveIntervals a second time. It's useful to think of these late passes not as late optimization passes, but as a domain-specific compression algorithm based on knowledge of liveness information. It's used to compress the code after all conventional optimizations are complete, which is why it uses LiveIntervals at a phase when actual optimization passes don't typically need it. Differential Revision: http://reviews.llvm.org/D20075 llvm-svn: 269012	2016-05-10 04:24:02 +00:00
Sanjoy Das	12c91dc4c8	[ValueTracking] Use guards to prove non-nullness of a value Reviewers: apilipenko, majnemer, reames Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20044 llvm-svn: 269008	2016-05-10 02:35:44 +00:00
Sanjoy Das	d47f42435a	[BasicAA] Guard intrinsics don't write to memory Summary: The idea is very close to what we do for assume intrinsics: we mark the guard intrinsics as writing to arbitrary memory to maintain control dependence, but under the covers we teach AA that they do not mod any particular memory location. Reviewers: chandlerc, hfinkel, gbiv, reames Subscribers: george.burgess.iv, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19575 llvm-svn: 269007	2016-05-10 02:35:41 +00:00
Matthias Braun	fb94d8d56a	llc: Rework -run-pass option We now construct a custom pass pipeline instead of injecting start-before/stop-after into the default pipeline construction. This allows to specify any pass known to the pass registry. Previously specifying indirectly added analysis passes or passes not added to the pipeline add all would not be added and we would silently do nothing. This also restricts the -run-pass option to cases with .mir input. llvm-svn: 269003	2016-05-10 01:32:44 +00:00
Quentin Colombet	ee5f36bd54	[X86][AVX512] Use the proper load/store for AVX512 registers. When loading or storing AVX512 registers we were not using the AVX512 variant of the load and store for VR128 and VR256 like registers. Thus, we ended up with the wrong encoding and actually were dropping the high bits of the instruction. The result was that we load or store the wrong register. The effect is visible only when we emit the object file directly and disassemble it. Then, the output of the disassembler does not match the assembly input. This is related to llvm.org/PR27481. llvm-svn: 269001	2016-05-10 01:09:14 +00:00
Evgeniy Stepanov	6694ec7406	Don't inline functions with different SafeStack attributes. llvm-svn: 268999	2016-05-10 00:33:07 +00:00
Sanjoy Das	2512d0c837	[SCEV] Use guards to prove predicates We can use calls to @llvm.experimental.guard to prove predicates, relying on the fact that in all locations domianted by a call to @llvm.experimental.guard the predicate it is guarding is known to be true. llvm-svn: 268997	2016-05-10 00:31:49 +00:00
Adam Nemet	0a77dfad95	[LV] Hint at the new loop distribution pragma in optimization remark When we encounter unsafe memory dependencies, loop distribution could help. Even though, the diagnostics is in LAA, it's only currently emitted in the vectorizer. llvm-svn: 268987	2016-05-09 23:03:44 +00:00
Rong Xu	2c9bdd0d3d	Fix buildbot failure from r268968. llvm-svn: 268984	2016-05-09 22:45:47 +00:00
Quentin Colombet	739614839f	[X86] Fix the AllRegs AVX calling convention. We used to list registers that were not in the AVX space. In other words, we were pushing registers that the ISA cannot encode (YMM16-YMM31). This is part of llvm.org/PR27481. llvm-svn: 268983	2016-05-09 22:37:05 +00:00
Sanjay Patel	0f153424a9	[Inliner] don't assume that a Constant alloca size is a ConstantInt (PR27277) Differential Revision: http://reviews.llvm.org/D20077 llvm-svn: 268980	2016-05-09 21:51:53 +00:00
Rong Xu	c5508046b8	Fix buildbot failure from r268968. llvm-svn: 268979	2016-05-09 21:51:50 +00:00
Simon Pilgrim	eec3a95f95	[X86][SSE] Improve cost model for i64 vector comparisons on pre-SSE42 targets As discussed on PR24888, until SSE42 we don't have access to PCMPGTQ for v2i64 comparisons, but the cost models don't reflect this, resulting in over-optimistic vectorizaton. This patch adds SSE2 'base level' costs that match what a typical target is capable of and only reduces the v2i64 costs at SSE42. Technically SSE41 provides a PCMPEQQ v2i64 equality test, but as getCmpSelInstrCost doesn't give us a way to discriminate between comparison test types we can't easily make use of this, otherwise we could split the cost of integer equality and greater-than tests to give better costings of each. Differential Revision: http://reviews.llvm.org/D20057 llvm-svn: 268972	2016-05-09 21:14:38 +00:00
Rong Xu	a12f6d3c7b	[PGO] Fix __llvm_profile_raw_version linkage in MACHO IR instrumentation generates a COMDAT symbol __llvm_profile_raw_version to overwrite the same symbol in profile run-time to distinguish IR profiles from Clang generated profiles. In MACHO, LinkOnceODR linkage is used due to the lack of COMDAT support. But LinkOnceODR linkage might have .weak_def_can_be_hidden assembly directive, while the weak variable in run-time has a .weak_definition directive. Linker will not merge these two symbols even they have the same name. The end result is IR profiles are not properly flagged in MACHO. This patch changes the linkage for __llvm_profile_raw_version in each module to LinkOnceAny so that it has same .weak_definition directive as in the run-time. Differential Revision: http://reviews.llvm.org/D20078 llvm-svn: 268969	2016-05-09 21:03:06 +00:00
Marcin Koscielnicki	60b3cbe095	[MSan] [AArch64] Fix vararg helper for >1 or non-int fixed arguments. This fixes http://llvm.org/PR27646 on AArch64. There are three issues here: - The GR save area is 7 words in size, instead of 8. This is not enough if none of the fixed arguments is passed in GRs (they're all floats or aggregates). - The first argument is ignored (which counteracts the above if it's passed in GR). - Like x86_64, fixed arguments landing in the overflow area are wrongly counted towards the overflow offset. Differential Revision: http://reviews.llvm.org/D20023 llvm-svn: 268967	2016-05-09 20:57:36 +00:00
Adrian Prantl	fe7a382453	Allow the LTO code generator to strip invalid debug info from the input. This patch introduces a new option -lto-strip-invalid-debug-info, which drops malformed debug info from the input. The problem I'm trying to solve with this sequence of patches is that historically we've done a really bad job at verifying debug info. We want to be able to make the verifier stricter without having to worry about breaking bitcode compatibility with existing producers. For example, we don't necessarily want IR produced by an older version of clang to be rejected by an LTO link just because of malformed debug info, and rather provide an option to strip it. Note that merely outdated (but well-formed) debug info would continue to be auto-upgraded in this scenario. rdar://problem/25818489 http://reviews.llvm.org/D19987 This reapplies 268936 with a test case fix for Linux (-exported-symbol foo) llvm-svn: 268965	2016-05-09 19:57:15 +00:00
Chad Rosier	131a42ccdf	[InstCombine] Fold icmp eq/ne (udiv i32 A, B), 0 -> icmp ugt/ule B, A. Differential Revision: http://reviews.llvm.org/D20036 llvm-svn: 268960	2016-05-09 19:30:20 +00:00
Quentin Colombet	86098ab10b	Reapply [X86] Add a new LOW32_ADDR_ACCESS_RBP register class. This reapplies commit r268796, with a fix for the setting of the inline asm constraints. I.e., "mark" LOW32_ADDR_ACCESS_RBP as a GR variant, so that the regular processing of the GR operands (setting of the subregisters) happens. Original commit log: [X86] Add a new LOW32_ADDR_ACCESS_RBP register class. ABIs like NaCl uses 32-bit addresses but have 64-bit frame. The new register class reflects those constraints when choosing a register class for a address access. llvm-svn: 268955	2016-05-09 19:01:46 +00:00
Quentin Colombet	52d8e3bd4c	[X86] Update a regexp in a test case to resist register allocation changes. llvm-svn: 268954	2016-05-09 19:01:42 +00:00
Nemanja Ivanovic	6e29baf7f5	[Power9] Add support for -mcpu=pwr9 in the back end This patch corresponds to review: http://reviews.llvm.org/D19683 Simply adds the bits for being able to specify -mcpu=pwr9 to the back end. llvm-svn: 268950	2016-05-09 18:54:58 +00:00
Sanjay Patel	da7fe0c4a4	clean up; NFC llvm-svn: 268949	2016-05-09 18:54:14 +00:00
Krzysztof Parzyszek	7c7bb538cb	[Hexagon] Treat all conditional branches as predicted (not-taken by default) llvm-svn: 268946	2016-05-09 18:22:07 +00:00
Konstantin Zhuravlyov	e34ead8269	[AMDGPU] Clean up debugger tests llvm-svn: 268944	2016-05-09 18:05:42 +00:00
Zachary Turner	06c2b4be25	[pdb] Parse the module info stream for each module. Differential Revision: http://reviews.llvm.org/D20026 Reviewed By: rnk llvm-svn: 268942	2016-05-09 17:45:21 +00:00
Adrian Prantl	6d80100c6a	Revert "Allow the LTO code generator to strip invalid debug info from the input." This reverts commit 268936 while investigating buildbot breakage. llvm-svn: 268940	2016-05-09 17:43:30 +00:00
Adrian Prantl	4a9292b127	Allow the LTO code generator to strip invalid debug info from the input. This patch introduces a new option -lto-strip-invalid-debug-info, which drops malformed debug info from the input. The problem I'm trying to solve with this sequence of patches is that historically we've done a really bad job at verifying debug info. We want to be able to make the verifier stricter without having to worry about breaking bitcode compatibility with existing producers. For example, we don't necessarily want IR produced by an older version of clang to be rejected by an LTO link just because of malformed debug info, and rather provide an option to strip it. Note that merely outdated (but well-formed) debug info would continue to be auto-upgraded in this scenario. rdar://problem/25818489 http://reviews.llvm.org/D19987 llvm-svn: 268936	2016-05-09 17:37:33 +00:00
Sanjay Patel	c7b91e65d8	[CGP] avoid crashing from weightlessness It's possible that we have branch weights with 0 values. In that case, don't try to create an impossible BranchProbability. llvm-svn: 268935	2016-05-09 17:31:55 +00:00
Matt Arsenault	1af53a91c0	DivergenceAnalysis: Fix crash with no return blocks The post dominator tree does not have a root node in this case. llvm-svn: 268933	2016-05-09 16:57:08 +00:00
Matt Arsenault	a949dc619c	AMDGPU: Fold shift into cvt_f32_ubyteN llvm-svn: 268930	2016-05-09 16:29:50 +00:00
Frederic Riss	98f489ce82	[dsymutil] Prevent use-after-free The BinaryHolder would query the archive member MemoryBuffer name to check if the current open archive also contains the next requested objectfile. This comparison was using a StringRef to a temporary buffer. It only happened with fat archives. This commit adds long-lived storage along with the MemoryBuffers for the fat archive filename. The added test would fail during an ASAN build without the fix. llvm-svn: 268924	2016-05-09 14:44:14 +00:00
Joerg Sonnenberger	8ffe7ab7c2	Optimize a printf with a double procent to putchar. llvm-svn: 268922	2016-05-09 14:36:16 +00:00
James Molloy	5c20e27b7f	[VectorUtils] Query number of sign bits to allow more truncations When deciding if a vector calculation can be done in a smaller bitwidth, use sign bit information from ValueTracking to add more information and allow more truncations. llvm-svn: 268921	2016-05-09 14:32:30 +00:00
Daniel Sanders	e473dc937f	[mips][micromips] Make getPointerRegClass() result depend on the instruction. Summary: Previously, it returned the GPR16MMRegClass for all instructions which was incorrect for instructions like lwsp/lwgp and unnecesarily restricted the permitted registers for instructions like lw32. This fixes quite a few of the -verify-machineinstrs errors reported in PR27458. I've only added -verify-machineinstrs to one test in this change since I understand there is a plan to enable the verifier by default. Reviewers: hvarga, zbuljan, zoran.jovanovic, sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D19873 llvm-svn: 268918	2016-05-09 13:38:25 +00:00
Strahinja Petrovic	e682b80b8b	[PowerPC] fix register alignment for long double type This patch fixes register alignment for long double type in soft float mode. Before this patch alignment was 8 and this patch changes it to 4. Differential Revision: http://reviews.llvm.org/D18034 llvm-svn: 268909	2016-05-09 12:27:39 +00:00
Chris Dewhurst	e3b8645a1c	[Sparc][LEON] Add UMAC and SMAC instruction support for Sparc LEON subtargets This change adds SMAC (signed multiply-accumulate) and UMAC (unsigned multiply-accumulate) for LEON subtargets of the Sparc processor. The new files LeonFeatures.td and leon-instructions.ll will both be expanded in future, so I want to leave them separate as small files for this review, to be expanded in future check-ins. Note: The functions are provided only for inline-assembly provision. No DAG selection is provided. Differential Revision: http://reviews.llvm.org/D19911 llvm-svn: 268908	2016-05-09 11:55:15 +00:00
Silviu Baranga	f60be28ed8	[AArch64] Implement lowering of the X constraint on AArch64 Summary: This implements the lowering of the X constraint on AArch64. The default behaviour of the X constraint lowering is to restrict it to "f". This is a problem because the "f" constraint is not implemented on AArch64 and would be too restrictive anyway. Therefore, the AArch64 hook will lower this to "w" (if the operand is a floating point or vector) or "r" otherwise. The implementation is similar with the one added for ARM (r267411). This is the AArch64 side of the fix for http://llvm.org/PR26493 Reviewers: rengolin Subscribers: aemerson, rengolin, llvm-commits, t.p.northover Differential Revision: http://reviews.llvm.org/D19967 llvm-svn: 268907	2016-05-09 11:10:44 +00:00
Simon Pilgrim	bf3a4f552e	[X86][AVX512] Added masked version of combine tests llvm-svn: 268904	2016-05-09 10:43:13 +00:00
Daniel Sanders	3d00056515	[mips][ias] R_MIPS_(GOT\|HI\|LO\|PC)16 and R_MIPS_GPREL32 do not need symbols. Summary: In theory, care must be taken to ensure that pairs of R_MIPS_(GOT\|HI\|LO)16 make the same decision on both relocs in the reloc pair but in practice this isn't as hard as it sounds and only limits the complexity of the predicate used. We handle all three with the same code to ensure their decisions always agree with each other. Reviewers: sdardis Subscribers: rafael, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D19016 llvm-svn: 268900	2016-05-09 10:21:14 +00:00
Zlatko Buljan	ba553a6e0a	[mips][microMIPS] Implement LWP and SWP instructions Differential Revision: http://reviews.llvm.org/D10640 llvm-svn: 268896	2016-05-09 08:07:28 +00:00
Frederic Riss	5af2c005eb	[dsymutil] Fix -arch option for thumb variants. r267249 removed the dual ARM/Thumb interface from MachOObjectFile, simplifying llvm-dsymutil's code. This unfortunately also regressed llvm-dsymutil's ability to select thumb slices, because the simplified code was also dealing with the discrepency between the slice arch (eg. armv7m) and the triple arch name (eg. thumbv7m). llvm-svn: 268894	2016-05-09 06:01:12 +00:00
Craig Topper	a58abd1cc6	[AVX512] Fix up types for arguments of int_x86_avx512_mask_cvtsd2ss_round and int_x86_avx512_mask_cvtss2sd_round. Only the argument being converted should be a different type. The other 2 argument should have the same type as the result. llvm-svn: 268891	2016-05-09 05:34:12 +00:00
Craig Topper	707c89c00d	[AVX512] Add non-temporal store patterns for v16i32/v32i16/v64i8. llvm-svn: 268889	2016-05-08 23:43:17 +00:00
Craig Topper	c41320d700	[AVX512] Add missing patterns for non-temporal stores of 128/256-bit vXi8/vXi16/vXi32 when VLX is enabled. The equivalent AVX1/2 patterns are disabled by VLX. This caused regular stores to be emitted instead. llvm-svn: 268886	2016-05-08 23:08:45 +00:00
Craig Topper	e5ce84a33c	[AVX512] Add VLX 128/256-bit SET0 operations that encode to 128/256-bit EVEX encoded VPXORD so all 32 registers can be used. llvm-svn: 268884	2016-05-08 21:33:53 +00:00
Craig Topper	298b6d7493	[X86] Re-generate tests using update_llc_test_checks.py to prepare for a future commit. NFC llvm-svn: 268883	2016-05-08 21:33:47 +00:00
Craig Topper	092794b82a	Remove Windows line endings in some tests to prepare for a future commit. NFC llvm-svn: 268882	2016-05-08 21:33:44 +00:00
Simon Pilgrim	4a9d32c5ba	[CostModel][X86] Extended comparison instruction cost model tests to include SSE2/SSE3/SSSE3/SSE41/SSE42 targets llvm-svn: 268877	2016-05-08 15:24:53 +00:00
Craig Topper	d681e23336	[X86] Lower 256-bit vector all-zero constants to v8i32 even with AVX1 only. Either way a 256-bit VXORPS will be used. llvm-svn: 268873	2016-05-08 07:10:54 +00:00
Craig Topper	3d6722910c	[X86] Add patterns for 256-bit non-temporal stores when only AVX1 is supported. While there, add a predicate to the SSE2 patterns to avoid an ordering dependency. llvm-svn: 268872	2016-05-08 07:10:50 +00:00
Craig Topper	d788498411	[X86] No need to avoid selecting AVX_SET0 for 256-bit integer types when only AVX1 is supported. AVX_SET0 just expands to 256-bit VXORPS which is legal in AVX1. llvm-svn: 268871	2016-05-08 07:10:47 +00:00
Weiming Zhao	5b5501e817	[ARM] Fix Scavenger assert due to underestimated stack size (re-apply r268810 as it exposed an uninitialized variable in ARM MFI. Patch 268868 should fix that.) Summary: Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure. Reviewers: rengolin Subscribers: vitalybuka, aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D19896 llvm-svn: 268869	2016-05-08 05:11:54 +00:00
Simon Pilgrim	b6f82c449a	[SelectionDAG] Added bitreverse(bitreverse(v)) --> v Added bitreverse creation testing llvm-svn: 268865	2016-05-07 20:12:36 +00:00
Simon Pilgrim	8ef046a8ca	[X86] Added BITREVERSE constant folding and identity tests Identity tests are currently failing - this will be fixed soon llvm-svn: 268862	2016-05-07 19:04:00 +00:00
Simon Pilgrim	420852e8d4	[CostModel][X86] Split BSWAP/BITREVERSE cost tests from CTPOP/CTLZ/CTTZ 'bit count' cost tests llvm-svn: 268859	2016-05-07 16:34:16 +00:00
Sanjay Patel	c2751e7050	[x86, BMI] add TLI hook for 'andn' and use it to simplify comparisons For the sake of minimalism, this patch is x86 only, but I think that at least PPC, ARM, AArch64, and Sparc probably want to do this too. We might want to generalize the hook and pattern recognition for a target like PPC that has a full assortment of negated logic ops (orc, nand). Note that http://reviews.llvm.org/D18842 will cause this transform to trigger more often. For reference, this relates to: https://llvm.org/bugs/show_bug.cgi?id=27105 https://llvm.org/bugs/show_bug.cgi?id=27202 https://llvm.org/bugs/show_bug.cgi?id=27203 https://llvm.org/bugs/show_bug.cgi?id=27328 Differential Revision: http://reviews.llvm.org/D19087 llvm-svn: 268858	2016-05-07 15:03:40 +00:00
Mehdi Amini	581f0e1451	Refactor stripDebugInfo(Function) to handle intrinsic This moves the code that handles stripping debug info intrinsic from StripDebugInfo(Module) to StripDebugInfo(Function). The latter is already walking every instructions so it makes sense to do it at the same time. This makes also stripDebugInfo(Function) as an API more useful: it is really dropping every debug info in the Function. Finally the existing code is trigerring an assertion when the Module is not fully materialized. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268847	2016-05-07 04:10:52 +00:00
Vitaly Buka	49bbbd8e7a	Revert r268832 "Refactor stripDebugInfo(Function) to handle intrinsic" It breaks many bots llvm-svn: 268837	2016-05-07 02:10:59 +00:00
Vitaly Buka	e81d96be6f	Revert r268810 becase it brakes msan bot. 16802==WARNING: MemorySanitizer: use-of-uninitialized-value lib/Target/ARM/ARMFrameLowering.cpp:1632 llvm-svn: 268833	2016-05-07 01:54:00 +00:00
Mehdi Amini	6eef08138e	Refactor stripDebugInfo(Function) to handle intrinsic This moves the code that handles stripping debug info intrinsic from StripDebugInfo(Module) to StripDebugInfo(Function). The latter is already walking every instructions so it makes sense to do it at the same time. This makes also stripDebugInfo(Function) as an API more useful: it is really dropping every debug info in the Function. Finally the existing code is trigerring an assertion when the Module is not fully materialized. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268832	2016-05-07 01:42:36 +00:00

... 31 32 33 34 35 ...

39519 Commits