llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniil Fukalov	4c3322cc84	[SCEV] limit recursion depth of CompareSCEVComplexity Summary: CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled loop) and runs almost infinite time. Added cache of "equal" SCEV pairs to earlier cutoff of further estimation. Recursion depth limit was also introduced as a parameter. Reviewers: sanjoy Subscribers: mzolotukhin, tstellarAMD, llvm-commits Differential Revision: https://reviews.llvm.org/D26389 llvm-svn: 287232	2016-11-17 16:07:52 +00:00
Simon Pilgrim	67ef3b984a	Wdocumentation fix llvm-svn: 287224	2016-11-17 12:21:45 +00:00
Simon Pilgrim	8eca5520dc	[X86][SSE] Improve lowering of vXi64 multiply with known zero 32-bit halves vXi64 multiplication is lowered into 3 calls of vpmuludq with the upper/lower 32-bit halves. If any of these halves are zero then we can remove individual calls. Although there was isBuildVectorAllZeros code to do this I don't think it ever worked (maybe just for constant folded cases that don't seem to be tested for any longer). This requires additional X86ISD support for computeKnownBitsForTargetNode, so far I've just added support for X86ISD::VZEXT (VPMOVZX* - helping the AVX2+ cases). Partial fix for PR30845 Differential Revision: https://reviews.llvm.org/D26590 llvm-svn: 287223	2016-11-17 12:14:49 +00:00
Simon Pilgrim	c4d733cd6a	Fix spelling in comment. NFC. llvm-svn: 287222	2016-11-17 12:03:05 +00:00
Pablo Barrio	c41e856f53	[ARM] Relax restriction on variadic functions for tailcall optimization Summary: Variadic functions can be treated in the same way as normal functions with respect to the number and types of parameters. Reviewers: grosbach, olista01, t.p.northover, rengolin Subscribers: javed.absar, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D26748 llvm-svn: 287219	2016-11-17 10:56:58 +00:00
Oren Ben Simhon	489d6eff4f	[X86] RegCall - Handling v64i1 in 32/64 bit target Register Calling Convention defines a new behavior for v64i1 types. This type should be saved in GPR. However for 32 bit machine we need to split the value into 2 GPRs (because each is 32 bit). Differential Revision: https://reviews.llvm.org/D26181 llvm-svn: 287217	2016-11-17 09:59:40 +00:00
Sanjoy Das	43ccb38bb5	Delete dead code and add asserts instead; NFC llvm-svn: 287214	2016-11-17 07:29:43 +00:00
Sanjoy Das	4a8fe09040	[ImplicitNullCheck] Fix an edge case where we were hoisting incorrectly ImplicitNullCheck keeps track of one instruction that the memory operation depends on that it also hoists with the memory operation. When hoisting this dependency, it would sometimes clobber a live-in value to the basic block we were hoisting the two things out of. Fix this by explicitly looking for such dependencies. I also noticed two redundant checks on `MO.isDef()` in IsMIOperandSafe. They're redundant since register MachineOperands are either Defs or Uses -- there is no third kind. I'll change the checks to asserts in a later commit. llvm-svn: 287213	2016-11-17 07:29:40 +00:00
Craig Topper	05b0fcd168	[X86] Fix formatting. NFC llvm-svn: 287211	2016-11-17 05:59:55 +00:00
Dean Michael Berris	3234d3a4bd	[XRay] Support AArch64 in LLVM This patch adds XRay support in LLVM for AArch64 targets. This patch is one of a series: Clang: https://reviews.llvm.org/D26415 compiler-rt: https://reviews.llvm.org/D26413 Author: rSerge Reviewers: rengolin, dberris Subscribers: amehsan, aemerson, llvm-commits, iid_iunknown Differential Revision: https://reviews.llvm.org/D26412 llvm-svn: 287209	2016-11-17 05:15:37 +00:00
Chris Bieneman	05c279fc4b	[CMake] NFC. Updating CMake dependency specifications This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system. llvm-svn: 287206	2016-11-17 04:36:50 +00:00
Konstantin Zhuravlyov	d709efb0da	[AMDGPU] Custom lower f16 = fp_round f64 llvm-svn: 287203	2016-11-17 04:28:37 +00:00
Konstantin Zhuravlyov	3f0cdc7a11	[AMDGPU] Promote f16/i16 conversions to f32/i32 llvm-svn: 287201	2016-11-17 04:00:46 +00:00
Konstantin Zhuravlyov	662e01dfbe	[AMDGPU] Expand `br_cc` for f16 Differential Revision: https://reviews.llvm.org/D26732 llvm-svn: 287199	2016-11-17 03:49:01 +00:00
Dehao Chen	41d72a8632	Use profile info to adjust loop unroll threshold. Summary: For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold. For hot loop, we set a higher unroll threshold and allows expensive tripcount computation to allow more aggressive unrolling. Reviewers: davidxl, mzolotukhin Subscribers: sanjoy, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D26527 llvm-svn: 287186	2016-11-17 01:17:02 +00:00
Peter Collingbourne	f72a8d4e08	Introduce GlobalSplit pass. This pass splits globals into elements using inrange annotations on getelementptr indices. Differential Revision: https://reviews.llvm.org/D22295 llvm-svn: 287178	2016-11-16 23:40:26 +00:00
Dylan McKay	017a55b092	[AVR] Wrap all methods in the pseudo expansion pass in an anon namespace The '-fpermissive' compiler flag complains if the template specializations used in the class are used in a different namespace. llvm-svn: 287176	2016-11-16 23:06:14 +00:00
Dylan McKay	5810c7ee6e	[AVR] Remove unused method from AVRTargetMachine llvm-svn: 287173	2016-11-16 22:48:30 +00:00
Sanjay Patel	066139a3ec	[x86] allow FP-logic ops when one operand is FP and result is FP We save an inter-register file move this way. If there's any CPU where the FP logic is slower, we could transform this back to int-logic in MachineCombiner. This helps, but doesn't solve, PR6137: https://llvm.org/bugs/show_bug.cgi?id=6137 The 'andn' test shows that we're missing a pattern match to recognize the xor with -1 constant as a 'not' op. llvm-svn: 287171	2016-11-16 22:34:05 +00:00
Ahmed Bougacha	f33f91af24	[AsmParser] Avoid recursing when lexing ';'. NFC. This should prevent stack overflows in non-optimized builds on .ll files with lots of consecutive commented-out lines. Instead of recursing into LexToken(), continue into a 'while (true)'. llvm-svn: 287170	2016-11-16 22:25:05 +00:00
Ahmed Bougacha	bd6ce9a247	[CodeGen] Pass references, not pointers, to MMI helpers. NFC. While there, rename them to follow the coding style. llvm-svn: 287169	2016-11-16 22:25:03 +00:00
Ahmed Bougacha	996961a461	Revert "Get GlobalISel to build on Linux after r286407" This reverts commit r286962. We want to avoid depending on SelectionDAG, and AddLandingPadInfo lives in CodeGen now. llvm-svn: 287168	2016-11-16 22:24:59 +00:00
Ahmed Bougacha	456dce8a84	[CodeGen] Pull MMI helpers from FunctionLoweringInfo to MMI. NFC. They're not SelectionDAG- or FunctionLoweringInfo-specific. They are, however, specific to building MMI from IR. We could make them members, but it's nice having MMI be a "simple" data structure and this logic kept separate. This also lets us reuse them from GlobalISel. llvm-svn: 287167	2016-11-16 22:24:56 +00:00
Ahmed Bougacha	2b4c127531	[CodeGen] Cleanup MachineModuleInfo doxygen comments. NFC. Remove redundant names and only keep header comments. llvm-svn: 287166	2016-11-16 22:24:53 +00:00
Dylan McKay	a789f40002	[AVR] Add the pseudo instruction expansion pass Summary: A lot of the pseudo instructions are required because LLVM assumes that all integers of the same size as the pointer size are legal. This means that it will not currently expand 16-bit instructions to their 8-bit variants because it thinks 16-bit types are legal for the operations. This also adds all of the CodeGen tests that required the pass to run. Reviewers: arsenm, kparzysz Subscribers: wdng, mgorny, modocache, llvm-commits Differential Revision: https://reviews.llvm.org/D26577 llvm-svn: 287162	2016-11-16 21:58:04 +00:00
Peter Collingbourne	7d0c869b86	X86: Simplify X86ISD::Wrapper operand checks. NFCI. We only ever create TargetConstantPool, TargetJumpTable, TargetExternalSymbol, TargetGlobalAddress, TargetGlobalTLSAddress, MCSymbol and TargetBlockAddress nodes as operands of X86ISD::Wrapper nodes, so we can remove one check and invert the other. Also update the documentation comment for X86ISD::Wrapper. Differential Revision: https://reviews.llvm.org/D26731 llvm-svn: 287160	2016-11-16 21:48:59 +00:00
Sanjoy Das	df4b162e4d	[ImplicitNullChecks] Do not not handle call MachineInstrs We don't track callee clobbered registers correctly, so avoid hoisting across calls. Note: for this bug to trigger we need a `readonly` call target, since we already have logic to not hoist across potentially storing instructions either. llvm-svn: 287159	2016-11-16 21:45:22 +00:00
Peter Collingbourne	7a74803abf	Bitcode: Introduce initial multi-module reader API. Implement getLazyBitcodeModule() and parseBitcodeFile() in terms of it. Differential Revision: https://reviews.llvm.org/D26719 llvm-svn: 287156	2016-11-16 21:44:45 +00:00
Tim Northover	397f9d9d05	ARM: fix CodeGen for 64-bit shifts. One half of the shifts obviously needed conditional selection based on whether the shift amount is more than 32-bits, but leaving the other half as the natural shift isn't acceptable either: it's undefined behaviour to shift a 32-bit value by more than 31. llvm-svn: 287149	2016-11-16 20:54:28 +00:00
Rong Xu	66827427e1	Make block placement deterministic We fail to produce bit-to-bit matching stage2 and stage3 compiler in PGO bootstrap build. The reason is because LoopBlockSet is of SmallPtrSet type whose iterating order depends on the pointer value. This patch fixes this issue by changing to use SmallSetVector. Differential Revision: http://reviews.llvm.org/D26634 llvm-svn: 287148	2016-11-16 20:50:06 +00:00
Sanjay Patel	80baf69cb5	[InstCombine] replace unreachable with assert and remove unreachable code; NFCI llvm-svn: 287147	2016-11-16 20:40:02 +00:00
Matt Arsenault	3b36bb1d87	AMDGPU: Enable ConstrainCopy DAG mutation This fixes a probably unintended divergence from the default scheduler behavior. llvm-svn: 287146	2016-11-16 20:35:23 +00:00
Sanjay Patel	1b9560ffd6	[InstCombine] fix formatting and add FIXMEs to foldOperationIntoSelectOperand(); NFC llvm-svn: 287145	2016-11-16 20:18:34 +00:00
Geoff Berry	8301c645c8	[AArch64] Handle vector types in replaceZeroVectorStore. Summary: Extend replaceZeroVectorStore to handle more vector type stores, floating point zero vectors and set alignment more accurately on split stores. This is a follow-up change to r286875. This change fixes PR31038. Reviewers: MatzeB Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26682 llvm-svn: 287142	2016-11-16 19:35:19 +00:00
Mandeep Singh Grang	000ce9a686	[LoopVectorize] Fix for non-determinism in codegen Summary: This patch fixes issues in codegen uncovered due to https://reviews.llvm.org/D26718 Reviewers: mssimpso Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D26727 llvm-svn: 287135	2016-11-16 18:53:17 +00:00
Tom Stellard	0d162b1c4f	AMDGPU/SI: Avoid creating unnecessary copies in the SIFixSGPRCopies pass Summary: 1. Don't try to copy values to and from the same register class. 2. Replace copies with of registers with immediate values with v_mov/s_mov instructions. The main purpose of this change is to make MachineSink do a better job of determining when it is beneficial to split a critical edge, since the pass assumes that copies will become move instructions. This prevents a regression in uniform-cfg.ll if we enable critical edge splitting for AMDGPU. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23408 llvm-svn: 287131	2016-11-16 18:42:17 +00:00
Sanjay Patel	4ce99d4d24	fix comment formatting; NFC llvm-svn: 287127	2016-11-16 18:09:44 +00:00
Sanjay Patel	7f3d51f840	[x86] add fake scalar FP logic instructions to ReplaceableInstrs to save some bytes We can replace "scalar" FP-bitwise-logic with other forms of bitwise-logic instructions. Scalar SSE/AVX FP-logic instructions only exist in your imagination and/or the bowels of compilers, but logically equivalent int, float, and double variants of bitwise-logic instructions are reality in x86, and the float variant may be a shorter instruction depending on which flavor (SSE or AVX) of vector ISA you have...so just prefer float all the time. This is a preliminary step towards solving PR6137: https://llvm.org/bugs/show_bug.cgi?id=6137 Differential Revision: https://reviews.llvm.org/D26712 llvm-svn: 287122	2016-11-16 17:42:40 +00:00
Reid Kleckner	3a83e76811	[sancov] Name the global containing the main source file name If the global name doesn't start with __sancov_gen, ASan will insert unecessary red zones around it. llvm-svn: 287117	2016-11-16 16:50:43 +00:00
Daniil Fukalov	e870398e48	test commit, changed tab to spaces, NFC llvm-svn: 287116	2016-11-16 16:41:40 +00:00
Pekka Jaaskelainen	8483cf0ae8	Add a little endian variant of TCE. llvm-svn: 287111	2016-11-16 15:22:23 +00:00
Simon Pilgrim	b57dd17142	[X86][AVX512] Autoupgrade lossless i32/u32 to f64 conversion intrinsics with generic IR Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic SINT_TO_FP/UINT_TO_FP calls instead of x86 intrinsics without affecting final codegen. LLVM counterpart to D26686 Differential Revision: https://reviews.llvm.org/D26736 llvm-svn: 287108	2016-11-16 14:48:32 +00:00
Simon Dardis	8ca1cbccc6	[mips] Fix unsigned/signed type error MipsFastISel uses a a class to represent addresses with a signed member to represent the offset. MipsFastISel::emitStore, emitLoad and computeAddress all treated the offset as being positive. In cases where the offset was actually negative and a frame pointer was used, this would cause the constant synthesis routine to crash as it would generate an unexpected instruction sequence when frame indexes are replaced. Reviewers: vkalintiris Differential Revision: https://reviews.llvm.org/D26192 llvm-svn: 287099	2016-11-16 11:29:07 +00:00
Simon Dardis	7b7cb8d9dd	[mips] not instruction alias This patch adds the single operand form of the not alias to microMIPS and MIPS along with additional tests. This partially resolves PR/30381. Thanks to Sean Bruno for reporting the issue! llvm-svn: 287097	2016-11-16 11:04:49 +00:00
Pavel Labath	0c20e05e89	Remove TimeValue class Summary: All uses have been replaced by appropriate std::chrono types, and the class is now unused. Reviewers: zturner, mehdi_amini Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D26447 llvm-svn: 287094	2016-11-16 10:46:48 +00:00
Ayman Musa	4d60243bfd	[X86][AVX512] Removing llvm x86 intrinsics for _mm_mask_move_{ss\|sd} intrinsics. Differential Revision: https://reviews.llvm.org/D26128 llvm-svn: 287087	2016-11-16 09:00:28 +00:00
Craig Topper	6910fa0ef4	[X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file. Reviewers: zvi, delena, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26660 llvm-svn: 287083	2016-11-16 05:24:10 +00:00
Konstantin Zhuravlyov	bf998c7003	[AMDGPU] Refactor v_mac_{f16, f32} patterns into a class NFC Differential Revision: https://reviews.llvm.org/D26711 llvm-svn: 287077	2016-11-16 03:39:12 +00:00
Matthias Braun	3d51cf0a2c	AArch64: Use DeadRegisterDefinitionsPass before regalloc. Doing this before register allocation reduces register pressure as we do not even have to allocate a register for those dead definitions. Differential Revision: https://reviews.llvm.org/D26111 llvm-svn: 287076	2016-11-16 03:38:27 +00:00
Konstantin Zhuravlyov	2a87a42035	[AMDGPU] Handle f16 select{_cc} - Select `select` to `v_cndmask_b32` - Expand `select_cc` - Refactor patterns Differential Revision: https://reviews.llvm.org/D26714 llvm-svn: 287074	2016-11-16 03:16:26 +00:00
Quentin Colombet	fb9b0cdcfe	[RegAllocGreedy] Record missed hint for late recoloring. In https://reviews.llvm.org/D25347, Geoff noticed that we still have useless copy that we can eliminate after register allocation. At the time the allocation is chosen for those copies, they are not useless but, because of changes in the surrounding code, later on they might become useless. The Greedy allocator already has a mechanism to deal with such cases with a late recoloring. However, we missed to record the some of the missed hints. This commit fixes that. llvm-svn: 287070	2016-11-16 01:07:12 +00:00
Rui Ueyama	fb1e6d22a3	Align Modi and FileInfo substreams on 32-byte offsets. This is required by DbiStream, but DbiStreamBuilder didn't align these substreams, so the output of DbiSTreamBuilder couldn't be read by DbiStream. Test will be added to LLD. llvm-svn: 287067	2016-11-16 00:59:27 +00:00
Vyacheslav Klochkov	b3dc774a99	Fixed the lost FastMathFlags for CALL operations in SLPVectorizer. Reviewer: Michael Zolotukhin. Differential Revision: https://reviews.llvm.org/D26575 llvm-svn: 287064	2016-11-16 00:55:50 +00:00
Justin Lebar	2860573529	[BypassSlowDivision] Handle division by constant numerators better. Summary: We don't do BypassSlowDivision when the denominator is a constant, but we do do it when the numerator is a constant. This patch makes two related changes to BypassSlowDivision when the numerator is a constant: * If the numerator is too large to fit into the bypass width, don't bypass slow division (because we'll never run the smaller-width code). * If we bypass slow division where the numerator is a constant, don't OR together the numerator and denominator when determining whether both operands fit within the bypass width. We need to check only the denominator. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D26699 llvm-svn: 287062	2016-11-16 00:44:47 +00:00
Justin Lebar	583b8687eb	[BypassSlowDivision] Simplify partially-tautological if statement. if (A \|\| (B && A)) --> if (A). llvm-svn: 287061	2016-11-16 00:44:43 +00:00
Rui Ueyama	507013180e	Fix Modi and File count if there are more than 65535 modules/files. These numbers are intended to be capped at 65535, but `std::max<uint16_t>(UINT16_MAX, N)` always returns N for any N because the expression is the same as `std::max((uint16_t)UINT16_MAX, (uint16_t)N)`. llvm-svn: 287060	2016-11-16 00:38:33 +00:00
Joerg Sonnenberger	8c1a9ac52b	Always use relative jump table encodings on PowerPC64. For the default, small and medium code model, use the existing difference from the jump table towards the label. For all other code models, setup the picbase and use the difference between the picbase and the block address. Overall, this results in smaller data tables at the expensive of one or two more arithmetic operation at the jump site. Given that we only create jump tables with a lot more than two entries, it is a net win in size. For larger code models the assumption remains that individual functions are no larger than 2GB. Differential Revision: https://reviews.llvm.org/D26336 llvm-svn: 287059	2016-11-16 00:37:30 +00:00
Jan Vesely	e8cc395e4f	AMDGPU/GCN: Exit early in hazard recognizer if there is no vreg argument wbinvl.* are vector instruction that do not sue vector registers. v2: check only M?BUF instructions Differential Revision: https://reviews.llvm.org/D26633 llvm-svn: 287056	2016-11-15 23:55:15 +00:00
Filipe Cabecinhas	ec350b71fa	[AddressSanitizer] Add support for (constant-)masked loads and stores. This patch adds support for instrumenting masked loads and stores under ASan, if they have a constant mask. isInterestingMemoryAccess now supports returning a mask to be applied to the loads, and instrumentMop will use it to generate additional checks. Added tests for v4i32 v8i32, and v4p0i32 (~v4i64) for both loads and stores (as well as a test to verify we don't add checks to non-constant masks). Differential Revision: https://reviews.llvm.org/D26230 llvm-svn: 287047	2016-11-15 22:37:30 +00:00
Peter Collingbourne	bc9a574657	Object: replace backslashes with slashes in embedded relative thin archive paths on Windows. This makes these thin archives portable between *nix and Windows. Differential Revision: https://reviews.llvm.org/D26696 llvm-svn: 287038	2016-11-15 21:36:35 +00:00
Chad Rosier	201fc1ed26	[AArch64] Add support for Qualcomm's Falkor CPU. Differential Revision: https://reviews.llvm.org/D26673 llvm-svn: 287036	2016-11-15 21:34:12 +00:00
Tom Stellard	d23de360db	AMDGPU/SI: Fix pattern for i16 = sign_extend i1 Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26670 llvm-svn: 287035	2016-11-15 21:25:56 +00:00
Kostya Serebryany	9d6dc7b164	[sanitizer-coverage] make sure asan does not instrument coverage guards (reported in https://github.com/google/oss-fuzz/issues/84 ) llvm-svn: 287030	2016-11-15 21:12:50 +00:00
Kuba Brecka	a3ccf3ddbd	Fix llvm-symbolizer to correctly sort a symbol array and calculate symbol sizes Sometimes, llvm-symbolizer gives wrong results due to incorrect sizes of some symbols. The reason for that was an incorrectly sorted array in computeSymbolSizes. The comparison function used subtraction of unsigned types, which is incorrect. Let's change this to return explicit -1 or 1. Differential Revision: https://reviews.llvm.org/D26537 llvm-svn: 287028	2016-11-15 21:07:03 +00:00
Tim Northover	ed55a05b01	GlobalISel: remove unused variable to silence warning. llvm-svn: 287027	2016-11-15 21:06:07 +00:00
Matt Arsenault	d4bb5e4831	AMDGPU: Enable store clustering Also respect the TII hook for these like the generic code does in case we want a flag later to disable this. llvm-svn: 287021	2016-11-15 20:22:55 +00:00
Haicheng Wu	faee2b71a7	[AArch64] Lower multiplication by a constant int to shl+add+shl Lower a = b * C where C = (2^n + 1) * 2^m to add w0, w0, w0, lsl n lsl w0, w0, m Differential Revision: https://reviews.llvm.org/D229245 llvm-svn: 287019	2016-11-15 20:16:48 +00:00
Matt Arsenault	3666629837	AMDGPU: Analyze mubuf with immediate soffset Fixes giving up on clustering common addr64 accesses with constant 0 soffset. llvm-svn: 287018	2016-11-15 20:14:27 +00:00
Matt Arsenault	12c53897f3	AMDGPU: Fix return after else llvm-svn: 287015	2016-11-15 19:58:54 +00:00
Wei Mi	37c4aaaf52	Revert r286999 which caused buildbot test failures. Some testcases need to be made target specific. llvm-svn: 287014	2016-11-15 19:42:05 +00:00
Matt Arsenault	92b355b1a9	AMDGPU: Replace assert(false) with unreachable llvm-svn: 287013	2016-11-15 19:34:37 +00:00
Stanislav Mekhanoshin	ea91cca593	[AMDGPU] Add wave barrier builtin The wave barrier represents the discardable barrier. Its main purpose is to carry convergent attribute, thus preventing illegal CFG optimizations. All lanes in a wave come to convergence point simultaneously with SIMT, thus no special instruction is needed in the ISA. The barrier is discarded during code generation. Differential Revision: https://reviews.llvm.org/D26585 llvm-svn: 287007	2016-11-15 19:00:15 +00:00
Wei Mi	7ccf7651c0	[LSR] Allow formula containing Reg for SCEVAddRecExpr related with outerloop. In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr, and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser and dropped. Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only handle inner loop now so only %for.body2 will be handled. Using the logic above, formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1reg({0,+,1}<%for.body2>) will be dropped no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related with outerloop. Only formula like reg(%array) + 1reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept because the SCEVAddRecExpr related with outerloop is folded into the initial value of the SCEVAddRecExpr related with current loop. But in some cases, we do need to share the basic induction variable reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction variables used by LSR, so we don't want to drop the formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally. From the existing comment, it tries to avoid considering multiple level loops at the same time. However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other than current loop, it is an invariant and will be simple to handle, and the formula doesn't have to be dropped. Differential Revision: https://reviews.llvm.org/D26429 llvm-svn: 286999	2016-11-15 18:35:53 +00:00
Pawel Bylica	c3f6c97f71	Integer legalization: fix MUL expansion Summary: This fixes the runtime results produces by the fallback multiplication expansion introduced in r270720. For tests I created a fuzz tester that compares the results with Boost.Multiprecision. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26628 llvm-svn: 286998	2016-11-15 18:29:24 +00:00
Zaara Syeda	a19c9e60e9	vector load store with length (left justified) llvm portion llvm-svn: 286993	2016-11-15 17:54:19 +00:00
Sanjay Patel	73d1d35d21	fix formatting; NFC llvm-svn: 286989	2016-11-15 17:47:13 +00:00
Wei Mi	d2948cef70	[IndVars] Change the order to compute WidenAddRec in widenIVUse. When both WidenIV::getWideRecurrence and WidenIV::getExtendedOperandRecurrence return non-null but different WideAddRec, if getWideRecurrence is called before getExtendedOperandRecurrence, we won't bother to call getExtendedOperandRecurrence again. But As we know it is possible that after SCEV folding, we cannot prove the legality using the SCEVAddRecExpr returned by getWideRecurrence. Meanwhile if getExtendedOperandRecurrence returns non-null WideAddRec, we know for sure that it is legal to do widening for current instruction. So it is better to put getExtendedOperandRecurrence before getWideRecurrence, which will increase the chance of successful widening. Differential Revision: https://reviews.llvm.org/D26059 llvm-svn: 286987	2016-11-15 17:34:52 +00:00
Diana Picus	895c6aa6fd	[ARM] GlobalISel: Remove unused members. NFCI This silences some warnings that I didn't see with my host compiler. llvm-svn: 286981	2016-11-15 16:42:10 +00:00
Craig Topper	c7486af9c9	[AVX-512] Add AVX-512 vector shift intrinsics to memory santitizer. Just needed to add the intrinsics to the exist switch. The code is generic enough to support the wider vectors with no changes. llvm-svn: 286980	2016-11-15 16:27:33 +00:00
Simon Pilgrim	ceffb43b1b	[X86][SSE] Improve SINT_TO_FP of boolean vector results (signum) This patch helps avoids poor legalization of boolean vector results (e.g. 8f32 -> 8i1 -> 8i16) that feed into SINT_TO_FP by inserting an early SIGN_EXTEND and so help improve the truncation logic. This is not necessary for AVX512 targets where boolean vectors are legal - AVX512 manages to lower ( sint_to_fp vXi1 ) into some form of ( select mask, 1.0f , 0.0f ) in most cases. Fix for PR13248 Differential Revision: https://reviews.llvm.org/D26583 llvm-svn: 286979	2016-11-15 16:24:40 +00:00
Pablo Barrio	4f80c93a2e	Revert "[JumpThreading] Unfold selects that depend on the same condition" This reverts commit ac54d0066c478a09c7cd28d15d0f9ff8af984afc. llvm-svn: 286976	2016-11-15 15:42:23 +00:00
Pablo Barrio	5f782bb048	Revert "[JumpThreading] Prevent non-deterministic use lists" This reverts commit f2c2f5354070469dac253373c66527ca971ddc66. llvm-svn: 286975	2016-11-15 15:42:17 +00:00
Diana Picus	90f0a84943	[ARM] Make sure GlobalISel is only initialized once. NFCI Move some code inside the proper 'if' block to make sure it is only run once, when the subtarget is first created. Things can still break if we use different ARM target machines or if we have functions with different 'target-cpu' or 'target-features', we should fix that too in the future. llvm-svn: 286974	2016-11-15 15:38:15 +00:00
Robert Lougher	b0905209dd	[LoopVectorizer] When estimating reg usage, unused insts may "end" another use The register usage algorithm incorrectly treats instructions whose value is not used within the loop (e.g. those that do not produce a value). The algorithm first calculates the usages within the loop. It iterates over the instructions in order, and records at which instruction index each use ends (in fact, they're actually recorded against the next index, as this is when we want to delete them from the open intervals). The algorithm then iterates over the instructions again, adding each instruction in turn to a list of open intervals. Instructions are then removed from the list of open intervals when they occur in the list of uses ended at the current index. The problem is, instructions which are not used in the loop are skipped. However, although they aren't used, the last use of a value may have been recorded against that instruction index. In this case, the use is not deleted from the open intervals, which may then bump up the estimated register usage. This patch fixes the issue by simply moving the "is used" check after the loop which erases the uses at the current index. Differential Revision: https://reviews.llvm.org/D26554 llvm-svn: 286969	2016-11-15 14:27:33 +00:00
Tony Jiang	5f850cd1b1	[PowerPC] Implement BE VSX load/store builtins - llvm portion. This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE, they behaves exactly the same with vec_xl and vec_xst, therefore they are simply implemented by defining a matching macro. On LE, they are implemented by defining new builtins and intrinsics. For int/float/long long/double, it is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short, we also need some extra shuffling before or after call the builtins to get the desired BE order. For int128, simply call vec_xl or vec_xst. llvm-svn: 286967	2016-11-15 14:25:56 +00:00
Diana Picus	3776e76201	Get GlobalISel to build on Linux after r286407 r286407 has introduced calls to llvm::AddLandingPadInfo, which lives in the SelectionDAG component. Add it to LLVMBuild to avoid linker failures on Linux. llvm-svn: 286962	2016-11-15 14:11:11 +00:00
Zvi Rackover	6f76f46d2c	[X86][FastISel] Assert that we are dealing with arithmetic with overflow intrinsics. NFC llvm-svn: 286961	2016-11-15 13:50:35 +00:00
Sam Kolton	c01faa383f	[AMDGPU] TableGen: change individual instruction flags to bit type from bits<1> Summary: This is needed to be able to use this flags in InstrMappings. Reviewers: tstellarAMD, vpykhtin Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26666 llvm-svn: 286960	2016-11-15 13:39:07 +00:00
Zvi Rackover	f0b9b57bd3	[X86][FastISel] Fix lowering of overflow result on AVX512 targets Summary: Fix a case where the overflow value of type i1, which is legal on AVX512, was assigned to a VK1 register class. We always want this value to be assigned to a GPR since the overflow return value is lowered to a SETO instruction. Fixes pr30981. Reviewers: mkuper, igorb, craig.topper, guyblank, qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D26620 llvm-svn: 286958	2016-11-15 13:29:23 +00:00
Florian Hahn	4b4dc172e7	Test commit, remove trailing space. This commit is used to test commit access. llvm-svn: 286957	2016-11-15 13:28:42 +00:00
Joerg Sonnenberger	1a7eec68a9	Introduce TLI predicative for base-relative Jump Tables. For 64bit ABIs it is common practice to use relative Jump Tables with potentially different relocation bases. As the logic for the jump table itself doesn't depend on the relocation base, make it easier for targets to use the generic logic. Start by dropping the now redundant MIPS logic. Differential Revision: https://reviews.llvm.org/D26578 llvm-svn: 286951	2016-11-15 12:39:46 +00:00
Javed Absar	f043dac25d	[ARM] Add machine scheduler for Cortex-R52 This patch adds the Sched Machine Model for Cortex-R52. Details of the pipeline and descriptions are in comments in file ARMScheduleR52.td included in this patch. Reviewers: rengolin, jmolloy Differential Revision: https://reviews.llvm.org/D26500 llvm-svn: 286949	2016-11-15 11:34:54 +00:00
Asaf Badouh	b573553424	DAGCombiner: fix combine of trunc and select bugzilla: https://llvm.org/bugs/show_bug.cgi?id=29002 pr29002 Differential Revision: https://reviews.llvm.org/D26449 llvm-svn: 286938	2016-11-15 07:55:22 +00:00
Matt Arsenault	1c8d933881	TableGen: Add operator !or llvm-svn: 286936	2016-11-15 06:49:28 +00:00
Zvi Rackover	76dbf26599	[X86][GlobalISel] Add minimal call lowering support to the IRTranslator Summary: Add basic functionality to support call lowering for X86. Currently only supports functions which return void and take zero arguments. Inspired by commit 286573. Reviewers: ab, qcolombet, t.p.northover Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26593 llvm-svn: 286935	2016-11-15 06:34:33 +00:00
Craig Topper	e6915b85ed	[X86] Add LLVM version number for each intrinsic handled by auto upgrade for age tracking. One day we'd like to remove some of this autoupgrade support and it will be easier if we know how long some of it has been around. Differential Revision: https://reviews.llvm.org/D26321 llvm-svn: 286933	2016-11-15 05:04:51 +00:00
Matt Arsenault	c79dc70d50	AMDGPU: Fix f16 fabs/fneg llvm-svn: 286931	2016-11-15 02:25:28 +00:00
Rui Ueyama	6b77ad3546	Simplify identify_magic. This patch defines a memcmp-ish helper function to simplify identify_magic. llvm-svn: 286928	2016-11-15 01:57:05 +00:00
Greg Clayton	6f6e4dbd5d	Improve DWARF parsing speed by improving DWARFAbbreviationDeclaration This patch gets a DWARF parsing speed improvement by having DWARFAbbreviationDeclaration instances know if they have a fixed byte size. If an abbreviation has a fixed byte size that can be calculated given a DWARFUnit, then parsing a DIE becomes two steps: parse ULEB128 abbrev code, and then add constant size to the offset. This patch also adds a fixed byte size to each DWARFAbbreviationDeclaration::AttributeSpec so that attributes can quickly skip their values if needed without the need to lookup the fixed for size. Notable improvements: - DWARFAbbreviationDeclaration::findAttributeIndex() now returns an Optional<uint32_t> instead of a uint32_t and we no longer have to look for the magic -1U return value - Optional<uint32_t> DWARFAbbreviationDeclaration::findAttributeIndex(dwarf::Attribute attr) const; - DWARFAbbreviationDeclaration now has a getAttributeValue() function that extracts an attribute value given a DIE offset that takes advantage of the DWARFAbbreviationDeclaration::AttributeSpec::ByteSize - bool DWARFAbbreviationDeclaration::getAttributeValue(const uint32_t DIEOffset, const dwarf::Attribute Attr, const DWARFUnit &U, DWARFFormValue &FormValue) const; - A DWARFAbbreviationDeclaration instance can return a fixed byte size for itself so DWARF parsing is faster: - Optional<size_t> DWARFAbbreviationDeclaration::getFixedAttributesByteSize(const DWARFUnit &U) const; - Any functions that used to take a "const DWARFUnit *U" that would crash if U was NULL now take a "const DWARFUnit &U" and are only called with a valid DWARFUnit Differential Revision: https://reviews.llvm.org/D26567 llvm-svn: 286924	2016-11-15 01:23:06 +00:00
Rui Ueyama	e97c34cb60	Fix -Wswitch. llvm-svn: 286920	2016-11-15 00:58:50 +00:00
Rui Ueyama	2d02166b43	Add a file magic for CL.exe's object file created with /GL. This patch makes it possible to identify object files created by CL.exe with /GL option. Such file contains Microsoft proprietary intermediate code instead of target machine code to do LTO. I need this to print out user-friendly error message from LLD. Differential Revision: https://reviews.llvm.org/D26645 llvm-svn: 286919	2016-11-15 00:54:54 +00:00
Matt Arsenault	81da114e65	AMDGPU: Set hasExtraSrcRegAllocReq on v_div_scale_* This doesn't solve any problems I know about, but this should have more conservative assumptions about the operands' llvm-svn: 286913	2016-11-15 00:05:42 +00:00
Matt Arsenault	972034bda9	AMDGPU: Fix formatting of 1/2pi immediate llvm-svn: 286912	2016-11-15 00:04:33 +00:00
Tom Stellard	9c884e495c	MIRParser: Add support for parsing vreg reg alloc hints Reviewers: qcolombet, MatzeB Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26573 llvm-svn: 286911	2016-11-15 00:03:14 +00:00
Evandro Menezes	9fc54826e0	[AArch64] Compute the Newton series for reciprocals natively Implement the Newton series for square root, its reciprocal and reciprocal natively using the specialized instructions in AArch64 to perform each series iteration. Differential revision: https://reviews.llvm.org/D26518 llvm-svn: 286907	2016-11-14 23:29:01 +00:00
Peter Collingbourne	3cb86272fc	Linker: Remove unnecessary call to copyMetadata in IRLinker::linkGlobalVariable. This was causing us to create duplicate metadata on global variables. Debug info test case by Adrian Prantl, additional test cases by me. Fixes PR31012. Differential Revision: https://reviews.llvm.org/D26622 llvm-svn: 286905	2016-11-14 23:18:38 +00:00
Tom Stellard	11e60ff7da	RegAllocGreedy: Properly initialize this pass, so that -run-pass will work Reviewers: qcolombet, MatzeB Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26572 llvm-svn: 286895	2016-11-14 21:50:13 +00:00
Kuba Brecka	ddfdba3b01	[tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part This adds support for TSan C++ exception handling, where we need to add extra calls to __tsan_func_exit when a function is exitted via exception mechanisms. Otherwise the shadow stack gets corrupted (leaked). This patch moves and enhances the existing implementation of EscapeEnumerator that finds all possible function exit points, and adds extra EH cleanup blocks where needed. Differential Revision: https://reviews.llvm.org/D26177 llvm-svn: 286893	2016-11-14 21:41:13 +00:00
Kevin Enderby	22fc007809	Add a checkSymbolTable() method to the MachOObjectFile class. The philosophy of the error checking in libObject for Mach-O files is that the constructor will check the load commands so for their tables the offsets and sizes are properly contained in the file. But there is no checking of the entries of any of the tables. For the contents of the tables themselves the methods accessing the contents of the entries return errors as needed. In some cases this however makes it difficult or cumbersome to produce a good error message which would include the tool name, file name, archive member, and name of the architecture of a slice of a universal file the error occurred in. So idea is that there will be a method to check a table which can be called up front before using it allowing a good error message to be produced before a table is used. And if only verification of the Mach-O file and its tables are wanted a new possible method checkAllTables() could be added to call all of the methods to check all the tables at some time when such methods exist. The checkSymbolTable() is the first of such methods to check one of the Mach-O file tables. This method initially will used in llvm-objdump’s DisassembleMachO() routine before it gets the section and symbol information. As if there are problems with the symbol table currently the error is first encountered by the bool operator() in the SymbolSorter() struct which passed to std::sort(). In this case there is no context as to the file name the symbol which results a poor error message: LLVM ERROR: truncated or malformed object (bad string index: 22 for symbol at index 1) with the added call to the checkSymbolTable() method the error message includes the tool name and file name: llvm-objdump: 'macho-invalid-symbol-strx': truncated or malformed object (bad string table index: 22 past the end of string table, for symbol at index 1) llvm-svn: 286887	2016-11-14 20:57:04 +00:00
Krzysztof Parzyszek	b16a4e5869	[Hexagon] Give a predicate function a more meaningful name Change "orisadd" to "IsOrAdd" to follow the naming conventions, and change "isOrAdd" in the C++ code to "isOrEquivalentToAdd". llvm-svn: 286886	2016-11-14 20:53:09 +00:00
Tim Northover	3d38c38826	ARM: try to fix GCC 4.8 compilation again after r286881. llvm-svn: 286882	2016-11-14 20:31:53 +00:00
Tim Northover	46a6f0fbf0	Recommit: ARM: sort register lists by encoding in push/pop instructions. For example we were producing push {r8, r10, r11, r4, r5, r7, lr} This is misleading (r4, r5 and r7 are actually pushed before the rest), and other components (stack folding recently) often forget to deal with the extra complexity coming from the different order, leading to miscompiles. Finally, we warn about our own code in -no-integrated-as mode without this, which is really not a good idea. Fixed usage of std::sort so that we (hopefully) use instantiations that actually exist in GCC 4.8. llvm-svn: 286881	2016-11-14 20:28:24 +00:00
Geoff Berry	e8de67abad	[AArch64] Change some pointers to references. NFC. Follow-up change to r286875. llvm-svn: 286879	2016-11-14 19:59:11 +00:00
Geoff Berry	526c50588d	[AArch64] Split 0 vector stores into scalar store pairs. Summary: Replace a splat of zeros to a vector store by scalar stores of WZR/XZR. The load store optimizer pass will merge them to store pair stores. This should be better than a movi to create the vector zero followed by a vector store if the zero constant is not re-used, since one instructions and one register live range will be removed. For example, the final generated code should be: stp xzr, xzr, [x0] instead of: movi v0.2d, #0 str q0, [x0] Reviewers: t.p.northover, mcrosier, MatzeB, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26561 llvm-svn: 286875	2016-11-14 19:39:04 +00:00
Geoff Berry	def4bfa9d9	[AArch64] Factor out transform code from split16BStore. NFC. llvm-svn: 286874	2016-11-14 19:39:00 +00:00
Teresa Johnson	4fef68cb8d	[ThinLTO] Only promote exported locals as marked in index Summary: We have always speculatively promoted all renamable local values (except const non-address taken variables) for both the exporting and importing module. We would then internalize them back based on the ThinLink results if they weren't actually exported. This is inefficient, and results in unnecessary renames. It also meant we had to check the non-renamability of a value in the summary, which was already checked during function importing analysis in the ThinLink. Made renameModuleForThinLTO (which does the promotion/renaming) instead use the index when exporting, to avoid unnecessary renames/promotions. For importing modules, we can simply promoted all values as any local we import by definition is exported and needs promotion. This required changes to the method used by the FunctionImport pass (only invoked from 'opt' for testing) and when invoked from llvm-link, since neither does a ThinLink. We simply conservatively mark all locals in the index as promoted, which preserves the current aggressive promotion behavior. I also needed to change an llvm-lto based test where we had previously been aggressively promoting values that weren't importable (aliasees), but now will not promote. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26467 llvm-svn: 286871	2016-11-14 19:21:41 +00:00
Kostya Serebryany	6c77811a29	[libFuzzer] replace 'auto' with 'auto *' to better follow the LLVM style llvm-svn: 286870	2016-11-14 19:21:38 +00:00
Daniel Sanders	08714cdee4	Revert: r286868 - Test commit llvm-svn: 286869	2016-11-14 19:10:56 +00:00
Daniel Sanders	12432e0b04	Test commit llvm-svn: 286868	2016-11-14 19:09:33 +00:00
Tim Northover	1b66f39cf2	Revert "ARM: sort register lists by encoding in push/pop instructions." This reverts commit 286866. It broke a bot, something to do with exactly which templates std::sort accepts. llvm-svn: 286867	2016-11-14 19:05:28 +00:00
Tim Northover	e908ea844c	ARM: sort register lists by encoding in push/pop instructions. For example we were producing push {r8, r10, r11, r4, r5, r7, lr} This is misleading (r4, r5 and r7 are actually pushed before the rest), and other components (stack folding recently) often forget to deal with the extra complexity coming from the different order, leading to miscompiles. Finally, we warn about our own code in -no-integrated-as mode without this, which is really not a good idea. llvm-svn: 286866	2016-11-14 19:02:17 +00:00
Sean Fertile	a435e07de8	[PPC] Add intrinsic mapping to the xscvhpsp instruction add an intrinsic to expose the 'VSX Scalar Convert Half-Precision to Single-Precision' instruction. Differential review: https://reviews.llvm.org/D26536 llvm-svn: 286862	2016-11-14 18:43:59 +00:00
Changpeng Fang	8236fe103f	AMDGPU/SI: Support data types other than V4f32 in image intrinsics Summary: Extend image intrinsics to support data types of V1F32 and V2F32. TODO: we should define a mapping table to change the opcode for data type of V2F32 but just one channel is active, even though such case should be very rare. Reviewers: tstellarAMD Differential Revision: http://reviews.llvm.org/D26472 llvm-svn: 286860	2016-11-14 18:33:18 +00:00
Bob Wilson	d7bef6972d	Use _Unwind_Backtrace on Apple platforms. Darwin's backtrace() function does not work with sigaltstack (which was enabled when available with r270395) — it does a sanity check to make sure that the current frame pointer is within the expected stack area (which it is not when using an alternate stack) and gives up otherwise. The alternative of _Unwind_Backtrace seems to work fine on macOS, so use that when backtrace() fails. Note that we then use backtrace_symbols_fd() with the addresses from _Unwind_Backtrace, but I’ve tested that and it also seems to work fine. rdar://problem/28646552 llvm-svn: 286851	2016-11-14 17:56:18 +00:00
Adrian Prantl	1f9ac96cb1	Typo llvm-svn: 286845	2016-11-14 17:26:32 +00:00
Teresa Johnson	3624bdf60a	Restore "[ThinLTO] Prevent exporting of locals used/defined in module level asm" This restores the rest of r286297 (part was restored in r286475). Specifically, it restores the part requiring adding a dependency from the Analysis to Object library (downstream use changed to correctly model split BitReader vs BitWriter libraries). Original description of this part of patch follows: Module level asm may also contain defs of values. We need to prevent export of any refs to local values defined in module level asm (e.g. a ref in normal IR), since that also requires renaming/promotion of the local. To do that, the summary index builder looks at all values in the module level asm string that are not marked Weak or Global, which is exactly the set of locals that are defined. A summary is created for each of these local defs and flagged as NoRename. This required adding handling to the BitcodeWriter to look at GV declarations to see if they have a summary (rather than skipping them all). Finally, added an assert to IRObjectFile::CollectAsmUndefinedRefs to ensure that an MCAsmParser is available, otherwise the module asm parse would silently fail. Initialized the asm parser in the opt tool for use in testing this fix. Fixes PR30610. llvm-svn: 286844	2016-11-14 17:12:32 +00:00
Sumanth Gundapaneni	d428cf8b5f	[Hexagon] Remove unsafe load instructions that affect Stack Slot Coloring The Stack slot coloring pass removes a store that is followed by a load that deal with the same stack slot. The function isLoadFromStackSlot is supposed to consider the loads that have no side-effects. This patch fixed the issue by removing the unsafe loads from this function Eg: %vreg0<def> = L2_loadruh_io <fi#15>, 0 S2_storeri_io <fi#15>, 0, %vreg0 In this case, we load an unsigned extended half word and store this in to the same stack slot. The Stack slot coloring pass considers safe to remove the store. This patch marked all the non-vector byte and half word loads as unsafe. llvm-svn: 286843	2016-11-14 17:11:00 +00:00
Teresa Johnson	d5033a4576	[ThinLTO] Make inline assembly handling more efficient in summary Summary: The change in r285513 to prevent exporting of locals used in inline asm added all locals in the llvm.used set to the reference set of functions containing inline asm. Since these locals were marked NoRename, this automatically prevented importing of the function. Unfortunately, this caused an explosion in the summary reference lists in some cases. In my particular example, it happened for a large protocol buffer generated C++ file, where many of the generated functions contained an inline asm call. It was exacerbated when doing a ThinLTO PGO instrumentation build, where the PGO instrumentation included thousands of private __profd_* values that were added to llvm.used. We really only need to include a single llvm.used local (NoRename) value in the reference list of a function containing inline asm to block it being imported. However, it seems cleaner to add a flag to the summary that explicitly describes this situation, which is what this patch does. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26402 llvm-svn: 286840	2016-11-14 16:40:19 +00:00
Simon Pilgrim	779da8e5ea	[CostModel][X86] Added mul costs for vXi8 vectors More realistic v16i8/v32i8/v64i8 MUL costs - we have to extend to vXi16, use PMULLW and then truncate the result llvm-svn: 286838	2016-11-14 15:54:24 +00:00
Simon Pilgrim	27fed8e5d6	[X86][AVX] Fixed v16i16/v32i8 ADD/SUB costs on AVX1 subtargets Add explicit v16i16/v32i8 ADD/SUB costs, matching the costs of v4i64/v8i32 - they were missing for some reason. This has side effects on the LV max bandwidth tests (AVX1 now prefers 128-bit vectors vs AVX2 which still prefers 256-bit) llvm-svn: 286832	2016-11-14 14:45:16 +00:00
Sean Fertile	adda5b2d2b	[PPC] add intrinsics for vec extract exp/significand and vec test data class. Differential Revision: https://reviews.llvm.org/D26272 llvm-svn: 286829	2016-11-14 14:42:37 +00:00
Simon Pilgrim	475b40dab8	Remove redundant condition (PR28352) NFCI. We were already testing is the op was not a leaf, so need to then test if it was a leaf (added it to the assert instead). llvm-svn: 286817	2016-11-14 12:00:46 +00:00
James Molloy	6df8f27c95	[InlineCost] Remove skew when calculating call costs When calculating the cost of a call instruction we were applying a heuristic penalty as well as the cost of the instruction itself. However, when calculating the benefit from inlining we weren't discounting the equivalent penalty for the call instruction that would be removed! This caused skew in the calculation and meant we wouldn't inline in the following, trivial case: int g() { h(); } int f() { g(); } llvm-svn: 286814	2016-11-14 11:14:41 +00:00
Simon Pilgrim	3d8482a88d	Remove redundant condition (PR28800) NFCI. 'A \|\| (!A && B)' is equivalent to 'A \|\| B': (LoopCycle > DefCycle) \|\| (LoopCycle <= DefCycle && LoopStage <= DefStage) --> (LoopCycle > DefCycle) \|\| (LoopStage <= DefStage) llvm-svn: 286811	2016-11-14 10:40:23 +00:00
Diana Picus	bda7276120	GlobalISel: Fix indentation. NFC llvm-svn: 286808	2016-11-14 10:25:43 +00:00
Pablo Barrio	7ce2c5ecaf	[JumpThreading] Prevent non-deterministic use lists Summary: Unfolding selects was previously done with the help of a vector of pointers that was then sorted to be able to remove duplicates. As this sorting depends on the memory addresses, it was non-deterministic. A SetVector is used now so that duplicates are removed without the need of sorting first. Reviewers: mgrang, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26450 llvm-svn: 286807	2016-11-14 10:24:26 +00:00
Eric Fiselier	c79c9d8536	Add explicit (void) cast to unused unique_ptr::release() results Summary: This patch adds explicit `(void)` casts to discarded `release()` calls to suppress -Wunused-result. This patch fixes all warnings are generated as a result of [applying `[[nodiscard]]` within libc++](https://reviews.llvm.org/D26596). Similar fixes were applied to Clang in r286796. Reviewers: chandlerc, dberris Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26598 llvm-svn: 286797	2016-11-14 07:26:17 +00:00
Saleem Abdulrasool	d6da74f22b	Demangle: only demangle mangled symbols Only attempt to demangle symbols which have the itanium C++ prefix of `_Z`. This ensures that we do not treat any symbol name as a managled named. We would previously treat a C function `f` as a mangled name and decode that to `float` incorrectly. While it is easy to add tests for this, Mehdi recommended against introducing tests for the demangler as libc++abi should cover the testing. llvm-svn: 286795	2016-11-14 04:54:47 +00:00
Craig Topper	8f85ad1755	[AVX-512] Add suffixless aliases for EVEX encoded vcvtsi2ss/vcvtsi2sd/vcvtusi2ss/vcvtusi2sd. This matches the VEX behavior. Fixes another problem from PR28850. llvm-svn: 286790	2016-11-14 02:46:58 +00:00
Craig Topper	b8596e4d1d	[X86] Cleanup 'x' and 'y' mnemonic suffixes for vcvtpd2dq/vcvttpd2dq/vcvtpd2ps and similar instructions. -Don't print the 'x' suffix for the 128-bit reg/mem VEX encoded instructions in Intel syntax. This is consistent with the EVEX versions. -Don't print the 'y' suffix for the 256-bit reg/reg VEX encoded instructions in Intel or AT&T syntax. This is consistent with the EVEX versions. -Allow the 'x' and 'y' suffixes to be used for the reg/mem forms when we're assembling using Intel syntax. -Allow the 'x' and 'y' suffixes on the reg/reg EVEX encoded instructions in Intel or AT&T syntax. This is consistent with what VEX was already allowing. This should fix at least some of PR28850. llvm-svn: 286787	2016-11-14 01:53:29 +00:00
Craig Topper	353e59b6d6	[AVX-512] Remove and autoupgrade masked dword/qword variable shift intrinsics to the new unmasked versions and selects. llvm-svn: 286786	2016-11-14 01:53:22 +00:00
Sanjay Patel	cfcc42bdc2	[ValueTracking] recognize even more variants of smin/smax Similar to: https://reviews.llvm.org/rL285499 https://reviews.llvm.org/rL286318 We can't minimally expose this in IR tests because we don't have min/max intrinsics, but the difference is visible in codegen because SelectionDAGBuilder::visitSelect() uses matchSelectPattern(). We're not canonicalizing these patterns in IR (yet), so I don't expect there to be any regressions as noted here: http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html llvm-svn: 286776	2016-11-13 20:04:52 +00:00
Craig Topper	ba13703bb3	[AVX-512] Fix a disassembler failure for AVX-512 vcmpss/vcmpsd with an immediate larger than 32. Fix the same bug with VLX vcmpps/vcmppd. Fixes PR24941. llvm-svn: 286775	2016-11-13 19:58:18 +00:00
Sanjay Patel	819f096f09	[ValueTracking] move min/max matching to helper function; NFCI llvm-svn: 286772	2016-11-13 19:30:19 +00:00
Craig Topper	987dad2bc3	[X86][IR] Reduce the number of full string comparisons in the code that autoupgrades masked shift intrinsics. llvm-svn: 286768	2016-11-13 19:09:56 +00:00
Matt Arsenault	dc45274d54	AMDGPU: Implement SGPR spilling with scalar stores nThis avoids the nasty problems caused by using memory instructions that read the exec mask while spilling / restoring registers used for control flow masking, but only for VI when these were added. This always uses the scalar stores when enabled currently, but it may be better to still try to spill to a VGPR and use this on the fallback memory path. The cache also needs to be flushed before wave termination if a scalar store is used. llvm-svn: 286766	2016-11-13 18:20:54 +00:00
Igor Breger	e2399f9e0e	revert commit r286761, some builds failed on Win platforms llvm-svn: 286765	2016-11-13 15:48:11 +00:00
Ayman Musa	c09b3769ae	[X86][AVX512] Removing llvm x86 intrinsics for _mm_mask_move_{ss\|sd} intrinsics. Differential Revision: https://reviews.llvm.org/D26128 llvm-svn: 286761	2016-11-13 14:51:25 +00:00
Ayman Musa	46af8f9c6f	[X86][AVX512] Add patterns for all variants of VMOVSS/VMOVSD instructions. Differential Revision: https://reviews.llvm.org/D26022 llvm-svn: 286758	2016-11-13 14:29:32 +00:00
Craig Topper	b4173a5a70	[InstCombine][AVX-512] Teach InstCombineCalls to handle the new unmasked AVX-512 variable shift intrinsics. llvm-svn: 286755	2016-11-13 07:26:19 +00:00
Craig Topper	43e97649a1	[AVX-512] Add unmasked intrinsics for variable shifts of dwords and qwords. These will be used to replace the masked intrinsics so that InstCombineCalls can optimize the AVX-512 variable shifts the same way it does for AVX2. llvm-svn: 286754	2016-11-13 07:26:15 +00:00
Konstantin Zhuravlyov	f86e4b7266	[AMDGPU] Add f16 support (VI+) Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753	2016-11-13 07:01:11 +00:00
Peter Collingbourne	d9445c49ad	Bitcode: Change module reader functions to return an llvm::Expected. Differential Revision: https://reviews.llvm.org/D26562 llvm-svn: 286752	2016-11-13 07:00:17 +00:00
Peter Collingbourne	8dff03911c	Analysis: Simplify the ScalarEvolution::getGEPExpr() interface. NFCI. All existing callers were manually extracting information out of an existing GEP instruction and passing it to getGEPExpr(). Simplify the interface by changing it to take a GEPOperator instead. llvm-svn: 286751	2016-11-13 06:59:50 +00:00
Peter Collingbourne	9ef5a8c501	Bitcode: More precise casting. NFCI. llvm-svn: 286750	2016-11-13 06:59:28 +00:00
Peter Collingbourne	9d5fd4db81	IR: Change the Type::get{Array,Vector,Pointer}ElementType() functions to perform the correct type assertion. Previously we were only asserting that the type was a sequential type. llvm-svn: 286749	2016-11-13 06:58:45 +00:00
Craig Topper	8b831cbb2a	[InstCombine][AVX-512] Expand vector shift handling to work on the AVX-512 shift by immediate and shift by single value. This does not include support for the AVX-512 variable shifts. That will be coming in a future patch. llvm-svn: 286739	2016-11-13 01:51:55 +00:00
Craig Topper	da6a63db1c	[AVX-512] Remove the remaining masked shift by immediate or by single value. Autoupgrade them to recently introduced unmasked versions and a select. After this I'll add the unmasked intrinsics to InstCombineCalls to finish making our handling of these types of shuffles consistent between AVX-512 and the legacy intrinsics. llvm-svn: 286725	2016-11-12 18:04:46 +00:00
Zachary Turner	17412b03b2	[Support] Add StringRef::find_lower and contains_lower. Differential Revision: https://reviews.llvm.org/D25299 llvm-svn: 286724	2016-11-12 17:17:12 +00:00
Craig Topper	9d25c5e2fa	[AVX-512] Add unmasked version of shift by immediate and shift by single element in XMM. Summary: This is the first step towards being able to add the avx512 shift by immediate intrinsics to InstCombineCalls where we aleady support the sse2 and avx2 intrinsics. We need to the unmasked versions so we can avoid having to teach InstCombineCalls that it would need to insert selects sometimes. Instead we'll just add the selects around the new instrinsics in the frontend. This change should also enable the shift by i32 intrinsics to take a non-constant shift value just like the avx2 and sse intrinsics. This will enable us to fix PR30691 once we update clang. Next I'll switch clang to use the new builtins. Then we'll come back to the backend and remove/autoupgrade the old intrinsics. Then I'll work on the same series for variable shifts. Reviewers: RKSimon, zvi, delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26333 llvm-svn: 286711	2016-11-12 05:28:24 +00:00
Craig Topper	5cb13062d2	[AVX-512] Add support for lowering shuffles to VALIGND/VALIGNQ Summary: VALIGND and VALIGNQ are similar to PALIGNR but instead of working on a 128-bit lane they work on the entire vector register. This change leverages the shuffle rotate detection code used for PALIGNR to detect these cases. Reviewers: delena, RKSimon Subscribers: Farhana, llvm-commits Differential Revision: https://reviews.llvm.org/D26297 llvm-svn: 286709	2016-11-12 05:05:27 +00:00
whitequark	ce9bb1097d	[C API] Fix several null pointer dereferences. llvm-svn: 286704	2016-11-12 03:38:23 +00:00
Kostya Serebryany	53c894d257	[libFuzzer] use a valid ASCII string for a dummy seed corpus llvm-svn: 286702	2016-11-12 02:27:21 +00:00
Kostya Serebryany	fc1c405f98	[libFuzzer] use less stack llvm-svn: 286689	2016-11-12 00:24:35 +00:00
Rui Ueyama	a8a68a993e	Remove extra semicolon. llvm-svn: 286688	2016-11-12 00:23:32 +00:00
Tom Stellard	b4c8e8e30b	AMDGPU/SI: Promote i16 = fp_[us]int f32 for VI Summary: This fixes a regression caused by r286464. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26570 llvm-svn: 286687	2016-11-12 00:19:11 +00:00
Zachary Turner	804d50286d	Fix -Werror build with clang-cl. llvm-svn: 286683	2016-11-11 23:58:11 +00:00
Zachary Turner	11db2642fb	[Support] Introduce llvm::formatv() function. This introduces a new type-safe general purpose formatting library. It provides compile-time type safety, does not require a format specifier (since the type is deduced), and provides mechanisms for extending the format capability to user defined types, and overriding the formatting behavior for existing types. This patch additionally adds documentation for the API to the LLVM programmer's manual. Mailing List Thread: http://lists.llvm.org/pipermail/llvm-dev/2016-October/105836.html Differential Revision: https://reviews.llvm.org/D25587 llvm-svn: 286682	2016-11-11 23:57:40 +00:00
Rui Ueyama	f7c9c3234c	Define DbiStreamBuilder::addSectionContribs. This patch defines a new function to add a SectionContribs stream to a PDB file. Unlike SectionMap, SectionContribs contains a list of input sections as opposed to output sections. Note that this patch needs improving because currently we do not set Module field in SectionContribs entries. In a follow-up patch, I'll add Modules and then fix it after that. Differential Revision: https://reviews.llvm.org/D26210 llvm-svn: 286677	2016-11-11 23:41:13 +00:00
Tom Stellard	9fdbec870c	AMDGPU/SI: Fix visit order assumption in SIFixSGPRCopies Summary: This pass was assuming that when a PHI instruction defined a register used by another PHI instruction that the defining insstruction would be legalized before the using instruction. This assumption was causing the pass to not legalize some PHI nodes within divergent flow-control. This fixes a bug that was uncovered by r285762. Reviewers: nhaehnle, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D26303 llvm-svn: 286676	2016-11-11 23:35:42 +00:00
Sanjay Patel	84ae943f0d	[InstCombine] use dyn_cast rather isa+cast; NFC Follow-up to r286664 cleanup as suggested by Eli. Thanks! llvm-svn: 286671	2016-11-11 23:20:01 +00:00
Kostya Serebryany	235679181b	[libFuzzer] do not initialize parts of TracePC -- let them be initialized by the linker. Add no-msan attribute to the memcmp hook. llvm-svn: 286665	2016-11-11 23:06:53 +00:00
Sanjay Patel	cb2199b2f3	[InstCombine] clean up foldSelectOpOp(); NFC llvm-svn: 286664	2016-11-11 23:01:20 +00:00
Anna Zaks	3c43737832	[tsan][llvm] Implement the function attribute to disable TSan checking at run time This implements a function annotation that disables TSan checking for the function at run time. The benefit over attribute((no_sanitize("thread"))) is that the accesses within the callees will also be suppressed. The motivation for this attribute is a guarantee given by the objective C language that the calls to the reference count decrement and object deallocation will be synchronized. To model this properly, we would need to intercept all ref count decrement calls (which are very common in ObjC due to use of ARC) and also every single message send. Instead, we propose to just ignore all accesses made from within dealloc at run time. The main downside is that this still does not introduce any synchronization, which means we might still report false positives if the code that relies on this synchronization is not executed from within dealloc. However, we have not seen this in practice so far and think these cases will be very rare. Differential Revision: https://reviews.llvm.org/D25858 llvm-svn: 286663	2016-11-11 23:01:02 +00:00
Adam Nemet	9bfbf8bbdf	[LV] Stop saying "use -Rpass-analysis=loop-vectorize" This is PR28376. Unfortunately given the current structure of optimization diagnostics we lack the capability to tell whether the user has passed -Rpass-analysis=loop-vectorize since this is local to the front-end (BackendConsumer::OptimizationRemarkHandler). So rather than printing this even if the user has already passed -Rpass-analysis, this patch just punts and stops recommending this option. I don't think that getting this right is worth the complexity. Differential Revision: https://reviews.llvm.org/D26563 llvm-svn: 286662	2016-11-11 22:51:46 +00:00
Matthias Braun	c94ca916f5	Revert "(origin/master, origin/HEAD) MachineScheduler/ScheduleDAG: Add support to skipping a node." Revert accidentally committed change. This reverts commit r286655. llvm-svn: 286656	2016-11-11 22:39:50 +00:00
Matthias Braun	66ee0bfced	MachineScheduler/ScheduleDAG: Add support to skipping a node. The DAG mutators in the scheduler cannot really remove DAG nodes as additional anlysis information such as ScheduleDAGToplogicalSort are already computed at this point and rely on a fixed number of DAG nodes. Alleviate the missing removal with a new flag: Setting the new skip flag on a node ignores it during scheduling. llvm-svn: 286655	2016-11-11 22:37:34 +00:00
Matthias Braun	40639885f5	ScheduleDAGInstrs: Move VRegUses to ScheduleDAGMILive; NFCI Push VRegUses/collectVRegUses() down the class hierarchy towards its only user ScheduleDAGMILive. NFCI: The initialization of the map happens at a later point but that should not matter. This is in preparation to allow DAG mutators to merge nodes, which relies on this map getting computed later. llvm-svn: 286654	2016-11-11 22:37:31 +00:00
Matthias Braun	69f1d123b2	MachineScheduler: Dump EntrySU/ExitSU if possible llvm-svn: 286653	2016-11-11 22:37:28 +00:00
Matthias Braun	636a5972a9	ScheduleDAG: Identify EntrySU/ExitSU when dumping node ids llvm-svn: 286652	2016-11-11 22:37:26 +00:00
Erik Eckstein	c1d52e5c53	FunctionComparator: don't rely on argument evaluation order. This is a follow-up on the recent refactoring of the FunctionMerge pass. It should fix a fail of the new FunctionComparator unittest whe compiling with MSVC. llvm-svn: 286648	2016-11-11 22:21:39 +00:00
Adrian Prantl	622bddb6e0	Simplify code and address review comments (NFC) llvm-svn: 286644	2016-11-11 22:09:25 +00:00
Adrian Prantl	6cb849e2f0	Fix a reference-to-temporary introduced in r286607. llvm-svn: 286640	2016-11-11 21:48:09 +00:00
Lang Hames	1f2bf2d3e1	[ORC] Re-apply 286620 with fixes for the ErrorSuccess class. llvm-svn: 286639	2016-11-11 21:42:09 +00:00
Nemanja Ivanovic	ec4b0c360f	[PowerPC] Add remaining vector permute builtins in altivec.h - LLVM portion This patch corresponds to review: https://reviews.llvm.org/D26480 Adds all the intrinsics used for various permute builtins that will be added to altivec.h. llvm-svn: 286638	2016-11-11 21:42:01 +00:00
Evgeniy Stepanov	1fe189d795	[cfi] Fix weak functions handling. When a function pointer is replaced with a jumptable pointer, special case is needed to preserve the semantics of extern_weak functions. Since a jumptable entry can not be extern_weak, we emulate that behaviour by replacing all references to F (the extern_weak function) with the following expression: F != nullptr ? JumpTablePtr : nullptr. Extra special care is needed for global initializers, since most (or probably all) backends can not lower an initializer that includes this kind of constant expression. Initializers like that are replaced with a global constructor (i.e. a runtime initializer). llvm-svn: 286636	2016-11-11 21:39:26 +00:00
Erik Eckstein	4d6fb72aa9	Make the FunctionComparator of the MergeFunctions pass a stand-alone utility. This is pure refactoring. NFC. This change moves the FunctionComparator (together with the GlobalNumberState utility) in to a separate file so that it can be used by other passes. For example, the SwiftMergeFunctions pass in the Swift compiler: https://github.com/apple/swift/blob/master/lib/LLVMPasses/LLVMMergeFunctions.cpp Details of the change: ) The big part is just moving code out of MergeFunctions.cpp into FunctionComparator.h/cpp ) Make FunctionComparator member functions protected (instead of private) so that a derived comparator class can use them. Following refactoring helps to share code between the base FunctionComparator class and a derived class: ) Add a beginCompare() function ) Move some basic function property comparisons into a separate function compareSignature() *) Do the GEP comparison inside cmpOperations() which now has a new needToCmpOperands reference parameter https://reviews.llvm.org/D25385 llvm-svn: 286632	2016-11-11 21:15:13 +00:00
Rui Ueyama	9a2a1d27a5	Fix -Wpessimizing-move warning. llvm-svn: 286629	2016-11-11 20:39:02 +00:00
Vyacheslav Klochkov	f1a12fe0f5	Fixed the lost FastMathFlags for FCmp operations in SLPVectorizer. Reviewer: Michael Zolotukhin. Differential Revision: https://reviews.llvm.org/D26543 llvm-svn: 286626	2016-11-11 19:55:29 +00:00
Chad Rosier	8ade03463e	[AArch64] Update a FIXME comment to reflect current state. NFC. llvm-svn: 286625	2016-11-11 19:52:45 +00:00
Peter Collingbourne	6de481a378	Bitcode: Change getModuleSummaryIndex() to return an llvm::Expected. Differential Revision: https://reviews.llvm.org/D26539 llvm-svn: 286624	2016-11-11 19:50:39 +00:00
Peter Collingbourne	cd513a41c1	Bitcode: Clean up error handling for certain bitcode query functions. The functions getBitcodeTargetTriple(), isBitcodeContainingObjCCategory(), getBitcodeProducerString() and hasGlobalValueSummary() now return errors via their return value rather than via the diagnostic handler. To make this work, re-implement these functions using non-member functions so that they can be used without the LLVMContext required by BitcodeReader. Differential Revision: https://reviews.llvm.org/D26532 llvm-svn: 286623	2016-11-11 19:50:24 +00:00
Peter Collingbourne	c0032b7178	Bitcode: Prepare to move bitcode readers to free functions. Make initStream() a free function, and change BitcodeReaderBase ctor to take a BitstreamCursor. llvm-svn: 286622	2016-11-11 19:50:10 +00:00
Lang Hames	4f734f254e	[ORC] Revert r286620 while I investigate a bot failure. llvm-svn: 286621	2016-11-11 19:46:46 +00:00
Lang Hames	ae1fdddbc4	[ORC] Refactor the ORC RPC utilities to add some new features. (1) Add support for function key negotiation. The previous version of the RPC required both sides to maintain the same enumeration for functions in the API. This means that any version skew between the client and server would result in communication failure. With this version of the patch functions (and serializable types) are defined with string names, and the derived function signature strings are used to negotiate the actual function keys (which are used for efficient call serialization). This allows clients to connect to any server that supports a superset of the API (based on the function signatures it supports). (2) Add a callAsync primitive. The callAsync primitive can be used to install a return value handler that will run as soon as the RPC function's return value is sent back from the remote. (3) Launch policies for RPC function handlers. The new addHandler method, which installs handlers for RPC functions, takes two arguments: (1) the handler itself, and (2) an optional "launch policy". When the RPC function is called, the launch policy (if present) is invoked to actually launch the handler. This allows the handler to be spawned on a background thread, or added to a work list. If no launch policy is used, the handler is run on the server thread itself. This should only be used for short-running handlers, or entirely synchronous RPC APIs. (4) Zero cost cross type serialization. You can now define serialization from any type to a different "wire" type. For example, this allows you to call an RPC function that's defined to take a std::string while passing a StringRef argument. If a serializer from StringRef to std::string has been defined for the channel type this will be used to serialize the argument without having to construct a std::string instance. This allows buffer reference types to be used as arguments to RPC calls without requiring a copy of the buffer to be made. llvm-svn: 286620	2016-11-11 19:42:44 +00:00
Geoff Berry	25fa4999ff	[AArch64] Fix bugs in isel lowering replaceSplatVectorStore. Summary: Fix off-by-one indexing error in loop checking that inserted value was a splat vector. Add code to check that INSERT_VECTOR_ELT nodes constructing the splat vector have the expected constant index values. Reviewers: t.p.northover, jmolloy, mcrosier Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26409 llvm-svn: 286616	2016-11-11 19:25:20 +00:00
Reid Kleckner	ec80354873	[sancov] Don't instrument MSVC CRT stdio config helpers They get called before initialization, which is a problem for winasan. Test coming in compiler-rt. llvm-svn: 286615	2016-11-11 19:18:45 +00:00
Evgeniy Stepanov	f48ffab554	[cfi] Implement cfi-icall using inline assembly. The current implementation is emitting a global constant that happens to evaluate to the same bytes + relocation as a jump instruction on X86. This does not work for PIE executables and shared libraries though, because we end up with a wrong relocation type. And it has no chance of working on ARM/AArch64 which use different relocation types for jump instructions (R_ARM_JUMP24) that is never generated for data. This change replaces the constant with module-level inline assembly followed by a hidden declaration of the jump table. Works fine for ARM/AArch64, but has some drawbacks. * Extra symbols are added to the static symbol table, which inflate the size of the unstripped binary a little. Stripped binaries are not affected. This happens because jump table declarations must be external (because their body is in the inline asm). * Original functions that were anonymous are now named <original name>.cfi, and it affects symbolization sometimes. This is necessary because the only user of these functions is the (inline asm) jump table, so they had to be added to @llvm.used, which does not allow unnamed functions. llvm-svn: 286611	2016-11-11 18:49:09 +00:00
Adrian Prantl	554fd99dd5	Revert "Use private linkage for MergedGlobals variables" on Darwin. This is a partial revert of r244615 (http://reviews.llvm.org/D11942), which caused a major regression in debug info quality. Turning the artificial __MergedGlobal symbols into private symbols (l__MergedGlobal) means that the linker will not include them in the symbol table of the final executable. Without a symbol table entry dsymutil is not be able to process the debug info for any of the merged globals and thus drops the debug info for all of them. This patch is enabling the old behavior for all MachO targets while leaving all other targets unaffected. rdar://problem/29160481 https://reviews.llvm.org/D26531 llvm-svn: 286607	2016-11-11 17:50:09 +00:00
Chad Rosier	d6e85ce3c3	[AArch64] Remove lots of redundant code. NFC. llvm-svn: 286606	2016-11-11 17:49:34 +00:00
Sanjay Patel	d1bf4340ef	[InstCombine] fix formatting of FoldOpIntoSelect(); NFCI llvm-svn: 286604	2016-11-11 17:42:16 +00:00
Greg Clayton	04c19286a1	Fixed issues found by Paul Robinson with my patch for: https://reviews.llvm.org/D26526 - Fixed DW_FORM_strp to be correctly sized and extracted for DWARF64 - Added some missing strp variants as well - Fixed comment typo llvm-svn: 286603	2016-11-11 17:38:14 +00:00
Chad Rosier	31ee813068	[AArch64] Early return and minor renaming/refactoring to ease code review. NFC. llvm-svn: 286601	2016-11-11 17:07:37 +00:00
Greg Clayton	82f12b149f	Clean up DWARFFormValue by reducing duplicated code and removing DWARFFormValue::getFixedFormSizes() In preparation for a follow on patch that improves DWARF parsing speed, clean up DWARFFormValue so that we have can get the fixed byte size of a form value given a DWARFUnit or given the version, address byte size and dwarf32/64. This patch cleans up code so that everyone is using one of the new DWARFFormValue functions: static Optional<uint8_t> DWARFFormValue::getFixedByteSize(dwarf::Form Form, const DWARFUnit *U = nullptr); static Optional<uint8_t> DWARFFormValue::getFixedByteSize(dwarf::Form Form, uint16_t Version, uint8_t AddrSize, bool Dwarf32); This patch changes DWARFFormValue::skipValue() to rely on the output of DWARFFormValue::getFixedByteSize(...) instead of duplicating the code in each function. This will reduce the number of changes we need to make to DWARF to fewer places in DWARFFormValue when we add support for new form. This patch also starts to support DWARF64 so that we can get correct byte sizes for forms that vary according the DWARF 32/64. To reduce the code duplication a new FormSizeHelper pure virtual class was created that can be created as a FormSizeHelperDWARFUnit when you have a DWARFUnit, or FormSizeHelperManual where you manually specify the DWARF version, address byte size and DWARF32/DWARF64. There is now a single implementation of a function that gets the fixed byte size (instead of two where one took a DWARFUnit and one took the DWARF version, address byte size and DWARFFormat enum) and one function to skip the form values. https://reviews.llvm.org/D26526 llvm-svn: 286597	2016-11-11 16:21:37 +00:00
Nemanja Ivanovic	2efc3cb968	[PowerPC] Add vector conversion builtins to altivec.h - LLVM portion This patch corresponds to review: https://reviews.llvm.org/D26307 Adds all the intrinsics used for various conversion builtins that will be added to altivec.h. These are type conversions between various types of vectors. llvm-svn: 286596	2016-11-11 14:41:19 +00:00
Chad Rosier	10c7aaaee9	[AArch64] Enable merging of adjacent zero stores for all subtargets. This optimization merges adjacent zero stores into a wider store. e.g., strh wzr, [x0] strh wzr, [x0, #2] ; becomes str wzr, [x0] e.g., str wzr, [x0] str wzr, [x0, #4] ; becomes str xzr, [x0] Previously, this was only enabled for Kryo and Cortex-A57. Differential Revision: https://reviews.llvm.org/D26396 llvm-svn: 286592	2016-11-11 14:10:12 +00:00
Sam Kolton	ce0aba74c1	[AMDGPU] TargetStreamer: Fix .note section name llvm-svn: 286591	2016-11-11 13:41:52 +00:00
Ulrich Weigand	a0e7325023	[SystemZ] Support CL(G)T instructions This adds support for the compare logical and trap (memory) instructions that were added as part of the miscellaneous instruction extensions feature with zEC12. llvm-svn: 286587	2016-11-11 12:48:26 +00:00
Ulrich Weigand	92c2c672e5	[SystemZ] Support load-and-zero-rightmost-byte facility This adds support for the LZRF/LZRG/LLZRGF instructions that were added on z13, and uses them for code generation were appropriate. SystemZDAGToDAGISel::tryRISBGZero is updated again to prefer LLZRGF over RISBG where both would be possible. llvm-svn: 286586	2016-11-11 12:46:28 +00:00
Ulrich Weigand	5dc7b67c62	[SystemZ] Use LLGT(R) instructions This adds support for the 31-to-64-bit zero extension instructions LLGT and LLGTR and uses them for code generation where appropriate. Since this operation can also be performed via RISBG, we have to update SystemZDAGToDAGISel::tryRISBGZero so that we prefer LLGT over RISBG in case both are possible. The patch includes some simplification to the tryRISBGZero code; this is not intended to cause any (further) functional change in codegen. llvm-svn: 286585	2016-11-11 12:43:51 +00:00
Simon Pilgrim	807f9cf243	[SelectionDAG] Add support for vector demandedelts in BSWAP opcodes llvm-svn: 286582	2016-11-11 11:51:29 +00:00
Simon Pilgrim	813721e98a	[SelectionDAG] Add support for vector demandedelts in UREM/SREM opcodes llvm-svn: 286578	2016-11-11 11:23:43 +00:00
Simon Pilgrim	0652227814	[SelectionDAG] Add support for vector demandedelts in UDIV opcodes llvm-svn: 286576	2016-11-11 10:47:24 +00:00
Diana Picus	22274934f4	[ARM] Add plumbing for GlobalISel Add GlobalISel skeleton, up to the point where we can select a ret void. llvm-svn: 286573	2016-11-11 08:27:37 +00:00
Teresa Johnson	ad17679abd	Split Bitcode/ReaderWriter.h into separate reader and writer headers Summary: Split ReaderWriter.h which contains the APIs into both the BitReader and BitWriter libraries into BitcodeReader.h and BitcodeWriter.h. This is to address Chandler's concern about sharing the same API header between multiple libraries (BitReader and BitWriter). That concern is why we create a single bitcode library in our downstream build of clang, which led to r286297 being reverted as it added a dependency that created a cycle only when there is a single bitcode library (not two as in upstream). Reviewers: mehdi_amini Subscribers: dlj, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D26502 llvm-svn: 286566	2016-11-11 05:34:58 +00:00
Mehdi Amini	c1edf566b9	Prevent at compile time converting from Error::success() to Expected<T> This would trigger an assertion at runtime otherwise. Differential Revision: https://reviews.llvm.org/D26482 llvm-svn: 286562	2016-11-11 04:29:25 +00:00
Mehdi Amini	41af43092c	Make the Error class constructor protected This is forcing to use Error::success(), which is in a wide majority of cases a lot more readable. Differential Revision: https://reviews.llvm.org/D26481 llvm-svn: 286561	2016-11-11 04:28:40 +00:00
Davide Italiano	03a856807c	[lli] Simplify the code a bit. No functional change intended. llvm-svn: 286555	2016-11-11 03:07:45 +00:00
Davide Italiano	5e327343f1	[IR/DataLayout] Simplify the code using PowerOf2Ceil. NFCI. llvm-svn: 286554	2016-11-11 03:00:00 +00:00
Yaxun Liu	c5bf4b831d	AMDGPU: Attempt to fix build failure on x86-64 selfhost build Remove redundant include file. llvm-svn: 286552	2016-11-11 02:48:50 +00:00
Sean Fertile	e1ca561b0a	Add a blank line for a test commit. llvm-svn: 286550	2016-11-11 02:33:17 +00:00
Matthias Braun	325cd2c98a	ScheduleDAGInstrs: Add condjump deps to addSchedBarrierDeps() addSchedBarrierDeps() is supposed to add use operands to the ExitSU node. The current implementation adds uses for calls/barrier instruction and the MBB live-outs in all other cases. The use operands of conditional jump instructions were missed. Also added code to macrofusion to set the latencies between nodes to zero to avoid problems with the fusing nodes lingering around in the pending list now. Differential Revision: https://reviews.llvm.org/D25140 llvm-svn: 286544	2016-11-11 01:34:21 +00:00
Stanislav Mekhanoshin	6fc8a1cdaa	Revert "[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies" This reverts commit r286171, it breaks piglit test fs-discard-exit-2 llvm-svn: 286530	2016-11-11 00:22:34 +00:00
Joerg Sonnenberger	618d475c03	Fix requirements. llvm-svn: 286527	2016-11-10 23:53:45 +00:00
Matthias Braun	f29b12dca8	ScheduleDAGInstrs: Ignore dependencies of constant physregs There is no need to track dependencies for constant physregs, as they don't change their value no matter in what order you read/write to them. Differential Revision: https://reviews.llvm.org/D26221 llvm-svn: 286526	2016-11-10 23:46:44 +00:00
Matthias Braun	d67fa9dc6a	Timer: Remove group-less NamedRegionTimer constructor. The NamedRegionTimer initializer without a group name puts the Timer into the "Misc" group and is (nearly) unused. Remove it. The only user of this constructor appears to be the HexagonGenInsert pass, which creates a counter without group to count the complete execution time of that pass, however since every pass gets a counter by the PassManager anyway this should be unnecessary. Also removed the pointless TimerGroup there. Differential Revision: https://reviews.llvm.org/D25582 llvm-svn: 286524	2016-11-10 23:36:44 +00:00
Evandro Menezes	21f9ce1a0d	[DAG Combiner] Fix the native computation of the Newton series for reciprocals The generic infrastructure to compute the Newton series for reciprocal and reciprocal square root was conceived to allow a target to compute the series itself. However, the original code did not properly consider this condition if returned by a target. This patch addresses the issues to allow a target to compute the series on its own. Differential revision: https://reviews.llvm.org/D22975 llvm-svn: 286523	2016-11-10 23:31:06 +00:00
Simon Pilgrim	38f0045cb0	[SelectionDAG] Add support for vector demandedelts in ADD/SUB opcodes llvm-svn: 286516	2016-11-10 22:41:49 +00:00
Peter Collingbourne	d93620bf4d	IR: Introduce inrange attribute on getelementptr indices. If the inrange keyword is present before any index, loading from or storing to any pointer derived from the getelementptr has undefined behavior if the load or store would access memory outside of the bounds of the element selected by the index marked as inrange. This can be used, e.g. for alias analysis or to split globals at element boundaries where beneficial. As previously proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html Differential Revision: https://reviews.llvm.org/D22793 llvm-svn: 286514	2016-11-10 22:34:55 +00:00
Matthias Braun	111603f933	ScheduleDAGInstrs: Slightly simplify code; NFC llvm-svn: 286510	2016-11-10 22:11:00 +00:00
Simon Pilgrim	fe3a54371d	[SelectionDAG] Add support for splatted vectors in SUB opcode llvm-svn: 286509	2016-11-10 21:57:42 +00:00
Matthias Braun	9d62c5571b	RegisterCoalescer: Ignore interferences for constant physregs When copying to/from a constant register interferences can be ignored. Also update the documentation for isConstantPhysReg() to make it more obvious that this transformation is valid. Differential Revision: https://reviews.llvm.org/D26106 llvm-svn: 286503	2016-11-10 21:22:47 +00:00
Yaxun Liu	d6fbe65040	AMDGPU: Emit runtime metadata as a note element in .note section Currently runtime metadata is emitted as an ELF section with name .AMDGPU.runtime_metadata. However there is a standard way to convey vendor specific information about how to run an ELF binary, which is called vendor-specific note element (http://www.netbsd.org/docs/kernel/elf-notes.html). This patch lets AMDGPU backend emits runtime metadata as a note element in .note section. Differential Revision: https://reviews.llvm.org/D25781 llvm-svn: 286502	2016-11-10 21:18:49 +00:00
Zachary Turner	805d43a0b8	Fix type ambiguity with std::max llvm-svn: 286498	2016-11-10 20:35:21 +00:00
Zachary Turner	4a86af07a2	[Support] Improve flexibility of binary blob formatter. This makes it possible to indent a binary blob by a certain number of bytes, and also makes some things more idiomatic. Finally, it integrates this binary blob formatter into ScopedPrinter which used to have its own implementation of this algorithm. Differential Revision: https://reviews.llvm.org/D26477 llvm-svn: 286495	2016-11-10 20:16:45 +00:00
Davide Italiano	a22ddddfea	[Target] Rename X86/ARM Assembly printer to reflect reality. This shows up a lot profiling LTO testcases with -time-passes, so better have a non confusing name. llvm-svn: 286488	2016-11-10 18:39:31 +00:00
Nico Weber	85740f6b86	Revert r286437 r286438, they caused PR30976 llvm-svn: 286483	2016-11-10 17:55:41 +00:00
Adam Nemet	7da20c39ee	[OptDiag] Remove non-printable chars from function name The r283656 did this in the remark arguments. We also need to do this in the main function attribute as that is written to YAML as well. llvm-svn: 286482	2016-11-10 17:47:03 +00:00
Simon Pilgrim	d67af68f06	[SelectionDAG] Add support for vector demandedelts in TRUNCATE opcodes llvm-svn: 286481	2016-11-10 17:43:52 +00:00
Dehao Chen	5492f8646c	Add comments about why we put LoopSink pass at the very late stage. llvm-svn: 286480	2016-11-10 17:42:18 +00:00
Teresa Johnson	a081145ebd	Restore part of "[ThinLTO] Prevent exporting of locals used/defined in module level asm" This restores the part of r286297 that didn't require adding a dependency from the Analysis to Object library. There are two parts to the original fix, and this will address the handling for the case where locals are used in module level asm. The part that requires functionality in libObject handles local defs in module level asm, and was reverted because our downstream build of clang builds lib/Bitcode into a single library, and this new dependency introduced a cycle there. I am trying to get that fixed (see D26502), so for now that change isn't being restored llvm-svn: 286475	2016-11-10 16:57:32 +00:00
Simon Pilgrim	33fef8e865	Use common SDLoc. NFCI. llvm-svn: 286473	2016-11-10 16:47:09 +00:00
Simon Pilgrim	ee187fd6e7	[SelectionDAG] Add support for vector demandedelts in MUL opcodes llvm-svn: 286471	2016-11-10 16:27:42 +00:00
Tom Stellard	115a61560e	AMDGPU: Add VI i16 support Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464	2016-11-10 16:02:37 +00:00
Simon Pilgrim	ca57e53ded	[SelectionDAG] Add support for vector demandedelts in SRA opcodes llvm-svn: 286461	2016-11-10 15:05:09 +00:00
Simon Pilgrim	37c9034bd6	[DAGCombiner] Correctly extract the ConstOrConstSplat shift value for SHL nodes We were failing to extract a constant splat shift value if the shifted value was being masked. The (shl (and (setcc) N01CV) N1CV) -> (and (setcc) N01CV<<N1CV) combine was unnecessarily preventing this. llvm-svn: 286454	2016-11-10 14:35:09 +00:00
Simon Pilgrim	3bf99c056a	[SelectionDAG] Add support for vector demandedelts in SHL/SRL opcodes llvm-svn: 286448	2016-11-10 13:52:42 +00:00
Oliver Stannard	18ca2adf2d	[ARM] Thumb2 LDR (literal) should accept PC as the destination The version of this instruction with the .w suffix already correctly accepts this, but the alias without the .w did not. Differential Revision: https://reviews.llvm.org/D26499 llvm-svn: 286446	2016-11-10 13:20:41 +00:00
Sanjoy Das	3d75b62ffe	[SCEVExpander] Hoist unsigned divisons when safe That is, when the divisor is a constant non-zero. llvm-svn: 286438	2016-11-10 07:56:12 +00:00
Sanjoy Das	e30a281449	[SCEVExpander] Don't hoist divisions Fixes PR30942. llvm-svn: 286437	2016-11-10 07:56:09 +00:00
Craig Topper	bd298c37d1	[AVX-512] Allow legacy cvtpd2dq intrinsics to select EVEX encoded instruction when available. llvm-svn: 286435	2016-11-10 07:47:17 +00:00
Craig Topper	e0845d8e8c	[AVX-512][X86] Convert avx_cvtt_ps2dq_256 and sse2_cvttps2dq intrinsics to ISD::FP_TO_SINT in the intrinsics table and delete patterns. While nearby also move CVTDQ2PS patterns into their instructions. This allows these intrinsics to also use EVEX instructons. llvm-svn: 286434	2016-11-10 07:24:52 +00:00
Craig Topper	f37b9b9b5f	[X86] Convert int_x86_avx_cvtt_pd2dq_256 to fp_to_sint using the intrinsics table. Removes extra patterns and allows legacy intrinsic to select EVEX encoded instructions when available. llvm-svn: 286433	2016-11-10 06:45:39 +00:00
Craig Topper	2afed2c790	[X86] Move some custom patterns into the currently empty pattern of their corresponding instructions. NFC llvm-svn: 286432	2016-11-10 06:45:37 +00:00
Craig Topper	1d2e74f030	[X86] Remove some patterns still referencing int_x86_sse2_cvttpd2dq that should have been removed in r286344. NFC llvm-svn: 286431	2016-11-10 06:45:34 +00:00
Sanjoy Das	0ae390abce	[SCEV] Eta reduce some lambdas; NFC llvm-svn: 286429	2016-11-10 06:33:54 +00:00
Sanjay Patel	4e1b5a53c7	[InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923) Removing the limitation in visitInsertElementInst() causes several regressions because we're not prepared to fold sequences of shuffles or inserts and extracts separated by shuffles. Fixing that appears to be a difficult mission because we are purposely trying to avoid creating shuffles with arbitrary shuffle masks because some targets may choke on those. https://llvm.org/bugs/show_bug.cgi?id=30923 llvm-svn: 286423	2016-11-10 00:15:14 +00:00
Peter Collingbourne	32ab3a817d	Re-apply r286384, "X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate.", with a fix for 32-bit x86. Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions that take a global address operand. llvm-svn: 286420	2016-11-09 23:53:43 +00:00
Dehao Chen	38a666d6e5	Add isHotBB helper function to ProfileSummaryInfo Summary: This will unify all BB hotness checks. Reviewers: eraman, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26353 llvm-svn: 286415	2016-11-09 23:36:02 +00:00
Eli Friedman	ddbf83ea14	Preserve assumption cache in loop-rotate. No testcase included because I can't figure out how to reduce it. (It's easy to write a testcase where rotation clones an assume, but that doesn't actually seem to trigger the crash in opt on its own; maybe an issue with the laziness?) Differential Revision: https://reviews.llvm.org/D26434 llvm-svn: 286410	2016-11-09 23:05:01 +00:00
Tim Northover	09dd2496b7	GlobalISel: fix typo. NFC llvm-svn: 286408	2016-11-09 22:40:02 +00:00
Tim Northover	a9105be437	GlobalISel: translate invoke and landingpad instructions Pretty bare-bones support for exception handling (no weird MSVC stuff, no SjLj etc), but it should get things going. llvm-svn: 286407	2016-11-09 22:39:54 +00:00
Mehdi Amini	67d1a41226	Make BitcodeReader::parseIdentificationBlock() robust to EOF This method is particular: it iterates at the top-level and does not have an enclosing block. llvm-svn: 286394	2016-11-09 21:26:49 +00:00
Evgeny Stupachenko	c2698cd903	Minor unroll pass refacoring. Summary: Unrolled Loop Size calculations moved to a function. Constant representing number of optimized instructions when "back edge" becomes "fall through" replaced with variable. Some comments added. Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D21719 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 286389	2016-11-09 19:56:39 +00:00
Sanjoy Das	26f28a2836	[Verifier] clang-format a section; NFC Suggested in D26438 since I'm touching related code. llvm-svn: 286388	2016-11-09 19:36:39 +00:00
Sanjoy Das	6b46a0d1e8	[SCEV] Refactor out a useful pattern; NFC llvm-svn: 286386	2016-11-09 18:22:43 +00:00
Peter Collingbourne	a9cadeddd4	Revert r286384, "X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate." Suspected to be the cause of a sanitizer-windows bot failure: Assertion failed: isImm() && "Wrong MachineOperand accessor", file C:\b\slave\sanitizer-windows\llvm\include\llvm/CodeGen/MachineOperand.h, line 420 llvm-svn: 286385	2016-11-09 18:17:50 +00:00
Peter Collingbourne	4c15db45e4	X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate. A relocatable immediate is either an immediate operand or an operand that can be relocated by the linker to an immediate, such as a regular symbol in non-PIC code. Start using relocImm for 32-bit and 64-bit MOV instructions, and for operands of type "imm32_su". Remove a number of now-redundant patterns. Differential Revision: https://reviews.llvm.org/D25812 llvm-svn: 286384	2016-11-09 17:51:58 +00:00
Krzysztof Parzyszek	f817efbbb0	[Hexagon] Silence "sometimes uninitialized" warning in HexagonCopyToCombine llvm-svn: 286383	2016-11-09 17:50:46 +00:00
Peter Collingbourne	7f00d0a125	Bitcode: Change the materializer interface to return llvm::Error. Differential Revision: https://reviews.llvm.org/D26439 llvm-svn: 286382	2016-11-09 17:49:19 +00:00
Krzysztof Parzyszek	a540997ce4	[Hexagon] Separate Hexagon subreg indices for different register classes For pairs of 32-bit registers: isub_lo, isub_hi. For pairs of vector registers: vsub_lo, vsub_hi. Add generic subreg indices: ps_sub_lo, ps_sub_hi, and a function HexagonRegisterInfo::getHexagonSubRegIndex(RegClass, GenericSubreg) that returns the appropriate subreg index for RegClass. llvm-svn: 286377	2016-11-09 16:19:08 +00:00
Krzysztof Parzyszek	601d7eb11a	[Hexagon] Eliminate Insert4 pseudo-instruction, use combines instead llvm-svn: 286368	2016-11-09 14:16:29 +00:00
Jonas Paulsson	e127fe7083	[SystemZ] A few fixes in scheduler files. Review: U Weigand llvm-svn: 286362	2016-11-09 12:47:57 +00:00
Pavel Labath	c207bec388	Remove TimeValue usage from Scalar/SROA.cpp. NFC. llvm-svn: 286361	2016-11-09 12:07:12 +00:00
Pavel Labath	775bbc3736	Zero-initialize chrono duration objects The default duration constructor does not zero-initialize the object, we need to do that manually. llvm-svn: 286359	2016-11-09 11:43:57 +00:00
Jonas Paulsson	28f29487b9	[MachineScheduler] Comments fixing. The name/comment of the third argument to the ScheduleDAGMI constructor is RemoveKillFlags and not IsPostRA. Only the comments are changed. Review: A Trick llvm-svn: 286350	2016-11-09 09:59:27 +00:00
Alexandros Lamprineas	0ee3ec2fe4	[ARM] Loop Strength Reduction crashes when targeting ARM or Thumb. Scalar Evolution asserts when not all the operands of an Add Recurrence Expression are loop invariants. Loop Strength Reduction should only create affine Add Recurrences, so that both the start and the step of the expression are loop invariants. Differential Revision: https://reviews.llvm.org/D26185 llvm-svn: 286347	2016-11-09 08:53:07 +00:00
Craig Topper	f334ac19ad	[AVX-512] Add lowering to cvttpd2udq/cvttps2udq for fptoui v2f64/2f32 to 2i32 This patch adds support for fptoui to 2i32 from both 2f64 and 2f32, building on Simon's change for the signed version in r284459 and using AVX-512 instructions. If we don't have VLX support we need to use a 512-bit operation for v2f64->v2i32 and extract the result. It also recognises that cvttpd2udq zeroes the upper 64-bits of the xmm result. Differential Revision: https://reviews.llvm.org/D26331 llvm-svn: 286345	2016-11-09 07:48:51 +00:00
Craig Topper	731bf9c5d6	[X86] Lower AVX512 and SSE intrinsics for CVTTPD2DQ to X86ISD::CVTTPD2DQ. Summary: This allows the SSE intrinsic to use the EVEX instruction when available. It also fixes EVEX to not use a weird (v4i32 (fp_to_sint v2f64)) node and it merges some isel patterns. This also fixes some cases that weren't combining vzmovl with cvttpd2dq to remove extra moves. Reviewers: delena, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26330 llvm-svn: 286344	2016-11-09 07:31:32 +00:00
Craig Topper	28e3dfc02b	[AVX-512] Use alignedstore256 in patterns that look for stores of the lower 256-bits of a 512-bit vector to use a 256-bit aligned store. Previously we were only checking for 16 byte alignment instead of 32 byte alignment. Fixes PR30947. llvm-svn: 286342	2016-11-09 05:31:57 +00:00
Craig Topper	5c842be9a0	[AVX-512] Make VBMI instruction set enabling imply that the BWI instruction set is also enabled. Summary: This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912. Reviewers: delena, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26322 llvm-svn: 286339	2016-11-09 04:50:48 +00:00
Mehdi Amini	b6a11a7879	Revert "[ThinLTO] Prevent exporting of locals used/defined in module level asm" This reverts commit r286297. Introduces a dependency from libAnalysis to libObject, which I missed during the review. llvm-svn: 286329	2016-11-09 01:45:13 +00:00
Peter Collingbourne	7576cb0fa7	Bitcode: Remove the remnants of the BitcodeDiagnosticInfo class. The BitcodeReader no longer produces BitcodeDiagnosticInfo diagnostics. The only remaining reference was in the gold plugin; the code there has been dead since we stopped producing InvalidBitcodeSignature error codes in r225562. While at it remove the InvalidBitcodeSignature error code. llvm-svn: 286326	2016-11-09 01:09:11 +00:00
Dehao Chen	947dbe1254	Enable Loop Sink pass for functions that has profile. Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code. Reviewers: davidxl, hfinkel Subscribers: mehdi_amini, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D26155 llvm-svn: 286325	2016-11-09 00:58:19 +00:00
Peter Collingbourne	58f7f0759f	Bitcode: Change the BitcodeReader to use llvm::Error internally. Differential Revision: https://reviews.llvm.org/D26430 llvm-svn: 286323	2016-11-09 00:51:04 +00:00
Sanjay Patel	e104554412	[ValueTracking] recognize obfuscated variants of umin/umax The smallest tests that expose this are codegen tests (because SelectionDAGBuilder::visitSelect() uses matchSelectPattern to create UMAX/UMIN nodes), but it's also possible to see the effects in IR alone with folds of min/max pairs. If these were written as unsigned compares in IR, InstCombine canonicalizes the unsigned compares to signed compares. Ie, running the optimizer pessimizes the codegen for this case without this patch: define <4 x i32> @umax_vec(<4 x i32> %x) { %cmp = icmp ugt <4 x i32> %x, <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647> %sel = select <4 x i1> %cmp, <4 x i32> %x, <4 x i32> <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647> ret <4 x i32> %sel } $ ./opt umax.ll -S \| ./llc -o - -mattr=avx vpmaxud LCPI0_0(%rip), %xmm0, %xmm0 $ ./opt -instcombine umax.ll -S \| ./llc -o - -mattr=avx vpxor %xmm1, %xmm1, %xmm1 vpcmpgtd %xmm0, %xmm1, %xmm1 vmovaps LCPI0_0(%rip), %xmm2 ## xmm2 = [2147483647,2147483647,2147483647,2147483647] vblendvps %xmm1, %xmm0, %xmm2, %xmm0 Differential Revision: https://reviews.llvm.org/D26096 llvm-svn: 286318	2016-11-09 00:24:44 +00:00
Greg Clayton	bde0a1632b	Added the ability to dump hex bytes easily into a raw_ostream. Unit tests were added to verify this functionality keeps working correctly. Example output for raw hex bytes: llvm::ArrayRef<uint8_t> Bytes = ...; llvm::outs() << format_hex_bytes(Bytes); 554889e5 4881ec70 04000048 8d051002 00004c8d 05fd0100 004c8b0d d0020000 Example output for raw hex bytes with offsets: llvm::outs() << format_hex_bytes(Bytes, 0x100000d10); 0x0000000100000d10: 554889e5 4881ec70 04000048 8d051002 0x0000000100000d20: 00004c8d 05fd0100 004c8b0d d0020000 Example output for raw hex bytes with ASCII with offsets: llvm::outs() << format_hex_bytes_with_ascii(Bytes, 0x100000d10); 0x0000000100000d10: 554889e5 4881ec70 04000048 8d051002 \|UH.?H.?p...H....\| 0x0000000100000d20: 00004c8d 05fd0100 004c8b0d d0020000 \|..L..?...L..?...\| The default groups bytes into 4 byte groups, but this can be changed to 1 byte: llvm::outs() << format_hex_bytes(Bytes, 0x100000d10, 16 /NumPerLine/, 1 /ByteGroupSize/); 0x0000000100000d10: 55 48 89 e5 48 81 ec 70 04 00 00 48 8d 05 10 02 0x0000000100000d20: 00 00 4c 8d 05 fd 01 00 00 4c 8b 0d d0 02 00 00 llvm::outs() << format_hex_bytes(Bytes, 0x100000d10, 16 /NumPerLine/, 2 /ByteGroupSize/); 0x0000000100000d10: 5548 89e5 4881 ec70 0400 0048 8d05 1002 0x0000000100000d20: 0000 4c8d 05fd 0100 004c 8b0d d002 0000 llvm::outs() << format_hex_bytes(Bytes, 0x100000d10, 8 /NumPerLine/, 1 /ByteGroupSize/); 0x0000000100000d10: 55 48 89 e5 48 81 ec 70 0x0000000100000d18: 04 00 00 48 8d 05 10 02 0x0000000100000d20: 00 00 4c 8d 05 fd 01 00 0x0000000100000d28: 00 4c 8b 0d d0 02 00 00 https://reviews.llvm.org/D26405 llvm-svn: 286316	2016-11-09 00:15:54 +00:00
Sanjay Patel	4e9d6cd354	[InstCombine] fix profitability equation for max-of-nots transform As the test change shows, we can increase the critical path by adding a 'not' instruction, so make sure that we're actually removing an instruction if we do this transform. This transform could also cause us to miss folds of min/max pairs. llvm-svn: 286315	2016-11-09 00:13:11 +00:00
Sanjay Patel	99dc5feff1	[InstCombine] reduce indentation; NFC llvm-svn: 286314	2016-11-08 23:49:15 +00:00
Zachary Turner	44728f4014	Fix some size_t / uint32_t ambiguity errors. llvm-svn: 286305	2016-11-08 22:30:11 +00:00
Zachary Turner	4efa0a4201	[CodeView] Hook up CodeViewRecordIO to type serialization path. Previously support had been added for using CodeViewRecordIO to read (deserialize) CodeView type records. This patch adds support for writing those same records. With this patch, reading and writing of CodeView type records finally uses a single codepath. Differential Revision: https://reviews.llvm.org/D26253 llvm-svn: 286304	2016-11-08 22:24:53 +00:00
Adrian Prantl	3502f2089c	Emit the DW_AT_type for a C++ static member definition if it is more specific than the one in its DW_AT_specification. If a static member is an array, the translation unit containing the member definition may have a more specific type (including its length) than TUs only seeing the class declaration. This patch adds a DW_AT_type to the member's DW_TAG_variable in addition to the DW_AT_specification in these cases. The member type in the DW_AT_specification still shows the more generic type (without the length) to avoid defeating type uniquing. The DWARF standard discourages “duplicating” a DW_AT_type in a member variable definition but doesn’t explicitly forbid it. Having the more specific type (with the array length) available is what allows the debugger to print the contents of a static array member variable. https://reviews.llvm.org/D26368 rdar://problem/28706946 llvm-svn: 286302	2016-11-08 22:11:38 +00:00
David L. Jones	e09ae201f2	GlobalISel: make sure debugging variables are appropriately elided in release builds. Summary: There are two variables here that break. This change constrains both of them to debug builds (via DEBUG() or #ifndef NDEBUG). Reviewers: bkramer, t.p.northover Subscribers: mehdi_amini, vkalintiris Differential Revision: https://reviews.llvm.org/D26421 llvm-svn: 286300	2016-11-08 22:03:23 +00:00
Teresa Johnson	6955feebf3	[ThinLTO] Prevent exporting of locals used/defined in module level asm Summary: This patch uses the same approach added for inline asm in r285513 to similarly prevent promotion/renaming of locals used or defined in module level asm. All static global values defined in normal IR and used in module level asm should be included on either the llvm.used or llvm.compiler.used global. The former were already being flagged as NoRename in the summary, and I've simply added llvm.compiler.used values to this handling. Module level asm may also contain defs of values. We need to prevent export of any refs to local values defined in module level asm (e.g. a ref in normal IR), since that also requires renaming/promotion of the local. To do that, the summary index builder looks at all values in the module level asm string that are not marked Weak or Global, which is exactly the set of locals that are defined. A summary is created for each of these local defs and flagged as NoRename. This required adding handling to the BitcodeWriter to look at GV declarations to see if they have a summary (rather than skipping them all). Finally, added an assert to IRObjectFile::CollectAsmUndefinedRefs to ensure that an MCAsmParser is available, otherwise the module asm parse would silently fail. Initialized the asm parser in the opt tool for use in testing this fix. Fixes PR30610. Reviewers: mehdi_amini Subscribers: johanengelen, krasin, llvm-commits Differential Revision: https://reviews.llvm.org/D26146 llvm-svn: 286297	2016-11-08 21:53:35 +00:00
Kuba Brecka	a49dcbb743	[asan] Speed up compilation of large C++ stringmaps (tons of allocas) with ASan This addresses PR30746, <https://llvm.org/bugs/show_bug.cgi?id=30746>. The ASan pass iterates over entry-block instructions and checks each alloca whether it's in NonInstrumentedStaticAllocaVec, which is apparently slow. This patch gathers the instructions to move during visitAllocaInst. Differential Revision: https://reviews.llvm.org/D26380 llvm-svn: 286296	2016-11-08 21:30:41 +00:00
Andrew Kaylor	9604f34996	[BasicAA] Teach BasicAA to handle the inaccessiblememonly and inaccessiblemem_or_argmemonly attributes Differential Revision: https://reviews.llvm.org/D26382 llvm-svn: 286294	2016-11-08 21:07:42 +00:00
Matthias Braun	c53cbbb1d1	AArch64DeadRegisterDefinitionsPass: Fix Changed flag Fix a bug in the calculation of the changed flag introduced in r285488. llvm-svn: 286293	2016-11-08 20:59:03 +00:00
Sanjoy Das	2582e690b7	[TBAA] Drop support for "old style" scalar TBAA tags Summary: We've had support for auto upgrading old style scalar TBAA access metadata tags into the "new" struct path aware TBAA metadata for 3 years now. The only way to actually generate old style TBAA was explicitly through the IRBuilder API. I think this is a good time for dropping support for old style scalar TBAA. I'm not removing support for textual or bitcode upgrade -- if you have IR with the old style scalar TBAA tags that go through the AsmParser orf the bitcode parser before LLVM sees them, they will keep working as usual. Note: %val = load i32, i32* %ptr, !tbaa !N !N = < scalar tbaa node > is equivalent to %val = load i32, i32* %ptr, !tbaa !M !N = < scalar tbaa node > !M = !{!N, !N, 0} Reviewers: manmanren, chandlerc, sunfish Subscribers: mcrosier, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D26229 llvm-svn: 286291	2016-11-08 20:46:01 +00:00
Tim Northover	6cddfc14f9	GlobalISel: allow CodeGen to fallback on VReg type/class issues. After instruction selection we perform some checks on each VReg just before discarding the type information. These checks were assertions before, but that breaks the fallback path so this patch moves the logic into the main flow and reports a better error on failure. llvm-svn: 286289	2016-11-08 20:39:03 +00:00
Ulrich Weigand	05effca2d8	[SystemZ] Add missing FP extension instructions This completes assembler / disassembler support for all BFP instructions provided by the floating-point extensions facility. The instructions added here are not currently used for codegen. llvm-svn: 286285	2016-11-08 20:18:41 +00:00
Ulrich Weigand	4006e09d1d	[SystemZ] Add program mask and addressing mode instructions Add several instructions that operate on the program mask or the addressing mode. These are not really needed for code generation under Linux, but are provided for completeness for the assembler/disassembler. llvm-svn: 286284	2016-11-08 20:17:02 +00:00
Ulrich Weigand	fffc7110d6	[SystemZ] Model access registers as LLVM registers Add the 16 access registers as LLVM registers. This allows removing a lot of special cases in the assembler and disassembler where we were handling access registers; this can all just use the generic register code now. Also add a bunch of instructions to operate on access registers, for assembler/disassembler use only. No change in code generation intended. llvm-svn: 286283	2016-11-08 20:15:26 +00:00
Davide Italiano	11a871b227	[LoopDistribute] Preserve GlobalsAA also in the new Pass Manager. Differential Revision: https://reviews.llvm.org/D26408 llvm-svn: 286280	2016-11-08 19:52:32 +00:00
Eli Friedman	06025cf6c7	Don't store Twine in a local variable. Fixes post-commit review comment from r286177. llvm-svn: 286275	2016-11-08 19:43:56 +00:00
Dan Gohman	e81021a5cb	[WebAssembly] Convert stackified IMPLICIT_DEF into constant 0. Since IMPLIFIT_DEF instructions are omitted in the output, when the output of an IMPLICIT_DEF instruction is stackified, the resulting register lacks an explicit push, leading to a push/pop mismatch. Fix this by converting such IMPLICIT_DEFs into CONST_I32 0 instructions so that they have explicit pushes. llvm-svn: 286274	2016-11-08 19:40:38 +00:00
Ahmed Bougacha	53a03a28c4	[GlobalISel] Dump all instructions inserted by selector. This is helpful when multiple instructions are inserted. llvm-svn: 286273	2016-11-08 19:27:13 +00:00
Ahmed Bougacha	db273a1272	[GlobalISel] Permit select() to erase. Erasing reverse_iterators is problematic; iterate manually. While there, keep track of the range of inserted instructions. It can miss instructions inserted elsewhere, but those are harder to track. Differential Revision: http://reviews.llvm.org/D22924 llvm-svn: 286272	2016-11-08 19:27:10 +00:00
Davide Italiano	1e77aaca8a	[LibcallsShrinkWrap] This pass doesn't preserve the CFG. For example, it invalidates the domtree, causing assertions in later passes which need dominator infos. Make it preserve GlobalsAA, as suggested by Eli. Differential Revision: https://reviews.llvm.org/D26381 llvm-svn: 286271	2016-11-08 19:18:20 +00:00
Chad Rosier	fbc7b7d154	Fix typo in comment. NFC. llvm-svn: 286270	2016-11-08 19:10:25 +00:00
Ulrich Weigand	3d07d45089	[SystemZ] Always use semantic instruction classes Define a couple of additional semantic classes and use them throughout the .td files to make them more consistent and more easily readable. No functional change. llvm-svn: 286268	2016-11-08 18:37:48 +00:00
Ulrich Weigand	bfcfa0e207	[SystemZ] Refactor InstRR* instruction format patterns This changes the InstRR (and related) patterns to no longer automatically add an "r" at the end of the mnemonic. This makes the .td files more obviously understandable, and also allows using the patterns for those few instructions that do not follow the *r scheme. Also add some more sub-formats of the RRF format class, to match operand names and sequence from the PoP better. No functional change. llvm-svn: 286267	2016-11-08 18:36:31 +00:00
Ulrich Weigand	37bd451a55	[SystemZ] Rename some Inst* instruction format classes Now that we've added instruction format subclasses like InstRIb, it makes sense to rename the old InstRI to InstRIa. Similar for InstRX, InstRXY, InstRS, InstRSY, and InstSS. No functional change. llvm-svn: 286266	2016-11-08 18:32:50 +00:00
Nirav Dave	e833c6c61a	[MC][AArch64] Cleanup end-of-line parsing in AArch64 AsmParser. Reviewers: t.p.northover, rengolin Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D26309 llvm-svn: 286265	2016-11-08 18:31:04 +00:00
Ulrich Weigand	d2148caffc	[SystemZ] Refactor branch and conditional instruction patterns Rework patterns for branches, call & return instructions, compare-and-branch, compare-and-trap, and conditional move instructions. In particular, simplify creation of patterns for the extended opcodes of instructions that take a CC mask. Also, use semantical instruction classes for all the instructions instead of open-coding them in SystemZInstrInfo.td. Adds a couple of the basic branch instructions (that are unused for codegen) for the assembler/disassembler. llvm-svn: 286263	2016-11-08 18:30:50 +00:00
Piotr Padlewski	01659cb9fe	NFC small changes in MemDep llvm-svn: 286260	2016-11-08 18:20:51 +00:00
Wei Mi	b5cf9e53e5	[RegAllocGreedy] Another fix about NewVRegs for last chance recoloring after r281783. About when we should move a vreg from CurrentNewVRegs to NewVRegs, if the vreg in CurrentNewVRegs was added into RecoloringCandidate and was evicted, it shouldn't be added to NewVRegs because its physical register will be restored at the end of tryLastChanceRecoloring after the recoloring failed. If the vreg in CurrentNewVRegs was not in RecoloringCandidate, i.e. it was evicted in selectOrSplitImpl inside tryRecoloringCandidates, its physical register will not be restored even if the recoloring failed. In that case, we need to add the vreg to NewVRegs. Same as r281783, the problem was seen on out-of-tree target and we didn't have a test case that reproduce the problem with in-tree targets. llvm-svn: 286259	2016-11-08 18:19:36 +00:00
Tim Northover	5f7dea85c2	GlobalISel: support selecting fpext/fptrunc instructions on AArch64. llvm-svn: 286253	2016-11-08 17:44:07 +00:00
Anton Korobeynikov	243a4700ce	Fix PR27500: on MSP430 the branch destination offset is measured in words, not bytes. Summary: In addition, the branch instructions will have proper BB destinations, not offsets, like before. Reviewers: asl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23718 llvm-svn: 286252	2016-11-08 17:19:59 +00:00
Chad Rosier	c244349b85	Remove unused include. NFC. llvm-svn: 286250	2016-11-08 16:51:19 +00:00
Dehao Chen	2ca9be330b	Use the last 7 bits to represent the discriminator to fit it in 1 byte ULEB128 (NFC). From experiments, discriminator is rarely greater than 127. Here we enforce it to be no greater than 127 so that it will always fit in 1 byte. llvm-svn: 286245	2016-11-08 16:32:32 +00:00
Simon Pilgrim	778596bf59	[TargetLowering] Fix undef vector element issue with true/false result handling Fixed an issue with vector usage of TargetLowering::isConstTrueVal / TargetLowering::isConstFalseVal boolean result matching. The comment said we shouldn't handle constant splat vectors with undef elements. But the the actual code was returning false if the build vector contained no undef elements.... This patch now ignores the number of undefs (getConstantSplatNode will return null if the build vector is all undefs). The change has also unearthed a couple of missed opportunities in AVX512 comparison code that will need to be addressed. Differential Revision: https://reviews.llvm.org/D26031 llvm-svn: 286238	2016-11-08 15:07:01 +00:00
Pablo Barrio	9f45254138	[JumpThreading] Unfold selects that depend on the same condition Summary: These are good candidates for jump threading. This enables later opts (such as InstCombine) to combine instructions from the selects with instructions out of the selects. SimplifyCFG will fold the select again if unfolding wasn't worth it. Patch by James Molloy and Pablo Barrio. Reviewers: rengolin, haicheng, sebpop Subscribers: jojo, jmolloy, llvm-commits Differential Revision: https://reviews.llvm.org/D26391 llvm-svn: 286236	2016-11-08 14:53:30 +00:00
Simon Pilgrim	d02c55204b	[VectorLegalizer] Expansion of CTLZ using CTPOP when possible This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available. This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful. Differential Revision: https://reviews.llvm.org/D25910 llvm-svn: 286233	2016-11-08 14:10:28 +00:00
Roger Ferrer Ibanez	80c0f33c29	[AArch64] Fix incorrect CSEL node created Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64 transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to "a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have the same type which can lead to a wrong CSEL node that crashes later due to nonsensical copies. Differential Revision: https://reviews.llvm.org/D26394 llvm-svn: 286231	2016-11-08 13:34:41 +00:00
Amara Emerson	0b40201e13	Adds the loop end location to the loop metadata. This additional information can be used to improve the locations when generating remarks for loops. Patch by Florian Hahn. Differential Revision: https://reviews.llvm.org/D25763 llvm-svn: 286227	2016-11-08 11:18:59 +00:00
Sylvestre Ledru	3ce346d4ca	Fix memory leaks (coverity issues 1365586 & 1365591) Reviewers: hfinkel Subscribers: george.burgess.iv, malcolm.parsons, boris.ulasevich, llvm-commits Differential Revision: https://reviews.llvm.org/D26347 llvm-svn: 286223	2016-11-08 10:00:45 +00:00
Peter Collingbourne	e2dcf7c3a1	IR, Bitcode: Change bitcode reader to no longer own its memory buffer. Unique ownership is just one possible ownership pattern for the memory buffer underlying the bitcode reader. In practice, as this patch shows, ownership can often reside at a higher level. With the upcoming change to allow multiple modules in a single bitcode file, it will no longer be appropriate for modules to generally have unique ownership of their memory buffer. The C API exposes the ownership relation via the LLVMGetBitcodeModuleInContext and LLVMGetBitcodeModuleInContext2 functions, so we still need some way for the module to own the memory buffer. This patch does so by adding an owned memory buffer field to Module, and using it in a few other places where it is convenient. Differential Revision: https://reviews.llvm.org/D26384 llvm-svn: 286214	2016-11-08 06:03:43 +00:00
Peter Collingbourne	77c89b6958	Bitcode: Decouple block info block state from reader. As proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106630.html Move block info block state to a new class, BitstreamBlockInfo. Clients may set the block info for a particular cursor with the BitstreamCursor::setBlockInfo() method. At this point BitstreamReader is not much more than a container for an ArrayRef<uint8_t>, so remove it and replace all uses with direct uses of memory buffers. Differential Revision: https://reviews.llvm.org/D26259 llvm-svn: 286207	2016-11-08 04:17:11 +00:00
Peter Collingbourne	939c7d916e	Bitcode: Split out block info reading into a separate function. We're about to make this more complicated. llvm-svn: 286206	2016-11-08 04:16:57 +00:00
George Burgess IV	63b06e185c	Add a missing break statement. NFC. llvm-svn: 286203	2016-11-08 04:01:50 +00:00
Tim Northover	60f2349b50	GlobalISel: improve error diagnostics when IRTranslation fails. llvm-svn: 286190	2016-11-08 01:12:17 +00:00
Tim Northover	9ac0eba672	GlobalISel: support selecting G_SELECT on AArch64. llvm-svn: 286185	2016-11-08 00:45:29 +00:00
Tim Northover	7d88da6a46	GlobalISel: constrain PHI registers on AArch64. Self-referencing PHI nodes need their destination operands to be constrained because nothing else is likely to do so. For now we just pick a register class naively. Patch mostly by Ahmed again. llvm-svn: 286183	2016-11-08 00:34:06 +00:00
Eli Friedman	8649fc053b	[LTO] Add error message on IO error in compileOptimizedToFile. (No testcase because it's difficult to force an error here.) Differential Revision: https://reviews.llvm.org/D26371 llvm-svn: 286177	2016-11-07 23:43:07 +00:00
Stanislav Mekhanoshin	92e01ee90b	[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies Codegen prepare sinks comparisons close to a user is we have only one register for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions. Changed BE to report we have many condition registers. That way IR LICM pass would hoist an invariant comparison out of a loop and codegen prepare will not sink it. With that done a condition is calculated in one block and used in another. Current behavior is to store workitem's condition in a VGPR using v_cndmask and then restore it with yet another v_cmp instruction from that v_cndmask's result. To mitigate the issue a forward propagation of a v_cmp 64 bit result to an user is implemented. Additional side effect of this is that we may consume less VGPRs in a cost of more SGPRs in case if holding of multiple conditions is needed, and that is a clear win in most cases. llvm-svn: 286171	2016-11-07 23:04:50 +00:00
Adam Nemet	b103fc52d3	[OptDiag, opt-viewer] Save callee's location and display as link With this we get a new field in the YAML record if the value being streamed out has a debug location. For examples, please see the changes to the tests. This is then used in opt-viewer to display a link for the callee function in the inlining remarks. Differential Revision: https://reviews.llvm.org/D26366 llvm-svn: 286169	2016-11-07 22:41:13 +00:00
Sanjin Sijaric	6f020d91a1	[AArch64] Transfer memory operands when lowering vector load/store intrinsics Summary: Some vector loads and stores generated from AArch64 intrinsics alias each other unnecessarily, preventing better scheduling. We just need to transfer memory operands during lowering. Reviewers: mcrosier, t.p.northover, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26313 llvm-svn: 286168	2016-11-07 22:39:02 +00:00
Sanjoy Das	4aeb080db3	[TRE] Remove dead code Address review by Eli Friedman on rL286147. llvm-svn: 286165	2016-11-07 22:17:37 +00:00
Derek Schuff	0d41b7b3f3	[WebAssembly] Emit a BasePointer when we have overly-aligned stack objects Because we shift the stack pointer by an unknown amount, we need an additional pointer. In the case where we have variable-size objects as well, we can't reuse the frame pointer, thus three pointers. Patch by Jacob Gravelle Differential Revision: https://reviews.llvm.org/D26263 llvm-svn: 286160	2016-11-07 22:00:48 +00:00
Dehao Chen	d74e1e161d	Reset debug loc to OldInduction in InnerLoopVectorizer::createInductionVariable. (NFC) This is to prevent SetInsertionPoint from setting debug loc to Latch->getTerminator(). llvm-svn: 286159	2016-11-07 21:59:40 +00:00
Sanjoy Das	e06ef141fc	Avoid tail recursion elimination across calls with operand bundles Summary: In some specific scenarios with well understood operand bundle types (like `"deopt"`) it may be possible to go ahead and convert recursion to iteration, but TailRecursionElimination does not have that logic today so avoid doing the right thing for now. I need some input on whether `"funclet"` operand bundles should also block tail recursion elimination. If not, I'll allow TRE across calls with `"funclet"` operand bundles and add a test case. Reviewers: rnk, majnemer, nlewycky, ahatanak Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26270 llvm-svn: 286147	2016-11-07 21:01:49 +00:00
Davide Italiano	2b5ba7bae6	[lib/Object] Modernize. NFCI. llvm-svn: 286146	2016-11-07 21:01:42 +00:00
Evgeniy Stepanov	cd729d6236	Use -fsanitize-recover instead of -mllvm -msan-keep-going. Summary: Use -fsanitize-recover instead of -mllvm -msan-keep-going. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26352 llvm-svn: 286145	2016-11-07 21:00:10 +00:00
Davide Italiano	5df6066ec1	[AArch64] Remove dead store. Found by gcc7. llvm-svn: 286137	2016-11-07 19:11:25 +00:00
Kuba Brecka	44e875ad5b	[tsan] Cast floating-point types correctly when instrumenting atomic accesses, LLVM part Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this. Differential Revision: https://reviews.llvm.org/D26266 llvm-svn: 286135	2016-11-07 19:09:56 +00:00
Matt Arsenault	f530e8b3f0	AMDGPU: Remove unnecessary and on conditional branch The comment explaining why this was necessary is incorrect in its description of v_cmp's behavior for inactive workitems. llvm-svn: 286134	2016-11-07 19:09:33 +00:00
Matt Arsenault	52f14ec596	AMDGPU: Preserve vcc undef flags when inverting branch If the branch was on a read-undef of vcc, passes that used analyzeBranch to invert the branch condition wouldn't preserve the undef flag resulting in a verifier error. Fixes verifier failures in a future commit. Also fix verifier error when inserting copy for vccz corruption bug. llvm-svn: 286133	2016-11-07 19:09:27 +00:00
Benjamin Kramer	1697d39eef	[MemCpyOpt] Don't emit IR in an unspecified order Argument evaluation order is one of the edge cases where Clang differs from GCC, yielding different IR depending on which compiler LLVM was built with. Make the order deterministic and tune the test to actually verify the order instead of trying to hide it. llvm-svn: 286126	2016-11-07 17:47:28 +00:00
Matt Arsenault	2ae5653072	AMDGPU: Try to fix (non-clang?) bot builds llvm-svn: 286120	2016-11-07 16:52:50 +00:00
Richard Smith	857efb0880	Add -O0 support for @llvm.invariant.group.barrier by discarding it if it gets to ISel. Differential Revision: https://reviews.llvm.org/D26292 llvm-svn: 286119	2016-11-07 16:47:20 +00:00
Matt Arsenault	314cbf7a3b	AMDGPU: Refactor copyPhysReg Separate the subregister splitting logic to re-use later. llvm-svn: 286118	2016-11-07 16:39:22 +00:00
Chad Rosier	611b73b1f9	Fix 80-column violations. NFC. llvm-svn: 286117	2016-11-07 16:28:04 +00:00
Sanjay Patel	86408a8048	[InstCombine] allow splat vector folds in adjustMinMax() (retry r285732) This was reverted at r285866 because there was a crash handling a scalar select of vectors. I added a check for that pattern and a test case based on the example provided in the post-commit thread for r285732. llvm-svn: 286113	2016-11-07 15:52:45 +00:00
Jonas Paulsson	4f0509fab3	[SystemZ] Correct the SchedModel regarding vector unit / instructions. * Use a generic vector unit to model the issue unit more accurately. * Update some vector instructions that actually use the vector unit for more than one cycle. Review: Ulrich Weigand llvm-svn: 286112	2016-11-07 15:45:06 +00:00
Amara Emerson	614b44bbe9	This patch adds support for 16 bit floating point registers to the inline asm register selection on AArch64. Without this patch, register allocation for the example below fails. define half @test(half %a1, half %a2) #0 { entry: %0 = tail call half asm "sqrshl ${0:h}, ${1:h}, ${2:h}", "=w,w,w" (half %a1, half %a2) #1 ret half %0 } Patch by Florian Hahn. Differential Revision: https://reviews.llvm.org/D25080 llvm-svn: 286111	2016-11-07 15:42:12 +00:00
Chad Rosier	d6daac4746	[AArch64] Removed the narrow load merging code in the ld/st optimizer. This feature has been disabled for some time now, so remove cruft. Differential Revision: https://reviews.llvm.org/D26248 llvm-svn: 286110	2016-11-07 15:27:22 +00:00
Jonas Paulsson	818431a61a	[SystemZ] Fixes in SchedModels for older subtargets. IssueWidth updated to reflect the capacity of the issue unit correctly. Correct number of FX and LS units modelled (2, was 1). Review: Ulrich Weigand llvm-svn: 286109	2016-11-07 14:47:25 +00:00
Chad Rosier	8f348017b0	[AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory. Differential Revision: https://reviews.llvm.org/D26252 llvm-svn: 286108	2016-11-07 14:11:45 +00:00
James Molloy	b03e0879fc	[Thumb1] Move padding earlier when synthesizing TBBs off of the PC When the base register (register pointing to the jump table) is the PC, we expect the jump table to directly follow the jump sequence with no intervening padding. If there is intervening padding, the calculated offsets will not be correct. One solution would be to account for any padding in the emitted LDRB instruction, but at the moment we don't support emitting MCExprs for the load offset. In the meantime, it's correct and only a slight amount worse to just move the padding up, from just before the jump table to just before the jump instruction sequence. We can do that by emitting code alignment before the jump sequence, as we know the number of instructions in the sequence is always 4. llvm-svn: 286107	2016-11-07 13:38:21 +00:00
Dylan McKay	c988b334b6	[AVR] Enable the ISel, frame analyzer, and alloca passes llvm-svn: 286095	2016-11-07 06:02:55 +00:00
Craig Topper	b110e04851	[AVX-512] Remove masked pmovzx/pmovsx builtins and autoupgrade them to selects and native zext/sext. This mostly reuses earlier autoupgrade support for the sse and avx equivalents. Just needed to add the code to add the select. llvm-svn: 286092	2016-11-07 02:12:57 +00:00
Craig Topper	96041c6ba9	[X86] Use StringRef::startswith to reduce a few compares in the intrinsic autoupgrade code. llvm-svn: 286090	2016-11-07 00:13:42 +00:00
Craig Topper	7e545335d6	[AVX-512] Remove 128/256 masked pshufb intrinsics. Autoupgrade them to legacy intrinsics and a select. llvm-svn: 286089	2016-11-07 00:13:39 +00:00
Krzysztof Parzyszek	39d14f3bc3	Reapply r286080 with a phony change in Hexagon's CMakeLists.txt Cmake has not recognized that Hexagon.td has a new dependency in HexagonPatterns.td. All changes to that file were not visible to the build bots. llvm-svn: 286084	2016-11-06 20:55:57 +00:00
Saleem Abdulrasool	804e12eeb5	ARM: lower fpowi appropriately for Windows ARM This handles the last case of the builtin function calls that we would generate code which differed from Microsoft's ABI. Rather than generating a call to `__pow{d,s}i2` we now promote the parameter to a float or double and invoke `powf` or `pow` instead. Addresses PR30825! llvm-svn: 286082	2016-11-06 19:46:54 +00:00
Krzysztof Parzyszek	f8d38d11b9	Revert r286080: it breaks build bots llvm-svn: 286081	2016-11-06 19:36:09 +00:00
Krzysztof Parzyszek	9e3520c884	[Hexagon] Remove redundant custom selection code The clr/set/toggle-bit instructions (with the bit index given as an immediate operand) had both, custom selection code that generated them, and selection patterns at the same time. The selection patterns were not used, because the custom selection code was executed first. This patch removes the custom code in favor of the selection patterns. The custom code handled 64-bit registers as well with an immediate bit index, and so new patterns were added to implement that. It was also the same case for the instruction "Rd += asr(Rs, Rt)", except that the custom code did not offer any additional functionality, and was simply removed. llvm-svn: 286080	2016-11-06 19:03:38 +00:00
Krzysztof Parzyszek	c93815ef04	[Hexagon] Round 5 of selection pattern simplifications Remove unnecessary type casts in patterns. llvm-svn: 286079	2016-11-06 18:13:14 +00:00
Krzysztof Parzyszek	f914278f8b	[Hexagon] Round 4 of selection pattern simplifications Give simpler or more meaningful names to pat frags and xforms. llvm-svn: 286078	2016-11-06 18:09:56 +00:00
Krzysztof Parzyszek	846597d081	[Hexagon] Round 3 of selection pattern simplifications Remove unnecessary C++ functions for SDNode transforms. Move more pat frags to files where they are used. llvm-svn: 286077	2016-11-06 18:05:14 +00:00
Krzysztof Parzyszek	84755104b4	[Hexagon] Round 2 of selection pattern simplifications Add pat frags for any-, sign-, and zero-extensions. llvm-svn: 286076	2016-11-06 17:56:48 +00:00
Simon Pilgrim	39df78e384	[SelectionDAG] Add support for vector demandedelts in XOR opcodes llvm-svn: 286075	2016-11-06 16:49:19 +00:00
Craig Topper	46de41330c	[AVX-512] Remove intrinsics for 128/256-bit masked variable shift. Instead upgrade them to a select and the older AVX2 intrinsic. llvm-svn: 286073	2016-11-06 16:29:19 +00:00
Craig Topper	af9b3fe752	[AVX-512] Remove intrinsics for 128/256-bit masked shift by immediate. Instead upgrade them to a select and the older SSE/AVX2 intrinsic. llvm-svn: 286072	2016-11-06 16:29:14 +00:00
Simon Pilgrim	dd4809a603	[SelectionDAG] Add support for vector demandedelts in OR opcodes llvm-svn: 286071	2016-11-06 16:29:09 +00:00
Craig Topper	c9467ed31e	[AVX-512] Remove intrinsics for 128/256-bit masked shift by single element in xmm. Instead upgrade them to a select and the older SSE/AVX2 intrinsic. llvm-svn: 286070	2016-11-06 16:29:08 +00:00
Simon Pilgrim	b3ad5f7ebf	[X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsElementInsertion. NFCI Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286067	2016-11-06 14:20:29 +00:00
Benjamin Kramer	f15a8e6a50	[BitcodeWriter] Replace a manual byteswap with read32be. No functional change intended. llvm-svn: 286066	2016-11-06 13:26:39 +00:00
Amaury Sechet	74084c44ca	Kill deprecated attribute API Summary: This kill various depreacated API related to attribute : - The deprecated C API attribute based on LLVMAttribute enum. - The Raw attribute set format (planned to be removed in 4.0). Reviewers: bkramer, echristo, mehdi_amini, void Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23039 llvm-svn: 286062	2016-11-06 07:48:46 +00:00
Tim Shen	398f90f024	[APFloat] Make functions that produce APFloaat objects use correct semantics. Summary: Fixes PR30869. In D25977 I meant to change all functions that care about lifetime. I changed constructors, factory functions, but I missed member/free functions that return new instances. This patch changes them. Reviewers: hfinkel, kbarton, echristo, joerg Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D26269 llvm-svn: 286060	2016-11-06 07:38:37 +00:00
Craig Topper	5471fc29e4	[AVX-512] Add missing EVEX version of pattern for (v2f64 (extloadv2f32 addr:)) -> VCVTPS2PDZ128rm llvm-svn: 286059	2016-11-06 04:12:52 +00:00
Craig Topper	1162857ec4	[AVX-512] Lower AVX cvtpd2ps intrinsic to ISD::FP_ROUND so it can use EVEX instruction when available. llvm-svn: 286057	2016-11-06 04:12:46 +00:00
Craig Topper	9a4a3af5dd	[AVX-512] Lower SSE/AVX cvtdq2ps intrinsics directly to ISD::SINT_TO_FP so they can use EVEX instructions when available. llvm-svn: 286056	2016-11-06 04:12:42 +00:00
Krzysztof Parzyszek	2839b29f4b	[Hexagon] Relocate pattern-related bits to proper places llvm-svn: 286049	2016-11-05 21:44:50 +00:00
Krzysztof Parzyszek	4b4012a5c9	[Hexagon] Round 1 of selection pattern simplifications Consistently use register class pat frags instead of spelling out the type and class each time. llvm-svn: 286048	2016-11-05 21:02:54 +00:00
Simon Pilgrim	4a9f210412	[X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsBlend. NFCI Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286045	2016-11-05 18:31:57 +00:00
Simon Pilgrim	725174694a	[X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsZeroOrAnyExtend. NFCI Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286044	2016-11-05 18:22:13 +00:00
Simon Pilgrim	9f0afc6ae1	[X86][SSE] Reuse zeroable element mask in SSE4A EXTRQ/INSERTQ vector shuffle lowering. NFCI Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286043	2016-11-05 18:05:13 +00:00
Simon Pilgrim	3cae21960e	[X86][SSE] Reuse zeroable element mask in PSHUFB vector shuffle lowering. NFCI Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286042	2016-11-05 17:53:27 +00:00
Simon Pilgrim	64a592d0a2	[X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsInsertPS. NFCI Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286040	2016-11-05 17:27:48 +00:00
Simon Pilgrim	009befbd88	[X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsBitMask. NFCI Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available. llvm-svn: 286039	2016-11-05 17:12:19 +00:00
Justin Lebar	54b0be048e	[LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any valid int64_t to the set. Summary: SmallSetVector uses DenseSet, but that means we need to reserve some values for the empty and tombstone keys. It seems to me we should have a general way to let us store full-range ints inside of DenseSets, and furthermore that we probably shouldn't silently let you add ints into DenseSets without explicitly promising that they're in range. But that's a battle for another day; for now, just fix this code, since we currently do something Very Bad when compiling ffmpeg. Fixes PR30914. Reviewers: jeremyhu Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D26323 llvm-svn: 286038	2016-11-05 16:47:25 +00:00
Simon Pilgrim	1af0fc1103	[X86][SSE] Reuse zeroable element mask instead of regenerating it. NFCI We are repeatedly calling computeZeroableShuffleElements in many shuffle lowering calls for the same shuffle mask/inputs. This is a first step towards reusing the zeroable result, initially just for lowerVectorShuffleAsShift calls. llvm-svn: 286037	2016-11-05 16:40:20 +00:00
Krzysztof Parzyszek	a8d63dc289	[Hexagon] Split all selection patterns into a separate file This is just the basic separation, without any cleanup. Further changes will follow. llvm-svn: 286036	2016-11-05 15:01:38 +00:00
Simon Pilgrim	1b4e1ac966	Strip trailing whitespace. NFCI. llvm-svn: 286034	2016-11-05 14:43:04 +00:00
Craig Topper	8f354a7f8b	[AVX-512] Use an equality compare instead of StringRef::startswith in a few places in auto upgrade that were looking for the complete intrinsic name anyway. llvm-svn: 286033	2016-11-05 05:35:23 +00:00
Alina Sbirlea	f802c3f975	Correct mprotect page boundries to round up end page. Fixes PR30905. Summary: Update the boundries for mprotect. Patch by Andrew Adams. Fixes PR30905. Reviewers: loladiro, andrew.w.kaylor, chandlerc Subscribers: abadams, llvm-commits Differential Revision: https://reviews.llvm.org/D26312 llvm-svn: 286032	2016-11-05 04:22:15 +00:00
Craig Topper	a2f12e0a75	[X86] Remove broken support for autoupgrading llvm.x86.fma4.* intrinsics to llvm.x86.fma.*. It currently fires an assert if you even try. Looking back, I don't think it ever worked because it only changed the name of the function object, but not the intrinsic ID stored in it. Given that, I think it can be removed since no one has noticed or complained in the past 4 years. llvm-svn: 286031	2016-11-05 04:00:31 +00:00
Krzysztof Parzyszek	b7eb7fc892	[Hexagon] Account for <def,read-undef> when validating moves for predication llvm-svn: 286009	2016-11-04 20:41:03 +00:00
Weiming Zhao	6100118a52	Fix 24560: assembler does not share constant pool for same constants Summary: This patch returns the same label if the CP entry with the same value has been created. Reviewers: eli.friedman, rengolin, jmolloy Subscribers: majnemer, jmolloy, llvm-commits Differential Revision: https://reviews.llvm.org/D25804 llvm-svn: 286006	2016-11-04 19:17:32 +00:00
Zvi Rackover	85bc64c734	[X86] Broadcast from memory intructions aren't unfoldable Broadcast from memory instructions should be treated as moves. They can't be unfolded. Fixes pr30693. llvm-svn: 285998	2016-11-04 15:15:19 +00:00
Tom Stellard	2d2d33f1dc	Revert "AMDGPU: Add VI i16 support" This reverts commit r285939 and r285948. These broke some conformance tests. llvm-svn: 285995	2016-11-04 13:06:34 +00:00
Jonas Paulsson	baeb402014	Comment rewording in MachineScheduler.cpp. Author: A Trick llvm-svn: 285991	2016-11-04 08:31:14 +00:00
Chandler Carruth	d9ef4b6601	Only log the visit of a return instruction if we in fact found a return instruction. This avoids dereferencing null in the debug logging if the instruction was not in fact a return instruction. This potential bug was found by PVS-Studio. This actually fixes the last of the "dereferenced a pointer before checking it for null" reports in the recent PVS-Studio run. However, there are quite a few reports of this nature that I did not do anything to fix because they are pretty glaring false positives. They usually took the form of quite clear correlated checks or a check made in a separate function. I've even added asserts anywhere this correlation wasn't pretty obvious and fundamental to the code. llvm-svn: 285988	2016-11-04 06:59:50 +00:00
Chandler Carruth	0f139b4f71	Hoist check for TLI above all of the attempts to use it (including one of which that is hidden inside a separate function call) and helpfully before building expensive transaction infrastructure. This will avoid crashing when running CGP in a generic mode if we ever managed to hit this case. Note that I spent some time looking at alternatives. CGP is actually used without a TM or TLI in order to do some target-independent testing. Further, all of the neighboring optimization techniques actually have some paths that are effective even in the absence of TLI so this seemed the correct scope at which to check and bypass logic. It still isn't clear that long-term support for missing TM/TLI is the right cost/benefit tradeoff for CGP -- we seem to get relatively little for it and the code is just littered with checks (and assumptions which I suspect are still missing some checks). This at least fixes the potential bug in this code spotted by PVS-Studio, so we've got that going for us. ;] llvm-svn: 285987	2016-11-04 06:54:00 +00:00
Xinliang David Li	f450b88191	Fix typo llvm-svn: 285978	2016-11-04 03:00:52 +00:00
Justin Bogner	2c2c6ac7b5	X86: Move a non-null assert to before the pointer is dereferenced llvm-svn: 285975	2016-11-03 23:55:36 +00:00
Chandler Carruth	651f019297	Sink all of the code relying on the MachO MachineModuleInfo to live behind the test that the MachineModuleInfo analysis was actually available and can be used. While the MachO bits may well be reasonable to assume in the darwin assembly printer, the analysis isn't constructively guaranteed anywhere I could find so it seems safest to avoid crashing here. This issue was found with PVS-Studio. Pretty sure the Clang Static Anaylzer flags similar issues but we've probably never pointed it at this code effectively. llvm-svn: 285972	2016-11-03 23:33:46 +00:00
Weiming Zhao	962eaaea9c	[Cortex-M0] Atomic lowering Summary: ARMv6m supports dmb etc fench instructions but not ldrex/strex etc. So for some atomic load/store, LLVM should inline instructions instead of lowering to __sync_ calls. Reviewers: rengolin, efriedma, t.p.northover, jmolloy Subscribers: efriedma, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D26120 llvm-svn: 285969	2016-11-03 21:49:08 +00:00
Kevin Enderby	7747cb55dc	Add support for the ARM_THREAD_STATE64 and in llvm-objdump for Mach-O files add the printing of the ARM_THREAD_STATE64 in the same format as otool-classic(1) on darwin. To do this the 64-bit ARM general tread state needed to be defined in include/llvm/Support/MachO.h . rdar://28985800 llvm-svn: 285967	2016-11-03 20:51:28 +00:00
Tony Jiang	946242b5d2	NFC - Test commit. Delete an empty line at the end of README.txt file. llvm-svn: 285964	2016-11-03 20:32:21 +00:00
Adrian Prantl	dbfda63695	Add DWARF debug info support for C++11 inline namespaces. This implements the DWARF 5 DW_AT_export_symbols feature: http://dwarfstd.org/ShowIssue.php?issue=141212.1 <rdar://problem/18616046> llvm-svn: 285959	2016-11-03 19:42:02 +00:00
Kostya Serebryany	8a56917492	[libFuzzer] fix -error_exitcode=N, now with a test llvm-svn: 285958	2016-11-03 19:31:18 +00:00
Justin Bogner	f9fb2abb01	PDB: Fix some APIs to avoid use-after-frees The buffer is already owned by the PDBFile for all of these APIs, so don't pass it in separately. llvm-svn: 285953	2016-11-03 18:28:04 +00:00
Tom Stellard	cc34983181	AMDGPU/SI: Re add VIInstructions.td to unbreak bots This file is unused as of r285939, but we need to keep it around for bots that don't do full rebuilds. We should be able to delete this again in a few days. llvm-svn: 285948	2016-11-03 17:56:46 +00:00
Chandler Carruth	5589aa60c7	Remove a redundant condition found by PVS-Studio. Filed http://llvm.org/PR30897 to teach Clang to warn on this kind of stuff. llvm-svn: 285945	2016-11-03 17:42:02 +00:00
Tom Stellard	2b3379cdff	AMDGPU: Add VI i16 support Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 285939	2016-11-03 17:13:50 +00:00
Chandler Carruth	30e0029904	Delete a dead store found by PVS-Studio. Quite sad we still aren't really using aggressive dead code warnings from Clang that we could potentially use to catch this and so many other things. llvm-svn: 285936	2016-11-03 17:01:38 +00:00
Chandler Carruth	fca1ff0da2	Fix a bug found by inspection by PVS-Studio. This condition is trivially always true prior to the change. The comment at the call site makes it clear that we expect all of these to be '=', 'S', or 'I' so fix the code. We have a bug I will update to track the fact that Clang doesn't warn on this: http://llvm.org/PR13101 llvm-svn: 285930	2016-11-03 16:39:25 +00:00
Alexander Timofeev	f867a40bf6	[AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads. hange explores the fact that LDS reads may be reordered even if access the same location. Prior the change, algorithm immediately stops as soon as any memory access encountered between loads that are expected to be merged together. Although, Read-After-Read conflict cannot affect execution correctness. Improves hcBLAS CGEMM manually loop-unrolled kernels performance by 44%. Also improvement expected on any massive sequences of reads from LDS. Differential Revision: https://reviews.llvm.org/D25944 llvm-svn: 285919	2016-11-03 14:37:13 +00:00
Zvi Rackover	a455864fdf	Refactor creation of X86ISD::SETCC nodes to a helper function. NFC. llvm-svn: 285917	2016-11-03 14:25:24 +00:00
Nicolai Haehnle	bea772c6dc	DAGCombiner: fix use-after-free when merging consecutive stores Summary: Have MergeConsecutiveStores explicitly return information about the stores that were merged, so that we can safely determine whether the starting node has been freed. Reviewers: chandlerc, bogner, niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25601 llvm-svn: 285916	2016-11-03 14:25:04 +00:00
James Molloy	e7d97368f2	Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently" This reverts commit r285893. It caused (probably) http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/83 . llvm-svn: 285912	2016-11-03 14:08:01 +00:00
James Molloy	b60d8b1987	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently This recommits r281323, which was backed out for two reasons. One, a selfhost failure, and two, it apparently caused Chromium failures. Actually, the latter was a red herring. The log has expired from the former, but I suspect that was a red herring too (actually caused by another problematic patch of mine). Therefore reapplying, and will watch the bots like a hawk. For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 285893	2016-11-03 10:18:20 +00:00
George Rimar	e1924f06d1	[Object/ELF] - Make getSymbol() return Error. That is consistent with other methods around and helps to handle error on a caller side. Differential revision: https://reviews.llvm.org/D26247 llvm-svn: 285886	2016-11-03 08:40:55 +00:00
Craig Topper	7b9cc1474f	[AVX-512] Use 'vnot' instead of 'not' in patterns involving vXi1 vectors. This fixes selection of KANDN instructions and allows us to remove an extra set of patterns for KNOT and KXNOR. Reviewers: delena, igorb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26134 llvm-svn: 285878	2016-11-03 06:04:28 +00:00
Elena Demikhovsky	caaceef4b3	Expandload and Compressstore intrinsics 2 new intrinsics covering AVX-512 compress/expand functionality. This implementation includes syntax, DAG builder, operation lowering and tests. Does not include: handling of illegal data types, codegen prepare pass and the cost model. llvm-svn: 285876	2016-11-03 03:23:55 +00:00
Teresa Johnson	0515fb8d4b	[ThinLTO] Handle distributed backend case when doing renaming Summary: The recent change I made to consult the summary when deciding whether to rename (to handle inline asm) in r285513 broke the distributed build case. In a distributed backend we will only have a portion of the combined index, specifically for imported modules we only have the summaries for any imported definitions. When renaming on import we were asserting because no summary entry was found for a local reference being linked in (def wasn't imported). We only need to consult the summary for a renaming decision for the exporting module. For imports, we would have prevented importing any references to NoRename values already. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26250 llvm-svn: 285871	2016-11-03 01:07:16 +00:00
Greg Bedwell	5fc6f94591	Revert "[InstCombine] allow splat vector folds in adjustMinMax()" This reverts commit r285732. This change introduced a new assertion failure in the following testcase at -O2: typedef short __v8hi __attribute__((__vector_size__(16))); __v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) { __v8hi Result = V1; if (mask & 0x80) Result[0] = V2[0]; return Result; } llvm-svn: 285866	2016-11-02 23:17:05 +00:00
Adrian McCarthy	4333daab1c	Emit S_COMPILE3 record once per TU rather than once per function This has some ripple effects in several tests. llvm-svn: 285862	2016-11-02 21:30:35 +00:00
Kevin Enderby	fbebe1632a	Add the rest of the additional error checks for invalid Mach-O files when the offsets and sizes of an element of the Mach-O file overlaps with another element in the Mach-O file. Some other tests for malformed Mach-O files now run into these checks so their tests were also adjusted. llvm-svn: 285860	2016-11-02 21:08:39 +00:00
Eli Friedman	b6befc3bc4	DCE math library calls with a constant operand. On platforms which use -fmath-errno, math libcalls without any uses require some extra checks to figure out if they are actually dead. Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 . Differential Revision: https://reviews.llvm.org/D25970 llvm-svn: 285857	2016-11-02 20:48:11 +00:00
Krzysztof Parzyszek	ead77016d8	[Hexagon] Remove registers coalesced in expand-condsets from live intervals llvm-svn: 285846	2016-11-02 17:59:54 +00:00
Davide Italiano	6b2bba14a9	[lli/COFF] Set the correct alignment for common symbols Otherwise we set it always to zero, which is not correct, and we assert inside alignTo (Assertion failed: Align != 0u && "Align can't be 0."). Differential Revision: https://reviews.llvm.org/D26173 llvm-svn: 285841	2016-11-02 17:32:19 +00:00
Zachary Turner	7251ede7c5	Add CodeViewRecordIO for reading and writing. Using a pattern similar to that of YamlIO, this allows us to have a single codepath for translating codeview records to and from serialized byte streams. The current patch only hooks this up to the reading of CodeView type records. A subsequent patch will hook it up for writing of CodeView type records, and then a third patch will hook up the reading and writing of CodeView symbols. Differential Revision: https://reviews.llvm.org/D26040 llvm-svn: 285836	2016-11-02 17:05:19 +00:00
Nicolai Haehnle	368972c3b3	AMDGPU: Allow additional implicit operands on MOVRELS instructions Summary: The post-RA scheduler occasionally uses additional implicit operands when the vector implicit operand as a whole is killed, but some subregisters are still live because they are directly referenced later. Unfortunately, this seems incredibly subtle to reproduce. Fixes piglit spec/glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-wr.shader_test and others. Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25656 llvm-svn: 285835	2016-11-02 17:03:11 +00:00
Malcolm Parsons	06ac79c210	Fix Clang-tidy readability-redundant-string-cstr warnings Reviewers: beanz, lattner, jlebar Subscribers: jholewinski, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D26235 llvm-svn: 285832	2016-11-02 16:43:50 +00:00
Nirav Dave	0a392a8e7f	[ARM][MC] Cleanup ARM Target Assembly Parser Summary: Correctly parse end-of-statement tokens and handle preprocessor end-of-line comments in ARM assembly processor. Reviewers: rnk, majnemer Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26152 llvm-svn: 285830	2016-11-02 16:22:51 +00:00
Adrian Prantl	f148d6989c	Improve and cleanup comments in DwarfExpression.h llvm-svn: 285829	2016-11-02 16:20:37 +00:00
Matt Arsenault	44deb7914e	BranchRelaxation: Fix computing indirect branch block size llvm-svn: 285828	2016-11-02 16:18:29 +00:00
Adrian Prantl	54286bda2c	Simplify control flow in the the DWARF expression compiler by refactoring common code into a DwarfExpressionCursor wrapper. llvm-svn: 285827	2016-11-02 16:12:20 +00:00
Adrian Prantl	56527dc804	Emit DW_OP_piece also if the previous value was a constant. This fixes a bug in the DWARF backend. llvm-svn: 285826	2016-11-02 16:12:16 +00:00
Simon Pilgrim	93f2f7fb6c	Use !operator to test if APInt is zero/non-zero. NFCI. Avoids APInt construction and slower comparisons. llvm-svn: 285822	2016-11-02 15:41:15 +00:00
Vasileios Kalintiris	e3bb72ea78	[mips] Always run the MipsOptimizePICCall pass. Summary: Remove this pass from addMachineSSAOptimization() and register it unconditionally in through addPreRegAlloc(). This pass is required for generating correct PIC calls. Reviewers: sdardis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26036 llvm-svn: 285814	2016-11-02 15:11:27 +00:00
Joerg Sonnenberger	bef3621ad0	Create the virtual register for the global base in the intersection of GPRC and GPRC_NOR0 (or the 64bit equivalent) and not just the latter. GPRC_NOR0 contains ZERO as alternative meaning of r0 and is therefore not a true subclass of GPRC. llvm-svn: 285813	2016-11-02 15:00:31 +00:00
Aaron Ballman	3ac3a7efff	Removing a switch statement that contains a default label, but no case labels. Silences an MSVC warning; NFC. llvm-svn: 285806	2016-11-02 13:58:57 +00:00
Joerg Sonnenberger	5e31b3ad93	Simplify. llvm-svn: 285802	2016-11-02 12:45:28 +00:00
Ulrich Weigand	75d2f1b10d	[SystemZ] Fix compiler warnings introduced by r285574 SystemZAsmParser::parseOperand returns a bool, not an enum. llvm-svn: 285800	2016-11-02 11:32:28 +00:00
Kirill Bobyrev	1f1751182e	[llvm] FIx if-clause -Wmisleading-indentation issue. While bootstrapping Clang with recent `gcc 6.2.0` I found a bug related to misleading indentation. I believe, a pair of `{}` was forgotten, especially given the above similar piece of code: ``` if (!RDef \|\| !HII->isPredicable(*RDef)) { Done = coalesceRegisters(RD, RegisterRef(S1)); if (Done) { UpdRegs.insert(RD.Reg); UpdRegs.insert(S1.getReg()); } } ``` Reviewers: kparzysz Differential Revision: https://reviews.llvm.org/D26204 llvm-svn: 285794	2016-11-02 10:00:40 +00:00
Bjorn Pettersson	7424c8ccd1	[Reassociate] Skip analysis of dead code to avoid infinite loop. Summary: It was detected that the reassociate pass could enter an inifite loop when analysing dead code. Simply skipping to analyse basic blocks that are dead avoids such problems (and as a side effect we avoid spending time on optimising dead code). The solution is using the same Reverse Post Order ordering of the basic blocks when doing the optimisations, as when building the precalculated rank map. A nice side-effect of this solution is that we now know that we only try to do optimisations for blocks with ranked instructions. Fixes https://llvm.org/bugs/show_bug.cgi?id=30818 Reviewers: llvm-commits, davide, eli.friedman, mehdi_amini Subscribers: dberlin Differential Revision: https://reviews.llvm.org/D26154 llvm-svn: 285793	2016-11-02 08:55:19 +00:00
Dylan McKay	7549b0a013	[AVR] Add instruction selection lowering code Summary: This adds AVRISelLowering.cpp Reviewers: arsenm, kparzysz Subscribers: llvm-commits, modocache, japaric, wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25034 llvm-svn: 285790	2016-11-02 06:47:40 +00:00
Peter Collingbourne	ff2c2ec6b2	Bitcode: Check file size before reading bitcode header. Should unbreak ocaml binding tests. Also added an llvm-dis test that checks for the same thing. llvm-svn: 285777	2016-11-02 00:39:11 +00:00
Peter Collingbourne	4e76019e34	Support: Remove MemoryObject and DataStreamer interfaces. These interfaces are no longer used. Differential Revision: https://reviews.llvm.org/D26222 llvm-svn: 285774	2016-11-02 00:08:37 +00:00
Peter Collingbourne	028eb5a3f8	Bitcode: Change reader interface to take memory buffers. As proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106595.html This change also fixes an API oddity where BitstreamCursor::Read() would return zero for the first read past the end of the bitstream, but would report_fatal_error for subsequent reads. Now we always report_fatal_error for all reads past the end. Updated clients to check for the end of the bitstream before reading from it. I also needed to add padding to the invalid bitcode tests in test/Bitcode/. This is because the streaming interface was not checking that the file size is a multiple of 4. Differential Revision: https://reviews.llvm.org/D26219 llvm-svn: 285773	2016-11-02 00:08:19 +00:00
Alex Bradbury	6b2cca7f8f	[RISCV] Add bare-bones RISC-V MCTargetDesc This is enough to compile and link but doesn't yet do anything particularly useful. Once an ASM parser and printer are added in the next two patches, the whole thing can be usefully tested. Differential Revision: https://reviews.llvm.org/D23562 llvm-svn: 285770	2016-11-01 23:47:30 +00:00
Alex Bradbury	24d9b13b36	[RISCV 4/10] Add basic RISCV{InstrFormats,InstrInfo,RegisterInfo,}.td For now, only add instruction definitions for basic ALU operations. Our initial target is a working MC layer rather than codegen, so appropriate SelectionDAG patterns will come later. Differential Revision: https://reviews.llvm.org/D23561 llvm-svn: 285769	2016-11-01 23:40:28 +00:00
Matt Arsenault	c507cdb4bc	AMDGPU: Handle CopyToReg in getOperandRegClass llvm-svn: 285768	2016-11-01 23:22:17 +00:00
Matt Arsenault	663ab8c119	AMDGPU: Use brev for materializing SGPR constants This is already done with VGPR immediates and saves 4 bytes. llvm-svn: 285765	2016-11-01 23:14:20 +00:00
Matt Arsenault	3d463193a9	AMDGPU: Default to using scalar mov to materialize immediate This is the conservatively correct way because it's easy to move or replace a scalar immediate. This was incorrect in the case when the register class wasn't known from the static instruction definition, but still needed to be an SGPR. The main example of this is inlineasm has an SGPR constraint. Also start verifying the register classes of inlineasm operands. llvm-svn: 285762	2016-11-01 22:55:07 +00:00
Eric Christopher	690f8e587e	Move the initialization of PreferredLoopExit into runOnMachineFunction to be near the other function specific initializations. llvm-svn: 285758	2016-11-01 22:15:50 +00:00
Sam McCall	2a36eee4ca	Fix uninitialized access in MachineBlockPlacement. Summary: Currently PreferredLoopExit is set only in buildLoopChains, which is never called if there are no MachineLoops. MSan is currently broken by this: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/145/steps/check-llvm%20msan/logs/stdio This is a naive fix to get things green again. iteratee: you may have a better fix. This change will also mean PreferredLoopExit will not carry over if buildCFGChains() is called a second time in runOnMachineFunction, this appears to be the right thing. Reviewers: bkramer, iteratee, echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26069 llvm-svn: 285757	2016-11-01 22:02:14 +00:00
Matt Arsenault	a6319b82ca	AMDGPU: Stop creating unused virtual registers These are only used in the spill to VMEM path. Move them to the one use. llvm-svn: 285756	2016-11-01 21:58:07 +00:00
George Burgess IV	66837aba0a	[MemorySSA] Tighten up types to make our API prettier. NFC. Patch by bryant. Differential Revision: https://reviews.llvm.org/D26126 llvm-svn: 285750	2016-11-01 21:17:46 +00:00
Sanjay Patel	9840cdad4c	[ValueTracking] remove TODO comment; NFC InstCombine should always canonicalize patterns like the one shown in the comment when visiting 'select' insts in adjustMinMax(). Scalars were already handled there, and vector splats are handled after: https://reviews.llvm.org/rL285732 llvm-svn: 285744	2016-11-01 20:43:00 +00:00
Matt Arsenault	2d8c289b4b	AMDGPU: Workaround for instruction size with literals Instructions with a 32-bit base encoding with an optional 32-bit literal encoded after them report their size as 4 for the disassembler. Consider these when computing the MachineInstr size. This fixes problems caused by size estimate consistency in BranchRelaxation. llvm-svn: 285743	2016-11-01 20:42:24 +00:00
Sanjay Patel	c3d89842ad	[InstCombine] allow splat vector folds in adjustMinMax() llvm-svn: 285732	2016-11-01 20:08:02 +00:00
Sanjay Patel	c0339c77ef	[InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons. This transforms %a = shl nuw %x, c1 %b = icmp {ugt\|ule} %a, c0 into %b = icmp {ugt\|ule} %x, (c0 >> c1) z3: (declare-const x (_ BitVec 64)) (declare-const c0 (_ BitVec 64)) (declare-const c1 (_ BitVec 64)) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvugt (bvshl x c1) c0) (bvugt x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvule (bvshl x c1) c0) (bvule x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) Patch by bryant! Differential Revision: https://reviews.llvm.org/D25913 llvm-svn: 285729	2016-11-01 19:19:29 +00:00
Krzysztof Parzyszek	654dc11b79	[Hexagon] Rename operand/predicate names for unshifted integers For example, rename s6Ext to s6_0Ext. The names for shifted integers include the underscore and this will make the naming consistent. It also exposed a few duplicates that were removed. llvm-svn: 285728	2016-11-01 19:02:10 +00:00
Matt Arsenault	cb578f84e0	BranchRelaxation: Expand unconditional branches first It's likely if a conditional branch needs to be expanded, the following unconditional branch will also need expansion. By expanding the unconditional branch first, the conditional branch can be simply inverted to jump over the inserted indirect branch block. If the conditional branch is expanded first, it results in an additional branch. This avoids test regressions in future commits. llvm-svn: 285722	2016-11-01 18:34:00 +00:00
Sanjay Patel	644d7c3b8a	[InstCombine] clean up adjustMinMax(); NFCI 1. Change param names for readability 2. Change pointer param to ref 3. Early exit to reduce indent 4. Change switch to if/else llvm-svn: 285718	2016-11-01 18:15:03 +00:00
Konstantin Zhuravlyov	d971a1123f	[AMDGPU] Check if type transforms to i16 (VI+) when getting AMDGPUISD::FFBH_U32 This will prevent following regression when enabling i16 support (D18049): test/CodeGen/AMDGPU/ctlz.ll test/CodeGen/AMDGPU/ctlz_zero_undef.ll Differential Revision: https://reviews.llvm.org/D25802 llvm-svn: 285716	2016-11-01 17:49:33 +00:00
Sanjay Patel	7ce658388b	[InstCombine] add helper function for adjustMinMax(); NFCI This is just a cut and paste; clean-up and enhancements to follow. llvm-svn: 285715	2016-11-01 17:46:08 +00:00
Alex Bradbury	b2e5472d85	[RISCV] Add stub backend This contains just enough for lib/Target/RISCV to compile. Notably a basic RISCVTargetMachine and RISCVTargetInfo. At this point you can attempt llc -march=riscv32 myinput.ll and will find it fails due to the lack of MCAsmInfo. See http://lists.llvm.org/pipermail/llvm-dev/2016-August/103748.html for further discussion Differential Revision: https://reviews.llvm.org/D23560 llvm-svn: 285712	2016-11-01 17:27:54 +00:00
Tom Stellard	9677b60288	AMDGPU: Fix buildbots broken by r285704 llvm-svn: 285711	2016-11-01 17:20:03 +00:00
Alex Bradbury	1524f62b97	[RISCV] Add RISC-V ELF defines Add the necessary definitions for RISC-V ELF files, including relocs. Also make necessary trivial change to ELFYaml, llvm-objdump, and llvm-readobj in order to work with RISC-V ELFs. Differential Revision: https://reviews.llvm.org/D23557 llvm-svn: 285708	2016-11-01 16:59:37 +00:00
Alex Bradbury	b6e784a240	[RISCV] Recognise riscv32 and riscv64 in triple parsing code This is the first in a series of 10 initial patches that incrementally add an MC layer for RISC-V to LLVM. See <http://lists.llvm.org/pipermail/llvm-dev/2016-August/103748.html> for more discussion. Differential Revision: https://reviews.llvm.org/D23557 llvm-svn: 285707	2016-11-01 16:47:54 +00:00
Alex Bradbury	58eba09949	[TableGen] Move OperandMatchResultTy enum to MCTargetAsmParser.h As it stands, the OperandMatchResultTy is only included in the generated header if there is custom operand parsing. However, almost all backends make use of MatchOperand_Success and friends from OperandMatchResultTy for e.g. parseRegister. This is a pain when starting an AsmParser for a new backend that doesn't yet have custom operand parsing. Move the enum to MCTargetAsmParser.h. This patch is a prerequisite for D23563 Differential Revision: https://reviews.llvm.org/D23496 llvm-svn: 285705	2016-11-01 16:32:05 +00:00
Tom Stellard	94c21bc088	AMDGPU: Implement expansion of f16 = FP_TO_FP16 f64 I wanted to implement this as a target independent expansion, however when targets say they want to expand FP_TO_FP16 what they actually want is the unsafe math expansion when possible and expansion to a libcall in all other cases. The only way to make this work as a target independent would be to add logic to target's TargetLowering construction to mark theses nodes as Expand when LegalizeDAG can use the unsafe expansion and mark them as LibCall when it cannot. I think this would be possible, but I think it would be too fragile and complex as it would require targets to keep their expansion logic up to date with the code in LegalizeDAG. Reviewers: bogner, ab, t.p.northover, arsenm Subscribers: wdng, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D25999 llvm-svn: 285704	2016-11-01 16:31:48 +00:00
Simon Pilgrim	6dd8fab443	[InstCombine] Folding of shifts by the sum of positive values This patch introduces the combine: (C1 shift (A add C2)) -> ((C1 shift C2) shift A) iff A and C2 are both positive If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift. Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive). Differential Revision: https://reviews.llvm.org/D26000 llvm-svn: 285696	2016-11-01 15:40:30 +00:00
James Molloy	70a3d6df52	[Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables [Reapplying r284580 and r285917 with fix and testing to ensure emitted jump tables for Thumb-1 have 4-byte alignment] The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions. It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size. TBB example: Before: lsls r0, r0, #2 After: add r0, pc adr r1, .LJTI0_0 ldrb r0, [r0, #6] ldr r0, [r0, r1] lsls r0, r0, #1 mov pc, r0 add pc, r0 => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4. The only case that can increase dynamic instruction count is the TBH case: Before: lsls r0, r4, #2 After: lsls r4, r4, #1 adr r1, .LJTI0_0 add r4, pc ldr r0, [r0, r1] ldrh r4, [r4, #6] mov pc, r0 lsls r4, r4, #1 add pc, r4 => 1 more instruction in prologue. Jump table shrunk by a factor of 2. So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!) llvm-svn: 285690	2016-11-01 13:37:41 +00:00
Valery Pykhtin	8a89d3662a	[AMDGPU] Expand vector mulhu/mulhs Differential revision: https://reviews.llvm.org/D26077 llvm-svn: 285684	2016-11-01 10:26:48 +00:00
Nemanja Ivanovic	e70fa63390	[PowerPC] Implement vector shift builtins - llvm portion This patch corresponds to review https://reviews.llvm.org/D26095. Committing on behalf of Tony Jiang. llvm-svn: 285681	2016-11-01 09:42:32 +00:00
Serge Pavlov	6ac8e034f6	Allow resolving response file names relative to including file If a response file included by construct @file itself includes a response file and that file is specified by relative file name, current behavior is to resolve the name relative to the current working directory. The change adds additional flag to ExpandResponseFiles that may be used to resolve nested response file names relative to including file. With the new mode a set of related response files may be kept together and reference each other with short position independent names. Differential Revision: https://reviews.llvm.org/D24917 llvm-svn: 285675	2016-11-01 06:53:29 +00:00
Sanjoy Das	9c729067e9	[TBAA] Use wrapper objects instead of raw getOperand s; NFC This is intended to make the semantic intent clearer. The wrapper objects are now generic to avoid `const_cast` s. Since `const` ness is part of the API of `MDNode::getMostGenericTBAA` (and therefore I can't make things `const` all the way through without some code churn outside TypeBasedAliasAnalysis.cpp), this seemed like the cleanest solution. llvm-svn: 285665	2016-11-01 02:58:30 +00:00
Sanjoy Das	a1a77f3a21	[TBAA] Rename accessors to be more idiomatic; NFC llvm-svn: 285661	2016-11-01 01:21:57 +00:00
Peter Collingbourne	d3a6c70b2d	Bitcode: Simplify BitstreamWriter::EnterBlockInfoBlock() interface. No block info block should need to define local abbreviations, so we can always use a code width of 2. Also change all block info block writers to use EnterBlockInfoBlock. Differential Revision: https://reviews.llvm.org/D26168 llvm-svn: 285660	2016-11-01 01:18:57 +00:00
Matt Arsenault	f3dd863031	AMDGPU: Whitespace fixes llvm-svn: 285659	2016-11-01 00:55:14 +00:00
Sanjay Patel	70c5f02d25	[DAG] disable nsw/nuw for add/sub/mul when simplifying based on demanded bits (PR30841) This bug was exposed by using nsw/nuw for more aggressive folds in: https://reviews.llvm.org/rL284844 The changes mimic the IR demanded bits logic in InstCombiner::SimplifyDemandedUseBits(), but we can't just flip flag bits in the DAG; we have to create a new node that has the bits cleared. This should fix: https://llvm.org/bugs/show_bug.cgi?id=30841 llvm-svn: 285656	2016-10-31 23:28:45 +00:00
Davide Italiano	51cbe13a3f	[Hexagon] Garbage collect dead code. llvm-svn: 285654	2016-10-31 22:56:56 +00:00
Evgeniy Stepanov	1bd9fc7098	Fix a typo. Found with PVS-Studio here: http://www.viva64.com/en/b/0446/ llvm-svn: 285652	2016-10-31 22:42:39 +00:00
Saleem Abdulrasool	e1aa782bd0	CodeGen: further loosen -O0 CG for WoA division Generate the slowest possible codepath for noopt CodeGen. Even trying to be clever with the negated jump can cause out-of-range jumps. Use a wide branch instead. Although the code is modelled simplistically, the later optimizations would recombine the branching into `cbz` if possible. This re-enables the previous optimization as well as hopefully gives us working code in all cases. Addresses PR30356! llvm-svn: 285649	2016-10-31 22:12:37 +00:00
Teresa Johnson	002af9bbce	[ThinLTO] Disable importing and other cross-module optis at -O0 Summary: There is no point to importing at -O0, since we won't inline. We should also disable other cross-module optimizations. (Plan to backport this fix to the 3.9 branch to fix PR30774) Reviewers: pcc Subscribers: johanengelen, mehdi_amini Differential Revision: https://reviews.llvm.org/D25918 llvm-svn: 285648	2016-10-31 22:12:21 +00:00
Justin Lebar	ed1e312f05	[NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass. Summary: This has been replaced by the NVPTXInferAddressSpaces pass. We've had the new one as the default with the old one accessible via a flag for some months now, and we've had no problems. Reviewers: tra Subscribers: llvm-commits, jholewinski, jingyue, mgorny Differential Revision: https://reviews.llvm.org/D26165 llvm-svn: 285642	2016-10-31 21:51:42 +00:00
Kevin Enderby	d503940e8f	More additional error checks for invalid Mach-O files when the offsets and sizes of an element of the file overlaps with another element in the Mach-O file. This shows the approach to this testing for three elements and contains for tests for their overlap. Checking for all the remain elements will be added next. llvm-svn: 285632	2016-10-31 20:29:48 +00:00
Nemanja Ivanovic	60bdfe5a7c	[PPC] add absolute difference altivec instructions and matching intrinsics This patch corresponds to review https://reviews.llvm.org/D26072. Committing on behalf of Sean Fertile. llvm-svn: 285627	2016-10-31 19:47:52 +00:00
Victor Leschuk	e1156c2eb0	DebugInfo: make DW_TAG_atomic_type valid DW_TAG_atomic_type was already included in Dwarf.defs and emitted correctly, however Verifier didn't recognize it as valid. Thus we introduce the following changes: * Make DW_TAG_atomic_type valid tag for IR and DWARF (enabled only with -gdwarf-5) * Add it to related docs * Add DebugInfo tests Differential Revision: https://reviews.llvm.org/D26144 llvm-svn: 285624	2016-10-31 19:09:38 +00:00
Kuba Brecka	a28c9e8f09	[asan] Move instrumented null-terminated strings to a special section, LLVM part On Darwin, simple C null-terminated constant strings normally end up in the __TEXT,__cstring section of the resulting Mach-O binary. When instrumented with ASan, these strings are transformed in a way that they cannot be in __cstring (the linker unifies the content of this section and strips extra NUL bytes, which would break instrumentation), and are put into a generic __const section. This breaks some of the tools that we have: Some tools need to scan all C null-terminated strings in Mach-O binaries, and scanning all the contents of __const has a large performance penalty. This patch instead introduces a special section, __asan_cstring which will now hold the instrumented null-terminated strings. Differential Revision: https://reviews.llvm.org/D25026 llvm-svn: 285619	2016-10-31 18:51:58 +00:00
Tim Northover	037af52c8b	GlobalISel: allow truncating pointer casts on AArch64. llvm-svn: 285615	2016-10-31 18:31:09 +00:00
Tim Northover	cdf23f1d93	GlobalISel: translate stack protector intrinsics llvm-svn: 285614	2016-10-31 18:30:59 +00:00
Rui Ueyama	ddc79225c3	Define DbiStreamBuilder::addSectionMap. This change enables LLD to construct a Section Map stream in a PDB file. I do not understand all these fields in the Section Map yet, but it seems like a copy of a COFF section header in another format. With this patch, DbiStreamBuilder can emit a Section Map which llvm-pdbdump can dump. Differential Revision: https://reviews.llvm.org/D26112 llvm-svn: 285606	2016-10-31 17:38:56 +00:00
Greg Clayton	cddab279f6	Modify DWARFFormValue to remember the DWARFUnit that it was decoded with. Modifying DWARFFormValue to remember the DWARFUnit that it was encoded with can simplify the usage of instances of this class. Previously users would have to try and pass in the same DWARFUnit that was used to decode the form value and there was a possibility that a different DWARFUnit might be supplied to the functions that extract values (strings, CU relative references, addresses) and cause problems. This fixes this potential issue by storing the DWARFUnit inside the DWARFFormValue so that this mistake can't be made. Instances of DWARFFormValue are not stored permanently and are used as temporary values, so the increase in size of an instance of DWARFFormValue isn't a big deal. This makes decoding form values more bullet proof and is a change that will be used by future modifications. https://reviews.llvm.org/D26052 llvm-svn: 285594	2016-10-31 16:46:02 +00:00
Michael Zuckerman	68a5c53616	[x86][inline-asm][AVX512][llvm][PART-2] Introducing "k" and "Yk" constraints for extended inline assembly, enabling use of AVX512 masked vectorized instructions. Commit on behalf of mharoush Extending inline assembly support, compatible with GCC as folowing: "k" constraint hints the compiler to select any of AVX512 k0-k7 registers. "Yk" constraint is a subset of "k" excluding k0 which is not allowd to be used as a mask. Reviewer: 1. rnk Differential Revision: https://reviews.llvm.org/D25062 llvm-svn: 285591	2016-10-31 16:19:58 +00:00
Artem Tamazov	54bfd548aa	[AMDGPU][MC][gfx8] Support 20-bit immediate offset in SMEM instructions. Fixes Bug 30808. Note that passing subtarget information to predicates seems too complicated, so gfx8-specific def smrd_offset_20 introduced. Old gfx6/7-specific def renamed to smrd_offset_8 for clarity. Lit tests updated. Differential Revision: https://reviews.llvm.org/D26085 llvm-svn: 285590	2016-10-31 16:07:39 +00:00
Krzysztof Parzyszek	22586dcb2a	[Hexagon] Don't expand mux instructions with both sources identical llvm-svn: 285588	2016-10-31 15:45:09 +00:00
Ulrich Weigand	2e5e51b3f3	[SystemZ] Rework processor feature definitions and add -mcpu=archX support This patch implements two changes: - Move processor feature definition into a new file SystemZFeatures.td, and provide explicit lists of supported and unsupported features for each level of the z/Architecture. This allows specifying unsupported features in the scheduler definition files for each processor. - Add optional aliases for the -mcpu processor names according to the level of the z/Architecture, for compatibility with other compilers on the platform. The supported aliases are: -mcpu=arch8 equals -mcpu=z10 -mcpu=arch9 equals -mcpu=z196 -mcpu=arch10 equals -mcpu=zEC12 -mcpu=arch11 equals -mcpu=z13 llvm-svn: 285577	2016-10-31 14:33:29 +00:00
Ulrich Weigand	d28be373d4	[SystemZ] Guard LEFR/LFER with FeatureVector The LEFR/LFER pseudos are aliases for vector instructions and should therefore be guared by FeatureVector. If they aren't, the TableGen scheduler definition checking might complain that there is no data for those pseudos for pre-z13 machines. No functional change intended. llvm-svn: 285576	2016-10-31 14:28:43 +00:00
Ulrich Weigand	d9001301d9	[SystemZ] Correctly diagnose missing features in AsmParser Currently, when using an instruction that is not supported on the currently selected architecture, the LLVM assembler is likely to diagnose an "invalid operand" instead of a "missing feature". This is because many operands require a custom parser in order to be processed correctly, and if an instruction is not available according to the current feature set, the generated parser code will also not detect the associated custom operand parsers. Fixed by temporarily enabling all features while parsing operands. The missing features will then be correctly detected when actually parsing the instruction itself. llvm-svn: 285575	2016-10-31 14:25:05 +00:00
Ulrich Weigand	ec5d779eb8	[SystemZ] Fix encoding of MVCK and .insn ss LLVM currently treats the first operand of MVCK as if it were a regular base+index+displacement address. However, it is in fact a base+displacement combined with a length register field. While the two might look syntactically similar, there are two semantic differences: - %r0 is a valid length register, even though it cannot be used as an index register. - In an expression with just a single register like 0(%rX), the register is treated as base with normal addresses, while it is treated as the length register (with an empty base) for MVCK. Fixed by adding a new operand parser class BDRAddr and reworking the assembler parser to distinguish between address + length register operands and regular addresses. llvm-svn: 285574	2016-10-31 14:21:36 +00:00
Dorit Nuzman	bf2c15b5dc	Second attempt at r285517. llvm-svn: 285568	2016-10-31 13:17:31 +00:00
Jonas Paulsson	6788ddeac9	[SystemZ] Model 2 VBU units (not 1) in SystemZScheduleZ13.td. NFC. Review: Ulrich Weigand. llvm-svn: 285566	2016-10-31 13:05:48 +00:00
Alexey Bataev	d07c731d86	Improved cost model for FDIV and FSQRT, by Andrew Tischenko There is a bug describing poor cost model for floating point operations: Bug 29083 - [X86][SSE] Improve costs for floating point operations. This patch is the second one in series of patches dealing with cost model. Differential Revision: https://reviews.llvm.org/D25722 llvm-svn: 285564	2016-10-31 12:10:53 +00:00
Craig Topper	d4e580705d	[AVX-512] Add missing patterns for selecting masked vector extracts that started from shuffles. llvm-svn: 285546	2016-10-31 05:55:57 +00:00
Sanjoy Das	1707869db5	[SCEV] Try to order n-ary expressions in CompareValueComplexity llvm-svn: 285535	2016-10-31 03:32:43 +00:00
Sanjoy Das	299e67291c	[SCEV] In CompareValueComplexity, order global values by their name llvm-svn: 285529	2016-10-30 23:52:56 +00:00
Sanjoy Das	b4830a84b9	[SCEV] Use auto for consistency with an upcoming change; NFC llvm-svn: 285528	2016-10-30 23:52:53 +00:00
Sanjay Patel	339a51ac13	[DAG] x \| x --> x llvm-svn: 285522	2016-10-30 18:19:35 +00:00
Sanjay Patel	13aee345ca	[DAG] x & x --> x llvm-svn: 285521	2016-10-30 18:13:30 +00:00
Dorit Nuzman	06903d16af	Revert r285517 due to build failures. llvm-svn: 285518	2016-10-30 14:34:57 +00:00
Dorit Nuzman	3c1c658f24	[LoopVectorize] Make interleaved-accesses analysis less conservative about possible pointer-wrap-around concerns, in some cases. Before this patch, collectConstStridedAccesses (part of interleaved-accesses analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when examining all candidate pointers. This is too conservative. Instead, this patch makes collectConstStridedAccesses use an optimistic approach, calling getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the candidate interleave groups have been formed, revisits the pointer-wrapping analysis but only where it matters: namely, in groups that have gaps, and where the gaps are not at the very end of the group (in which case the loop is peeled). This second time getPtrStride is called with [Assume=false, ShouldCheckWrap=true], but this could further be improved to using Assume=true, once we also add the logic to track that we are not going to meet the scev runtime checks threshold. Differential Revision: https://reviews.llvm.org/D25276 llvm-svn: 285517	2016-10-30 12:23:26 +00:00
Craig Topper	b7781a95fd	[X86] Use intrinsics table for PMADDUBSW and PMADDWD so that we can use the legacy intrinsics to select EVEX encoded instructions when available. This removes a couple tablegen classes that become unused after this change. Another class gained an additional parameter to allow PMADDUBSW to specify a different result type from its input type. llvm-svn: 285515	2016-10-30 06:56:16 +00:00
Teresa Johnson	bf28c8fa45	[ThinLTO] Use per-summary flag to prevent exporting locals used in inline asm Summary: Instead of using the workaround of suppressing the entire index for modules that call inline asm that may reference locals, use the NoRename flag on the summary for any locals in the llvm.used set, and add a reference edge from any functions containing inline asm. This avoids issues from having no summaries despite the module defining global values, which was preventing more aggressive index-based optimization. It will be followed by a subsequent patch to make a similar fix for local references in module level asm (to fix PR30610). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26121 llvm-svn: 285513	2016-10-30 05:40:44 +00:00
Teresa Johnson	3bc8abdffc	[ThinLTO] Correctly resolve linkonce when importing aliasee Summary: When we have an aliasee that is linkonce, while we can't convert the non-prevailing copies to available_externally, we still need to convert the prevailing copy to weak. If a reference to the aliasee is exported, not converting a copy to weak will result in undefined references when the linkonce is removed in its original module. Add a new test and update existing tests. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26076 llvm-svn: 285512	2016-10-30 05:15:23 +00:00
Craig Topper	bf9e5a16a4	[X86] Don't use loadv2i64 on SSE version of PMULHRSW. Use memopv2i64 instead. This bug was introduced in r285501. llvm-svn: 285510	2016-10-30 00:02:55 +00:00
NAKAMURA Takumi	ff76cfefc0	NativeFormatting.cpp: Fix build for mingw. Where would writePadding() be? llvm-svn: 285509	2016-10-29 23:14:18 +00:00
Teresa Johnson	38d4df714c	[ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC) Rename as suggested in code review for D26063. llvm-svn: 285508	2016-10-29 21:52:23 +00:00
Teresa Johnson	1b9c2be8f4	[ThinLTO] Use NoPromote flag in summary during promotion Summary: Replace the check of whether a GV has a section with the flag check in the summary. This is in preparation for using the NoPromote flag to convey other situations when we can't promote (e.g. locals used in inline asm). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26063 llvm-svn: 285507	2016-10-29 21:31:48 +00:00
Peter Collingbourne	310474f576	IR: Remove a no longer needed assert. This assert was checking for a miscompile in a version of GCC that we no longer support. llvm-svn: 285506	2016-10-29 20:57:12 +00:00
Craig Topper	defe9ffbb5	[X86] Use intrinsics table for VPMULHRSW intrincis so that the legacy intrinsics can select EVEX encoded instructions when available. This requires a minor rename of the instructions due to the use of different tablegen classes and how the names are concatenated. llvm-svn: 285501	2016-10-29 18:41:45 +00:00
Sanjay Patel	36eeb6d6f6	[ValueTracking] recognize more variants of smin/smax Try harder to detect obfuscated min/max patterns: the initial pattern was added with D9352 / rL236202. There was a bug fix for PR27137 at rL264996, but I think we can do better by folding the corresponding smax pattern and commuted variants. The codegen tests demonstrate the effect of ValueTracking on the backend via SelectionDAGBuilder. We can't expose these differences minimally in IR because we don't have smin/smax intrinsics for IR. Differential Revision: https://reviews.llvm.org/D26091 llvm-svn: 285499	2016-10-29 16:21:19 +00:00
Sanjay Patel	978f827d12	[InstCombine] re-use bitcasted compare operands in selects (PR28001) These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>. The bitcasts obfuscate the expected min/max forms as shown in PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001#c6 Differential Revision: https://reviews.llvm.org/D25943 llvm-svn: 285495	2016-10-29 15:22:04 +00:00
Simon Pilgrim	75a697a17e	[DAGCombiner] (REAPPLIED) Add vector demanded elements support to computeKnownBits Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used. I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course. DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit. This looked like this had caused compile time regressions on some buildbots (and was reverted in rL285381), but appears to have just been a harmless bystander! Differential Revision: https://reviews.llvm.org/D25691 llvm-svn: 285494	2016-10-29 11:29:39 +00:00
Elena Demikhovsky	519b4ccd70	Fixed FMA + FNEG combine. Masked form of FMA should be omitted in this optimization. Differential Revision: https://reviews.llvm.org/D25984 llvm-svn: 285492	2016-10-29 08:44:46 +00:00
Matt Arsenault	c88ba36eab	AMDGPU: Use 1/2pi inline imm on VI I'm guessing at how it is supposed to be printed llvm-svn: 285490	2016-10-29 04:05:06 +00:00
Matthias Braun	7d78614ae9	AArch64DeadRegisterDefinitionsPass: Cleanup; NFC - Fix doxygen file comment - reduce indentation in loop - Factor out some common subexpressions - Move independent helper function out of class - Fix Changed flag (this is not strictly NFC but a bugfix, but the flag seems ignored anyway) llvm-svn: 285488	2016-10-29 01:03:41 +00:00
Rui Ueyama	77be2403f6	Define calculateDbgStreamSize for consistency. llvm-svn: 285487	2016-10-29 00:56:44 +00:00
Tim Shen	1bab9cfbe5	[APFloat] Remove the redundent function body of uninitialized ctor, which should be done in r285468 llvm-svn: 285486	2016-10-29 00:51:41 +00:00
Zachary Turner	5b2243e884	Resubmit "Add support for advanced number formatting." This resubmits r284436 and r284437, which were reverted in r284462 as they were breaking the AArch64 buildbot. The breakage on AArch64 turned out to be a miscompile which is still not fixed, but is actively tracked at llvm.org/pr30748. This resubmission re-writes the code in a way so as to make the miscompile not happen. llvm-svn: 285483	2016-10-29 00:27:22 +00:00
Davide Italiano	86168b23cf	[DAGCombiner] Fix a crash visiting `AND` nodes. Instead of asserting that the shift count is != 0 we just bail out as it's not profitable trying to optimize a node which will be removed anyway. Differential Revision: https://reviews.llvm.org/D26098 llvm-svn: 285480	2016-10-28 23:55:32 +00:00
Tom Stellard	6695ba0440	AMDGPU/SI: Don't use non-0 waitcnt values when waiting on Flat instructions Summary: Flat instruction can return out of order, so we need always need to wait for all the outstanding flat operations. Reviewers: tony-tye, arsenm Subscribers: kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D25998 llvm-svn: 285479	2016-10-28 23:53:48 +00:00
Matt Arsenault	4e9c1e3a79	AMDGPU: Fix instruction flags for s_endpgm Set isReturn, remove hasSideEffects. Also remove hasCtrlDep, I'm not really sure what that does. llvm-svn: 285476	2016-10-28 23:00:38 +00:00
Adrian Prantl	3cd37d0aeb	Refactor DW_LNE_* into Dwarf.def llvm-svn: 285475	2016-10-28 22:57:02 +00:00
Adrian Prantl	79deba6446	Refactor DW_LNS_* into Dwarf.def llvm-svn: 285474	2016-10-28 22:56:59 +00:00
Adrian Prantl	8580d3f3d3	Refactor DW_APPLE_PROPERTY_* into Dwarf.def llvm-svn: 285473	2016-10-28 22:56:56 +00:00
Adrian Prantl	44a4461b16	Refactor DW_CFA_* into Dwarf.def llvm-svn: 285472	2016-10-28 22:56:53 +00:00
Adrian Prantl	23865816d5	Refactor all DW_FORM_* constants into Dwarf.def llvm-svn: 285470	2016-10-28 22:56:45 +00:00
Tim Shen	b4991548c8	[APFloat] Fix memory bugs revealed by MSan Reviewers: eugenis, hfinkel, kbarton, iteratee, echristo Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D26102 llvm-svn: 285468	2016-10-28 22:45:33 +00:00
Justin Bogner	db6b6a7f0c	SDAG: Make sure we use an allocatable reg class when we create this vreg As per the discussion on r280783, if constrainRegClass fails we need to call getAllocatableClass like we did before that commit. llvm-svn: 285467	2016-10-28 22:42:54 +00:00
Matt Arsenault	7b6475568d	AMDGPU: Add definitions for scalar store instructions Also add glc bit to the scalar loads since they exist on VI and change the caching behavior. This currently has an assembler bug where the glc bit is incorrectly accepted on SI/CI which do not have it. llvm-svn: 285463	2016-10-28 21:55:15 +00:00
Matt Arsenault	4b6a6cc8e9	AMDGPU: Rename glc operand type While trying to add the glc bit to SMEM instructions on VI with the new refactoring I ran into some kind of shadowing problem for the glc operand when using the pseudoinstruction as a multiclass parameter. Everywhere that currently uses it defines the operand to have the same name as its type, i.e. glc:$glc which works. For some reason now it conflicts, and its up evaluating to the wrong thing. For the real encoding classes, let Inst{16} = !if(ps.has_glc, glc, ?); was not being evaluated and still visible in the Inst initializer in the expanded td file. In other cases I got a a different error about an illegal operand where this was using { 0 } initializer from the bits<1> glc initializer instead of evaluating it as false in the if. For consistency all of the operand types should probably be captialized to avoid conflicting with the variable names unless somebody has a better idea of how to fix this. llvm-svn: 285462	2016-10-28 21:55:08 +00:00
Justin Lebar	f0a80ba385	[NVPTX] Compute 'rem' using the result of 'div', if possible. Summary: In isel, transform Num % Den into Num - (Num / Den) * Den if the result of Num / Den is already available. Reviewers: tra Subscribers: hfinkel, llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D26090 llvm-svn: 285461	2016-10-28 21:44:00 +00:00

... 9 10 11 12 13 ...

97308 Commits