llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kuperstein	c5edcdeb0e	[LV] Use vector phis for some secondary induction variables Previously, we materialized secondary vector IVs from the primary scalar IV, by offseting the primary to match the correct start value, and then broadcasting it - inside the loop body. Instead, we can use a real vector IV, like we do for the primary. This enables using vector IVs for secondary integer IVs whose type matches the type of the primary. Differential Revision: http://reviews.llvm.org/D20932 llvm-svn: 272283	2016-06-09 18:03:15 +00:00
Xinliang David Li	ecde1c7f3d	Revert r272194 No need for it if loop Analysis Manager is used llvm-svn: 272243	2016-06-09 03:22:39 +00:00
Teresa Johnson	7ab1f69272	[ThinLTO/gold] Enable summary-based internalization Summary: Enable existing summary-based importing support in the gold-plugin. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21080 llvm-svn: 272239	2016-06-09 01:14:13 +00:00
Michael Zolotukhin	8e7e76729d	[LoopSimplify] Preserve LCSSA when merging exit blocks. Summary: This fixes PR26682. Also add LCSSA as a preserved pass to LoopSimplify, that looks correct to me and allows to write a test for the issue. Reviewers: chandlerc, bogner, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21112 llvm-svn: 272224	2016-06-08 23:13:21 +00:00
Michael Zolotukhin	aa547616d2	[LoopUnroll] Check that DT is available before trying to verify it. llvm-svn: 272221	2016-06-08 22:49:59 +00:00
Michael Zolotukhin	987ab631fa	[SLPVectorizer] Handle GEP with differing constant index types Summary: This fixes PR27617. Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64. The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert. Reviewers: nadav, mzolotukhin Subscribers: JesperAntonsson, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20685 Patch by Jesper Antonsson! llvm-svn: 272206	2016-06-08 21:55:16 +00:00
Davide Italiano	02861d8695	[PM] Add missing caching of GlobalsAA to EarlyCSE. llvm-svn: 272204	2016-06-08 21:31:55 +00:00
Sanjay Patel	3929313811	[InstCombine] move fold of select of add/sub to helper function; NFCI llvm-svn: 272199	2016-06-08 21:10:01 +00:00
Sanjay Patel	384d0f219d	[InstCombine] fix outdated comment, simplify logic; NFCI llvm-svn: 272196	2016-06-08 20:31:52 +00:00
Evgeny Stupachenko	3e2f389a7e	The patch set unroll disable pragma when unroll with user specified count has been applied. Summary: Previously SetLoopAlreadyUnrolled() set the disable pragma only if there was some loop metadata. Now it set the pragma in all cases. This helps to prevent multiple unroll when -unroll-count=N is given. Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D20765 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 272195	2016-06-08 20:21:24 +00:00
Xinliang David Li	572135f717	[PM] Refector LoopAccessInfo analysis code This is the preparation patch to port the analysis to new PM Differential Revision: http://reviews.llvm.org/D20560 llvm-svn: 272194	2016-06-08 20:15:37 +00:00
Sanjay Patel	10a2c38d83	[InstCombine] reduce indent; NFC llvm-svn: 272193	2016-06-08 20:09:04 +00:00
Tim Shen	7aa0ad65ce	[MemCpyOpt] Do not exchange llvm.lifetime.start and llvm.memcpy Reviewers: iteratee Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21087 llvm-svn: 272192	2016-06-08 19:42:32 +00:00
Sanjay Patel	916f8a0cdb	[InstCombine] use copyIRFlags() ; NFCI llvm-svn: 272191	2016-06-08 19:33:52 +00:00
Benjamin Kramer	c321e53402	Apply most suggestions of clang-tidy's performance-unnecessary-value-param Avoids unnecessary copies. All changes audited & pass tests with asan. No functional change intended. llvm-svn: 272190	2016-06-08 19:09:22 +00:00
Davide Italiano	2d5ab0a56a	[PM] LoopSimplify. Remove unneeded pass dependencies. NFCI. llvm-svn: 272140	2016-06-08 13:56:59 +00:00
Davide Italiano	d8d83f4773	[PM/SimplifyCFG] Preserve GlobalsAA even if the IR is mutated. llvm-svn: 272139	2016-06-08 13:32:23 +00:00
Benjamin Kramer	46e38f3678	Avoid copies of std::strings and APInt/APFloats where we only read from it As suggested by clang-tidy's performance-unnecessary-copy-initialization. This can easily hit lifetime issues, so I audited every change and ran the tests under asan, which came back clean. llvm-svn: 272126	2016-06-08 10:01:20 +00:00
Davide Italiano	16e96d4b16	[PM] Preserve GlobalsAA for SROA. Differential Revision: http://reviews.llvm.org/D21040 llvm-svn: 272009	2016-06-07 13:21:17 +00:00
Simon Pilgrim	db9893fb90	[InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to native shifts Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit). If the shift amount is constant we can sometimes convert these instructions to native shifts: 1 - if all shift amounts are in range then the conversion is trivial. 2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion. 3 - logical shifts just return zero if all elements have out of range shift amounts. In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts. Differential Revision: http://reviews.llvm.org/D19675 llvm-svn: 271996	2016-06-07 10:27:15 +00:00
Simon Pilgrim	91e3ac8293	[InstCombine][SSE] Add MOVMSK constant folding (PR27982) This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions. The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs. Differential Revision: http://reviews.llvm.org/D20998 llvm-svn: 271990	2016-06-07 08:18:35 +00:00
Michael Kuperstein	a0c6ae02a5	[InstCombine] scalarizePHI should not assume the code it sees has been CSE'd scalarizePHI only looked for phis that have exactly two uses - the "latch" use, and an extract. Unfortunately, we can not assume all equivalent extracts are CSE'd, since InstCombine itself may create an extract which is a duplicate of an existing one. This extends it to handle several distinct extracts from the same index. This should fix at least some of the performance regressions from PR27988. Differential Revision: http://reviews.llvm.org/D20983 llvm-svn: 271961	2016-06-06 23:38:33 +00:00
Davide Italiano	fea0a4c5b2	[PM] Preserve the correct set of analyses for GVN. llvm-svn: 271934	2016-06-06 20:01:50 +00:00
Davide Italiano	82c447823b	[GVN] Switch dump() definition over to LLVM_DUMP_METHOD. llvm-svn: 271932	2016-06-06 19:24:27 +00:00
Geoff Berry	43e5160d0e	Reapply [LSR] Create fewer redundant instructions. Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Originally reviewed in http://reviews.llvm.org/D18001 Reviewers: atrick Subscribers: llvm-commits, mzolotukhin, mcrosier Differential Revision: http://reviews.llvm.org/D18480 llvm-svn: 271929	2016-06-06 19:10:46 +00:00
Sanjay Patel	6a333c3ed9	[InstCombine] limit icmp transform to ConstantInt (PR28011) In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check above this to work for any Constant rather than ConstantInt. AFAICT, that part makes sense if we can determine that the shrunken/extended constant remained equal. But it doesn't make sense for this later transform where we assume that the constant DID change. This could assert for a ConstantExpr: https://llvm.org/bugs/show_bug.cgi?id=28011 And it could be wrong for a vector as shown in the added regression test. llvm-svn: 271908	2016-06-06 16:56:57 +00:00
Eli Friedman	ee89505799	LICM: Don't sink stores out of loops that may throw. Summary: This hasn't been caught before because it requires noalias or similarly strong alias analysis to actually reproduce. Fixes http://llvm.org/PR27952 . Reviewers: hfinkel, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20944 llvm-svn: 271858	2016-06-05 22:13:52 +00:00
Sanjoy Das	b7e861a488	Add safety check to InstCombiner::commonIRemTransforms Since FoldOpIntoPhi speculates the binary operation to potentially each of the predecessors of the PHI node (pulling it out of arbitrary control dependence in the process), we can FoldOpIntoPhi only if we know the operation doesn't have UB. This also brings up an interesting profitability question -- the way it is written today, commonIRemTransforms will hoist out work from dynamically dead code into code that will execute at runtime. Perhaps that isn't the best canonicalization? Fixes PR27968. llvm-svn: 271857	2016-06-05 21:17:04 +00:00
Sanjoy Das	4d4339d1e8	[PM] Port IndVarSimplify to the new pass manager Summary: There are some rough corners, since the new pass manager doesn't have (as far as I can tell) LoopSimplify and LCSSA, so I've updated the tests to run them separately in the old pass manager in the lit tests. We also don't have an equivalent for AU.setPreservesCFG() in the new pass manager, so I've left a FIXME. Reviewers: bogner, chandlerc, davide Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20783 llvm-svn: 271846	2016-06-05 18:01:19 +00:00
Sanjoy Das	f90e28d6fd	[IndVars] Remove -liv-reduce It is an off-by-default option that no one seems to use[0], and given that SCEV directly understands the overflow instrinsics there is no real need for it anymore. [0]: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098181.html llvm-svn: 271845	2016-06-05 18:01:12 +00:00
Sanjay Patel	a6fbc82392	[InstCombine] allow vector icmp bool transforms llvm-svn: 271843	2016-06-05 17:49:45 +00:00
Sanjay Patel	5f0217f42e	fix documentation comments and other clean-ups; NFC llvm-svn: 271839	2016-06-05 16:46:18 +00:00
Xinliang David Li	64dbb295b6	[PM] Port GCOVProfiler pass to the new pass manager llvm-svn: 271823	2016-06-05 05:12:23 +00:00
Xinliang David Li	fb3137c3b3	[PM] code refactoring /NFC llvm-svn: 271822	2016-06-05 03:40:03 +00:00
Sanjay Patel	6f8f47b358	[InstCombine] less 'CI' confusion; NFC Change the name of the ICmpInst to 'ICmp' and the Constant (was a ConstantInt) to 'C', so that it's hopefully clearer that 'CI' refers to CastInst in this context. While we're scrubbing, fix the documentation comment and use 'auto' with 'dyn_cast'. llvm-svn: 271817	2016-06-05 00:12:32 +00:00
David Majnemer	2482e1c017	[SimplifyCFG] Don't kill empty cleanuppads with multiple uses A basic block could contain: %cp = cleanuppad [] cleanupret from %cp unwind to caller This basic block is empty and is thus a candidate for removal. However, there can be other uses of %cp outside of this basic block. This is only possible in unreachable blocks. Make our transform more correct by checking that the pad has a single user before removing the BB. This fixes PR28005. llvm-svn: 271816	2016-06-04 23:50:03 +00:00
Sanjay Patel	ea8a211169	[InstCombine] allow vector constants for cast+icmp fold This is step 1 of unknown towards fixing PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001 llvm-svn: 271810	2016-06-04 22:04:05 +00:00
Sanjay Patel	c774f8c265	clean-up; NFC llvm-svn: 271807	2016-06-04 21:20:44 +00:00
Sanjay Patel	4c204230fc	fix formatting, punctuation; NFC llvm-svn: 271804	2016-06-04 20:39:22 +00:00
Simon Pilgrim	fda22d66fc	[InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614 Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type. llvm-svn: 271789	2016-06-04 13:42:46 +00:00
Xinliang David Li	6c44e9e33d	[pgo] extend r271532 to darwin platform llvm-svn: 271746	2016-06-03 23:02:28 +00:00
Derek Bruening	9ef5772154	[esan\|wset] Optionally assume intra-cache-line accesses Summary: Adds an option -esan-assume-intra-cache-line which causes esan to assume that a single memory access touches just one cache line, even if it is not aligned, for better performance at a potential accuracy cost. Experiments show that the performance difference can be 2x or more, and accuracy loss is typically negligible, so we turn this on by default. This currently applies just to the working set tool. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D20978 llvm-svn: 271743	2016-06-03 22:29:52 +00:00
Derek Bruening	4252a16c35	[esan] Specify which tool via a global variable Summary: Adds a global variable to specify the tool, to support handling early interceptors that invoke instrumented code and require shadow memory to be initialized prior to __esan_init() being invoked. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D20973 llvm-svn: 271715	2016-06-03 19:40:37 +00:00
Sanjay Patel	6cf18af1c5	[InstCombine] look through bitcasts to find selects There was concern that creating bitcasts for the simpler potential select pattern: define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) { %a2 = add <2 x i64> %a, %a %sext = sext <4 x i1> %cmp to <4 x i32> %bc = bitcast <4 x i32> %sext to <2 x i64> %and = and <2 x i64> %a2, %bc ret <2 x i64> %and } might lead to worse code for some targets, so this patch is matching the larger patterns seen in the test cases. The motivating example for this patch is this IR produced via SSE intrinsics in C: define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) { %t0 = bitcast <2 x i64> %a to <4 x i32> %t1 = bitcast <2 x i64> %b to <4 x i32> %cmp = icmp sgt <4 x i32> %t0, %t1 %sext = sext <4 x i1> %cmp to <4 x i32> %t2 = bitcast <4 x i32> %sext to <2 x i64> %and = and <2 x i64> %t2, %a %neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1> %neg2 = bitcast <4 x i32> %neg to <2 x i64> %and2 = and <2 x i64> %neg2, %b %or = or <2 x i64> %and, %and2 ret <2 x i64> %or } For an AVX target, this is currently: vpcmpgtd %xmm1, %xmm0, %xmm2 vpand %xmm0, %xmm2, %xmm0 vpandn %xmm1, %xmm2, %xmm1 vpor %xmm1, %xmm0, %xmm0 retq With this patch, it becomes: vpmaxsd %xmm1, %xmm0, %xmm0 Differential Revision: http://reviews.llvm.org/D20774 llvm-svn: 271676	2016-06-03 14:42:07 +00:00
Qin Zhao	c14c249343	[esan\|cfrag] Instrument GEP instr for struct field access. Summary: Instrument GEP instruction for counting the number of struct field address calculation to approximate the number of struct field accesses. Adds test struct_field_count_basic.ll to test the struct field instrumentation. Reviewers: bruening, aizatsky Subscribers: junbuml, zhaoqin, llvm-commits, eugenis, vitalybuka, kcc, bruening Differential Revision: http://reviews.llvm.org/D20892 llvm-svn: 271619	2016-06-03 02:33:04 +00:00
Michael Zolotukhin	585649895f	[LoopUnroll] Set correct thresholds for new recently enabled unrolling heuristic. In r270478, where I enabled the new heuristic I posted testing results, which I got when explicitly passed the thresholds values via CL options. However, setting the CL options init-values is not enough to change the default values of thresholds, so I'm changing them in another place now. llvm-svn: 271615	2016-06-03 00:16:46 +00:00
Davide Italiano	8738363339	[TailRecursionElimination] Refactor/cleanup. In preparation for porting to the new PM. Patch by Jake VanAdrighem! (review mainly by me/Justin) Differential Revision: http://reviews.llvm.org/D20610 llvm-svn: 271607	2016-06-02 23:02:44 +00:00
Manuel Jacob	a485984c0c	[PM] Schedule InstSimplify after late LICM run, to clean up LCSSA nodes. Summary: The module pass pipeline includes a late LICM run after loop unrolling. LCSSA is implicitly run as a pass dependency of LICM. However no cleanup pass was run after this, so the LCSSA nodes ended in the optimized output. Reviewers: hfinkel, mehdi_amini Subscribers: majnemer, bruno, mzolotukhin, mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D20606 llvm-svn: 271602	2016-06-02 22:14:26 +00:00
Davide Italiano	6dfdbf1f46	[PM] LoadCombine preserves GlobalsAA, doesn't depend on it. llvm-svn: 271601	2016-06-02 22:05:59 +00:00
Davide Italiano	84e1414522	[PM/LoadCombine] Inline getAnalysisUsage(). NFCI. llvm-svn: 271600	2016-06-02 22:04:43 +00:00
Sanjay Patel	dba8b4c04d	transform obscured FP sign bit ops into a fabs/fneg using TLI hook This is effectively a revert of: http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886) and: http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions. This is intended to resolve the objections raised on the dev list: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html and: https://llvm.org/bugs/show_bug.cgi?id=24886#c4 In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later. Differential Revision: http://reviews.llvm.org/D19391 llvm-svn: 271573	2016-06-02 20:01:37 +00:00
Sanjay Patel	5c0bc02878	[InstCombine] remove guard for generating a vector select This is effectively NFC because we already do this transform after r175380: http://reviews.llvm.org/rL175380 and also via foldBoolSextMaskToSelect(). This change should just make it a bit more efficient to match the pattern. The original guard was added in r95058: http://reviews.llvm.org/rL95058 A sampling of codegen for current in-tree targets shows no problems. This makes sense given that we're already producing the vector selects via the other transforms. llvm-svn: 271554	2016-06-02 18:03:05 +00:00
Qin Zhao	6d3bd6866b	[esan\|cfrag] Create the cfrag struct array for the runtime Summary: Fills the cfrag struct variable with an array of struct information variables. Reviewers: aizatsky, bruening Subscribers: bruening, kcc, vitalybuka, eugenis, llvm-commits, zhaoqin Differential Revision: http://reviews.llvm.org/D20661 llvm-svn: 271547	2016-06-02 17:30:47 +00:00
Xinliang David Li	7008ce3f98	[profile] value profiling bug fix -- missing icall targets in profile-use Inline virtual functions has linkeonceodr linkage (emitted in comdat on supporting targets). If the vtable for the class is not emitted in the defining module, function won't be address taken thus its address is not recorded. At the mercy of the linker, if the per-func prf_data from this module (in comdat) is picked at link time, we will lose mapping from function address to its hash val. This leads to missing icall promotion. The second test case (currently disabled) in compiler_rt (r271528): instrprof-icall-prom.test demostrates the bug. The first profile-use subtest is fine due to linker order difference. With this change, no missing icall targets is found in instrumented clang's raw profile. llvm-svn: 271532	2016-06-02 16:33:41 +00:00
Xinliang David Li	0b29330612	make icall pass name consistent /NFC llvm-svn: 271467	2016-06-02 01:52:05 +00:00
Vitaly Buka	7b8ed4f223	[asan] Rename UAR into UseAfterReturn Summary: To improve readability. PR27453 Reviewers: kcc, eugenis, aizatsky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20761 llvm-svn: 271447	2016-06-02 00:06:42 +00:00
Geoff Berry	b96d3b2dd8	[MemorySSA] Port to new pass manager Add support for the new pass manager to MemorySSA pass. Change MemorySSA to be computed eagerly upon construction. Change MemorySSAWalker to be owned by the MemorySSA object that creates it. Reviewers: dberlin, george.burgess.iv Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19664 llvm-svn: 271432	2016-06-01 21:30:40 +00:00
Michael Kuperstein	3a3c64d23e	[LV] For some IVs, use vector phis instead of widening in the loop body Previously, whenever we needed a vector IV, we would create it on the fly, by splatting the scalar IV and adding a step vector. Instead, we can create a real vector IV. This tends to save a couple of instructions per iteration. This only changes the behavior for the most basic case - integer primary IVs with a constant step. Differential Revision: http://reviews.llvm.org/D20315 llvm-svn: 271410	2016-06-01 17:16:46 +00:00
Peter Collingbourne	382d81cacf	IR: Allow multiple global metadata attachments with the same type. This will be necessary to allow the global merge pass to attach multiple debug info metadata nodes to global variables once we reverse the edge from DIGlobalVariable to GlobalVariable. Differential Revision: http://reviews.llvm.org/D20414 llvm-svn: 271358	2016-06-01 01:17:57 +00:00
Guozhi Wei	b994f4cdbc	[SLP] Pass in correct alignment when query memory access cost This patch fixes bug https://llvm.org/bugs/show_bug.cgi?id=27897. When query memory access cost, current SLP always passes in alignment value of 1 (unaligned), so it gets a very high cost of scalar memory access, and wrongly vectorize memory loads in the test case. It can be fixed by simply giving correct alignment. llvm-svn: 271333	2016-05-31 20:41:19 +00:00
Davide Italiano	bdc2971434	[PM] BDCE: Fix caching of analyses. Another chapter in the story. GlobalsAA should be preserved, as well as the CFG. llvm-svn: 271307	2016-05-31 17:53:22 +00:00
Davide Italiano	688616ff74	[PM] ADCE: Fix caching of analyses. When this pass was originally ported, AA wasn't available for the new PM. Now it is, so we can cache properly. llvm-svn: 271303	2016-05-31 17:39:39 +00:00
Erik Eckstein	0c48dd8ca5	Fix a crash in MergeFunctions related to ordering of weak/strong functions The assumption, made in insert() that weak functions are always inserted after strong functions, is only true in the first round of adding functions. In subsequent rounds this is no longer guaranteed , because we might remove a strong function from the tree (because it's modified) and add it later, where an equivalent weak function already exists in the tree. This change removes the assert in insert() and explicitly enforces a weak->strong order. This also removes the need of two separate loops in runOnModule(). llvm-svn: 271299	2016-05-31 17:20:23 +00:00
Qin Zhao	1762eef572	[esan\|cfrag] Create the skeleton of cfrag variable for the runtime Summary: Creates a global variable containing preliminary information for the cache-fragmentation tool runtime. Passes a pointer to the variable (null if no variable is created) to the compilation unit init and exit routines in the runtime. Reviewers: aizatsky, bruening Subscribers: filcab, kubabrecka, bruening, kcc, vitalybuka, eugenis, llvm-commits, zhaoqin Differential Revision: http://reviews.llvm.org/D20541 llvm-svn: 271298	2016-05-31 17:14:02 +00:00
Saleem Abdulrasool	d2f705ddf9	X86: permit using SjLj EH on x86 targets as an option This adds support to the backed to actually support SjLj EH as an exception model. This is NOT the default model, and requires explicitly opting into it from the frontend. GCC supports this model and for MinGW can still be enabled via the `--using-sjlj-exceptions` options. Addresses PR27749! llvm-svn: 271244	2016-05-31 01:48:07 +00:00
Craig Topper	8287fd8abd	[X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions. llvm-svn: 271236	2016-05-30 23:15:56 +00:00
Sanjoy Das	3e5ce2b737	[IndVars] Assert that the incoming IR is in LCSSA Since we already assert that the outgoing IR is in LCSSA, it is easy to get misled into thinking that -indvars broke LCSSA if the incoming IR is non-LCSSA. Checking this pre-condition will make such cases break in more obvious ways. Inspired by (but does _not_ fix) PR26682. llvm-svn: 271196	2016-05-30 01:37:39 +00:00
Sanjoy Das	496f274257	[IndVarSimplify] Extract the logic of `-indvars` out into a class; NFC This will be used later to port IndVarSimplify to the new pass manager. llvm-svn: 271190	2016-05-29 21:42:00 +00:00
Benjamin Kramer	728f4448a9	Remove some 'const' specifiers that do nothing but prevent moving the argument. Found by clang-tidy's misc-move-const-arg. While there drop some obsolete c_str() calls. llvm-svn: 271181	2016-05-29 10:46:35 +00:00
Davide Italiano	39893bd41c	[PM] Reassociate: cache analyses more aggressively. While here, add a FIXME for setPreserveCFG(). llvm-svn: 271159	2016-05-29 00:41:17 +00:00
Sanjoy Das	ae09b3cd4c	[IndVars] Eliminate op.with.overflow when possible (re-apply) Summary: If we can prove that an op.with.overflow intrinsic does not overflow, we can get rid of the intrinsic, and replace it with non-wrapping arithmetic. This was first checked in at r265913 but reverted in r265950 because it exposed some issues around how SCEV handled post-inc add recurrences. Those issues have now been fixed. Reviewers: atrick, regehr Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18685 llvm-svn: 271153	2016-05-29 00:36:25 +00:00
Davide Italiano	484b5ab39d	[PM] SCCP should preserve GlobalsAA even if the IR is mutated. llvm-svn: 271149	2016-05-29 00:31:15 +00:00
Simon Pilgrim	9602d678cb	[X86][SSE] (Reapplied) Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused. Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed. Differential Revision: http://reviews.llvm.org/D20686 llvm-svn: 271131	2016-05-28 18:03:41 +00:00
Mehdi Amini	bcc47419d9	ValueMapper: fix assertion when null-mapping a constant for linking metadata Summary: When RF_NullMapMissingGlobalValues is set, mapValue can return null for GlobalValue. When mapping the operands of a constant that is referenced from metadata, we need to handle this case and actually return null instead of mapping this constant. Reviewers: dexonsmith, rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20713 llvm-svn: 271129	2016-05-28 17:26:03 +00:00
Sean Silva	42cc3422eb	Add a comment about why we need to buffer the attribute changes. llvm-svn: 271097	2016-05-28 04:24:39 +00:00
Sean Silva	8c7e12136c	Small cleanup. Centralize assertion. Clean up max loop. llvm-svn: 271094	2016-05-28 04:19:45 +00:00
Sean Silva	2e8f095b2a	Inline this into its only use. NFC. The name was out of date at this point and it seems simple enough to have in-line. llvm-svn: 271093	2016-05-28 04:19:40 +00:00
Sean Silva	02b9d892c5	Bring back r271090 in a way that doesn't depend on r271089. llvm-svn: 271092	2016-05-28 04:05:36 +00:00
Sean Silva	9dd4b5c51d	Revert r271089 and r271090. It was triggering an msan bot. Revert "[IRPGO] Set the function entry count metadata." This reverts commit r271090. Revert "[IRPGO] Centralize the function attribute inliner hint logic. NFC." This reverts commit r271089. llvm-svn: 271091	2016-05-28 03:56:25 +00:00
Sean Silva	7884633c5b	[IRPGO] Set the function entry count metadata. llvm-svn: 271090	2016-05-28 03:02:54 +00:00
Sean Silva	2a73019f3e	[IRPGO] Centralize the function attribute inliner hint logic. NFC. This keeps the logic in the same function. llvm-svn: 271089	2016-05-28 03:02:50 +00:00
Evgeny Stupachenko	b787522d28	The patch fixes r271071 Summary: unused variables in Release mode: BasicBlock *Header unsigned OrigCount put under DEBUG From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 271076	2016-05-28 00:14:58 +00:00
Xinliang David Li	d38392ecd6	[PM] Port the Sample FDO to new PM (part-2) llvm-svn: 271072	2016-05-27 23:20:16 +00:00
Evgeny Stupachenko	ea2aef4a1d	The patch refactors unroll pass. Summary: Unroll factor (Count) calculations moved to a new function. Early exits on pragma and "-unroll-count" defined factor added. New type of unrolling "Force" introduced (previously used implicitly). New unroll preference "AllowRemainder" introduced and set "true" by default. (should be set to false for architectures that suffers from it). Reviewers: hfinkel, mzolotukhin, zzheng Differential Revision: http://reviews.llvm.org/D19553 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 271071	2016-05-27 23:15:06 +00:00
Vitaly Buka	1e75fa4ad8	[asan] Add option to enable asan-use-after-scope from clang. Clang will have -fsanitize-address-use-after-scope flag. PR27453 Reviewers: kcc, eugenis, aizatsky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20750 llvm-svn: 271067	2016-05-27 22:55:10 +00:00
Xinliang David Li	e897edbd36	[PM] Port the Sample FDO to new PM (part-1) llvm-svn: 271062	2016-05-27 22:30:44 +00:00
Sanjay Patel	74d23ad498	[InstCombine] move and/sext fold to helper function; NFCI We need to enhance the pattern matching on these to look through bitcasts. llvm-svn: 271051	2016-05-27 21:41:29 +00:00
Davide Italiano	88a7892a07	[LCSSA] Simplify. Suggested by Sanjoy. llvm-svn: 271041	2016-05-27 20:25:31 +00:00
Sanjoy Das	6fff9dc932	[GVN] Preserve !range metadata when PRE'ing loads Reviewers: dberlin, reames, george.burgess.iv Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20743 llvm-svn: 271034	2016-05-27 19:03:10 +00:00
Benjamin Kramer	f6f815bf39	Use StringRef::startswith instead of find(...) == 0. It's faster and easier to read. llvm-svn: 271018	2016-05-27 16:54:57 +00:00
Tim Northover	10a1e8b1fe	Vectorizer: track non-fast FP instructions through phis when finding reductions. When we traced through a phi node looking for floating-point reductions, we forgot whether we'd ever seen an instruction without fast-math flags (that would block vectorization). This propagates it through to the end. llvm-svn: 271015	2016-05-27 16:40:27 +00:00
Xinliang David Li	11c849c10b	Reapply r270865 -- previous bot failure is unrelated llvm-svn: 271014	2016-05-27 16:22:03 +00:00
Dehao Chen	80b16d4135	Remove sample profile dependency to instcombine, which is not a analysis pass. Summary: This patch removes dependency from sample profile pass to instcombine pass. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20501 llvm-svn: 271009	2016-05-27 16:14:15 +00:00
Benjamin Kramer	82de7d323d	Apply clang-tidy's misc-move-constructor-init throughout LLVM. No functionality change intended, maybe a tiny performance improvement. llvm-svn: 270997	2016-05-27 14:27:24 +00:00
Igor Laevsky	df9db45c94	[RewriteStatepointsForGC] All constant should have null base pointer Currently we consider that each constant has itself as a base value. I.e "base(const) = const". This introduces couple of problems when we are trying to avoid reporting constants in statepoint live sets: 1. When querying "base( phi(const1, const2) )" we will get "phi(const1, const2)" as a base pointer. Since it's not a constant we will record it in a stack map. However on practice we don't want this to happen (constant are never relocated). 2. base( phi(const, gc ptr) ) = phi( const, base(gc ptr) ). This particular case imposes challenge on our runtime - we don't expect to see constant base pointers other than null. This problems can be avoided by treating all constant as if they were derived from null pointer base. I.e in a first case we will not include constant pointer in a stack map at all. In a second case we will get "phi(null, base(gc ptr))" as a base pointer which is a lot more convenient. Differential Revision: http://reviews.llvm.org/D20584 llvm-svn: 270993	2016-05-27 13:13:59 +00:00
Benjamin Kramer	4fed928f53	Avoid some copies by using const references. clang-tidy's performance-unnecessary-copy-initialization with some manual fixes. No functional changes intended. llvm-svn: 270988	2016-05-27 12:30:51 +00:00
Simon Pilgrim	4642a57fbf	Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) llvm-svn: 270976	2016-05-27 09:02:25 +00:00
Simon Pilgrim	c013e5737b	[X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm) This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused. A companion patch (D20684) removes/auto-upgrade the clang intrinsics. Differential Revision: http://reviews.llvm.org/D20686 llvm-svn: 270973	2016-05-27 08:49:15 +00:00
Pete Cooper	1929b5539a	Form objc_storeStrong in the presence of bitcasts. objc_storeStrong can be formed from a sequence such as %0 = tail call i8* @objc_retain(i8* %p) nounwind %tmp = load i8, i8* @x, align 8 store i8* %0, i8** @x, align 8 tail call void @objc_release(i8* %tmp) nounwind The code was already looking through bitcasts for most of the values involved, but had missed one case where the pointer operand for the store was a bitcast. Ultimately the pointer for the load and store have to be the same value, after stripping casts. llvm-svn: 270955	2016-05-27 02:13:53 +00:00
Mehdi Amini	9ee054aea8	ValueMapper: fix typo in minor optimization on constant mapping (NFC) If every operands of a constant are mapping to themselves, and the type does not change, we have an early exit as acknowledged in the comment: // Otherwise, we have some other constant to remap. Start by checking to see // if all operands have an identity remapping. However instead of checking for identity the code was checking if the operands were mapped to the constant itself, which is rarely true. As a consequence, the coverage report showed that the early exit was never taken. llvm-svn: 270944	2016-05-27 00:32:12 +00:00
Easwaran Raman	5fe04a1d8e	Attach profile summary in IR based instrumentation pass. Differential revision: http://reviews.llvm.org/D20655 llvm-svn: 270933	2016-05-26 22:57:11 +00:00
Michael Zolotukhin	1ecdedad8d	[LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost. Condition might be simplified to a Constant, but it doesn't have to be ConstantInt, so we should dyn_cast, instead of cast. This fixes PR27886. llvm-svn: 270924	2016-05-26 21:42:51 +00:00
David Majnemer	d99068d26d	[MemCpyOpt] Don't perform callslot optimization across may-throw calls An exception could prevent a store from occurring but MemCpyOpt's callslot optimization would fire anyway, causing the store to occur. This fixes PR27849. llvm-svn: 270892	2016-05-26 19:24:24 +00:00
Michael Kuperstein	9a81b62a01	[BBVectorize] Don't vectorize selects with a scalar condition and vector operands. This fixes PR27879. Differential Revision: http://reviews.llvm.org/D20659 llvm-svn: 270888	2016-05-26 18:43:57 +00:00
Xinliang David Li	b02f3b141c	Revert 270865 -- unexplained bot failure on linux/ppcle llvm-svn: 270876	2016-05-26 17:27:22 +00:00
Xinliang David Li	0777a93bee	Use new interface in Triple /NFC llvm-svn: 270865	2016-05-26 16:28:01 +00:00
Chad Rosier	e5819e2732	[InstCombine] Catch more bswap cases missed due to zext and truncs. Fixes PR27824. Differential Revision: http://reviews.llvm.org/D20591. llvm-svn: 270853	2016-05-26 14:58:51 +00:00
John Brawn	3546c2f158	Add auto-exporting of symbols from tools so that plugins work on Windows The problem with plugins on Windows is that when building a plugin DLL it needs to explicitly link against something (an exe or DLL) if it uses symbols from that thing, and that thing must explicitly export those symbols. Also there's a limit of 65535 symbols that can be exported. This means that currently plugins only work on Windows when using BUILD_SHARED_LIBS, and that doesn't work with MSVC. This patch adds an LLVM_EXPORT_SYMBOLS_FOR_PLUGINS option, which when enabled automatically exports from all LLVM tools the symbols that a plugin could want to use so that a plugin can link against a tool directly. Plugins can specify what tool they link against by using PLUGIN_TOOL argument to llvm_add_library. The option can also be enabled on Linux, though there all it should do is restrict the set of symbols that are exported as by default all symbols are exported. This option is currently OFF by default, as while I've verified that it works with MSVC, linux gcc, and cygwin gcc, I haven't tried mingw gcc and I have no idea what will happen on OSX. Also unfortunately we can't turn on LLVM_ENABLE_PLUGINS when the option is ON as bugpoint-passes needs to be loaded by both bugpoint.exe and opt.exe which is incompatible with this approach. Also currently clang plugins don't work with this approach, which will be fixed in future patches. Differential Revision: http://reviews.llvm.org/D18826 llvm-svn: 270839	2016-05-26 11:16:43 +00:00
David Majnemer	474512576e	[MergedLoadStoreMotion] Don't transform across may-throw calls It is unsafe to hoist a load before a function call which may throw, the throw might prevent a pointer dereference. Likewise, it is unsafe to sink a store after a call which may throw. The caller might be able to observe the difference. This fixes PR27858. llvm-svn: 270828	2016-05-26 07:11:09 +00:00
David Majnemer	8cce333abd	[MergedLoadStoreMotion] Small cleanup No functional change is intended. llvm-svn: 270824	2016-05-26 05:43:12 +00:00
Peter Collingbourne	b9aa1f4a03	MemorySSA: Revert r269678 and r268068; replace with special casing in MemorySSA. It turns out that too many passes are relying on alias analysis results for control dependencies. Until we fix that by introducing a more accurate modelling of control dependencies, special case assume in MemorySSA instead. Also introduce tests to ensure we don't regress the FunctionAttrs or LICM passes. Differential Revision: http://reviews.llvm.org/D20658 llvm-svn: 270823	2016-05-26 04:58:46 +00:00
Craig Topper	a423aa4642	[X86] Add the AVX storeu intrinsics to InstCombine and LoopStrengthReduce in the same places that the SSE/SSE2 storeu intrinsics appear. I don't really know how to test this. Just seemed like we should be consistent. llvm-svn: 270819	2016-05-26 04:28:45 +00:00
Sanjoy Das	ee77a4828e	[IRCE] Use C++11 style initializers; NFC llvm-svn: 270815	2016-05-26 01:50:18 +00:00
Peter Collingbourne	ffecb1441b	MemorySSA: Remove argument to createNewAccess function. There is only one caller of MemorySSA::createNewAccess, and it passes true as the IgnoreNonMemory argument. Remove that argument and fold its behavior into createNewAccess. llvm-svn: 270812	2016-05-26 01:19:17 +00:00
Sanjoy Das	a099268e85	[IRCE] Optimize conjunctions of range checks After this change, we do the expected thing for cases like ``` Check0Passed = /* range check IRCE can optimize / Check1Passed = / range check IRCE can optimize */ if (!(Check0Passed && Check1Passed)) throw_Exception(); ``` llvm-svn: 270804	2016-05-26 00:09:02 +00:00
Sanjoy Das	8fe8892c2d	[IRCE] Refactor out a parseRangeCheckFromCond; NFC This will later hold more general logic to parse conjunctions of range checks. llvm-svn: 270802	2016-05-26 00:08:24 +00:00
Davide Italiano	1021c68e92	[PM] Port PartiallyInlineLibCalls to the new pass manager. llvm-svn: 270798	2016-05-25 23:38:53 +00:00
Peter Collingbourne	fad596aa81	Move whole-program virtual call optimization pass after function attribute inference in LTO pipeline. As a result of D18634 we no longer infer certain attributes on linkonce_odr functions at compile time, and may only infer them at LTO time. The readnone attribute in particular is required for virtual constant propagation (part of whole-program virtual call optimization) to work correctly. This change moves the whole-program virtual call optimization pass after the function attribute inference passes, and enables the attribute inference passes at opt level 1, so that virtual constant propagation has a chance to work correctly for linkonce_odr functions. Differential Revision: http://reviews.llvm.org/D20643 llvm-svn: 270765	2016-05-25 21:26:14 +00:00
Sanjay Patel	6be09ee827	fix typo; NFC llvm-svn: 270760	2016-05-25 21:03:31 +00:00
Mehdi Amini	cc8c107e6a	ValueMaterializer: rename materializeDeclFor() to materialize() It may materialize a declaration, or a definition. The name could be misleading. This is following a merge of materializeInitFor() into materializeDeclFor(). Differential Revision: http://reviews.llvm.org/D20593 llvm-svn: 270759	2016-05-25 21:03:21 +00:00
Mehdi Amini	53a6672e21	ValueMaterializer: fuse materializeDeclFor and materializeInitFor (NFC) They were originally separated to handle the co-recursion between the ValueMapper and the ValueMaterializer. This recursion does not exist anymore: the ValueMapper now uses a Worklist and the ValueMaterializer is scheduling job on the Worklist. Differential Revision: http://reviews.llvm.org/D20593 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 270758	2016-05-25 21:01:51 +00:00
Davide Italiano	d85ac997b8	[PM] CorrelatedValuePropagation: pass state to function. NFCI. While here, convert the logic of the pass to use static function(s). This is in preparation for porting this pass to the new PM. llvm-svn: 270734	2016-05-25 17:39:54 +00:00
Xinliang David Li	a228608b26	Use new triple API to check if comdat is supported llvm-svn: 270727	2016-05-25 17:17:51 +00:00
Chad Rosier	a00df49dc5	Clarify that we match BSwap in InstCombine and BitReverse in CGP. NFC. Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom, so the ordering of the MatchBSwaps and MatchBitReversals arguments are consistent with the function name. llvm-svn: 270715	2016-05-25 16:22:14 +00:00
Teresa Johnson	04c9a2d63d	[ThinLTO] Refactor ODR resolution and internalization (NFC) Move the now index-based ODR resolution and internalization routines out of ThinLTOCodeGenerator.cpp and into either LTO.cpp (index-based analysis) or FunctionImport.cpp (index-driven optimizations). This is to enable usage by other linkers. llvm-svn: 270698	2016-05-25 14:03:11 +00:00
Simon Pilgrim	4298d06d0f	[X86][SSE] Replace (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) lossless conversion intrinsics with generic IR Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead. Differential Revision: http://reviews.llvm.org/D20568 llvm-svn: 270678	2016-05-25 08:59:18 +00:00
Craig Topper	12e322a8cf	[X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a long time. llvm-svn: 270677	2016-05-25 06:56:32 +00:00
David Majnemer	124bdb7497	[FunctionAttrs] Volatile loads should disable readonly A volatile load has side effects beyond what callers expect readonly to signify. For example, it is not safe to reorder two function calls which each perform a volatile load to the same memory location. llvm-svn: 270671	2016-05-25 05:53:04 +00:00
Davide Italiano	655a145e83	[PM] Port BDCE to the new pass manager. llvm-svn: 270647	2016-05-25 01:57:04 +00:00
Derek Bruening	5662b93985	[esan\|wset] EfficiencySanitizer working set tool fastpath Summary: Adds fastpath instrumentation for esan's working set tool. The instrumentation for an intra-cache-line load or store consists of an inlined write to shadow memory bits for the corresponding cache line. Adds a basic test for this instrumentation. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D20483 llvm-svn: 270640	2016-05-25 00:17:24 +00:00
Michael Zolotukhin	8f7a242c7b	Re-enable "[LoopUnroll] Enable advanced unrolling analysis by default" one more time. This reverts commit r270577. llvm-svn: 270630	2016-05-24 23:00:05 +00:00
Derek Bruening	0b872d9399	[esan] Add calls from the ctor/dtor to the runtime library Summary: Adds createEsanInitToolGV for creating a tool-specific variable passed to the runtime library. Adds dtor "esan.module_dtor" and inserts calls from the dtor to "__esan_exit" in the runtime library. Updates the EfficiencySanitizer test. Patch by Qin Zhao. Reviewers: aizatsky Subscribers: bruening, kcc, vitalybuka, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D20488 llvm-svn: 270627	2016-05-24 22:48:24 +00:00
Sanjoy Das	be99153aca	[GuardWidening] Tighten the interface of the RangeCheck struct; NFC Make `GuardWideningImpl::RangeCheck` into a class and add accessors. llvm-svn: 270611	2016-05-24 20:54:45 +00:00
Xinliang David Li	f4edae6076	[profile] Fix runtime hook linkage bug for COFF Patch by: Johan Engelen the user hook has linkonceODR linkage and it needs to be in comdatAny group. llvm-svn: 270596	2016-05-24 18:47:38 +00:00
Sanjoy Das	5fd7ac452e	[IRCE] Return a Value, not SCEV from parseRangeCheck; NFC This is better layering, since the caller needs to check if the index was an add-rec anyway. llvm-svn: 270582	2016-05-24 17:19:56 +00:00
Sanjay Patel	929ebf5a54	fix typos; NFC llvm-svn: 270579	2016-05-24 16:51:26 +00:00
Hans Wennborg	b64e4390a3	Revert r270518, which re-enabled "[LoopUnroll] Enable advanced unrolling analysis by default. Chromium builds are still hitting the assert in PR27874. llvm-svn: 270577	2016-05-24 16:10:12 +00:00
Michael Zolotukhin	96c150d154	Revert "Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default."" This reverts commit r270512 and reapplies r270478. Originally it caused PR27847, but it was fixed in r270517. llvm-svn: 270518	2016-05-24 01:22:20 +00:00
Hans Wennborg	6951028b61	Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default." This caused PR27847. llvm-svn: 270512	2016-05-23 23:42:35 +00:00
Sanjoy Das	aa83c47bab	[IRCE] Optimize "uses" not branches; NFCI This changes IRCE to optimize uses, and not branches. This change is NFCI since the uses we do inspect are in practice only ever going to be the condition use in conditional branches; but this flexibility will later allow us to analyze more complex expressions than just a direct branch on a range check. llvm-svn: 270500	2016-05-23 22:16:45 +00:00
Andrew Kaylor	9c81d0fdeb	Avoid including AlwaysInliner pass in opt-bisect search. Differential Revision: http://reviews.llvm.org/D19640 llvm-svn: 270495	2016-05-23 21:57:54 +00:00
Xinliang David Li	e45207608c	tune lowering parameter for small apps (sjeng) llvm-svn: 270480	2016-05-23 19:29:26 +00:00
Gerolf Hoflehner	00e7092f68	[InstCombine] Fix assertion when bitcast is converted to gep When an aggregate contains an opaque type its size cannot be determined. This triggers an "Invalid GetElementPtrInst indices for type" assert in function checkGEPType. The fix suppresses the conversion in this case. http://reviews.llvm.org/D20319 llvm-svn: 270479	2016-05-23 19:23:17 +00:00
Michael Zolotukhin	be080fc51d	[LoopUnroll] Enable advanced unrolling analysis by default. Summary: This patch turns on LoopUnrollAnalyzer by default. To mitigate compile time regressions, I chose very conservative thresholds for now. Later we can make them more aggressive, but it might require being smarter in which loops we're optimizing. E.g. currently the biggest issue is that with more agressive thresholds we unroll many cold loops, which increases compile time for no performance benefit (performance of those loops is improved, but it doesn't matter since they are cold). Test results for compile time(using 4 samples to reduce noise): ``` MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19% SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect 4.19% MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow 3.39% MultiSource/Applications/JM/lencod/lencod 1.47% MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06% ``` I didn't see any performance changes in the testsuite, but it improves some internal tests. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20482 llvm-svn: 270478	2016-05-23 19:10:19 +00:00
Sanjay Patel	a8ef4a5737	reduce indent; NFC llvm-svn: 270372	2016-05-22 17:08:52 +00:00
Xinliang David Li	b628dd3568	[profile] Static counter allocation for value profiling (part-1) Differential Revision: http://reviews.llvm.org/D20459 llvm-svn: 270336	2016-05-21 22:55:34 +00:00
Chad Rosier	56def258e3	Fix 80-column violation. llvm-svn: 270329	2016-05-21 21:12:06 +00:00
David Majnemer	9f92f4c497	[SimplifyCFG] Remove cleanuppads which are empty except for calls to lifetime.end A cleanuppad is not cheap, they turn into many instructions and result in additional spills and fills. It is not worth keeping a cleanuppad around if all it does is hold a lifetime.end instruction. N.B. We first try to merge the cleanuppad with another cleanuppad to avoid dropping the lifetime and debug info markers. llvm-svn: 270314	2016-05-21 05:12:32 +00:00
Sanjoy Das	c5b1169de2	[IRCE] Don't use an allocator for range checks; NFC The InductiveRangeCheck struct is only five words long; so passing these around value is fine. The allocator makes the code look more complex than it is. llvm-svn: 270309	2016-05-21 02:52:13 +00:00
Sanjoy Das	59776734a3	[IRCE] Don't pass IRBuilder<> where unnecessary; NFC llvm-svn: 270308	2016-05-21 02:31:51 +00:00
Sanjoy Das	be6c7a12cb	[GuardWidening] Fix incorrect use of remove_if I had used `std::remove_if` under the assumption that it moves the predicate matching elements to the end, but actaully the elements remaining towards the end (after the iterator returned by `std::remove_if`) are indeterminate. Fix the bug (and make the code more straightforward) by using a temporary SmallVector, and add a test case demonstrating the issue. llvm-svn: 270306	2016-05-21 02:24:44 +00:00
Derek Bruening	bc0a68e688	[esan] Use ModulePass for EfficiencySanitizerPass. Summary: Uses ModulePass instead of FunctionPass for EfficiencySanitizerPass to better support global variable creation for a forthcoming struct field counter tool. Patch by Qin Zhao. Reviewers: aizatsky Subscribers: llvm-commits, eugenis, vitalybuka, bruening, kcc Differential Revision: http://reviews.llvm.org/D20458 llvm-svn: 270263	2016-05-20 20:00:05 +00:00
Mark Lacey	9b5fcf65ec	Functions with differing phis should not be merged. Check that the incoming blocks of phi nodes are identical, and block function merging if they are not. rdar://problem/26255167 Differential Revision: http://reviews.llvm.org/D20462 llvm-svn: 270250	2016-05-20 18:39:11 +00:00
Davide Italiano	f7211fd44d	[PM/PartiallyInlineLibCalls] Fix pass dependencies. Inline getAnalysisUsage() while I'm here. llvm-svn: 270231	2016-05-20 16:23:14 +00:00
Davide Italiano	8749dfd1bf	[PartiallyInlineLibCalls] Remove dead includes. NFC. llvm-svn: 270228	2016-05-20 15:52:23 +00:00
Davide Italiano	08713bd1ed	[PM/PartiallyInlineLibCalls] Convert to static function in preparation for porting this pass to the new PM. llvm-svn: 270225	2016-05-20 15:43:39 +00:00
Sanjay Patel	75892a1543	[SimplifyCFG] eliminate switch cases based on known range of switch condition This was noted in PR24766: https://llvm.org/bugs/show_bug.cgi?id=24766#c2 We may not know whether the sign bit(s) are zero or one, but we can still optimize based on knowing that the sign bit is repeated. Differential Revision: http://reviews.llvm.org/D20275 llvm-svn: 270222	2016-05-20 14:53:09 +00:00
Sanjoy Das	2351975860	Add const qualifiers to appease bots; NFC llvm-svn: 270155	2016-05-19 23:15:59 +00:00
Sanjoy Das	f5f0331a3b	[GuardWidening] Introduce range check merging Sequences of range checks expressed using guards, like guard((I - 2) u< L) guard((I - 1) u< L) guard((I + 0) u< L) guard((I + 1) u< L) guard((I + 2) u< L) can sometimes be combined into a smaller sequence: guard((I - 2) u< L AND (I + 2) u< L) if we can prove that (I - 2) u< L AND (I + 2) u< L implies all of checks expressed in the previous sequence. This change teaches GuardWidening to do this kind of merging when feasible. llvm-svn: 270151	2016-05-19 22:55:46 +00:00
Guozhi Wei	b1d37199cc	[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703. If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it. llvm-svn: 270135	2016-05-19 21:07:01 +00:00
Wei Mi	0456d9dd18	Recommit r255691 since PR26509 has been fixed. llvm-svn: 270113	2016-05-19 20:38:03 +00:00
Davide Italiano	46f249b4cd	[SCCP] Prefer class to struct. llvm-svn: 270074	2016-05-19 15:58:02 +00:00
Vedant Kumar	9152fd17e9	Retry^3 "[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC" Transition InstrProf and Coverage over to the stricter Error/Expected interface. Changes since the initial commit: - Fix error message printing in llvm-profdata. - Check errors in loadTestingFormat() + annotateAllFunctions(). - Defer error handling in InstrProfIterator to InstrProfReader. - Remove the base ProfError class to work around an MSVC ICE. Differential Revision: http://reviews.llvm.org/D19901 llvm-svn: 270020	2016-05-19 03:54:45 +00:00
Sanjoy Das	b784ed36c0	[GuardWidening] Use getEquivalentICmp to fold constant compares `ConstantRange::getEquivalentICmp` is more general, and better factored. llvm-svn: 270019	2016-05-19 03:53:17 +00:00
Sanjoy Das	52bbde2bbc	[LowerGuards] Rename variable; NFC PredicatePassProbability is a better name for what LikelyBranchWeight was trying to express. llvm-svn: 269999	2016-05-18 23:16:27 +00:00
Sanjoy Das	083f38939b	New pass: guard widening Summary: Implement guard widening in LLVM. Description from GuardWidening.cpp: The semantics of the `@llvm.experimental.guard` intrinsic lets LLVM transform it so that it fails more often that it did before the transform. This optimization is called "widening" and can be used hoist and common runtime checks in situations like these: ``` %cmp0 = 7 u< Length call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ] call @unknown_side_effects() %cmp1 = 9 u< Length call @llvm.experimental.guard(i1 %cmp1) [ "deopt"(...) ] ... ``` to ``` %cmp0 = 9 u< Length call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ] call @unknown_side_effects() ... ``` If `%cmp0` is false, `@llvm.experimental.guard` will "deoptimize" back to a generic implementation of the same function, which will have the correct semantics from that point onward. It is always _legal_ to deoptimize (so replacing `%cmp0` with false is "correct"), though it may not always be profitable to do so. NB! This pass is a work in progress. It hasn't been tuned to be "production ready" yet. It is known to have quadriatic running time and will not scale to large numbers of guards Reviewers: reames, atrick, bogner, apilipenko, nlewycky Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20143 llvm-svn: 269997	2016-05-18 22:55:34 +00:00
Dehao Chen	f16376b505	Follow-up patch of http://reviews.llvm.org/D19948 to handle missing profiles when simplifying CFG. Summary: Set default branch weight to 1:1 if one of the branch has profile missing when simplifying CFG. Reviewers: spatel, davidxl Subscribers: danielcdh, llvm-commits Differential Revision: http://reviews.llvm.org/D20307 llvm-svn: 269995	2016-05-18 22:41:03 +00:00
Michael Zolotukhin	d2268a73bc	[LoopUnrollAnalyzer] Take into account cost of instructions controlling branches, along with their operands. Previously, we didn't add their and their operands cost, which could've resulted in unrolling loops for no actual benefit. llvm-svn: 269985	2016-05-18 21:20:12 +00:00
Dehao Chen	f6c0083b55	clang-format SimplifyCFG.cpp. llvm-svn: 269974	2016-05-18 19:44:21 +00:00
Davide Italiano	98f7e0e790	[PM] Port per-function SCCP to the new pass manager. llvm-svn: 269937	2016-05-18 15:18:25 +00:00
James Molloy	a854c0a0c3	[VectorUtils] Fix nasty use-after-free In truncateToMinimalBitwidths() we were RAUW'ing an instruction then erasing it. However, that intruction could be cached in the map we're iterating over. The first check is "I->use_empty()" which in most cases would return true, as the (deleted) object was RAUW'd first so would have zero use count. However in some cases the object could have been polluted or written over and this wouldn't be the case. Also it makes valgrind, asan and traditionalists who don't like their compiler to crash sad. No testcase as there are no externally visible symptoms apart from a crash if the stars align. Fixes PR26509. llvm-svn: 269908	2016-05-18 11:57:58 +00:00
Justin Bogner	594e07bd78	[PM] Port DSE to the new pass manager Patch by JakeVanAdrighem. Thanks! llvm-svn: 269847	2016-05-17 21:38:13 +00:00
Xinliang David Li	7d0fed74f0	minor cleanup /NFC llvm-svn: 269839	2016-05-17 21:06:16 +00:00
Sanjay Patel	22b01febd4	[InstCombine] add another test for wrong icmp constant (PR27792) It doesn't matter if the comparison is unsigned; the inc/dec is always signed. llvm-svn: 269831	2016-05-17 20:20:40 +00:00
Xinliang David Li	8da773bf74	Simple refactoring /NFC llvm-svn: 269829	2016-05-17 20:19:03 +00:00
Davide Italiano	bfe3801d16	[LCSSA] Use llvm::any_of instead of std::size_of. The API is simpler. Suggested by David Blaikie! llvm-svn: 269800	2016-05-17 19:01:02 +00:00
Sanjay Patel	86564cad06	[InstCombine] fix constant to be signed for signed comparisons This bug was introduced in r269728 and is the likely cause of many stage 2 ubsan bot failures. I'll add a test in a follow-up commit assuming this fixes things properly. llvm-svn: 269797	2016-05-17 18:38:55 +00:00
Sanjoy Das	fd67038c8b	[Guards] Add branch metadata when lowering Guards are expected to basically never fail. Reflect this in the branch probabilities in their lowered form. llvm-svn: 269791	2016-05-17 17:51:19 +00:00
Davide Italiano	a0e0feea1d	[PM/LCSSA] Fix dependency list. Some passes are preserved, not required. llvm-svn: 269768	2016-05-17 14:32:12 +00:00
Davide Italiano	b75b16e2ff	[LCSSA] Use any_of() to simplify the code. NFCI. llvm-svn: 269767	2016-05-17 14:24:41 +00:00
Igor Laevsky	953f2d2a54	[RewriteStatepointsForGC] Remove obsolete assertion This is assertion is no longer necessary since we never record constants in the live set anyway. (They are never recorded in the initial live set, and constant bases are removed near line 2119) Differential Revision: http://reviews.llvm.org/D20293 llvm-svn: 269764	2016-05-17 13:54:10 +00:00
Benjamin Kramer	ca9a0fe2b9	[InstCombine] Don't crash when trying to take an element of a ConstantExpr. Fixes PR27786. llvm-svn: 269757	2016-05-17 12:08:55 +00:00
Sanjay Patel	18254935c9	try to avoid unused variable warning in release build; NFCI llvm-svn: 269729	2016-05-17 01:12:31 +00:00
Sanjay Patel	e9b2c32e7f	[InstCombine] check vector elements before trying to transform LE/GE vector icmp (PR27756) Fix a bug introduced with rL269426 : [InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819) We were assuming that a ConstantDataVector / ConstantVector / ConstantAggregateZero operand of an ICMP was composed of ConstantInt elements, but it might have ConstantExpr or UndefValue elements. Handle those appropriately. Also, refactor this function to join the scalar and vector paths and eliminate the switches. Differential Revision: http://reviews.llvm.org/D20289 llvm-svn: 269728	2016-05-17 00:57:57 +00:00
Vedant Kumar	85c973d3f0	Revert "Retry^2 "[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC"" This reverts commit r269694. MSVC says: error C2086: 'char llvm::ProfErrorInfoBase<enum llvm::instrprof_error>::ID' : redefinition llvm-svn: 269700	2016-05-16 21:03:38 +00:00
Vedant Kumar	7cb2fd5904	Retry^2 "[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC" Transition InstrProf and Coverage over to the stricter Error/Expected interface. Changes since the initial commit: - Address undefined-var-template warning. - Fix error message printing in llvm-profdata. - Check errors in loadTestingFormat() + annotateAllFunctions(). - Defer error handling in InstrProfIterator to InstrProfReader. Differential Revision: http://reviews.llvm.org/D19901 llvm-svn: 269694	2016-05-16 20:49:39 +00:00
Xinliang David Li	f3c7a35238	[PM] Port indirect call promotion pass to new pass manager llvm-svn: 269660	2016-05-16 16:31:07 +00:00
Matthew Simpson	e43198dc4b	[LV] Ensure safe VF for loops with interleaved accesses The selection of the vectorization factor currently doesn't consider interleaved accesses. The vectorization factor is based on the maximum safe dependence distance computed by LAA. However, for loops with interleaved groups, we should instead base the vectorization factor on the maximum safe dependence distance divided by the maximum interleave factor of all the interleaved groups. Interleaved accesses not in a group will be scalarized. Differential Revision: http://reviews.llvm.org/D20241 llvm-svn: 269659	2016-05-16 15:08:20 +00:00
Davide Italiano	6f852eedbf	[PM] RewriterStatepointForGC: add missing dependency. llvm-svn: 269624	2016-05-16 02:29:53 +00:00
Benjamin Kramer	a65b610bd2	Move helper classes into anonymous namespaces. NFC. llvm-svn: 269591	2016-05-15 15:18:11 +00:00
Davide Italiano	e62c54375d	[PM/SCCP] Fix pass dependencies. TargetLibraryInfoWrapperPass is a dependency of SCCP but it's not listed as such. Chandler pointed out this is an easy mistake to make which only surfaces in weird crashes with some flag combinations. This code will go away anyway at some point in the future, but as long as it's (still) exercised, try to make it correct. llvm-svn: 269589	2016-05-15 08:04:28 +00:00
Xinliang David Li	72616180df	Rename pass name to prepare to new PM porting /NFC llvm-svn: 269586	2016-05-15 01:04:24 +00:00
Davide Italiano	e7c56c5c4f	[SCCP] Use range-based for loops. NFC. llvm-svn: 269578	2016-05-14 20:59:09 +00:00
Chandler Carruth	5957375902	Revert "Retry "[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC"" This reverts commit r269491. It triggers warnings with Clang, breaking builds for -Werror users including several build bots. llvm-svn: 269547	2016-05-14 05:26:26 +00:00
Marcin Koscielnicki	a4fcd3681f	[MSan] [PowerPC] Implement PowerPC64 vararg helper. Differential Revision: http://reviews.llvm.org/D20000 llvm-svn: 269518	2016-05-13 23:55:33 +00:00
Davide Italiano	9922344178	[PM] Port LowerAtomic to the new pass manager. llvm-svn: 269511	2016-05-13 22:52:35 +00:00
Sanjay Patel	abbc2ac231	use 'match' for less indenting; NFCI llvm-svn: 269494	2016-05-13 21:51:17 +00:00
Vedant Kumar	df41bd89a5	Retry "[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC" Transition InstrProf and Coverage over to the stricter Error/Expected interface. Changes since the initial commit: - Fix error message printing in llvm-profdata. - Check errors in loadTestingFormat() + annotateAllFunctions(). - Defer error handling in InstrProfIterator to InstrProfReader. Differential Revision: http://reviews.llvm.org/D19901 llvm-svn: 269491	2016-05-13 21:50:56 +00:00
Michael Zolotukhin	963a6d9c69	Revert "Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."" This reverts commit r269395. Try to reapply with a fix from chapuni. llvm-svn: 269486	2016-05-13 21:23:25 +00:00
Matthew Simpson	c326d050ca	Correct spelling in comment (NFC) llvm-svn: 269482	2016-05-13 21:01:07 +00:00
Sanjay Patel	5d5134f676	use range-loops; NFCI llvm-svn: 269471	2016-05-13 20:24:53 +00:00
Vedant Kumar	064535c1ea	Revert "(HEAD -> master, origin/master, origin/HEAD) [ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC" This reverts commit r269462. It fails two llvm-profdata tests. llvm-svn: 269466	2016-05-13 20:09:39 +00:00
Vedant Kumar	ac25219d20	[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC Transition InstrProf and Coverage over to the stricter Error/Expected interface. Differential Revision: http://reviews.llvm.org/D19901 llvm-svn: 269462	2016-05-13 20:01:27 +00:00
Jun Bum Lim	be11bdc4b0	Rename getLargestLegalIntTypeSize to getLargestLegalIntTypeSizeInBits(). NFC. Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes fixed in r269433. Reviewers: joker.eph, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20248 llvm-svn: 269456	2016-05-13 18:38:35 +00:00
Geoff Berry	2f64c20284	[EarlyCSE] Change key type of AvailableCalls to Instruction*. NFCI. llvm-svn: 269445	2016-05-13 17:54:58 +00:00
Sanjay Patel	0c8f3f9332	[InstCombine] handle zero constant vectors for LE/GE comparisons too Enhancement to: http://reviews.llvm.org/rL269426 With discussion in: http://reviews.llvm.org/D17859 This should complete the fixes for: PR26701, PR26819: https://llvm.org/bugs/show_bug.cgi?id=26701 https://llvm.org/bugs/show_bug.cgi?id=26819 llvm-svn: 269439	2016-05-13 17:28:12 +00:00
Rong Xu	0698de9218	[PGO] Add flags to control IRPGO warnings. Currently there is no reasonable way to control the warnings in the 'use' phase of the IRPGO pass. This is problematic because the output can be somewhat spammy. This patch adds some flags which allow us to optionally disable these warnings. The current upstream behavior will remain the default. Patch by Jake VanAdrighem (jvanadrighem@gmail.com) Differential Revision: http://reviews.llvm.org/D20195 llvm-svn: 269437	2016-05-13 17:26:06 +00:00
Jun Bum Lim	f28beac419	[MemCpyOpt] Use MaxIntSize in byte instead of bit Summary: This change fix the bug in isProfitableToUseMemset() where MaxIntSize shoule be in byte, not bit. Reviewers: arsenm, joker.eph, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20176 llvm-svn: 269433	2016-05-13 16:52:24 +00:00
Sanjay Patel	b79ab27853	[InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819) *We don't currently handle the edge case constants (min/max values), so it's not a complete canonicalization. To fully solve the motivating bugs, we need to enhance this to recognize a zero vector too because that's a ConstantAggregateZero which is a ConstantData, not a ConstantVector or a ConstantDataVector. Differential Revision: http://reviews.llvm.org/D17859 llvm-svn: 269426	2016-05-13 15:10:46 +00:00
Michael Zolotukhin	9be3b8b9bb	Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..." This reverts commit r269388. It caused some bots to fail, I'm reverting it until I investigate the issue. llvm-svn: 269395	2016-05-13 06:32:25 +00:00
Adam Nemet	eff76646f5	[LoopDist] Only run LAA for loops with the pragma This should fix some compile-time regressions after r267672. Thanks to Chris Matthews for bisecting it. llvm-svn: 269392	2016-05-13 04:20:31 +00:00
Michael Zolotukhin	b7b8052982	[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... Summary: ...loop after the last iteration. This is really hard to do correctly. The core problem is that we need to model liveness through the induction PHIs from iteration to iteration in order to get the correct results, and we need to correctly de-duplicate the common subgraphs of instructions feeding some subset of the induction PHIs. All of this can be driven either from a side effect at some iteration or from the loop values used after the loop finishes. This patch implements this by storing the forward-propagating analysis of each instruction in a cache to recall whether it was free and whether it has become live and thus counted toward the total unroll cost. Then, at each sink for a value in the loop, we recursively walk back through every value that feeds the sink, including looping back through the iterations as needed, until we have marked the entire input graph as live. Because we cache this, we never visit instructions more than twice -- once when we analyze them and put them into the cache, and once when we count their cost towards the unrolled loop. Also, because the cache is only two bits and because we are dealing with relatively small iteration counts, we can store all of this very densely in memory to avoid this from becoming an excessively slow analysis. The code here is still pretty gross. I would appreciate suggestions about better ways to factor or split this up, I've stared too long at the algorithmic side to really have a good sense of what the design should probably look at. Also, it might seem like we should do all of this bottom-up, but I think that is a red herring. Specifically, the simplification power is much greater working top-down. We can forward propagate very effectively, even across strange and interesting recurrances around the backedge. Because we use data to propagate, this doesn't cause a state space explosion. Doing this level of constant folding, etc, would be very expensive to do bottom-up because it wouldn't be until the last moment that you could collapse everything. The current solution is essentially a top-down simplification with a bottom-up cost accounting which seems to get the best of both worlds. It makes the simplification incremental and powerful while leaving everything dead until we know it is needed. Finally, a core property of this approach is its monotonicity. At all times, the current UnrolledCost is a conservatively low estimate. This ensures that we will never early-exit from the analysis due to exceeding a threshold when if we had continued, the cost would have gone back below the threshold. These kinds of bugs can cause incredibly hard to track down random changes to behavior. We could use a techinque similar (but much simpler) within the inliner as well to avoid considering speculated code in the inline cost. Reviewers: chandlerc Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D11758 llvm-svn: 269388	2016-05-13 01:42:39 +00:00
Chandler Carruth	49c22190d0	[PM] Port of the DepndenceAnalysis to the new PM. Ported DA to the new PM by splitting the former DependenceAnalysis Pass into a DependenceInfo result type and DependenceAnalysisWrapperPass type and adding a new PM-style DependenceAnalysis analysis pass returning the DependenceInfo. Patch by Philip Pfaffe, most of the review by Justin. Differential Revision: http://reviews.llvm.org/D18834 llvm-svn: 269370	2016-05-12 22:19:39 +00:00
Simon Pilgrim	3ac4d831ee	Tidied up switch cases. NFCI. Split FCMP//ICMP/SEL from the basic arithmetic cost functions. They were not sharing any notable code path (just the return) and were repeatedly testing the opcode. llvm-svn: 269348	2016-05-12 21:01:20 +00:00
Davide Italiano	851f879f32	[PM] Make LowerAtomic a FunctionPass. Differential Revision: http://reviews.llvm.org/D20025 llvm-svn: 269322	2016-05-12 18:49:32 +00:00
Michael Kuperstein	82e7df5a58	[LoopVectorizer] LoopVectorBody doesn't need to be a vector. NFC. LoopVectorBody was changed from a single pointer to a SmallVector when store predication was introduced in r200270. Since r247139, store predication no longer splits the vector loop body in-place, so we can go back to having a single LoopVectorBody block. This reverts the no-longer-needed changes from r200270. llvm-svn: 269321	2016-05-12 18:44:51 +00:00
David Majnemer	96f0d383a7	[SCCP] Resolve shifts beyond the bitwidth to undef Shifts beyond the bitwidth are undef but SCCP resolved them to zero. Instead, DTRT and resolve them to undef. This reimplements the transform which caused PR27712. llvm-svn: 269269	2016-05-12 03:07:40 +00:00
Sanjoy Das	e0aa414acf	All llvm.deoptimize declarations must use the same calling convention This new verifier rule lets us unambigously pick a calling convention when creating a new declaration for `@llvm.experimental.deoptimize.<ty>`. It is also congruent with our lowering strategy -- since all calls to `@llvm.experimental.deoptimize` are lowered to calls to `__llvm_deoptimize`, it is reasonable to enforce a unique calling convention. Some of the tests that were breaking this verifier rule have had to be split up into different .ll files. The inliner was violating this rule as well, and has been fixed to avoid producing invalid IR. llvm-svn: 269261	2016-05-12 01:17:38 +00:00
Davide Italiano	cd7c84bd8b	Revert "[SCCP] Partially propagate informations when the input is not fully defined." This reverts commit r269105 as it caused PR27712. llvm-svn: 269252	2016-05-11 23:06:10 +00:00
Teresa Johnson	2e03094d45	[ThinLTO] Don't re-analyze callee at same threshold unnecessarily This should just be a compile-time change. Correct the check for whether we have already analyzed the callee when making summary based decisions. There is no need to reprocess one at the same threshold as when it was last processed. llvm-svn: 269251	2016-05-11 22:56:19 +00:00
Rafael Espindola	83658d6e7a	Return a StringRef from getSection. This is similar to how getName is handled. llvm-svn: 269218	2016-05-11 18:21:59 +00:00
Rafael Espindola	f329be8394	Delete mayBeOverridden. It is the same as isInterposable which seems to be the preferred name. llvm-svn: 269150	2016-05-11 01:26:06 +00:00
Rong Xu	ca28a0afb6	[PGO] Use WeakAny linkage for __llvm_profile_raw_version Use WeakAny linkage instead of LinkOnceAny, as the symbol can be removed with LinkOnceAny in O2 (not referenced). llvm-svn: 269146	2016-05-11 00:31:59 +00:00
Dehao Chen	b76e5d948a	Propagate branch metadata when some branch probability is missing. Summary: In sample profile, some branches may have profile missing due to profile inaccuracy. We want existing branch probability still valid after propagation. Reviewers: hfinkel, davidxl, spatel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19948 llvm-svn: 269137	2016-05-10 23:07:19 +00:00
Xinliang David Li	da1955835d	[PM]: port IR based profUse pass to new pass manager llvm-svn: 269129	2016-05-10 21:59:52 +00:00
Tim Northover	3961735f03	Revert "MemCpyOpt: combine local load/store sequences into memcpy." This reverts commit r269125. It was in my tree when I ran "git svn dcommit". It's really still under review. llvm-svn: 269127	2016-05-10 21:49:40 +00:00
Tim Northover	6c65c71639	MemCpyOpt: combine local load/store sequences into memcpy. Sort of the BB-local equivalent to idiom-recognizer: if we have a basic-block that really implements a memcpy operation, backends can benefit from seeing this. llvm-svn: 269125	2016-05-10 21:48:11 +00:00
Hans Wennborg	719b26ba54	Loop unroller: set thresholds for optsize and minsize functions to zero Before r268509, Clang would disable the loop unroll pass when optimizing for size. That commit enabled it to be able to support unroll pragmas in -Os builds. However, this regressed binary size in one of Chromium's DLLs with ~100 KB. This restores the original behaviour of no unrolling at -Os, but doing it in LLVM instead of Clang makes more sense, and also allows the pragmas to keep working. Differential revision: http://reviews.llvm.org/D20115 llvm-svn: 269124	2016-05-10 21:45:55 +00:00
Lawrence Hu	e58a814c07	Enable loopreroll for sext of loop control only IV This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. Differential Revision: http://reviews.llvm.org/D19820 llvm-svn: 269121	2016-05-10 21:16:49 +00:00
Lawrence Hu	fe7c87beac	Revert r26084: Enable loopreroll for sext of loop control only IV llvm-svn: 269119	2016-05-10 21:11:09 +00:00
Peter Collingbourne	dba995601b	Cloning: Clean up the interface to the CloneFunction function. Remove the ModuleLevelChanges argument, and the ability to create new subprograms for cloned functions. The latter was added without review in r203662, but it has no in-tree clients (all non-test callers pass false for ModuleLevelChanges [1], so it isn't reachable outside of tests). It also isn't clear that adding a duplicate subprogram to the compile unit is always the right thing to do when cloning a function within a module. If this functionality comes back it should be accompanied with a more concrete use case. Furthermore, all in-tree clients add the returned function to the module. Since that's pretty much the only sensible thing you can do with the function, just do that in CloneFunction. [1] http://llvm-cs.pcc.me.uk/lib/Transforms/Utils/CloneFunction.cpp/rCloneFunction Differential Revision: http://reviews.llvm.org/D18628 llvm-svn: 269110	2016-05-10 20:23:24 +00:00
Chad Rosier	4e6cda2db5	[InstCombine] Fold icmp ugt/ult (udiv i32 C2, X), C1. This patch adds support for two optimizations: icmp ugt (udiv C2, X), C1 -> icmp ule X, C2/(C1+1) icmp ult (udiv C2, X), C1 -> icmp ugt X, C2/C1 Differential Revision: http://reviews.llvm.org/D20123 llvm-svn: 269109	2016-05-10 20:22:09 +00:00
Davide Italiano	7860c9bbf4	[SCCP] Partially propagate informations when the input is not fully defined. With this patch: %r1 = lshr i64 -1, 4294967296 -> undef Before this patch: %r1 = lshr i64 -1, 4294967296 -> 0 llvm-svn: 269105	2016-05-10 19:49:47 +00:00
Peter Collingbourne	ccdc225c27	Re-apply r269081 and r269082 with a fix for MSVC. llvm-svn: 269094	2016-05-10 18:07:21 +00:00
Peter Collingbourne	4d41cb6cc6	Revert r269081 and r269082 while I try to find the right incantation to fix MSVC build. llvm-svn: 269091	2016-05-10 17:54:43 +00:00
Rong Xu	b6211a0b4f	[PGO] resubmit r268969 Put the test into a target specific directory. llvm-svn: 269090	2016-05-10 17:45:33 +00:00
Lawrence Hu	8cc3b37d2c	Enable loopreroll for sext of loop control only IV This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. llvm-svn: 269084	2016-05-10 17:42:27 +00:00
Peter Collingbourne	0df2b085bc	WholeProgramDevirt: Move logic for finding devirtualizable call sites to Analysis. The plan is to eventually make this logic simpler, however I expect it to be a little tricky for the foreseeable future (at least until we're rid of pointee types), so move it here so that it can be reused to build a summary index for devirtualization. Differential Revision: http://reviews.llvm.org/D20005 llvm-svn: 269081	2016-05-10 17:34:21 +00:00
Teresa Johnson	8570fe47ef	[ThinLTO] Add option to emit imports files for distributed backends Summary: Add support for emission of plaintext lists of the imported files for each distributed backend compilation. Used for distributed build file staging. Invoked with new gold-plugin thinlto-emit-imports-files option, which is only valid with thinlto-index-only (i.e. for distributed builds), or from llvm-lto with new -thinlto-action=emitimports value. Depends on D19556. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19636 llvm-svn: 269067	2016-05-10 15:54:09 +00:00
Teresa Johnson	84174c3771	Restore "[ThinLTO] Emit individual index files for distributed backends" This restores commit r268627: Summary: When launching ThinLTO backends in a distributed build (currently supported in gold via the thinlto-index-only plugin option), emit an individual index file for each backend process as described here: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098272.html ... Differential Revision: http://reviews.llvm.org/D19556 Address msan failures by avoiding std::prev on map.end(), the theory is that this is causing issues due to some known UB problems in __tree. llvm-svn: 269059	2016-05-10 13:48:23 +00:00
Chuang-Yu Cheng	175741d5a7	Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotation Loop rotation clones instruction from the old header into the preheader. If there were uses of values produced by these instructions that were outside the loop, we have to insert PHI nodes to merge the two values. If the values are used by DbgIntrinsics they will be used as a MetadataAsValue of a ValueAsMetadata of the original values, and iterating all of the uses of the original value will not update the DbgIntrinsics. The new code checks if the values are used by DbgIntrinsics and if so, updates them using essentially the same logic as the original code. The attached testcase demonstrates the issue. Without the fix, the DbgIntrinic outside the loop uses values computed inside the loop, even though these values do not dominate the DbgIntrinsic. Author: Thomas Jablin (tjablin) Reviewers: dblaikie aprantl kbarton hfinkel cycheng http://reviews.llvm.org/D19564 llvm-svn: 269034	2016-05-10 09:45:44 +00:00
Arnaud A. de Grandmaison	333ef381b8	[InstCombine] Remove trivially empty va_start/va_end and va_copy/va_end ranges. When a va_start or va_copy is immediately followed by a va_end (ignoring debug information or other start/end in between), then it is safe to remove the pair. As this code shares some commonalities with the lifetime markers, this has been factored to helper functions. This InstCombine pattern kicks-in 3 times when running the LLVM test suite. llvm-svn: 269033	2016-05-10 09:24:49 +00:00
Renato Golin	d876eecf02	Revert "[PGO] Fix __llvm_profile_raw_version linkage in MACHO IR instrumentation generates a COMDAT symbol __llvm_profile_raw_version to overwrite the same symbol in profile run-time to distinguish IR profiles from Clang generated profiles. In MACHO, LinkOnceODR linkage is used due to the lack of COMDAT support." This reverts commits r268969, r268979 and r268984. They had target specific test in generic directories without the correct specifiers and made it hard for us to come up with a good solution by rapidly committing untested changes. This test needs to be in a target specific directory or have the correct REQUIRED identifier. llvm-svn: 269027	2016-05-10 08:23:57 +00:00
Elena Demikhovsky	c434d091c5	[LoopVectorize] Handling induction variable with non-constant step. Allow vectorization when the step is a loop-invariant variable. This is the loop example that is getting vectorized after the patch: int int_inc; int bar(int init, int restrict A, int N) { int x = init; for (int i=0;i<N;i++){ A[i] = x; x += int_inc; } return x; } "x" is an induction variable with loop-invariant* step. But it is not a primary induction. Primary induction variable with non-constant step is not handled yet. Differential Revision: http://reviews.llvm.org/D19258 llvm-svn: 269023	2016-05-10 07:33:35 +00:00
Denis Zobnin	15d1e64b2b	[LAA] Rename "isStridedPtr" with "getPtrStride". NFC. Changing misleading function name was approved in http://reviews.llvm.org/D17268. Patch by Roman Shirokiy. llvm-svn: 269021	2016-05-10 05:55:16 +00:00
Justin Lebar	50deb6d028	Minor formatting fixes in LoopUnroll.cpp. llvm-svn: 268995	2016-05-10 00:31:23 +00:00
Adam Nemet	c6bbd80d59	[IndirectCallPromotion] Remove duplicate comment. NFC llvm-svn: 268986	2016-05-09 23:03:06 +00:00
Chad Rosier	58919cc6f8	Typo. NFC. llvm-svn: 268975	2016-05-09 21:37:43 +00:00
Xinliang David Li	dfa21c310d	Cleanup followup of r268710 - [PM] port IR based PGO prof-gen pass to new pass manager llvm-svn: 268974	2016-05-09 21:37:12 +00:00
Rong Xu	a12f6d3c7b	[PGO] Fix __llvm_profile_raw_version linkage in MACHO IR instrumentation generates a COMDAT symbol __llvm_profile_raw_version to overwrite the same symbol in profile run-time to distinguish IR profiles from Clang generated profiles. In MACHO, LinkOnceODR linkage is used due to the lack of COMDAT support. But LinkOnceODR linkage might have .weak_def_can_be_hidden assembly directive, while the weak variable in run-time has a .weak_definition directive. Linker will not merge these two symbols even they have the same name. The end result is IR profiles are not properly flagged in MACHO. This patch changes the linkage for __llvm_profile_raw_version in each module to LinkOnceAny so that it has same .weak_definition directive as in the run-time. Differential Revision: http://reviews.llvm.org/D20078 llvm-svn: 268969	2016-05-09 21:03:06 +00:00
Marcin Koscielnicki	60b3cbe095	[MSan] [AArch64] Fix vararg helper for >1 or non-int fixed arguments. This fixes http://llvm.org/PR27646 on AArch64. There are three issues here: - The GR save area is 7 words in size, instead of 8. This is not enough if none of the fixed arguments is passed in GRs (they're all floats or aggregates). - The first argument is ignored (which counteracts the above if it's passed in GR). - Like x86_64, fixed arguments landing in the overflow area are wrongly counted towards the overflow offset. Differential Revision: http://reviews.llvm.org/D20023 llvm-svn: 268967	2016-05-09 20:57:36 +00:00
Chad Rosier	131a42ccdf	[InstCombine] Fold icmp eq/ne (udiv i32 A, B), 0 -> icmp ugt/ule B, A. Differential Revision: http://reviews.llvm.org/D20036 llvm-svn: 268960	2016-05-09 19:30:20 +00:00
Joerg Sonnenberger	8ffe7ab7c2	Optimize a printf with a double procent to putchar. llvm-svn: 268922	2016-05-09 14:36:16 +00:00
Junmo Park	955298746d	Minor code cleanups. NFC. llvm-svn: 268888	2016-05-08 23:22:58 +00:00
Xinliang David Li	d55827f7b2	[PM] code refactoring -- preparation for new PM porting /NFC llvm-svn: 268851	2016-05-07 05:39:12 +00:00
Philip Reames	6f4d0088c6	Reapply 267210 with fix for PR27490 Original Commit Message Extend load/store type canonicalization to handle unordered operations Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case. Note that the concern about lowering is now much less likely. PR27490 proved that we already were mucking with the types of ordered atomics and volatiles. As a result, this change doesn't introduce as much new behavior as originally thought. llvm-svn: 268809	2016-05-06 22:17:01 +00:00
Philip Reames	4a3c3b66d7	[GVN] PRE of unordered loads Again, fairly simple. Only change is ensuring that we actually copy the property of the load correctly. The aliasing legality constraints were already handled by the FRE patches. There's nothing special about unorder atomics from the perspective of the PRE algorithm itself. llvm-svn: 268804	2016-05-06 21:43:51 +00:00
Sanjoy Das	091fcfa3a7	[RS4GC] Fix typo in comment llvm-svn: 268790	2016-05-06 20:39:33 +00:00
Marcin Koscielnicki	b088ad1e09	[MSan] [X86] Fix vararg helper for fixed arguments in overflow area. This fixes http://llvm.org/PR27646 on x86_64. Differential Revision: http://reviews.llvm.org/D19997 llvm-svn: 268783	2016-05-06 19:36:56 +00:00
Philip Reames	1fdce639d2	[GVN] Handle unordered atomics in cross block FRE You'll note there are essentially no code changes here. Cross block FRE heavily reuses code from the block local FRE. All of the tricky parts were done as part of the previous patch and the refactoring that removed the original code duplication. llvm-svn: 268775	2016-05-06 18:46:45 +00:00
Philip Reames	ae8997f496	[GVN] Do local FRE for unordered atomic loads This patch is the first in a small series teaching GVN to optimize unordered loads aggressively. This change just handles block local FRE because that's the simplest thing which lets me test MDA, and the AvailableValue pieces. Somewhat suprisingly, MDA appears fine and only a couple of small changes are needed in GVN. Once this is in, I'll tackle non-local FRE and PRE. The former looks like a natural extension of this, the later will require a couple of minor changes. Differential Revision: http://reviews.llvm.org/D19440 llvm-svn: 268770	2016-05-06 18:17:13 +00:00
Mehdi Amini	31407ba009	Tweak the ThinLTO pass pipeline Summary: The original ThinLTO pipeline was derived from some work I did tuning FullLTO on the test suite and SPEC. This patch reduces the amount of work done in the "linker phase" of the build, and extend the function simplifications passes performed during the "compile phase". This helps the build time by reducing the IR as much as possible during the compile phase and limiting the work to be performed during the "link phase", while keeping the performance "on par" with the existing pipeline. Reviewers: tejohnson Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19773 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268769	2016-05-06 18:17:03 +00:00
Sanjay Patel	1cb6241a89	[SimplifyCFG] propagate branch metadata when creating select (retry r268550 / r268751 with possible fix) Retrying r268550/r268751 which were reverted at r268577/r268765 due a memory sanitizer failure. I have not been able to reproduce that failure, but I've taken another guess at fixing the problem in this version of the patch and will watch for another failure. Original commit message: Unlike earlier similar fixes, we need to recalculate the branch weights in this case. Differential Revision: http://reviews.llvm.org/D19674 llvm-svn: 268767	2016-05-06 18:07:46 +00:00
Sanjay Patel	84a0bf64a8	revert r268751 - caused same failures on msan bot llvm-svn: 268765	2016-05-06 17:51:37 +00:00
Sanjay Patel	6609510c32	[SimplifyCFG] propagate branch metadata when creating select (retry r268550 with possible fix) Retrying r268550 which was reverted at r268577 due a memory sanitizer failure. I have not been able to reproduce that failure, but I've taken a guess at fixing the problem in this version of the patch and will watch for another failure. Original commit message: Unlike earlier similar fixes, we need to recalculate the branch weights in this case. Differential Revision: http://reviews.llvm.org/D19674 llvm-svn: 268751	2016-05-06 17:07:47 +00:00
Chad Rosier	4ab37c0037	[SimplifyCFG] Prefer a simplification based on a dominating condition. Rather than merge two branches with a common destination. Differential Revision: http://reviews.llvm.org/D19743 llvm-svn: 268735	2016-05-06 14:25:14 +00:00
Ryan Govostes	6194ae69fe	Fix whitespace and line wrapping. NFC. llvm-svn: 268725	2016-05-06 11:22:11 +00:00
Ryan Govostes	3f37df0326	[asan] add option to set shadow mapping offset Allowing overriding the default ASAN shadow mapping offset with the -asan-shadow-offset option, and allow zero to be specified for both offset and scale. Patch by Aaron Carroll <aaronc@apple.com>. llvm-svn: 268724	2016-05-06 10:25:22 +00:00
Mehdi Amini	3b132e34b0	ThinLTO: fix assertion and refactor check for hidden use from inline ASM in a helper function This test was crashing, and currently it breaks bootstrapping clang with debuginfo Differential Revision: http://reviews.llvm.org/D20008 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268715	2016-05-06 08:25:33 +00:00
Xinliang David Li	8aebf44c97	[PM] port IR based PGO prof-gen pass to new pass manager llvm-svn: 268710	2016-05-06 05:49:19 +00:00
Philip Reames	32b55181fa	[EarlyCSE] Rename a variable for clarity [NFC] llvm-svn: 268701	2016-05-06 01:13:58 +00:00
Davide Italiano	f54f2f0893	[PM] Port Interprocedural SCCP to the new pass manager. llvm-svn: 268684	2016-05-05 21:05:36 +00:00
Dehao Chen	f50c67ce7c	Revert http://reviews.llvm.org/D19926 as it breaks tests. llvm-svn: 268681	2016-05-05 20:47:53 +00:00
Dehao Chen	e48b4ee98c	Simplify CFG before assigning discriminator. Summary: We need to clean up CFG before assigning discriminator to minimize the impact of optimization on debug info. Reviewers: davidxl, dblaikie, dnovillo Subscribers: dnovillo, danielcdh, llvm-commits Differential Revision: http://reviews.llvm.org/D19926 llvm-svn: 268675	2016-05-05 20:18:49 +00:00
Marcin Koscielnicki	60061c21cb	[MSan] [MIPS64] Fix vararg helper for >1 fixed argument. This fixes http://llvm.org/PR27646 on Mips64. Differential Revision: http://reviews.llvm.org/D19989 llvm-svn: 268673	2016-05-05 20:13:17 +00:00
Vitaly Buka	1df2338bb6	Revert "[ThinLTO] Emit individual index files for distributed backends" MemorySanitizer: use-of-uninitialized-value in lib/Bitcode/Writer/BitcodeWriter.cpp:364:70 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/12544/steps/check-llvm%20msan/logs/stdio This reverts commit 0c4a898ea550699d1b2f4fe3767251c8f9a48d52. llvm-svn: 268660	2016-05-05 18:31:00 +00:00
Chad Rosier	b438a327d7	Remove dead include. NFC. llvm-svn: 268655	2016-05-05 17:55:51 +00:00
Chad Rosier	799e4c6fc3	Remove dead include. NFC. llvm-svn: 268654	2016-05-05 17:53:43 +00:00
Silviu Baranga	28eb344140	Fix unused variable warning after r268632 llvm-svn: 268634	2016-05-05 15:27:57 +00:00
Silviu Baranga	c05bab8a9c	[LV] Identify more induction PHIs by coercing expressions to AddRecExprs Summary: Some PHIs can have expressions that are not AddRecExprs due to the presence of sext/zext instructions. In order to prevent the Loop Vectorizer from bailing out when encountering these PHIs, we now coerce the SCEV expressions to AddRecExprs using SCEV predicates (when possible). We only do this when the alternative would be to not vectorize. Reviewers: mzolotukhin, anemet Subscribers: mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17153 llvm-svn: 268633	2016-05-05 15:20:39 +00:00
Silviu Baranga	7e0d4353f2	[LV] Refactor the validation of PHI inductions. NFC This moves the validation of PHI inductions into a separate method, making it easier to reuse this logic. llvm-svn: 268632	2016-05-05 15:14:01 +00:00
Teresa Johnson	9254ebe3c0	[ThinLTO] Emit individual index files for distributed backends Summary: When launching ThinLTO backends in a distributed build (currently supported in gold via the thinlto-index-only plugin option), emit an individual index file for each backend process as described here: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098272.html The individual index file encodes the summary and module information required for implementing the importing/exporting decisions made for a given module in the thin link step. This is in place of the current mechanism that uses the combined index to make importing decisions in each back end independently. It is an enabler for doing global summary based optimizations in the thin link step (which will be recorded in the individual index files), and reduces the size of the index that must be sent to each backend process, and the amount of work to scan it in the backends. Rather than create entirely new ModuleSummaryIndex structures (and all the included unique_ptrs) for each backend index file, a map is created to record all of the GUID and summary pointers needed for a particular index file. The IndexBitcodeWriter walks this map instead of the full index (hiding the details of managing the appropriate summary iteration in a new iterator subclass). This is more efficient than walking the entire combined index and filtering out just the needed summaries during each backend bitcode index write. Depends on D19481. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19556 llvm-svn: 268627	2016-05-05 13:44:56 +00:00
Davide Italiano	344e838fea	[PM] Port EliminateAvailableExternally pass to the new pass manager. llvm-svn: 268599	2016-05-05 02:37:32 +00:00
Ryan Govostes	8c21be6b3e	Revert "[asan] add option to set shadow mapping offset" This reverts commit ba89768f97b1d4326acb5e33c14eb23a05c7bea7. llvm-svn: 268588	2016-05-05 01:27:04 +00:00
Ryan Govostes	097c5b051c	[asan] add option to set shadow mapping offset Allowing overriding the default ASAN shadow mapping offset with the -asan-shadow-offset option, and allow zero to be specified for both offset and scale. llvm-svn: 268586	2016-05-05 01:14:39 +00:00
Dehao Chen	d55bc4c7ab	clang-format some files in preparation of coming patch reviews. llvm-svn: 268583	2016-05-05 00:54:54 +00:00
Davide Italiano	164b9bc6fe	[PM] Port ConstantMerge to the new pass manager. llvm-svn: 268582	2016-05-05 00:51:09 +00:00
Adam Nemet	3c5eabfcbc	[LoopDataPrefetch] Add optimization remark With -Rpass=loop-data-prefetch, show the memory access that got prefetched. llvm-svn: 268578	2016-05-05 00:08:15 +00:00
Vitaly Buka	fdcea9d78a	Revert "[SimplifyCFG] propagate branch metadata when creating select" MemorySanitizer: use-of-uninitialized-value 0x4910e47 in count /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/Support/MathExtras.h:159:12 0x4910e47 in countLeadingZeros<unsigned long> /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/Support/MathExtras.h:183 0x4910e47 in FitWeights /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:855 0x4910e47 in SimplifyCondBranchToCondBranch /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:2895 This reverts commit 609f4dd4bf3bc735c8c047a4d4b0a8e9e4d202e2. llvm-svn: 268577	2016-05-04 23:59:33 +00:00
Davide Italiano	a7f5e88932	Revert "[SCCP] Throw away dead code. NFC." This reverts commit r268568, as it broke the bots. llvm-svn: 268570	2016-05-04 23:27:13 +00:00
Davide Italiano	fc1214fee2	[SCCP] Throw away dead code. NFC. llvm-svn: 268568	2016-05-04 23:05:59 +00:00
Balaram Makam	569eaec5f3	"Reapply r268521 "[InstCombine] Canonicalize icmp instructions based on dominating conditions."" This reapplies commit r268521, that was reverted in r268530 due to a test failure in select-implied.ll Modified the test case to reflect the new change. llvm-svn: 268557	2016-05-04 21:32:14 +00:00
Sanjay Patel	7e8c285814	[SimplifyCFG] propagate branch metadata when creating select Unlike earlier similar fixes, we need to recalculate the branch weights in this case. Differential Revision: http://reviews.llvm.org/D19674 llvm-svn: 268550	2016-05-04 20:48:24 +00:00
Balaram Makam	31e7e13789	Revert "[InstCombine] Canonicalize icmp instructions based on dominating conditions." This reverts commit 573a40f79b35cf3e71db331bb00f6a84f03b835d. llvm-svn: 268530	2016-05-04 18:37:35 +00:00
Balaram Makam	cf3bcb2625	[InstCombine] Canonicalize icmp instructions based on dominating conditions. Summary: This patch canonicalizes conditions based on the constant range information of the dominating branch condition. For example: %cmp = icmp slt i64 %a, 0 br i1 %cmp, label %land.lhs.true, label %lor.rhs lor.rhs: %cmp2 = icmp sgt i64 %a, 0 Would now be canonicalized into: %cmp = icmp slt i64 %a, 0 br i1 %cmp, label %land.lhs.true, label %lor.rhs lor.rhs: %cmp2 = icmp ne i64 %a, 0 Reviewers: mcrosier, gberry, t.p.northover, llvm-commits, reames, hfinkel, sanjoy, majnemer Subscribers: MatzeB, majnemer, mcrosier Differential Revision: http://reviews.llvm.org/D18841 llvm-svn: 268521	2016-05-04 17:34:20 +00:00
Hans Wennborg	0c3518e84b	[SimplifyCFG] isSafeToSpeculateStore now ignores debug info This patch fixes PR27615. @llvm.dbg.value instructions no longer count towards the maximum number of instructions to look back at in the instruction list when searching for a store instruction. This should make the output consistent between debug and non-debug build. Patch by Henric Karlsson <henric.karlsson@ericsson.com>! Differential Revision: http://reviews.llvm.org/D19912 llvm-svn: 268512	2016-05-04 15:40:57 +00:00
Chad Rosier	7ab9a7b203	Use a uniform name for the load combine pass. NFC. llvm-svn: 268507	2016-05-04 15:19:02 +00:00
Igor Laevsky	fb1811d3a0	[RS4GC] Use SetVector/MapVector instead of DenseSet/DenseMap to guarantee stable ordering Goal of this change is to guarantee stable ordering of the statepoint arguments and other newly inserted values such as gc.relocates. Previously we had explicit sorting in a couple of places. However for unnamed values ordering was partial and overall we didn't have any strong invariant regarding it. This change switches all data structures to use SetVector's and MapVector's which provide possibility for deterministic iteration over them. Explicit sorting is now redundant and was removed. Differential Revision: http://reviews.llvm.org/D19669 llvm-svn: 268502	2016-05-04 14:55:36 +00:00
Davide Italiano	17da174b8b	[IPO/ConstantMerge] Convert to static function, to facilitate transition to the new PM. llvm-svn: 268476	2016-05-04 03:21:20 +00:00
David Majnemer	95549497ec	[GlobalDCE, Misc] Don't remove functions referenced by ifuncs We forgot to consider the target of ifuncs when considering if a function was alive or dead. N.B. Also update a few auxiliary tools like bugpoint and verify-uselistorder. This fixes PR27593. llvm-svn: 268468	2016-05-04 00:20:48 +00:00
Andrew Kaylor	50271f787e	Add opt-bisect support to additional passes that can be skipped Differential Revision: http://reviews.llvm.org/D19882 llvm-svn: 268457	2016-05-03 22:32:30 +00:00
Justin Bogner	d0d2341f30	PM: Port LoopRotation to the new loop pass manager llvm-svn: 268452	2016-05-03 22:02:31 +00:00
Justin Bogner	ab6a513b4e	PM: Port LoopSimplifyCFG to the new pass manager llvm-svn: 268446	2016-05-03 21:47:32 +00:00
Davide Italiano	c91e0b2fde	[IPO/ConstantMerge] Garbage collect dead code. NFC. llvm-svn: 268442	2016-05-03 21:30:10 +00:00
Davide Italiano	296d12cd40	[IPO/IPCP] Convert to use static functions. NFC. In preparation for porting this pass to the new PM. llvm-svn: 268429	2016-05-03 20:08:24 +00:00
Davide Italiano	66228c4cf1	[IPO/GlobalDCE] Port to the new pass manager. Differential Revision: http://reviews.llvm.org/D19782 llvm-svn: 268425	2016-05-03 19:39:15 +00:00
Jack Liu	f101c0f7a1	[SROA] Function canConvertValue needs to check whether both NewTy and OldTy pointers are pointing to the same addr space. This can prevent SROA from creating a bitcast between pointers with different addr spaces. Differential Revision: http://reviews.llvm.org/D19697 llvm-svn: 268424	2016-05-03 19:30:48 +00:00
Jack Liu	430e2c2140	Revert 268409 due to missing comment. llvm-svn: 268421	2016-05-03 19:15:02 +00:00
Jack Liu	1ff4a0b7ee	(no commit message) llvm-svn: 268409	2016-05-03 18:01:43 +00:00
Sanjoy Das	4ae3920c5b	[LICM] Kill SCEV loop dispositions if needed SCEV caches whether SCEV expressions are loop invariant, variant or computable. LICM breaks this cache, almost by definition; so clear the SCEV disposition cache if LICM changed anything. llvm-svn: 268408	2016-05-03 17:50:11 +00:00
Sanjoy Das	7e7a5a050a	Use all_of instead of a raw loop; NFC Added some tests despite being NFC, since it looks like nothing was exercising the "all incoming values to exit PHIs are same" logic. llvm-svn: 268407	2016-05-03 17:50:06 +00:00
Sanjoy Das	905fc27ebf	[LoopDeletion] Clear SCEV loop dispositions `Loop::makeLoopInvariant` can hoist instructions out of loops, so loop dispositions for the loop it operated on may need to be cleared. We can be smarter here (especially around how `forgetLoopDispositions` is implemented), but let's be correct first. Fixes PR27570. llvm-svn: 268406	2016-05-03 17:50:02 +00:00
Vedant Kumar	43cba7333c	[ProfileData] Add error codes for compression failures Be more specific in describing compression failures. Also, check for this kind of error in emitNameData(). This is part of a series of patches to transition ProfileData over to the stricter Error/Expected interface. llvm-svn: 268400	2016-05-03 16:53:17 +00:00
Mehdi Amini	7f7d8be518	Move "Eliminate Available Externally" immediately after the inliner This pass is supposed to reduce the size of the IR for compile time purpose. We should run it ASAP, except when we prepare for LTO or ThinLTO, and we want to keep them available for link-time inline. Differential Revision: http://reviews.llvm.org/D19813 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268394	2016-05-03 15:46:00 +00:00
Kristof Beyls	c08f70588d	Mark that SpeculativeExecution preserves Globals Alias Analysis. A few benchmarks with lots of accesses to global variables in the hot loops regressed a lot since r266399, which added the SpeculativeExecution pass to the default pipeline. The problem is that this pass doesn't mark Globals Alias Analysis as preserved. Globals Alias Analysis is computed in a module pass, whereas SpeculativeExecution is a function pass, and a lot of passes dependent on the Globals Alias Analysis to optimize these benchmarks are also function passes. As such, the Globals Alias Analysis information cannot be recomputed between SpeculativeExecution and the following function passes needing that information. SpeculativeExecution doesn't invalidate Globals Alias Analysis, so mark it as such to fix those performance regressions. Differential Revision: http://reviews.llvm.org/D19806 llvm-svn: 268370	2016-05-03 08:33:26 +00:00
David Majnemer	3d90bb79c4	[LoopUnroll] Unroll loops which have exit blocks to EH pads We were overly cautious in our analysis of loops which have invokes which unwind to EH pads. The loop unroll transform is safe because it only clones blocks in the loop body, it does not try to split critical edges involving EH pads. Instead, move the necessary safety check to LoopUnswitch. N.B. The safety check for loop unswitch is covered by an existing test which fails without it. llvm-svn: 268357	2016-05-03 03:57:40 +00:00
Mehdi Amini	5b85d8d67b	ThinLTO: do not import function whose linkage prevents inlining. There is not point in importing a "weak" or a "linkonce" function since we won't be able to inline it anyway. We already had a targeted check for WeakAny, this is using the same check on GlobalValue as the inline, i.e. isMayBeOverriddenLinkage() From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268341	2016-05-03 00:27:28 +00:00
Mehdi Amini	1e918c9cb3	Revert "ThinLTO: do not import function whose linkage prevents inlining." This reverts commit r268315, the tests are not passing. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268317	2016-05-02 22:26:04 +00:00
Mehdi Amini	bda9b2ae9e	ThinLTO: do not import function whose linkage prevents inlining. There is not point in importing a "weak" or a "linkonce" function since we won't be able to inline it anyway. We already had a targeted check for WeakAny, this is using the same check on GlobalValue as the inline, i.e. isMayBeOverriddenLinkage() From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268315	2016-05-02 22:11:27 +00:00
Xinliang David Li	5ad7c820fc	Code refactoring -- preparation for new PM porting /NFC llvm-svn: 268301	2016-05-02 20:33:59 +00:00
Reid Kleckner	bca59d2a43	Revert "[SimplifyCFG] Extend TryToSimplifyUncondBranchFromEmptyBlock for empty block including lifetime intrinsics" This reverts commit r268254. This change causes assertion failures while building Chromium. Reduced test case coming soon. llvm-svn: 268288	2016-05-02 19:43:22 +00:00
Chad Rosier	fcb2210812	Typo. NFC. llvm-svn: 268280	2016-05-02 19:06:04 +00:00
Chad Rosier	4466ff50eb	Use false rather than 0 for a boolean value. NFC. llvm-svn: 268279	2016-05-02 19:06:02 +00:00
Mehdi Amini	0ddf404cf4	ReversePostOrderFunctionAttrs is not modifying the call graph, let's preserve it. When running cc1 with -flto=thin, it is followed by GlobalOpt, which requires the callgraph. This saves rebuilding one. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268266	2016-05-02 18:03:33 +00:00
Hans Wennborg	b7599329fc	[SimplifyCFG] Extend TryToSimplifyUncondBranchFromEmptyBlock for empty block including lifetime intrinsics Make it possible that TryToSimplifyUncondBranchFromEmptyBlock merges empty basic block including lifetime intrinsics as well as phi nodes and unconditional branch into its successor or predecessor(s). If successor of empty block has single predecessor, all contents including lifetime intrinsics are sinked into the successor. Otherwise, they are hoisted into its predecessor(s) and then merged into the predecessor(s). Patch by Josh Yoon <josh.yoon@samsung.com>! Differential Revision: http://reviews.llvm.org/D19257 llvm-svn: 268254	2016-05-02 17:22:54 +00:00
Mehdi Amini	45c7b3ecb5	Move createReversePostOrderFunctionAttrsPass right after the inliner is done This is where it was originally, until LoopVersioningLICM was inserted before in r259986, I don't believe it was on purpose. Differential Revision: http://reviews.llvm.org/D19809 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 268252	2016-05-02 16:53:16 +00:00
Adam Nemet	d02872c7b4	[LLE] Fix typo from r263058 This was meant to check unit stride for both the load and the store. Thanks to Roman Shirokiy for noticing this. llvm-svn: 268251	2016-05-02 16:52:00 +00:00
Simon Pilgrim	ca140b17cb	[InstCombine][SSE] Added support to VPERMD/VPERMPS to shuffle combine to accept UNDEF elements. llvm-svn: 268206	2016-05-01 20:43:02 +00:00
Simon Pilgrim	eeacc40e27	[InstCombine][SSE] Added support to VPERMILVAR to shuffle combine to accept UNDEF elements. llvm-svn: 268204	2016-05-01 20:22:42 +00:00
Simon Pilgrim	e5e8c2fde0	[InstCombine][SSE] Added support to PSHUFB to shuffle combine to accept UNDEF elements. llvm-svn: 268202	2016-05-01 19:26:21 +00:00
Simon Pilgrim	8cddf8b3c6	[InstCombine][AVX2] Combine VPERMD/VPERMPS intrinsics with constant masks to shufflevector. llvm-svn: 268199	2016-05-01 16:41:22 +00:00
Marcin Koscielnicki	57290f934a	[ASan] Add shadow offset for SystemZ. SystemZ on Linux currently has 53-bit address space. In theory, the hardware could support a full 64-bit address space, but that's not supported due to kernel limitations (it'd require 5-level page tables), and there are no plans for that. The default process layout stays within first 4TB of address space (to avoid creating 4-level page tables), so any offset >= (1 << 42) is fine. Let's use 1 << 52 here, ie. exactly half the address space. I've originally used 7 << 50 (uses top 1/8th of the address space), but ASan runtime assumes there's some space after the shadow area. While this is fixable, it's simpler to avoid the issue entirely. Also, I've originally wanted to have the shadow aligned to 1/8th the address space, so that we can use OR like X86 to assemble the offset. I no longer think it's a good idea, since using ADD enables us to load the constant just once and use it with register + register indexed addressing. Differential Revision: http://reviews.llvm.org/D19650 llvm-svn: 268161	2016-04-30 09:57:34 +00:00
Simon Pilgrim	640f9964c7	[InstCombine][AVX] VPERMILVAR to shuffle combine to use general aggregate elements. NFCI. Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements. llvm-svn: 268158	2016-04-30 07:23:30 +00:00
Sanjoy Das	47cf2affbd	[LowerGuardIntrinsics] Keep track of !make.implicit metadata If a guard call being lowered by LowerGuardIntrinsics has the `!make.implicit` metadata attached, then reattach the metadata to the branch in the resulting expanded form of the intrinsic. This allows us to implement null checks as guards and still get the benefit of implicit null checks. llvm-svn: 268148	2016-04-30 00:55:59 +00:00
Lawrence Hu	1befea2bdc	Reroll loops with multiple IV and negative step part 3 support multiple induction variables This patch enable loop reroll for the following case: for(int i=0; i<N; i += 2) { S += a++; S += a++; }; Differential Revision: http://reviews.llvm.org/D16550 llvm-svn: 268147	2016-04-30 00:51:22 +00:00
Sanjoy Das	52c68bb0f5	[LowerGuardIntrinsics] Preserve calling conv when lowering llvm-svn: 268142	2016-04-30 00:17:47 +00:00
Xinliang David Li	4b2fdccad9	Reapply r268107 after fixing a bug breaks debug build. Makes the new method to set data needed by debug dump. llvm-svn: 268130	2016-04-29 22:59:36 +00:00
Sanjoy Das	107aefc2fc	Mark guards on true as "trivially dead" This moves some logic added to EarlyCSE in rL268120 into `llvm::isInstructionTriviallyDead`. Adds a test case for DCE to demonstrate that passes other than EarlyCSE can now pick up on the new information. llvm-svn: 268126	2016-04-29 22:23:16 +00:00
Sanjoy Das	ee81b23fe7	[EarlyCSE] Simplify guard intrinsics Summary: This change teaches EarlyCSE some basic properties of guard intrinsics: - Guard intrinsics read all memory, but don't write to any memory - After a guard has executed, the condition it was guarding on can be assumed to be true - Guard intrinsics on a constant `true` are no-ops Reviewers: reames, hfinkel Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19578 llvm-svn: 268120	2016-04-29 21:52:58 +00:00
Xinliang David Li	0552521b03	Revert r268107 -- debug build failure llvm-svn: 268116	2016-04-29 21:43:28 +00:00
Simon Pilgrim	bf60cc492c	[InstCombine][SSE] PSHUFB to shuffle combine to use general aggregate elements. NFCI. Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements. llvm-svn: 268115	2016-04-29 21:34:54 +00:00
Xinliang David Li	1ffa28a3f1	[inliner]: Refactor inline deferring logic into its own method /NFC The implemented heuristic has a large body of code which better sits in its own function for better readability. It also allows adding more heuristics easier in the future. llvm-svn: 268107	2016-04-29 21:21:44 +00:00
Chad Rosier	cd62bf5821	[InstCombine] Determine the result of a select based on a dominating condition. Differential Revision: http://reviews.llvm.org/D19550 llvm-svn: 268104	2016-04-29 21:12:31 +00:00
Sanjay Patel	9190b4add8	[InstCombine] clean up; NFC llvm-svn: 268099	2016-04-29 20:54:56 +00:00
George Burgess IV	1b1fef30d0	[MemorySSA] Fix bugs in walker; refactor unittests a bit. This patch fixes two somewhat related bugs in MemorySSA's caching walker. These bugs were found because D19695 brought up the problem that we'd have defs cached to themselves, which is incorrect. The bugs this fixes are: - We would sometimes skip the nearest clobber of a MemoryAccess, because we would query our cache for a given potential clobber before checking if the potential clobber is the clobber we're looking for. The cache entry for the potential clobber would point to the nearest clobber of the potential clobber, so if that was a cache hit, we'd ignore the potential clobber entirely. - There are times (sometimes in DFS, sometimes in the getClobbering... functions) where we would insert cache entries that say a def clobbers itself. There's a bit of common code between the fixes for the bugs, so they aren't split out into multiple commits. This patch also adds a few unit tests, and refactors existing tests a bit to reduce the duplication of setup code. llvm-svn: 268087	2016-04-29 18:42:55 +00:00
Dehao Chen	21aefaec97	Do not read callee name when matching IR to profile as it is not used. Summary: Callee name is not used to identify a callsite now, so do not read it during annotation. Reviewers: davidxl, dnovillo Subscribers: dnovillo, danielcdh, llvm-commits Differential Revision: http://reviews.llvm.org/D19704 llvm-svn: 268069	2016-04-29 17:19:10 +00:00
Sanjay Patel	d5b0e54b49	[InstCombine] add helper function for ICmp with constant canonicalization; NFCI As suggested in http://reviews.llvm.org/D17859 , we should enhance this to support vectors. llvm-svn: 268059	2016-04-29 16:22:25 +00:00
Filipe Cabecinhas	0da9937517	Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them. Summary: Historically, we had a switch in the Makefiles for turning on "expensive checks". This has never been ported to the cmake build, but the (dead-ish) code is still around. This will also make it easier to turn it on in buildbots. Reviewers: chandlerc Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits Differential Revision: http://reviews.llvm.org/D19723 llvm-svn: 268050	2016-04-29 15:22:48 +00:00
David Majnemer	fadc6db036	[GlobalOpt] Propagate operand bundles We neglected to transfer operand bundles for some transforms. These were found via inspection, I'll try to come up with some test cases. llvm-svn: 268011	2016-04-29 08:07:22 +00:00
David Majnemer	231a68cc22	[InstCombine] Propagate operand bundles We neglected to transfer operand bundles for some transforms. These were found via inspection, I'll try to come up with some test cases. llvm-svn: 268010	2016-04-29 08:07:20 +00:00
David Majnemer	1a5799fe3e	[DeadArgumentElimination] Propagate operand bundles to promoted call sites We neglected to transfer operand bundles when performing argument promotion. llvm-svn: 268008	2016-04-29 07:22:36 +00:00
Adam Nemet	88ec491830	[LoopDist] Also emit optimization remark on success (-Rpass=) The option -Rpass=loop-distribute now reports the loops that were distributed. llvm-svn: 268006	2016-04-29 07:10:46 +00:00
Adam Nemet	4338d6769e	[LoopDist] Pass 'Function' to main class. NFC Next patch will add another use for 'Function' inside the class. llvm-svn: 268005	2016-04-29 07:10:39 +00:00
David Majnemer	13d5526392	[SLPVectorizer] Add operand bundles to vectorized functions SLPVectorizing a call site should result in further propagation of its bundles. llvm-svn: 268004	2016-04-29 07:09:51 +00:00
David Majnemer	50ddc0e1b6	[LoopVectorize] Add operand bundles to vectorized functions Also, do not crash when calculating a cost model for loop-invariant token values. llvm-svn: 268003	2016-04-29 07:09:48 +00:00
David Majnemer	cd24bb1d3a	[ArgumentPromotion] Propagate operand bundles to promoted call sites We neglected to transfer operand bundles when performing argument promotion. This fixes PR27568. llvm-svn: 267986	2016-04-29 04:56:12 +00:00
Michael Zolotukhin	1816d03b7d	[PR25281] Remove AAResultsWrapper from preserved analyses of loop vectorizer. We don't preserve AAResults, because, for one, we don't preserve SCEV-AA. That fixes PR25281. llvm-svn: 267980	2016-04-29 03:31:25 +00:00
Ivan Krasin	8dafa2da8e	Fix build by casting to the proper int type. Reviewers: eugenis Differential Revision: http://reviews.llvm.org/D19706 llvm-svn: 267974	2016-04-29 02:09:57 +00:00
Hal Finkel	1b66f7e3c8	[LoopVectorize] Keep hints from original loop on the vector loop We need to keep loop hints from the original loop on the new vector loop. Failure to do this meant that, for example: void foo(int *b) { #pragma clang loop unroll(disable) for (int i = 0; i < 16; ++i) b[i] = 1; } this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the hints that unrolling should be disabled, and then we'd unroll it. llvm-svn: 267970	2016-04-29 01:27:40 +00:00
Evgeniy Stepanov	35f3e5e4e7	[msan] Handle vector compare x86 intrinsics. This handles SSE and SSE2 cmp_* and comiXX_* intrinsics. llvm-svn: 267966	2016-04-29 01:19:52 +00:00
Adam Nemet	0ba164bbcb	[LoopDist] Emit optimization remarks (-Rpass) I closely followed the precedents set by the vectorizer: With -Rpass-missed, the loop is reported with further details pointing to -Rpass--analysis. * -Rpass-analysis reports the details why distribution has failed. * Regardless of -Rpass*, when distribution fails for a loop where distribution was forced with the pragma, a warning is produced according to -Wpass-failed. In this case the analysis info is also printed even without -Rpass-analysis. llvm-svn: 267952	2016-04-28 23:08:32 +00:00
Adam Nemet	adeccf7658	[LoopDist] Improve debug messages The next patch will start using these for -Rpass-analysis so they won't be internal-only anymore. Move the 'Skipping; ' prefix that some of the message are using into the 'fail' function. We don't want to include this prefix in the -Rpass-analysis report. llvm-svn: 267951	2016-04-28 23:08:30 +00:00
Adam Nemet	7f38e1199a	[LoopDist] Add helper to print debug message when distribution fails. NFC This will form the basis to emit optimization remarks (-Rpass*). llvm-svn: 267950	2016-04-28 23:08:27 +00:00
Hal Finkel	50316d95a9	[Inliner] Preserve llvm.mem.parallel_loop_access metadata When inlining a call site with llvm.mem.parallel_loop_access metadata, this metadata needs to be propagated to all cloned memory-accessing instructions. Otherwise, inlining parts of the loop body will invalidate the annotation. With this functionality, we now vectorize the following as expected: void Body(int res, int c, int d, int p, int i) { res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } void Test(int res, int c, int d, int p, int n) { int i; #pragma clang loop vectorize(assume_safety) for (i = 0; i < 1600; i++) { Body(res, c, d, p, i); } } llvm-svn: 267949	2016-04-28 23:00:04 +00:00
Rong Xu	62d5e473ce	[PGO] Fix incorrect Twine usage in emitting optimization remarks. Should not store Twine objects to local variables. This is fixed the test failures with r267815 in VS2015 X64 build. llvm-svn: 267908	2016-04-28 17:49:56 +00:00
Rong Xu	08afb05491	Minor format change and fixing typos in the comments. NFC. llvm-svn: 267905	2016-04-28 17:31:22 +00:00
Arch D. Robison	0e61034018	[SLPVectorizer] Extend SLP Vectorizer to deal with aggregates. The refactoring portion part was done as r267748. http://reviews.llvm.org/D14185 llvm-svn: 267899	2016-04-28 16:11:45 +00:00
Chad Rosier	712b7d7630	[GVN] Minor code cleanup. NFC. Differential Revision: http://reviews.llvm.org/D18828 Patch by Aditya Kumar! llvm-svn: 267898	2016-04-28 16:00:15 +00:00
Geoff Berry	5ae272c2c1	[EarlyCSE] Change LoadValue field Value Data to Instruction Inst. NFC. Made in preparation for adding MemorySSA support to EarlyCSE. llvm-svn: 267893	2016-04-28 15:22:37 +00:00
Geoff Berry	354fac2a69	[EarlyCSE] Sort includes. NFC. Reviewers: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19617 llvm-svn: 267890	2016-04-28 14:59:27 +00:00
Ahmed Bougacha	17482a5696	[InstCombine] Remove trailing whitespace. NFC. r267873. llvm-svn: 267887	2016-04-28 14:36:07 +00:00
Simon Pilgrim	bd4a3be7d2	[InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits. This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero. Differential Revision: http://reviews.llvm.org/D19614 llvm-svn: 267873	2016-04-28 12:22:53 +00:00
Rong Xu	6e34c490ff	[PGO] Promote indirect calls to conditional direct calls with value-profile This patch implements the transformation that promotes indirect calls to conditional direct calls when the indirect-call value profile meta-data is available. Differential Revision: http://reviews.llvm.org/D17864 llvm-svn: 267815	2016-04-27 23:20:27 +00:00
Sanjay Patel	facf45a82f	[SimplifyCFG] propagate branch metadata when creating select There's no existing test for this path, and I don't know how to expose it in a regression test, but I'm assuming there's some reason this path exists. llvm-svn: 267813	2016-04-27 23:14:12 +00:00
Rong Xu	af5aebaa32	[PGO] Prohibit address recording if the function is both internal and COMDAT Differential Revision: http://reviews.llvm.org/D19515 llvm-svn: 267792	2016-04-27 21:17:30 +00:00
Ahmed Bougacha	ace97c1f7d	[LIR] Set attributes on memset_pattern16. "inferattrs" will deduce the attribute, but it will be too late for many optimizations. Set it ourselves when creating the call. Differential Revision: http://reviews.llvm.org/D17598 llvm-svn: 267762	2016-04-27 19:04:50 +00:00
Ahmed Bougacha	7f97193dd7	[LIR] Reuse variable. NFCI. llvm-svn: 267761	2016-04-27 19:04:46 +00:00
Ahmed Bougacha	44c19876c7	[InferAttrs] Mark memset_pattern16 params nocapture. Differential Revision: http://reviews.llvm.org/D19471 llvm-svn: 267760	2016-04-27 19:04:43 +00:00
Ahmed Bougacha	b0624a2cb4	[TLI] Unify LibFunc attribute inference. NFCI. Now the pass is just a tiny wrapper around the util. This lets us reuse the logic elsewhere (done here for BuildLibCalls) instead of duplicating it. The next step is to have something like getOrInsertLibFunc that also sets the attributes. Differential Revision: http://reviews.llvm.org/D19470 llvm-svn: 267759	2016-04-27 19:04:40 +00:00
Ahmed Bougacha	d765a82b54	[TLI] Unify LibFunc signature checking. NFCI. I tried to be as close as possible to the strongest check that existed before; cleaning these up properly is left for future work. Differential Revision: http://reviews.llvm.org/D19469 llvm-svn: 267758	2016-04-27 19:04:35 +00:00
Matthew Simpson	622b95be7b	[LV] Reallow positive-stride interleaved load groups with gaps We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751	2016-04-27 18:21:36 +00:00
Arch D. Robison	aca7c412b4	[SLPVectorizer] Refactor where MinVecRegSize and MaxVecRegSize live. This is the first of two commits for extending SLP Vectorizer to deal with aggregates. This commit merely refactors existing logic. http://reviews.llvm.org/D14185 llvm-svn: 267748	2016-04-27 17:46:25 +00:00
Matthew Simpson	e5dfb08fcb	[TTI] Add hook for vector extract with extension This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725	2016-04-27 15:20:21 +00:00
Teresa Johnson	df5ef8711f	[ThinLTO] Refine fix to avoid renaming of uses in inline assembly. Summary: Refine the workaround from r266877 that attempts to prevent renaming of locals in inline assembly, so that in addition to looking for a llvm.used local value, that there is at least one inline assembly call in the module. Otherwise, debug functions added to the llvm.used can block importing/exporting unnecessarily. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19573 llvm-svn: 267717	2016-04-27 14:19:38 +00:00
Artur Pilipenko	9bb6beabf4	isSafeToLoadUnconditionally support queries without a context This is required to use this function from isSafeToSpeculativelyExecute Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16231 llvm-svn: 267692	2016-04-27 11:00:48 +00:00
Adam Nemet	d2fa414718	[LoopDist] Add llvm.loop.distribute.enable loop metadata Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672	2016-04-27 05:28:18 +00:00
Vaivaswatha Nagaraj	08efb0efcd	[Cloning] cloneLoopWithPreheader(): add assert to ensure no sub-loops Summary: cloneLoopWithPreheader() does not update LoopInfo for sub-loop of the original loop being cloned. Add assert to ensure no sub-loops for loop being cloned. Reviewers: anemet, ashutosh.nema, hfinkel Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D15922 llvm-svn: 267671	2016-04-27 05:25:09 +00:00
Evgeny Stupachenko	23ce61b663	The patch fixes PR27392. Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662	2016-04-27 03:04:54 +00:00
Sanjoy Das	5253a089ba	Fix typo in comment; NFC llvm-svn: 267653	2016-04-27 01:44:31 +00:00
Mehdi Amini	b4e1e8297b	ThinLTO: do not promote GlobalVariable that have a specific section. Differential Revision: http://reviews.llvm.org/D18298 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267646	2016-04-27 00:32:13 +00:00
Matt Arsenault	ba437c67d2	SLSR: Use UnknownAddressSpace instead of 0 for pure arithmetic. In the case where isLegalAddressingMode is used for cases not related to addressing modes, such as pure adds and muls, it should not be using address space 0. LSR already passes -1 as the address space in these cases. llvm-svn: 267645	2016-04-27 00:32:09 +00:00
Adam Nemet	61399ac424	[LoopDist] Split main class. NFC This splits out the per-loop functionality from the Pass class. With this the fact whether the loop is forced-distribute with the new metadata/pragma can be cached in the per-loop class rather than passed around. llvm-svn: 267643	2016-04-27 00:31:03 +00:00
Justin Bogner	c2bf63d29d	PM: Port Reassociate to the new pass manager llvm-svn: 267631	2016-04-26 23:39:29 +00:00
Justin Bogner	cb8a21c88e	Reassociate: Convert another functor into a lambda. NFC Also move the explanatory comment with it. llvm-svn: 267628	2016-04-26 23:32:00 +00:00
Sanjay Patel	29dea0d230	[SimplifyCFG] propagate branch metadata when creating select llvm-svn: 267624	2016-04-26 23:15:48 +00:00
Sanjay Patel	d2d2aa52cd	[LowerExpectIntrinsic] make default likely/unlikely ratio bigger We need the default ratio to be sufficiently large that it triggers transforms based on block frequency info (BFI) and plays well with the recently introduced BranchProbability used by CGP. Differential Revision: http://reviews.llvm.org/D19435 llvm-svn: 267615	2016-04-26 22:23:38 +00:00
Justin Bogner	90744d215b	Reassociate: Simplify using lambdas. NFC llvm-svn: 267614	2016-04-26 22:22:18 +00:00
David Majnemer	abb9f55c80	Revert "[SimplifyLibCalls] sprintf doesn't copy null bytes" The destination buffer that sprintf uses is restrict qualified, we do not need to worry about derived pointers referenced via format specifiers. This reverts commit r267580. llvm-svn: 267605	2016-04-26 21:04:47 +00:00
Elena Demikhovsky	308a7eb0d2	Masked Store in Loop Vectorizer - bugfix Fixed a bug in loop vectorization with conditional store. Differential Revision: http://reviews.llvm.org/D19532 llvm-svn: 267597	2016-04-26 20:18:04 +00:00
Justin Bogner	4563a06cee	PM: Port Internalize to the new pass manager llvm-svn: 267596	2016-04-26 20:15:52 +00:00
David Majnemer	8cd77baebc	[SimplifyLibCalls] sprintf doesn't copy null bytes sprintf doesn't read or copy the terminating null byte from it's string operands. sprintf will append it's own after processing all of the format specifiers. This fixes PR27526. llvm-svn: 267580	2016-04-26 18:16:49 +00:00
Dehao Chen	5d6d4841ed	Tune basic block annotation algorithm. Summary: Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary. This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19301 llvm-svn: 267519	2016-04-26 04:59:11 +00:00
Hal Finkel	e4c0c1679b	[SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging When SimplifyCFG merges identical instructions from both sides of a diamond, it can preserve !llvm.mem.parallel_loop_access (as it does with most of the other metadata). There's no real data or control dependency change in this case. llvm-svn: 267515	2016-04-26 02:06:06 +00:00
Hal Finkel	411d31ad72	[LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops I really thought we were doing this already, but we were not. Given this input: void Test(int res, int c, int d, int p) { for (int i = 0; i < 16; i++) res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } we did not vectorize the loop. Even with "assume_safety" the check that we don't if-convert conditionally-executed loads (to protect against data-dependent deferenceability) was not elided. One subtlety: As implemented, it will still prefer to use a masked-load instrinsic (given target support) over the speculated load. The choice here seems architecture specific; the best option depends on how expensive the masked load is compared to a regular load. Ideally, using the masked load still reduces unnecessary memory traffic, and so should be preferred. If we'd rather do it the other way, flipping the order of the checks is easy. The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also implies that if conversion is okay. Differential Revision: http://reviews.llvm.org/D19512 llvm-svn: 267514	2016-04-26 02:00:36 +00:00
David Majnemer	30ffc4ce45	[SROA] Don't falsely report that changes have occured We would report that the function changed despite creating no new allocas or performing any promotion. This fixes PR27316. llvm-svn: 267507	2016-04-26 01:05:00 +00:00
Justin Bogner	1a07501379	PM: Port GlobalOpt to the new pass manager llvm-svn: 267499	2016-04-26 00:28:01 +00:00
Justin Bogner	d2f3d0a79d	PM: Convert the logic for GlobalOpt into static functions. NFC Pass all of the state we need around as arguments, so that these functions are easier to reuse. There is one part of this that is unusual: we pass around a functor to look up a DomTree for a function. This will be a necessary abstraction when we try to use this code in both the legacy and the new pass manager. llvm-svn: 267498	2016-04-26 00:27:56 +00:00
Arch D. Robison	be0490a6e8	Optimize store of "bitcast" from vector to aggregate. This patch is what was the "instcombine" portion of D14185, with an additional test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). The patch causes instcombine to replace sequences of extractelement-insertvalue-store that act essentially like a bitcast followed by a store. Differential review: http://reviews.llvm.org/D14260 llvm-svn: 267482	2016-04-25 22:22:39 +00:00
Teresa Johnson	c851d216e2	[ThinLTO] Introduce typedef for commonly-used map type (NFC) Add a typedef for the std::map<GlobalValue::GUID, GlobalValueSummary *> map that is passed around to identify summaries for values defined in a particular module. This shortens up declarations in a variety of places. llvm-svn: 267471	2016-04-25 21:09:51 +00:00
Etienne Bergeron	50f02aa3fa	Cleanup redundant expression in InstCombineAndOrXor. Summary: The expression is redundant on both side of operator \|. detected by : http://reviews.llvm.org/D19451 Reviewers: rnk, majnemer Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D19459 llvm-svn: 267458	2016-04-25 20:15:33 +00:00
Chad Rosier	e2cbd13e56	[ValueTracking] Improve isImpliedCondition when the dominating cond is false. llvm-svn: 267430	2016-04-25 17:23:36 +00:00
Anna Thomas	95f68aa7eb	Test commit: modified comment. NFC llvm-svn: 267406	2016-04-25 13:58:05 +00:00
James Molloy	eb040cc55f	[GlobalOpt] Allow constant globals to be SRA'd The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!). There seems to be no reason why we can't SRA constants too, so let's do it. llvm-svn: 267393	2016-04-25 10:48:29 +00:00
Mehdi Amini	bf4513b9aa	Run GlobalOpt before emitting the bitcode for ThinLTO This is motivated by reducing the size of the IR and thus reduce compile time. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267385	2016-04-25 08:47:49 +00:00
Mehdi Amini	f72ca86b71	ThinLTO: Move createNameAnonFunctionPass insertion in PassManagerBuilder (NFC) It is just code motion, but makes more sense this way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267384	2016-04-25 08:47:37 +00:00
Simon Pilgrim	4c564ad4dd	Tweak comments to make it clear that these combines are for SSE scalar instructions. llvm-svn: 267360	2016-04-24 19:31:56 +00:00
Simon Pilgrim	4b5462f119	[InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns. llvm-svn: 267359	2016-04-24 18:35:59 +00:00
Simon Pilgrim	83020942d3	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2) Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND). 2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements 3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns). Differential Revision: http://reviews.llvm.org/D19318 llvm-svn: 267357	2016-04-24 18:23:14 +00:00
Simon Pilgrim	424da1637a	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2) This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input) 2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input Differential Revision: http://reviews.llvm.org/D17490 llvm-svn: 267356	2016-04-24 18:12:42 +00:00
Simon Pilgrim	1c9a9f255c	[InstCombine] Avoid updating argument demanded elements in separate passes. As discussed on D17490, we should attempt to update an intrinsic's arguments demanded elements in one pass if we can. llvm-svn: 267355	2016-04-24 17:57:27 +00:00
Simon Pilgrim	2f6097d113	[X86][InstCombine] Tidyup VPERMILVAR -> shufflevector conversion to helper function. NFCI. llvm-svn: 267352	2016-04-24 17:23:46 +00:00
Simon Pilgrim	c0c56e747a	[X86][InstCombine] Tidyup PSHUFB -> shufflevector conversion to helper function. NFCI. llvm-svn: 267351	2016-04-24 17:00:34 +00:00
Teresa Johnson	28e457bccd	[ThinLTO] Remove GlobalValueInfo class from index Summary: Remove the GlobalValueInfo and change the ModuleSummaryIndex to directly reference summary objects. The info structure was there to support lazy parsing of the combined index summary objects, which is no longer needed and not supported. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19462 llvm-svn: 267344	2016-04-24 14:57:11 +00:00
Mehdi Amini	cb87494f4c	Always traverse GlobalVariable initializer when computing the export list Summary: We are always importing the initializer for a GlobalVariable. So if a GlobalVariable is in the export-list, we pull in any refs as well. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19102 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267303	2016-04-23 23:29:24 +00:00
Sanjay Patel	dc88bd6e1f	replace duplicated static functions for profile metadata access with BranchInst member function; NFCI llvm-svn: 267295	2016-04-23 20:01:22 +00:00
Sanjay Patel	85ce0f1f1f	improve documentation comments; NFC llvm-svn: 267292	2016-04-23 16:31:48 +00:00
Nico Weber	0aa9845d15	Revert r267210, it makes clang assert (PR27490). llvm-svn: 267232	2016-04-22 22:08:42 +00:00
Andrew Kaylor	aa641a5171	Re-commit optimization bisect support (r267022) without new pass manager support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231	2016-04-22 22:06:11 +00:00
Rong Xu	f8f051cbf5	[PGO] change the interface for createPGOFuncNameMetadata() This patch changes the interface for createPGOFuncNameMetadata() where we add another PGOFuncName argument. Differential Revision: http://reviews.llvm.org/D19433 llvm-svn: 267216	2016-04-22 21:00:17 +00:00
Philip Reames	5f0e36947b	[unordered] sink unordered stores at end of blocks The existing code turned out to be completely correct when auditted. Thus, only minor code changes and adding a couple of tests. llvm-svn: 267215	2016-04-22 20:53:32 +00:00
Sanjoy Das	f97229d6ba	Fold compares for distinct allocations Summary: We can fold compares to false when two distinct allocations within a function are compared for equality. Patch by Anna Thomas! Reviewers: majnemer, reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19390 llvm-svn: 267214	2016-04-22 20:52:25 +00:00
Philip Reames	eedef73b63	[unordered] Extend load/store type canonicalization to handle unordered operations Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case. llvm-svn: 267210	2016-04-22 20:33:48 +00:00
Justin Bogner	b93949089e	PM: Port SinkingPass to the new pass manager llvm-svn: 267199	2016-04-22 19:54:10 +00:00
Justin Bogner	82077c4ab0	PM: Reorder the functions used for SinkingPass. NFC This will make the port to the new PM easier to follow. llvm-svn: 267198	2016-04-22 19:54:04 +00:00
Jun Bum Lim	d29a24e4fd	[DeadStoreElimination] Shorten beginning of memset overwritten by later stores Summary: This change will shorten memset if the beginning of memset is overwritten by later stores. Reviewers: hfinkel, eeckstein, dberlin, mcrosier Subscribers: mgrang, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18906 llvm-svn: 267197	2016-04-22 19:51:29 +00:00
Justin Bogner	395c2127ed	PM: Port DCE to the new pass manager Also add a very basic test, since apparently there aren't any tests for DCE whatsoever to add the new pass version to. llvm-svn: 267196	2016-04-22 19:40:41 +00:00
Adam Nemet	fe3def7c2a	[LoopUtils] Extend findStringMetadataForLoop to return the value for metadata E.g. for: !1 = {"llvm.distribute", i32 1} it now returns the MDOperand for 1. I will use this in LoopDistribution to check the value of the metadata. Note that the change is backward-compatible with its current use in LoopVersioningLICM. An Optional implicitly converts to a bool depending whether it contains a value or not. llvm-svn: 267190	2016-04-22 19:10:05 +00:00
Chad Rosier	1a4bc110f5	[EarlyCSE/CVP] Add stats for CVPs and make sure to account for any Changes. llvm-svn: 267187	2016-04-22 18:47:21 +00:00
Geoff Berry	9fe26e6dc9	[MemorySSA] Fix bug in CachingMemorySSAWalker::invalidateInfo Summary: CachingMemorySSAWalker::invalidateInfo was using IsCall to determine which cache map needed to be cleared of entries referring to the invalidated MemoryAccess, but there could also be entries referring to it in the other cache map (value entries, not key entries). This change just clears both tables to be conservatively correct. Also add a verifyRemoved() function, called when expensive checks (i.e. XDEBUG) are enabled to verify that the invalidated MemoryAccess object is not referenced in any of the caches. Reviewers: dberlin, george.burgess.iv Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19388 llvm-svn: 267157	2016-04-22 14:44:10 +00:00
David Majnemer	bfd695d591	[EarlyCSE] Don't add the overflow flags to the hash We take the intersection of overflow flags while CSE'ing. This permits us to consider two instructions with different overflow behavior to be replaceable. llvm-svn: 267153	2016-04-22 14:12:50 +00:00
Silviu Baranga	e985c76b90	[InstCombine] Preserve fast math flags when combining PHIs Summary: When optimizing PHIs which have inputs floating point binary operators, we preserve all IR flags except the fast math flags. This change removes the logic which tracked some of the IR flags (no wrap, exact) and replaces it by doing an and on the IR flags of all inputs to the PHI - which will also handle the fast math flags. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19370 llvm-svn: 267139	2016-04-22 11:21:36 +00:00
Vedant Kumar	6013f45f92	Revert "Initial implementation of optimization bisect support." This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115	2016-04-22 06:51:37 +00:00
David Majnemer	d0ce8f1485	[GVN] Respect fast-math-flags on fcmps We assumed that flags were only present on binary operators. This is not true, they may also be present on calls and fcmps. llvm-svn: 267113	2016-04-22 06:37:51 +00:00
David Majnemer	9554c1339c	[EarlyCSE] Take the intersection of flags on instructions EarlyCSE had inconsistent behavior with regards to flag'd instructions: - In some cases, it would pessimize if the available instruction had different flags by not performing CSE. - In other cases, it would miscompile if it replaced an instruction which had no flags with an instruction which has flags. Fix this by being more consistent with our flag handling by utilizing andIRFlags. llvm-svn: 267111	2016-04-22 06:37:45 +00:00
Duncan P. N. Exon Smith	71480bd0c7	ValueMapper/Enumerator: Clean up code in post-order traversals, NFC Re-layer the functions in the new (i.e., newly correct) post-order traversals in ValueEnumerator (r266947) and ValueMapper (r266949). Instead of adding a node to the worklist in a helper function and returning a flag to say what happened, return the node itself. This makes the code way cleaner: the worklist is local to the main function, there is no flag for an early loop exit (since we can cleanly bury the loop), and it's perfectly clear when pointers into the worklist might be invalidated. I'm fixing both algorithms in the same commit to avoid repeating the commit message; if you take the time to understand one the other should be easy. The diff itself isn't entirely obvious since the traversals have some noise (i.e., things to do), but here's the high-level change: auto helper = [&WL](T Op) { auto helper = [](T &I, T E) { => while (I != E) { if (shouldVisit(Op)) { T Op = I++; WL.push(Op, Op->begin()); if (shouldVisit(Op)) { return true; return Op; } } return false; return nullptr; }; }; => WL.push(S, S->begin()); WL.push(S, S->begin()); while (!empty()) { while (!empty()) { auto N = WL.top().N; auto N = WL.top().N; auto &I = WL.top().I; auto &I = WL.top().I; bool DidChange = false; while (I != N->end()) if (helper(I++)) { => if (T *Op = helper(I, N->end()) { DidChange = true; WL.push(Op, Op->begin()); break; continue; } } if (DidChange) continue; POT.push(WL.pop()); => POT.push(WL.pop()); } } Thanks to Mehdi for helping me find a better way to layer this. llvm-svn: 267099	2016-04-22 02:33:06 +00:00
Mike Aizatsky	243b71fd8b	Fixed flag description Summary: asan-use-after-return control feature we call use-after-return or stack-use-after-return. Reviewers: kcc, aizatsky, eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19284 llvm-svn: 267064	2016-04-21 22:00:13 +00:00
Derek Bruening	d862c178b0	[esan] EfficiencySanitizer instrumentation pass Summary: Adds an instrumentation pass for the new EfficiencySanitizer ("esan") performance tuning family of tools. Multiple tools will be supported within the same framework. Preliminary support for a cache fragmentation tool is included here. The shared instrumentation includes: + Turn mem{set,cpy,move} instrinsics into library calls. + Slowpath instrumentation of loads and stores via callouts to the runtime library. + Fastpath instrumentation will be per-tool. + Which memory accesses to ignore will be per-tool. Reviewers: eugenis, vitalybuka, aizatsky, filcab Subscribers: filcab, vkalintiris, pcc, silvas, llvm-commits, zhaoqin, kcc Differential Revision: http://reviews.llvm.org/D19167 llvm-svn: 267058	2016-04-21 21:30:22 +00:00
JF Bastien	c22d29982b	NFC: fix copy / paste comment llvm-svn: 267039	2016-04-21 19:53:39 +00:00
JF Bastien	3e2e69f607	NFC: fix nonsensical comment llvm-svn: 267036	2016-04-21 19:41:48 +00:00
Sanjoy Das	a085cfc150	Folding compares with unescaped allocations Summary: If we know that the pointer allocated within a function does not escape, we can fold away comparisons that are done with global pointers Patch by Anna Thomas! Reviewers: reames, majnemer, sanjoy Subscribers: mgrang, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D19276 llvm-svn: 267035	2016-04-21 19:26:45 +00:00

... 7 8 9 10 11 ...

15688 Commits