Commit Graph

15688 Commits

Author SHA1 Message Date
Michael Kuperstein c5edcdeb0e [LV] Use vector phis for some secondary induction variables
Previously, we materialized secondary vector IVs from the primary scalar IV,
by offseting the primary to match the correct start value, and then broadcasting
it - inside the loop body. Instead, we can use a real vector IV, like we do for
the primary.

This enables using vector IVs for secondary integer IVs whose type matches the
type of the primary.

Differential Revision: http://reviews.llvm.org/D20932

llvm-svn: 272283
2016-06-09 18:03:15 +00:00
Xinliang David Li ecde1c7f3d Revert r272194 No need for it if loop Analysis Manager is used
llvm-svn: 272243
2016-06-09 03:22:39 +00:00
Teresa Johnson 7ab1f69272 [ThinLTO/gold] Enable summary-based internalization
Summary: Enable existing summary-based importing support in the gold-plugin.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21080

llvm-svn: 272239
2016-06-09 01:14:13 +00:00
Michael Zolotukhin 8e7e76729d [LoopSimplify] Preserve LCSSA when merging exit blocks.
Summary:
This fixes PR26682. Also add LCSSA as a preserved pass to LoopSimplify,
that looks correct to me and allows to write a test for the issue.

Reviewers: chandlerc, bogner, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21112

llvm-svn: 272224
2016-06-08 23:13:21 +00:00
Michael Zolotukhin aa547616d2 [LoopUnroll] Check that DT is available before trying to verify it.
llvm-svn: 272221
2016-06-08 22:49:59 +00:00
Michael Zolotukhin 987ab631fa [SLPVectorizer] Handle GEP with differing constant index types
Summary:
This fixes PR27617.

Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64.

The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert.

Reviewers: nadav, mzolotukhin

Subscribers: JesperAntonsson, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20685

Patch by Jesper Antonsson!

llvm-svn: 272206
2016-06-08 21:55:16 +00:00
Davide Italiano 02861d8695 [PM] Add missing caching of GlobalsAA to EarlyCSE.
llvm-svn: 272204
2016-06-08 21:31:55 +00:00
Sanjay Patel 3929313811 [InstCombine] move fold of select of add/sub to helper function; NFCI
llvm-svn: 272199
2016-06-08 21:10:01 +00:00
Sanjay Patel 384d0f219d [InstCombine] fix outdated comment, simplify logic; NFCI
llvm-svn: 272196
2016-06-08 20:31:52 +00:00
Evgeny Stupachenko 3e2f389a7e The patch set unroll disable pragma when unroll
with user specified count has been applied.

Summary:
Previously SetLoopAlreadyUnrolled() set the disable pragma only if
there was some loop metadata.
Now it set the pragma in all cases. This helps to prevent multiple
unroll when -unroll-count=N is given.

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D20765

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 272195
2016-06-08 20:21:24 +00:00
Xinliang David Li 572135f717 [PM] Refector LoopAccessInfo analysis code
This is the preparation patch to port the analysis to new PM

Differential Revision: http://reviews.llvm.org/D20560

llvm-svn: 272194
2016-06-08 20:15:37 +00:00
Sanjay Patel 10a2c38d83 [InstCombine] reduce indent; NFC
llvm-svn: 272193
2016-06-08 20:09:04 +00:00
Tim Shen 7aa0ad65ce [MemCpyOpt] Do not exchange llvm.lifetime.start and llvm.memcpy
Reviewers: iteratee

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21087

llvm-svn: 272192
2016-06-08 19:42:32 +00:00
Sanjay Patel 916f8a0cdb [InstCombine] use copyIRFlags() ; NFCI
llvm-svn: 272191
2016-06-08 19:33:52 +00:00
Benjamin Kramer c321e53402 Apply most suggestions of clang-tidy's performance-unnecessary-value-param
Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.

llvm-svn: 272190
2016-06-08 19:09:22 +00:00
Davide Italiano 2d5ab0a56a [PM] LoopSimplify. Remove unneeded pass dependencies. NFCI.
llvm-svn: 272140
2016-06-08 13:56:59 +00:00
Davide Italiano d8d83f4773 [PM/SimplifyCFG] Preserve GlobalsAA even if the IR is mutated.
llvm-svn: 272139
2016-06-08 13:32:23 +00:00
Benjamin Kramer 46e38f3678 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126
2016-06-08 10:01:20 +00:00
Davide Italiano 16e96d4b16 [PM] Preserve GlobalsAA for SROA.
Differential Revision:  http://reviews.llvm.org/D21040

llvm-svn: 272009
2016-06-07 13:21:17 +00:00
Simon Pilgrim db9893fb90 [InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to native shifts
Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit).

If the shift amount is constant we can sometimes convert these instructions to native shifts:

1 - if all shift amounts are in range then the conversion is trivial.
2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion.
3 - logical shifts just return zero if all elements have out of range shift amounts.

In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts.

Differential Revision: http://reviews.llvm.org/D19675

llvm-svn: 271996
2016-06-07 10:27:15 +00:00
Simon Pilgrim 91e3ac8293 [InstCombine][SSE] Add MOVMSK constant folding (PR27982)
This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions.

The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs.

Differential Revision: http://reviews.llvm.org/D20998

llvm-svn: 271990
2016-06-07 08:18:35 +00:00
Michael Kuperstein a0c6ae02a5 [InstCombine] scalarizePHI should not assume the code it sees has been CSE'd
scalarizePHI only looked for phis that have exactly two uses - the "latch"
use, and an extract. Unfortunately, we can not assume all equivalent extracts
are CSE'd, since InstCombine itself may create an extract which is a duplicate
of an existing one. This extends it to handle several distinct extracts from
the same index.

This should fix at least some of the  performance regressions from PR27988.

Differential Revision: http://reviews.llvm.org/D20983

llvm-svn: 271961
2016-06-06 23:38:33 +00:00
Davide Italiano fea0a4c5b2 [PM] Preserve the correct set of analyses for GVN.
llvm-svn: 271934
2016-06-06 20:01:50 +00:00
Davide Italiano 82c447823b [GVN] Switch dump() definition over to LLVM_DUMP_METHOD.
llvm-svn: 271932
2016-06-06 19:24:27 +00:00
Geoff Berry 43e5160d0e Reapply [LSR] Create fewer redundant instructions.
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs.  This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.

Originally reviewed in http://reviews.llvm.org/D18001

Reviewers: atrick

Subscribers: llvm-commits, mzolotukhin, mcrosier

Differential Revision: http://reviews.llvm.org/D18480

llvm-svn: 271929
2016-06-06 19:10:46 +00:00
Sanjay Patel 6a333c3ed9 [InstCombine] limit icmp transform to ConstantInt (PR28011)
In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check
above this to work for any Constant rather than ConstantInt. AFAICT, 
that part makes sense if we can determine that the shrunken/extended 
constant remained equal. But it doesn't make sense for this later 
transform where we assume that the constant DID change. 

This could assert for a ConstantExpr:
https://llvm.org/bugs/show_bug.cgi?id=28011

And it could be wrong for a vector as shown in the added regression test.

llvm-svn: 271908
2016-06-06 16:56:57 +00:00
Eli Friedman ee89505799 LICM: Don't sink stores out of loops that may throw.
Summary:
This hasn't been caught before because it requires noalias or similarly
strong alias analysis to actually reproduce.

Fixes http://llvm.org/PR27952 .

Reviewers: hfinkel, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20944

llvm-svn: 271858
2016-06-05 22:13:52 +00:00
Sanjoy Das b7e861a488 Add safety check to InstCombiner::commonIRemTransforms
Since FoldOpIntoPhi speculates the binary operation to potentially each
of the predecessors of the PHI node (pulling it out of arbitrary control
dependence in the process), we can FoldOpIntoPhi only if we know the
operation doesn't have UB.

This also brings up an interesting profitability question -- the way it
is written today, commonIRemTransforms will hoist out work from
dynamically dead code into code that will execute at runtime.  Perhaps
that isn't the best canonicalization?

Fixes PR27968.

llvm-svn: 271857
2016-06-05 21:17:04 +00:00
Sanjoy Das 4d4339d1e8 [PM] Port IndVarSimplify to the new pass manager
Summary:
There are some rough corners, since the new pass manager doesn't have
(as far as I can tell) LoopSimplify and LCSSA, so I've updated the
tests to run them separately in the old pass manager in the lit tests.
We also don't have an equivalent for AU.setPreservesCFG() in the new
pass manager, so I've left a FIXME.

Reviewers: bogner, chandlerc, davide

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20783

llvm-svn: 271846
2016-06-05 18:01:19 +00:00
Sanjoy Das f90e28d6fd [IndVars] Remove -liv-reduce
It is an off-by-default option that no one seems to use[0], and given
that SCEV directly understands the overflow instrinsics there is no real
need for it anymore.

[0]: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098181.html

llvm-svn: 271845
2016-06-05 18:01:12 +00:00
Sanjay Patel a6fbc82392 [InstCombine] allow vector icmp bool transforms
llvm-svn: 271843
2016-06-05 17:49:45 +00:00
Sanjay Patel 5f0217f42e fix documentation comments and other clean-ups; NFC
llvm-svn: 271839
2016-06-05 16:46:18 +00:00
Xinliang David Li 64dbb295b6 [PM] Port GCOVProfiler pass to the new pass manager
llvm-svn: 271823
2016-06-05 05:12:23 +00:00
Xinliang David Li fb3137c3b3 [PM] code refactoring /NFC
llvm-svn: 271822
2016-06-05 03:40:03 +00:00
Sanjay Patel 6f8f47b358 [InstCombine] less 'CI' confusion; NFC
Change the name of the ICmpInst to 'ICmp' and the Constant (was a ConstantInt) to 'C',
so that it's hopefully clearer that 'CI' refers to CastInst in this context.

While we're scrubbing, fix the documentation comment and use 'auto' with 'dyn_cast'.

llvm-svn: 271817
2016-06-05 00:12:32 +00:00
David Majnemer 2482e1c017 [SimplifyCFG] Don't kill empty cleanuppads with multiple uses
A basic block could contain:
  %cp = cleanuppad []
  cleanupret from %cp unwind to caller

This basic block is empty and is thus a candidate for removal.  However,
there can be other uses of %cp outside of this basic block.  This is
only possible in unreachable blocks.

Make our transform more correct by checking that the pad has a single
user before removing the BB.

This fixes PR28005.

llvm-svn: 271816
2016-06-04 23:50:03 +00:00
Sanjay Patel ea8a211169 [InstCombine] allow vector constants for cast+icmp fold
This is step 1 of unknown towards fixing PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001

llvm-svn: 271810
2016-06-04 22:04:05 +00:00
Sanjay Patel c774f8c265 clean-up; NFC
llvm-svn: 271807
2016-06-04 21:20:44 +00:00
Sanjay Patel 4c204230fc fix formatting, punctuation; NFC
llvm-svn: 271804
2016-06-04 20:39:22 +00:00
Simon Pilgrim fda22d66fc [InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX
Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614

Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type.

llvm-svn: 271789
2016-06-04 13:42:46 +00:00
Xinliang David Li 6c44e9e33d [pgo] extend r271532 to darwin platform
llvm-svn: 271746
2016-06-03 23:02:28 +00:00
Derek Bruening 9ef5772154 [esan|wset] Optionally assume intra-cache-line accesses
Summary:
Adds an option -esan-assume-intra-cache-line which causes esan to assume
that a single memory access touches just one cache line, even if it is not
aligned, for better performance at a potential accuracy cost.  Experiments
show that the performance difference can be 2x or more, and accuracy loss
is typically negligible, so we turn this on by default.  This currently
applies just to the working set tool.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits

Differential Revision: http://reviews.llvm.org/D20978

llvm-svn: 271743
2016-06-03 22:29:52 +00:00
Derek Bruening 4252a16c35 [esan] Specify which tool via a global variable
Summary:
Adds a global variable to specify the tool, to support handling early
interceptors that invoke instrumented code and require shadow memory to be
initialized prior to __esan_init() being invoked.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits

Differential Revision: http://reviews.llvm.org/D20973

llvm-svn: 271715
2016-06-03 19:40:37 +00:00
Sanjay Patel 6cf18af1c5 [InstCombine] look through bitcasts to find selects
There was concern that creating bitcasts for the simpler potential select pattern:

define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) {
  %a2 = add <2 x i64> %a, %a
  %sext = sext <4 x i1> %cmp to <4 x i32>
  %bc = bitcast <4 x i32> %sext to <2 x i64>
  %and = and <2 x i64> %a2, %bc
  ret <2 x i64> %and
}

might lead to worse code for some targets, so this patch is matching the larger
patterns seen in the test cases.

The motivating example for this patch is this IR produced via SSE intrinsics in C:

define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) {
  %t0 = bitcast <2 x i64> %a to <4 x i32>
  %t1 = bitcast <2 x i64> %b to <4 x i32>
  %cmp = icmp sgt <4 x i32> %t0, %t1
  %sext = sext <4 x i1> %cmp to <4 x i32>
  %t2 = bitcast <4 x i32> %sext to <2 x i64>
  %and = and <2 x i64> %t2, %a
  %neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1>
  %neg2 = bitcast <4 x i32> %neg to <2 x i64>
  %and2 = and <2 x i64> %neg2, %b
  %or = or <2 x i64> %and, %and2
  ret <2 x i64> %or
}

For an AVX target, this is currently:

vpcmpgtd  %xmm1, %xmm0, %xmm2
vpand     %xmm0, %xmm2, %xmm0
vpandn    %xmm1, %xmm2, %xmm1
vpor      %xmm1, %xmm0, %xmm0
retq

With this patch, it becomes:

vpmaxsd   %xmm1, %xmm0, %xmm0

Differential Revision: http://reviews.llvm.org/D20774

llvm-svn: 271676
2016-06-03 14:42:07 +00:00
Qin Zhao c14c249343 [esan|cfrag] Instrument GEP instr for struct field access.
Summary:
Instrument GEP instruction for counting the number of struct field
address calculation to approximate the number of struct field accesses.

Adds test struct_field_count_basic.ll to test the struct field
instrumentation.

Reviewers: bruening, aizatsky

Subscribers: junbuml, zhaoqin, llvm-commits, eugenis, vitalybuka, kcc, bruening

Differential Revision: http://reviews.llvm.org/D20892

llvm-svn: 271619
2016-06-03 02:33:04 +00:00
Michael Zolotukhin 585649895f [LoopUnroll] Set correct thresholds for new recently enabled unrolling heuristic.
In r270478, where I enabled the new heuristic I posted testing results,
which I got when explicitly passed the thresholds values via CL options.
However, setting the CL options init-values is not enough to change the
default values of thresholds, so I'm changing them in another place now.

llvm-svn: 271615
2016-06-03 00:16:46 +00:00
Davide Italiano 8738363339 [TailRecursionElimination] Refactor/cleanup.
In preparation for porting to the new PM.
Patch by Jake VanAdrighem! (review mainly by me/Justin)

Differential Revision:  http://reviews.llvm.org/D20610

llvm-svn: 271607
2016-06-02 23:02:44 +00:00
Manuel Jacob a485984c0c [PM] Schedule InstSimplify after late LICM run, to clean up LCSSA nodes.
Summary:
The module pass pipeline includes a late LICM run after loop
unrolling.  LCSSA is implicitly run as a pass dependency of LICM.  However no
cleanup pass was run after this, so the LCSSA nodes ended in the optimized output.

Reviewers: hfinkel, mehdi_amini

Subscribers: majnemer, bruno, mzolotukhin, mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D20606

llvm-svn: 271602
2016-06-02 22:14:26 +00:00
Davide Italiano 6dfdbf1f46 [PM] LoadCombine preserves GlobalsAA, doesn't depend on it.
llvm-svn: 271601
2016-06-02 22:05:59 +00:00
Davide Italiano 84e1414522 [PM/LoadCombine] Inline getAnalysisUsage(). NFCI.
llvm-svn: 271600
2016-06-02 22:04:43 +00:00
Sanjay Patel dba8b4c04d transform obscured FP sign bit ops into a fabs/fneg using TLI hook
This is effectively a revert of:
http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886)
and:
http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits
and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions.

This is intended to resolve the objections raised on the dev list:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html
and:
https://llvm.org/bugs/show_bug.cgi?id=24886#c4

In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later.

Differential Revision: http://reviews.llvm.org/D19391

llvm-svn: 271573
2016-06-02 20:01:37 +00:00
Sanjay Patel 5c0bc02878 [InstCombine] remove guard for generating a vector select
This is effectively NFC because we already do this transform after r175380:
http://reviews.llvm.org/rL175380

and also via foldBoolSextMaskToSelect().

This change should just make it a bit more efficient to match the pattern. 
The original guard was added in r95058:
http://reviews.llvm.org/rL95058

A sampling of codegen for current in-tree targets shows no problems. This
makes sense given that we're already producing the vector selects via the
other transforms.

llvm-svn: 271554
2016-06-02 18:03:05 +00:00
Qin Zhao 6d3bd6866b [esan|cfrag] Create the cfrag struct array for the runtime
Summary:
Fills the cfrag struct variable with an array of struct information
variables.

Reviewers: aizatsky, bruening

Subscribers: bruening, kcc, vitalybuka, eugenis, llvm-commits, zhaoqin

Differential Revision: http://reviews.llvm.org/D20661

llvm-svn: 271547
2016-06-02 17:30:47 +00:00
Xinliang David Li 7008ce3f98 [profile] value profiling bug fix -- missing icall targets in profile-use
Inline virtual functions has linkeonceodr linkage (emitted in comdat on 
supporting targets). If the vtable for the class is not emitted in the
defining module, function won't be address taken thus its address is not
recorded. At the mercy of the linker, if the per-func prf_data from this
module (in comdat) is picked at link time, we will lose mapping from
function address to its hash val. This leads to missing icall promotion.
The second test case (currently disabled) in compiler_rt (r271528): 
instrprof-icall-prom.test demostrates the bug. The first profile-use
subtest is fine due to linker order difference.

With this change, no missing icall targets is found in instrumented clang's
raw profile.

llvm-svn: 271532
2016-06-02 16:33:41 +00:00
Xinliang David Li 0b29330612 make icall pass name consistent /NFC
llvm-svn: 271467
2016-06-02 01:52:05 +00:00
Vitaly Buka 7b8ed4f223 [asan] Rename *UAR* into *UseAfterReturn*
Summary:
To improve readability.

PR27453

Reviewers: kcc, eugenis, aizatsky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20761

llvm-svn: 271447
2016-06-02 00:06:42 +00:00
Geoff Berry b96d3b2dd8 [MemorySSA] Port to new pass manager
Add support for the new pass manager to MemorySSA pass.

Change MemorySSA to be computed eagerly upon construction.

Change MemorySSAWalker to be owned by the MemorySSA object that creates
it.

Reviewers: dberlin, george.burgess.iv

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19664

llvm-svn: 271432
2016-06-01 21:30:40 +00:00
Michael Kuperstein 3a3c64d23e [LV] For some IVs, use vector phis instead of widening in the loop body
Previously, whenever we needed a vector IV, we would create it on the fly,
by splatting the scalar IV and adding a step vector. Instead, we can create a
real vector IV. This tends to save a couple of instructions per iteration.

This only changes the behavior for the most basic case - integer primary
IVs with a constant step.

Differential Revision: http://reviews.llvm.org/D20315

llvm-svn: 271410
2016-06-01 17:16:46 +00:00
Peter Collingbourne 382d81cacf IR: Allow multiple global metadata attachments with the same type.
This will be necessary to allow the global merge pass to attach
multiple debug info metadata nodes to global variables once we reverse
the edge from DIGlobalVariable to GlobalVariable.

Differential Revision: http://reviews.llvm.org/D20414

llvm-svn: 271358
2016-06-01 01:17:57 +00:00
Guozhi Wei b994f4cdbc [SLP] Pass in correct alignment when query memory access cost
This patch fixes bug https://llvm.org/bugs/show_bug.cgi?id=27897.

When query memory access cost, current SLP always passes in alignment value of 1 (unaligned), so it gets a very high cost of scalar memory access, and wrongly vectorize memory loads in the test case.

It can be fixed by simply giving correct alignment.

llvm-svn: 271333
2016-05-31 20:41:19 +00:00
Davide Italiano bdc2971434 [PM] BDCE: Fix caching of analyses.
Another chapter in the story. GlobalsAA should be preserved, as
 well as the CFG.

llvm-svn: 271307
2016-05-31 17:53:22 +00:00
Davide Italiano 688616ff74 [PM] ADCE: Fix caching of analyses.
When this pass was originally ported, AA wasn't available for the
new PM. Now it is, so we can cache properly.

llvm-svn: 271303
2016-05-31 17:39:39 +00:00
Erik Eckstein 0c48dd8ca5 Fix a crash in MergeFunctions related to ordering of weak/strong functions
The assumption, made in insert() that weak functions are always inserted after strong functions,
is only true in the first round of adding functions.
In subsequent rounds this is no longer guaranteed , because we might remove a strong function from the tree (because it's modified) and add it later,
where an equivalent weak function already exists in the tree.
This change removes the assert in insert() and explicitly enforces a weak->strong order.
This also removes the need of two separate loops in runOnModule().

llvm-svn: 271299
2016-05-31 17:20:23 +00:00
Qin Zhao 1762eef572 [esan|cfrag] Create the skeleton of cfrag variable for the runtime
Summary:
Creates a global variable containing preliminary information
for the cache-fragmentation tool runtime.

Passes a pointer to the variable (null if no variable is created) to the
compilation unit init and exit routines in the runtime.

Reviewers: aizatsky, bruening

Subscribers: filcab, kubabrecka, bruening, kcc, vitalybuka, eugenis, llvm-commits, zhaoqin

Differential Revision: http://reviews.llvm.org/D20541

llvm-svn: 271298
2016-05-31 17:14:02 +00:00
Saleem Abdulrasool d2f705ddf9 X86: permit using SjLj EH on x86 targets as an option
This adds support to the backed to actually support SjLj EH as an exception
model.  This is *NOT* the default model, and requires explicitly opting into it
from the frontend.  GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.

Addresses PR27749!

llvm-svn: 271244
2016-05-31 01:48:07 +00:00
Craig Topper 8287fd8abd [X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions.
llvm-svn: 271236
2016-05-30 23:15:56 +00:00
Sanjoy Das 3e5ce2b737 [IndVars] Assert that the incoming IR is in LCSSA
Since we already assert that the outgoing IR is in LCSSA, it is easy to
get misled into thinking that -indvars broke LCSSA if the incoming IR is
non-LCSSA.  Checking this pre-condition will make such cases break in
more obvious ways.

Inspired by (but does _not_ fix) PR26682.

llvm-svn: 271196
2016-05-30 01:37:39 +00:00
Sanjoy Das 496f274257 [IndVarSimplify] Extract the logic of `-indvars` out into a class; NFC
This will be used later to port IndVarSimplify to the new pass manager.

llvm-svn: 271190
2016-05-29 21:42:00 +00:00
Benjamin Kramer 728f4448a9 Remove some 'const' specifiers that do nothing but prevent moving the argument.
Found by clang-tidy's misc-move-const-arg. While there drop some
obsolete c_str() calls.

llvm-svn: 271181
2016-05-29 10:46:35 +00:00
Davide Italiano 39893bd41c [PM] Reassociate: cache analyses more aggressively.
While here, add a FIXME for setPreserveCFG().

llvm-svn: 271159
2016-05-29 00:41:17 +00:00
Sanjoy Das ae09b3cd4c [IndVars] Eliminate op.with.overflow when possible (re-apply)
Summary:
If we can prove that an op.with.overflow intrinsic does not overflow, we
can get rid of the intrinsic, and replace it with non-wrapping
arithmetic.

This was first checked in at r265913 but reverted in r265950 because it
exposed some issues around how SCEV handled post-inc add recurrences.
Those issues have now been fixed.

Reviewers: atrick, regehr

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18685

llvm-svn: 271153
2016-05-29 00:36:25 +00:00
Davide Italiano 484b5ab39d [PM] SCCP should preserve GlobalsAA even if the IR is mutated.
llvm-svn: 271149
2016-05-29 00:31:15 +00:00
Simon Pilgrim 9602d678cb [X86][SSE] (Reapplied) Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed.

Differential Revision: http://reviews.llvm.org/D20686

llvm-svn: 271131
2016-05-28 18:03:41 +00:00
Mehdi Amini bcc47419d9 ValueMapper: fix assertion when null-mapping a constant for linking metadata
Summary:
When RF_NullMapMissingGlobalValues is set, mapValue can return null
for GlobalValue. When mapping the operands of a constant that is
referenced from metadata, we need to handle this case and actually
return null instead of mapping this constant.

Reviewers: dexonsmith, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20713

llvm-svn: 271129
2016-05-28 17:26:03 +00:00
Sean Silva 42cc3422eb Add a comment about why we need to buffer the attribute changes.
llvm-svn: 271097
2016-05-28 04:24:39 +00:00
Sean Silva 8c7e12136c Small cleanup.
Centralize assertion.
Clean up max loop.

llvm-svn: 271094
2016-05-28 04:19:45 +00:00
Sean Silva 2e8f095b2a Inline this into its only use. NFC.
The name was out of date at this point and it seems simple enough to
have in-line.

llvm-svn: 271093
2016-05-28 04:19:40 +00:00
Sean Silva 02b9d892c5 Bring back r271090 in a way that doesn't depend on r271089.
llvm-svn: 271092
2016-05-28 04:05:36 +00:00
Sean Silva 9dd4b5c51d Revert r271089 and r271090.
It was triggering an msan bot.

Revert "[IRPGO] Set the function entry count metadata."

This reverts commit r271090.

Revert "[IRPGO] Centralize the function attribute inliner hint logic. NFC."

This reverts commit r271089.

llvm-svn: 271091
2016-05-28 03:56:25 +00:00
Sean Silva 7884633c5b [IRPGO] Set the function entry count metadata.
llvm-svn: 271090
2016-05-28 03:02:54 +00:00
Sean Silva 2a73019f3e [IRPGO] Centralize the function attribute inliner hint logic. NFC.
This keeps the logic in the same function.

llvm-svn: 271089
2016-05-28 03:02:50 +00:00
Evgeny Stupachenko b787522d28 The patch fixes r271071
Summary:
unused variables in Release mode:
  BasicBlock *Header
  unsigned OrigCount
put under DEBUG

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 271076
2016-05-28 00:14:58 +00:00
Xinliang David Li d38392ecd6 [PM] Port the Sample FDO to new PM (part-2)
llvm-svn: 271072
2016-05-27 23:20:16 +00:00
Evgeny Stupachenko ea2aef4a1d The patch refactors unroll pass.
Summary:
Unroll factor (Count) calculations moved to a new function.
Early exits on pragma and "-unroll-count" defined factor added.
New type of unrolling "Force" introduced (previously used implicitly).
New unroll preference "AllowRemainder" introduced and set "true" by default.
(should be set to false for architectures that suffers from it).

Reviewers: hfinkel, mzolotukhin, zzheng

Differential Revision: http://reviews.llvm.org/D19553

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 271071
2016-05-27 23:15:06 +00:00
Vitaly Buka 1e75fa4ad8 [asan] Add option to enable asan-use-after-scope from clang.
Clang will have -fsanitize-address-use-after-scope flag.

PR27453

Reviewers: kcc, eugenis, aizatsky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20750

llvm-svn: 271067
2016-05-27 22:55:10 +00:00
Xinliang David Li e897edbd36 [PM] Port the Sample FDO to new PM (part-1)
llvm-svn: 271062
2016-05-27 22:30:44 +00:00
Sanjay Patel 74d23ad498 [InstCombine] move and/sext fold to helper function; NFCI
We need to enhance the pattern matching on these to look through bitcasts.

llvm-svn: 271051
2016-05-27 21:41:29 +00:00
Davide Italiano 88a7892a07 [LCSSA] Simplify. Suggested by Sanjoy.
llvm-svn: 271041
2016-05-27 20:25:31 +00:00
Sanjoy Das 6fff9dc932 [GVN] Preserve !range metadata when PRE'ing loads
Reviewers: dberlin, reames, george.burgess.iv

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20743

llvm-svn: 271034
2016-05-27 19:03:10 +00:00
Benjamin Kramer f6f815bf39 Use StringRef::startswith instead of find(...) == 0.
It's faster and easier to read.

llvm-svn: 271018
2016-05-27 16:54:57 +00:00
Tim Northover 10a1e8b1fe Vectorizer: track non-fast FP instructions through phis when finding reductions.
When we traced through a phi node looking for floating-point reductions, we
forgot whether we'd ever seen an instruction without fast-math flags (that
would block vectorization). This propagates it through to the end.

llvm-svn: 271015
2016-05-27 16:40:27 +00:00
Xinliang David Li 11c849c10b Reapply r270865 -- previous bot failure is unrelated
llvm-svn: 271014
2016-05-27 16:22:03 +00:00
Dehao Chen 80b16d4135 Remove sample profile dependency to instcombine, which is not a analysis pass.
Summary: This patch removes dependency from sample profile pass to instcombine pass.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20501

llvm-svn: 271009
2016-05-27 16:14:15 +00:00
Benjamin Kramer 82de7d323d Apply clang-tidy's misc-move-constructor-init throughout LLVM.
No functionality change intended, maybe a tiny performance improvement.

llvm-svn: 270997
2016-05-27 14:27:24 +00:00
Igor Laevsky df9db45c94 [RewriteStatepointsForGC] All constant should have null base pointer
Currently we consider that each constant has itself as a base value. I.e "base(const) = const". 
This introduces couple of problems when we are trying to avoid reporting constants in statepoint live sets:

1. When querying "base( phi(const1, const2) )" we will get "phi(const1, const2)" as a base pointer. Since 
   it's not a constant we will record it in a stack map. However on practice we don't want this to happen
   (constant are never relocated).
2. base( phi(const, gc ptr) ) = phi( const, base(gc ptr) ). This particular case imposes challenge on our 
   runtime - we don't expect to see constant base pointers other than null. This problems can be avoided 
   by treating all constant as if they were derived from null pointer base. I.e in a first case we will 
   not include constant pointer in a stack map at all. In a second case we will get "phi(null, base(gc ptr))" 
   as a base pointer which is a lot more convenient.

Differential Revision: http://reviews.llvm.org/D20584

llvm-svn: 270993
2016-05-27 13:13:59 +00:00
Benjamin Kramer 4fed928f53 Avoid some copies by using const references.
clang-tidy's performance-unnecessary-copy-initialization with some manual
fixes. No functional changes intended.

llvm-svn: 270988
2016-05-27 12:30:51 +00:00
Simon Pilgrim 4642a57fbf Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
llvm-svn: 270976
2016-05-27 09:02:25 +00:00
Simon Pilgrim c013e5737b [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

A companion patch (D20684) removes/auto-upgrade the clang intrinsics.

Differential Revision: http://reviews.llvm.org/D20686

llvm-svn: 270973
2016-05-27 08:49:15 +00:00
Pete Cooper 1929b5539a Form objc_storeStrong in the presence of bitcasts.
objc_storeStrong can be formed from a sequence such as

  %0 = tail call i8* @objc_retain(i8* %p) nounwind
  %tmp = load i8*, i8** @x, align 8
  store i8* %0, i8** @x, align 8
  tail call void @objc_release(i8* %tmp) nounwind

The code was already looking through bitcasts for most of the values
involved, but had missed one case where the pointer operand for the
store was a bitcast.  Ultimately the pointer for the load and store
have to be the same value, after stripping casts.

llvm-svn: 270955
2016-05-27 02:13:53 +00:00
Mehdi Amini 9ee054aea8 ValueMapper: fix typo in minor optimization on constant mapping (NFC)
If every operands of a constant are mapping to themselves, and the
type does not change, we have an early exit as acknowledged in the
comment:

  // Otherwise, we have some other constant to remap.  Start by checking to see
  // if all operands have an identity remapping.

However instead of checking for identity the code was checking if the
operands were mapped to the constant itself, which is rarely true.

As a consequence, the coverage report showed that the early exit was
never taken.

llvm-svn: 270944
2016-05-27 00:32:12 +00:00
Easwaran Raman 5fe04a1d8e Attach profile summary in IR based instrumentation pass.
Differential revision: http://reviews.llvm.org/D20655

llvm-svn: 270933
2016-05-26 22:57:11 +00:00
Michael Zolotukhin 1ecdedad8d [LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost.
Condition might be simplified to a Constant, but it doesn't have to be
ConstantInt, so we should dyn_cast, instead of cast.

This fixes PR27886.

llvm-svn: 270924
2016-05-26 21:42:51 +00:00
David Majnemer d99068d26d [MemCpyOpt] Don't perform callslot optimization across may-throw calls
An exception could prevent a store from occurring but MemCpyOpt's
callslot optimization would fire anyway, causing the store to occur.

This fixes PR27849.

llvm-svn: 270892
2016-05-26 19:24:24 +00:00
Michael Kuperstein 9a81b62a01 [BBVectorize] Don't vectorize selects with a scalar condition and vector operands.
This fixes PR27879.

Differential Revision: http://reviews.llvm.org/D20659

llvm-svn: 270888
2016-05-26 18:43:57 +00:00
Xinliang David Li b02f3b141c Revert 270865 -- unexplained bot failure on linux/ppcle
llvm-svn: 270876
2016-05-26 17:27:22 +00:00
Xinliang David Li 0777a93bee Use new interface in Triple /NFC
llvm-svn: 270865
2016-05-26 16:28:01 +00:00
Chad Rosier e5819e2732 [InstCombine] Catch more bswap cases missed due to zext and truncs.
Fixes PR27824.
Differential Revision: http://reviews.llvm.org/D20591.

llvm-svn: 270853
2016-05-26 14:58:51 +00:00
John Brawn 3546c2f158 Add auto-exporting of symbols from tools so that plugins work on Windows
The problem with plugins on Windows is that when building a plugin DLL it needs
to explicitly link against something (an exe or DLL) if it uses symbols from
that thing, and that thing must explicitly export those symbols. Also there's a
limit of 65535 symbols that can be exported. This means that currently plugins
only work on Windows when using BUILD_SHARED_LIBS, and that doesn't work with
MSVC.

This patch adds an LLVM_EXPORT_SYMBOLS_FOR_PLUGINS option, which when enabled
automatically exports from all LLVM tools the symbols that a plugin could want
to use so that a plugin can link against a tool directly. Plugins can specify
what tool they link against by using PLUGIN_TOOL argument to llvm_add_library.
The option can also be enabled on Linux, though there all it should do is
restrict the set of symbols that are exported as by default all symbols are
exported.

This option is currently OFF by default, as while I've verified that it works
with MSVC, linux gcc, and cygwin gcc, I haven't tried mingw gcc and I have no
idea what will happen on OSX. Also unfortunately we can't turn on
LLVM_ENABLE_PLUGINS when the option is ON as bugpoint-passes needs to be
loaded by both bugpoint.exe and opt.exe which is incompatible with this
approach. Also currently clang plugins don't work with this approach, which
will be fixed in future patches.

Differential Revision: http://reviews.llvm.org/D18826

llvm-svn: 270839
2016-05-26 11:16:43 +00:00
David Majnemer 474512576e [MergedLoadStoreMotion] Don't transform across may-throw calls
It is unsafe to hoist a load before a function call which may throw, the
throw might prevent a pointer dereference.

Likewise, it is unsafe to sink a store after a call which may throw.
The caller might be able to observe the difference.

This fixes PR27858.

llvm-svn: 270828
2016-05-26 07:11:09 +00:00
David Majnemer 8cce333abd [MergedLoadStoreMotion] Small cleanup
No functional change is intended.

llvm-svn: 270824
2016-05-26 05:43:12 +00:00
Peter Collingbourne b9aa1f4a03 MemorySSA: Revert r269678 and r268068; replace with special casing in MemorySSA.
It turns out that too many passes are relying on alias analysis results
for control dependencies. Until we fix that by introducing a more accurate
modelling of control dependencies, special case assume in MemorySSA instead.

Also introduce tests to ensure we don't regress the FunctionAttrs or LICM
passes.

Differential Revision: http://reviews.llvm.org/D20658

llvm-svn: 270823
2016-05-26 04:58:46 +00:00
Craig Topper a423aa4642 [X86] Add the AVX storeu intrinsics to InstCombine and LoopStrengthReduce in the same places that the SSE/SSE2 storeu intrinsics appear.
I don't really know how to test this. Just seemed like we should be consistent.

llvm-svn: 270819
2016-05-26 04:28:45 +00:00
Sanjoy Das ee77a4828e [IRCE] Use C++11 style initializers; NFC
llvm-svn: 270815
2016-05-26 01:50:18 +00:00
Peter Collingbourne ffecb1441b MemorySSA: Remove argument to createNewAccess function.
There is only one caller of MemorySSA::createNewAccess, and it passes true
as the IgnoreNonMemory argument. Remove that argument and fold its behavior
into createNewAccess.

llvm-svn: 270812
2016-05-26 01:19:17 +00:00
Sanjoy Das a099268e85 [IRCE] Optimize conjunctions of range checks
After this change, we do the expected thing for cases like

```
Check0Passed = /* range check IRCE can optimize */
Check1Passed = /* range check IRCE can optimize */
if (!(Check0Passed && Check1Passed))
  throw_Exception();
```

llvm-svn: 270804
2016-05-26 00:09:02 +00:00
Sanjoy Das 8fe8892c2d [IRCE] Refactor out a parseRangeCheckFromCond; NFC
This will later hold more general logic to parse conjunctions of range
checks.

llvm-svn: 270802
2016-05-26 00:08:24 +00:00
Davide Italiano 1021c68e92 [PM] Port PartiallyInlineLibCalls to the new pass manager.
llvm-svn: 270798
2016-05-25 23:38:53 +00:00
Peter Collingbourne fad596aa81 Move whole-program virtual call optimization pass after function attribute inference in LTO pipeline.
As a result of D18634 we no longer infer certain attributes on linkonce_odr
functions at compile time, and may only infer them at LTO time. The readnone
attribute in particular is required for virtual constant propagation (part
of whole-program virtual call optimization) to work correctly.

This change moves the whole-program virtual call optimization pass after
the function attribute inference passes, and enables the attribute inference
passes at opt level 1, so that virtual constant propagation has a chance to
work correctly for linkonce_odr functions.

Differential Revision: http://reviews.llvm.org/D20643

llvm-svn: 270765
2016-05-25 21:26:14 +00:00
Sanjay Patel 6be09ee827 fix typo; NFC
llvm-svn: 270760
2016-05-25 21:03:31 +00:00
Mehdi Amini cc8c107e6a ValueMaterializer: rename materializeDeclFor() to materialize()
It may materialize a declaration, or a definition. The name could
be misleading. This is following a merge of materializeInitFor()
into materializeDeclFor().

Differential Revision: http://reviews.llvm.org/D20593

llvm-svn: 270759
2016-05-25 21:03:21 +00:00
Mehdi Amini 53a6672e21 ValueMaterializer: fuse materializeDeclFor and materializeInitFor (NFC)
They were originally separated to handle the co-recursion between
the ValueMapper and the ValueMaterializer. This recursion does not
exist anymore: the ValueMapper now uses a Worklist and the
ValueMaterializer is scheduling job on the Worklist.

Differential Revision: http://reviews.llvm.org/D20593

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 270758
2016-05-25 21:01:51 +00:00
Davide Italiano d85ac997b8 [PM] CorrelatedValuePropagation: pass state to function. NFCI.
While here, convert the logic of the pass to use static function(s).
This is in preparation for porting this pass to the new PM.

llvm-svn: 270734
2016-05-25 17:39:54 +00:00
Xinliang David Li a228608b26 Use new triple API to check if comdat is supported
llvm-svn: 270727
2016-05-25 17:17:51 +00:00
Chad Rosier a00df49dc5 Clarify that we match BSwap in InstCombine and BitReverse in CGP. NFC.
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.

llvm-svn: 270715
2016-05-25 16:22:14 +00:00
Teresa Johnson 04c9a2d63d [ThinLTO] Refactor ODR resolution and internalization (NFC)
Move the now index-based ODR resolution and internalization routines out
of ThinLTOCodeGenerator.cpp and into either LTO.cpp (index-based
analysis) or FunctionImport.cpp (index-driven optimizations).
This is to enable usage by other linkers.

llvm-svn: 270698
2016-05-25 14:03:11 +00:00
Simon Pilgrim 4298d06d0f [X86][SSE] Replace (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) lossless conversion intrinsics with generic IR
Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead.

Differential Revision: http://reviews.llvm.org/D20568

llvm-svn: 270678
2016-05-25 08:59:18 +00:00
Craig Topper 12e322a8cf [X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a long time.
llvm-svn: 270677
2016-05-25 06:56:32 +00:00
David Majnemer 124bdb7497 [FunctionAttrs] Volatile loads should disable readonly
A volatile load has side effects beyond what callers expect readonly to
signify.  For example, it is not safe to reorder two function calls
which each perform a volatile load to the same memory location.

llvm-svn: 270671
2016-05-25 05:53:04 +00:00
Davide Italiano 655a145e83 [PM] Port BDCE to the new pass manager.
llvm-svn: 270647
2016-05-25 01:57:04 +00:00
Derek Bruening 5662b93985 [esan|wset] EfficiencySanitizer working set tool fastpath
Summary:
Adds fastpath instrumentation for esan's working set tool.  The
instrumentation for an intra-cache-line load or store consists of an
inlined write to shadow memory bits for the corresponding cache line.

Adds a basic test for this instrumentation.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits

Differential Revision: http://reviews.llvm.org/D20483

llvm-svn: 270640
2016-05-25 00:17:24 +00:00
Michael Zolotukhin 8f7a242c7b Re-enable "[LoopUnroll] Enable advanced unrolling analysis by default" one more time.
This reverts commit r270577.

llvm-svn: 270630
2016-05-24 23:00:05 +00:00
Derek Bruening 0b872d9399 [esan] Add calls from the ctor/dtor to the runtime library
Summary:
Adds createEsanInitToolGV for creating a tool-specific variable passed
to the runtime library.

Adds dtor "esan.module_dtor" and inserts calls from the dtor to
"__esan_exit" in the runtime library.

Updates the EfficiencySanitizer test.

Patch by Qin Zhao.

Reviewers: aizatsky

Subscribers: bruening, kcc, vitalybuka, eugenis, llvm-commits

Differential Revision: http://reviews.llvm.org/D20488

llvm-svn: 270627
2016-05-24 22:48:24 +00:00
Sanjoy Das be99153aca [GuardWidening] Tighten the interface of the RangeCheck struct; NFC
Make `GuardWideningImpl::RangeCheck` into a class and add accessors.

llvm-svn: 270611
2016-05-24 20:54:45 +00:00
Xinliang David Li f4edae6076 [profile] Fix runtime hook linkage bug for COFF
Patch by: Johan Engelen

the user hook has linkonceODR linkage and it needs to be
in comdatAny group.

llvm-svn: 270596
2016-05-24 18:47:38 +00:00
Sanjoy Das 5fd7ac452e [IRCE] Return a Value*, not SCEV* from parseRangeCheck; NFC
This is better layering, since the caller needs to check if the index
was an add-rec anyway.

llvm-svn: 270582
2016-05-24 17:19:56 +00:00
Sanjay Patel 929ebf5a54 fix typos; NFC
llvm-svn: 270579
2016-05-24 16:51:26 +00:00
Hans Wennborg b64e4390a3 Revert r270518, which re-enabled "[LoopUnroll] Enable advanced unrolling analysis by default.
Chromium builds are still hitting the assert in PR27874.

llvm-svn: 270577
2016-05-24 16:10:12 +00:00
Michael Zolotukhin 96c150d154 Revert "Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default.""
This reverts commit r270512 and reapplies r270478. Originally it caused
PR27847, but it was fixed in r270517.

llvm-svn: 270518
2016-05-24 01:22:20 +00:00
Hans Wennborg 6951028b61 Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default."
This caused PR27847.

llvm-svn: 270512
2016-05-23 23:42:35 +00:00
Sanjoy Das aa83c47bab [IRCE] Optimize "uses" not branches; NFCI
This changes IRCE to optimize uses, and not branches.  This change is
NFCI since the uses we do inspect are in practice only ever going to be
the condition use in conditional branches; but this flexibility will
later allow us to analyze more complex expressions than just a direct
branch on a range check.

llvm-svn: 270500
2016-05-23 22:16:45 +00:00
Andrew Kaylor 9c81d0fdeb Avoid including AlwaysInliner pass in opt-bisect search.
Differential Revision: http://reviews.llvm.org/D19640

llvm-svn: 270495
2016-05-23 21:57:54 +00:00
Xinliang David Li e45207608c tune lowering parameter for small apps (sjeng)
llvm-svn: 270480
2016-05-23 19:29:26 +00:00
Gerolf Hoflehner 00e7092f68 [InstCombine] Fix assertion when bitcast is converted to gep
When an aggregate contains an opaque type its size cannot be
determined. This triggers an "Invalid GetElementPtrInst indices for type" assert
in function checkGEPType. The fix suppresses the conversion in this case.

http://reviews.llvm.org/D20319

llvm-svn: 270479
2016-05-23 19:23:17 +00:00
Michael Zolotukhin be080fc51d [LoopUnroll] Enable advanced unrolling analysis by default.
Summary:
This patch turns on LoopUnrollAnalyzer by default. To mitigate compile
time regressions, I chose very conservative thresholds for now. Later we
can make them more aggressive, but it might require being smarter in
which loops we're optimizing. E.g. currently the biggest issue is that
with more agressive thresholds we unroll many cold loops, which
increases compile time for no performance benefit (performance of those
loops is improved, but it doesn't matter since they are cold).

Test results for compile time(using 4 samples to reduce noise):
```
MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19%
SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect  4.19%
MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow  3.39%
MultiSource/Applications/JM/lencod/lencod 1.47%
MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06%
```

I didn't see any performance changes in the testsuite, but it improves
some internal tests.

Reviewers: hfinkel, chandlerc

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20482

llvm-svn: 270478
2016-05-23 19:10:19 +00:00
Sanjay Patel a8ef4a5737 reduce indent; NFC
llvm-svn: 270372
2016-05-22 17:08:52 +00:00
Xinliang David Li b628dd3568 [profile] Static counter allocation for value profiling (part-1)
Differential Revision: http://reviews.llvm.org/D20459

llvm-svn: 270336
2016-05-21 22:55:34 +00:00
Chad Rosier 56def258e3 Fix 80-column violation.
llvm-svn: 270329
2016-05-21 21:12:06 +00:00
David Majnemer 9f92f4c497 [SimplifyCFG] Remove cleanuppads which are empty except for calls to lifetime.end
A cleanuppad is not cheap, they turn into many instructions and result
in additional spills and fills.  It is not worth keeping a cleanuppad
around if all it does is hold a lifetime.end instruction.

N.B.  We first try to merge the cleanuppad with another cleanuppad to
avoid dropping the lifetime and debug info markers.

llvm-svn: 270314
2016-05-21 05:12:32 +00:00
Sanjoy Das c5b1169de2 [IRCE] Don't use an allocator for range checks; NFC
The InductiveRangeCheck struct is only five words long; so passing these
around value is fine.  The allocator makes the code look more complex
than it is.

llvm-svn: 270309
2016-05-21 02:52:13 +00:00
Sanjoy Das 59776734a3 [IRCE] Don't pass IRBuilder<> where unnecessary; NFC
llvm-svn: 270308
2016-05-21 02:31:51 +00:00
Sanjoy Das be6c7a12cb [GuardWidening] Fix incorrect use of remove_if
I had used `std::remove_if` under the assumption that it moves the
predicate matching elements to the end, but actaully the elements
remaining towards the end (after the iterator returned by
`std::remove_if`) are indeterminate.  Fix the bug (and make the code
more straightforward) by using a temporary SmallVector, and add a test
case demonstrating the issue.

llvm-svn: 270306
2016-05-21 02:24:44 +00:00
Derek Bruening bc0a68e688 [esan] Use ModulePass for EfficiencySanitizerPass.
Summary:
Uses ModulePass instead of FunctionPass for EfficiencySanitizerPass to
better support global variable creation for a forthcoming struct field
counter tool.

Patch by Qin Zhao.

Reviewers: aizatsky

Subscribers: llvm-commits, eugenis, vitalybuka, bruening, kcc

Differential Revision: http://reviews.llvm.org/D20458

llvm-svn: 270263
2016-05-20 20:00:05 +00:00
Mark Lacey 9b5fcf65ec Functions with differing phis should not be merged.
Check that the incoming blocks of phi nodes are identical, and block
function merging if they are not.

rdar://problem/26255167

Differential Revision: http://reviews.llvm.org/D20462

llvm-svn: 270250
2016-05-20 18:39:11 +00:00
Davide Italiano f7211fd44d [PM/PartiallyInlineLibCalls] Fix pass dependencies.
Inline getAnalysisUsage() while I'm here.

llvm-svn: 270231
2016-05-20 16:23:14 +00:00
Davide Italiano 8749dfd1bf [PartiallyInlineLibCalls] Remove dead includes. NFC.
llvm-svn: 270228
2016-05-20 15:52:23 +00:00
Davide Italiano 08713bd1ed [PM/PartiallyInlineLibCalls] Convert to static function in preparation for porting this pass to the new PM.
llvm-svn: 270225
2016-05-20 15:43:39 +00:00
Sanjay Patel 75892a1543 [SimplifyCFG] eliminate switch cases based on known range of switch condition
This was noted in PR24766:
https://llvm.org/bugs/show_bug.cgi?id=24766#c2

We may not know whether the sign bit(s) are zero or one, but we can still
optimize based on knowing that the sign bit is repeated.

Differential Revision: http://reviews.llvm.org/D20275

llvm-svn: 270222
2016-05-20 14:53:09 +00:00
Sanjoy Das 2351975860 Add const qualifiers to appease bots; NFC
llvm-svn: 270155
2016-05-19 23:15:59 +00:00
Sanjoy Das f5f0331a3b [GuardWidening] Introduce range check merging
Sequences of range checks expressed using guards, like

  guard((I - 2) u< L)
  guard((I - 1) u< L)
  guard((I + 0) u< L)
  guard((I + 1) u< L)
  guard((I + 2) u< L)

can sometimes be combined into a smaller sequence:

  guard((I - 2) u< L AND (I + 2) u< L)

if we can prove that (I - 2) u< L AND (I + 2) u< L implies all of checks
expressed in the previous sequence.

This change teaches GuardWidening to do this kind of merging when
feasible.

llvm-svn: 270151
2016-05-19 22:55:46 +00:00
Guozhi Wei b1d37199cc [InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions
This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703.

If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it.

llvm-svn: 270135
2016-05-19 21:07:01 +00:00
Wei Mi 0456d9dd18 Recommit r255691 since PR26509 has been fixed.
llvm-svn: 270113
2016-05-19 20:38:03 +00:00
Davide Italiano 46f249b4cd [SCCP] Prefer class to struct.
llvm-svn: 270074
2016-05-19 15:58:02 +00:00
Vedant Kumar 9152fd17e9 Retry^3 "[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC"
Transition InstrProf and Coverage over to the stricter Error/Expected
interface.

Changes since the initial commit:
- Fix error message printing in llvm-profdata.
- Check errors in loadTestingFormat() + annotateAllFunctions().
- Defer error handling in InstrProfIterator to InstrProfReader.
- Remove the base ProfError class to work around an MSVC ICE.

Differential Revision: http://reviews.llvm.org/D19901

llvm-svn: 270020
2016-05-19 03:54:45 +00:00
Sanjoy Das b784ed36c0 [GuardWidening] Use getEquivalentICmp to fold constant compares
`ConstantRange::getEquivalentICmp` is more general, and better
factored.

llvm-svn: 270019
2016-05-19 03:53:17 +00:00
Sanjoy Das 52bbde2bbc [LowerGuards] Rename variable; NFC
PredicatePassProbability is a better name for what LikelyBranchWeight
was trying to express.

llvm-svn: 269999
2016-05-18 23:16:27 +00:00
Sanjoy Das 083f38939b New pass: guard widening
Summary:
Implement guard widening in LLVM. Description from GuardWidening.cpp:

The semantics of the `@llvm.experimental.guard` intrinsic lets LLVM
transform it so that it fails more often that it did before the
transform.  This optimization is called "widening" and can be used hoist
and common runtime checks in situations like these:

```
%cmp0 = 7 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
%cmp1 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp1) [ "deopt"(...) ]
...
```

to

```
%cmp0 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
...
```

If `%cmp0` is false, `@llvm.experimental.guard` will "deoptimize" back
to a generic implementation of the same function, which will have the
correct semantics from that point onward.  It is always _legal_ to
deoptimize (so replacing `%cmp0` with false is "correct"), though it may
not always be profitable to do so.

NB! This pass is a work in progress.  It hasn't been tuned to be
"production ready" yet.  It is known to have quadriatic running time and
will not scale to large numbers of guards

Reviewers: reames, atrick, bogner, apilipenko, nlewycky

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20143

llvm-svn: 269997
2016-05-18 22:55:34 +00:00
Dehao Chen f16376b505 Follow-up patch of http://reviews.llvm.org/D19948 to handle missing profiles when simplifying CFG.
Summary: Set default branch weight to 1:1 if one of the branch has profile missing when simplifying CFG.

Reviewers: spatel, davidxl

Subscribers: danielcdh, llvm-commits

Differential Revision: http://reviews.llvm.org/D20307

llvm-svn: 269995
2016-05-18 22:41:03 +00:00
Michael Zolotukhin d2268a73bc [LoopUnrollAnalyzer] Take into account cost of instructions controlling branches, along with their operands.
Previously, we didn't add their and their operands cost, which could've
resulted in unrolling loops for no actual benefit.

llvm-svn: 269985
2016-05-18 21:20:12 +00:00
Dehao Chen f6c0083b55 clang-format SimplifyCFG.cpp.
llvm-svn: 269974
2016-05-18 19:44:21 +00:00
Davide Italiano 98f7e0e790 [PM] Port per-function SCCP to the new pass manager.
llvm-svn: 269937
2016-05-18 15:18:25 +00:00
James Molloy a854c0a0c3 [VectorUtils] Fix nasty use-after-free
In truncateToMinimalBitwidths() we were RAUW'ing an instruction then erasing it. However, that intruction could be cached in the map we're iterating over. The first check is "I->use_empty()" which in most cases would return true, as the (deleted) object was RAUW'd first so would have zero use count. However in some cases the object could have been polluted or written over and this wouldn't be the case. Also it makes valgrind, asan and traditionalists who don't like their compiler to crash sad.

No testcase as there are no externally visible symptoms apart from a crash if the stars align.

Fixes PR26509.

llvm-svn: 269908
2016-05-18 11:57:58 +00:00
Justin Bogner 594e07bd78 [PM] Port DSE to the new pass manager
Patch by JakeVanAdrighem. Thanks!

llvm-svn: 269847
2016-05-17 21:38:13 +00:00
Xinliang David Li 7d0fed74f0 minor cleanup /NFC
llvm-svn: 269839
2016-05-17 21:06:16 +00:00
Sanjay Patel 22b01febd4 [InstCombine] add another test for wrong icmp constant (PR27792)
It doesn't matter if the comparison is unsigned; the inc/dec is always signed.

llvm-svn: 269831
2016-05-17 20:20:40 +00:00
Xinliang David Li 8da773bf74 Simple refactoring /NFC
llvm-svn: 269829
2016-05-17 20:19:03 +00:00
Davide Italiano bfe3801d16 [LCSSA] Use llvm::any_of instead of std::size_of.
The API is simpler. Suggested by David Blaikie!

llvm-svn: 269800
2016-05-17 19:01:02 +00:00
Sanjay Patel 86564cad06 [InstCombine] fix constant to be signed for signed comparisons
This bug was introduced in r269728 and is the likely cause of many stage 2 ubsan bot failures.
I'll add a test in a follow-up commit assuming this fixes things properly.

llvm-svn: 269797
2016-05-17 18:38:55 +00:00
Sanjoy Das fd67038c8b [Guards] Add branch metadata when lowering
Guards are expected to basically never fail.  Reflect this in the branch
probabilities in their lowered form.

llvm-svn: 269791
2016-05-17 17:51:19 +00:00
Davide Italiano a0e0feea1d [PM/LCSSA] Fix dependency list. Some passes are preserved, not required.
llvm-svn: 269768
2016-05-17 14:32:12 +00:00
Davide Italiano b75b16e2ff [LCSSA] Use any_of() to simplify the code. NFCI.
llvm-svn: 269767
2016-05-17 14:24:41 +00:00
Igor Laevsky 953f2d2a54 [RewriteStatepointsForGC] Remove obsolete assertion
This is assertion is no longer necessary since we never record
constants in the live set anyway. (They are never recorded in 
the initial live set, and constant bases are removed near line 2119)

Differential Revision: http://reviews.llvm.org/D20293

llvm-svn: 269764
2016-05-17 13:54:10 +00:00
Benjamin Kramer ca9a0fe2b9 [InstCombine] Don't crash when trying to take an element of a ConstantExpr.
Fixes PR27786.

llvm-svn: 269757
2016-05-17 12:08:55 +00:00
Sanjay Patel 18254935c9 try to avoid unused variable warning in release build; NFCI
llvm-svn: 269729
2016-05-17 01:12:31 +00:00
Sanjay Patel e9b2c32e7f [InstCombine] check vector elements before trying to transform LE/GE vector icmp (PR27756)
Fix a bug introduced with rL269426 :
[InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819)

We were assuming that a ConstantDataVector / ConstantVector / ConstantAggregateZero operand of
an ICMP was composed of ConstantInt elements, but it might have ConstantExpr or UndefValue 
elements. Handle those appropriately.

Also, refactor this function to join the scalar and vector paths and eliminate the switches.

Differential Revision: http://reviews.llvm.org/D20289

llvm-svn: 269728
2016-05-17 00:57:57 +00:00
Vedant Kumar 85c973d3f0 Revert "Retry^2 "[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC""
This reverts commit r269694. MSVC says:

error C2086: 'char llvm::ProfErrorInfoBase<enum llvm::instrprof_error>::ID' : redefinition

llvm-svn: 269700
2016-05-16 21:03:38 +00:00
Vedant Kumar 7cb2fd5904 Retry^2 "[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC"
Transition InstrProf and Coverage over to the stricter Error/Expected
interface.

Changes since the initial commit:
- Address undefined-var-template warning.
- Fix error message printing in llvm-profdata.
- Check errors in loadTestingFormat() + annotateAllFunctions().
- Defer error handling in InstrProfIterator to InstrProfReader.

Differential Revision: http://reviews.llvm.org/D19901

llvm-svn: 269694
2016-05-16 20:49:39 +00:00
Xinliang David Li f3c7a35238 [PM] Port indirect call promotion pass to new pass manager
llvm-svn: 269660
2016-05-16 16:31:07 +00:00
Matthew Simpson e43198dc4b [LV] Ensure safe VF for loops with interleaved accesses
The selection of the vectorization factor currently doesn't consider
interleaved accesses. The vectorization factor is based on the maximum safe
dependence distance computed by LAA. However, for loops with interleaved
groups, we should instead base the vectorization factor on the maximum safe
dependence distance divided by the maximum interleave factor of all the
interleaved groups. Interleaved accesses not in a group will be scalarized.

Differential Revision: http://reviews.llvm.org/D20241

llvm-svn: 269659
2016-05-16 15:08:20 +00:00
Davide Italiano 6f852eedbf [PM] RewriterStatepointForGC: add missing dependency.
llvm-svn: 269624
2016-05-16 02:29:53 +00:00
Benjamin Kramer a65b610bd2 Move helper classes into anonymous namespaces. NFC.
llvm-svn: 269591
2016-05-15 15:18:11 +00:00
Davide Italiano e62c54375d [PM/SCCP] Fix pass dependencies.
TargetLibraryInfoWrapperPass is a dependency of
SCCP but it's not listed as such. Chandler pointed
out this is an easy mistake to make which only
surfaces in weird crashes with some flag combinations.
This code will go away anyway at some point in the
future, but as long as it's (still) exercised, try
to make it correct.

llvm-svn: 269589
2016-05-15 08:04:28 +00:00
Xinliang David Li 72616180df Rename pass name to prepare to new PM porting /NFC
llvm-svn: 269586
2016-05-15 01:04:24 +00:00
Davide Italiano e7c56c5c4f [SCCP] Use range-based for loops. NFC.
llvm-svn: 269578
2016-05-14 20:59:09 +00:00
Chandler Carruth 5957375902 Revert "Retry "[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC""
This reverts commit r269491. It triggers warnings with Clang, breaking
builds for -Werror users including several build bots.

llvm-svn: 269547
2016-05-14 05:26:26 +00:00
Marcin Koscielnicki a4fcd3681f [MSan] [PowerPC] Implement PowerPC64 vararg helper.
Differential Revision: http://reviews.llvm.org/D20000

llvm-svn: 269518
2016-05-13 23:55:33 +00:00
Davide Italiano 9922344178 [PM] Port LowerAtomic to the new pass manager.
llvm-svn: 269511
2016-05-13 22:52:35 +00:00
Sanjay Patel abbc2ac231 use 'match' for less indenting; NFCI
llvm-svn: 269494
2016-05-13 21:51:17 +00:00
Vedant Kumar df41bd89a5 Retry "[ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC"
Transition InstrProf and Coverage over to the stricter Error/Expected
interface.

Changes since the initial commit:
- Fix error message printing in llvm-profdata.
- Check errors in loadTestingFormat() + annotateAllFunctions().
- Defer error handling in InstrProfIterator to InstrProfReader.

Differential Revision: http://reviews.llvm.org/D19901

llvm-svn: 269491
2016-05-13 21:50:56 +00:00
Michael Zolotukhin 963a6d9c69 Revert "Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...""
This reverts commit r269395.

Try to reapply with a fix from chapuni.

llvm-svn: 269486
2016-05-13 21:23:25 +00:00
Matthew Simpson c326d050ca Correct spelling in comment (NFC)
llvm-svn: 269482
2016-05-13 21:01:07 +00:00
Sanjay Patel 5d5134f676 use range-loops; NFCI
llvm-svn: 269471
2016-05-13 20:24:53 +00:00
Vedant Kumar 064535c1ea Revert "(HEAD -> master, origin/master, origin/HEAD) [ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC"
This reverts commit r269462. It fails two llvm-profdata tests.

llvm-svn: 269466
2016-05-13 20:09:39 +00:00
Vedant Kumar ac25219d20 [ProfileData] (llvm) Use Error in InstrProf and Coverage, NFC
Transition InstrProf and Coverage over to the stricter Error/Expected
interface.

Differential Revision: http://reviews.llvm.org/D19901

llvm-svn: 269462
2016-05-13 20:01:27 +00:00
Jun Bum Lim be11bdc4b0 Rename getLargestLegalIntTypeSize to getLargestLegalIntTypeSizeInBits(). NFC.
Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes  fixed in r269433.

Reviewers: joker.eph, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20248

llvm-svn: 269456
2016-05-13 18:38:35 +00:00
Geoff Berry 2f64c20284 [EarlyCSE] Change key type of AvailableCalls to Instruction*. NFCI.
llvm-svn: 269445
2016-05-13 17:54:58 +00:00
Sanjay Patel 0c8f3f9332 [InstCombine] handle zero constant vectors for LE/GE comparisons too
Enhancement to: http://reviews.llvm.org/rL269426
With discussion in: http://reviews.llvm.org/D17859

This should complete the fixes for: PR26701, PR26819:
https://llvm.org/bugs/show_bug.cgi?id=26701
https://llvm.org/bugs/show_bug.cgi?id=26819
 

llvm-svn: 269439
2016-05-13 17:28:12 +00:00
Rong Xu 0698de9218 [PGO] Add flags to control IRPGO warnings.
Currently there is no reasonable way to control the warnings in the 'use' phase
of the IRPGO pass. This is problematic because the output can be somewhat
spammy. This patch adds some flags which allow us to optionally disable these
warnings. The current upstream behavior will remain the default.

Patch by Jake VanAdrighem (jvanadrighem@gmail.com)

Differential Revision: http://reviews.llvm.org/D20195

llvm-svn: 269437
2016-05-13 17:26:06 +00:00
Jun Bum Lim f28beac419 [MemCpyOpt] Use MaxIntSize in byte instead of bit
Summary: This change fix the bug in isProfitableToUseMemset() where MaxIntSize shoule be in byte, not bit.

Reviewers: arsenm, joker.eph, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20176

llvm-svn: 269433
2016-05-13 16:52:24 +00:00
Sanjay Patel b79ab27853 [InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819)
*We don't currently handle the  edge case constants (min/max values), so it's not a complete
canonicalization.

To fully solve the motivating bugs, we need to enhance this to recognize a zero vector
too because that's a ConstantAggregateZero which is a ConstantData, not a ConstantVector
or a ConstantDataVector.

Differential Revision: http://reviews.llvm.org/D17859 

llvm-svn: 269426
2016-05-13 15:10:46 +00:00
Michael Zolotukhin 9be3b8b9bb Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."
This reverts commit r269388.

It caused some bots to fail, I'm reverting it until I investigate the
issue.

llvm-svn: 269395
2016-05-13 06:32:25 +00:00
Adam Nemet eff76646f5 [LoopDist] Only run LAA for loops with the pragma
This should fix some compile-time regressions after r267672.  Thanks to
Chris Matthews for bisecting it.

llvm-svn: 269392
2016-05-13 04:20:31 +00:00
Michael Zolotukhin b7b8052982 [Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...
Summary:
...loop after the last iteration.

This is really hard to do correctly. The core problem is that we need to
model liveness through the induction PHIs from iteration to iteration in
order to get the correct results, and we need to correctly de-duplicate
the common subgraphs of instructions feeding some subset of the
induction PHIs. All of this can be driven either from a side effect at
some iteration or from the loop values used after the loop finishes.

This patch implements this by storing the forward-propagating analysis
of each instruction in a cache to recall whether it was free and whether
it has become live and thus counted toward the total unroll cost. Then,
at each sink for a value in the loop, we recursively walk back through
every value that feeds the sink, including looping back through the
iterations as needed, until we have marked the entire input graph as
live. Because we cache this, we never visit instructions more than twice
-- once when we analyze them and put them into the cache, and once when
we count their cost towards the unrolled loop. Also, because the cache
is only two bits and because we are dealing with relatively small
iteration counts, we can store all of this very densely in memory to
avoid this from becoming an excessively slow analysis.

The code here is still pretty gross. I would appreciate suggestions
about better ways to factor or split this up, I've stared too long at
the algorithmic side to really have a good sense of what the design
should probably look at.

Also, it might seem like we should do all of this bottom-up, but I think
that is a red herring. Specifically, the simplification power is *much*
greater working top-down. We can forward propagate very effectively,
even across strange and interesting recurrances around the backedge.
Because we use data to propagate, this doesn't cause a state space
explosion. Doing this level of constant folding, etc, would be very
expensive to do bottom-up because it wouldn't be until the last moment
that you could collapse everything. The current solution is essentially
a top-down simplification with a bottom-up cost accounting which seems
to get the best of both worlds. It makes the simplification incremental
and powerful while leaving everything dead until we *know* it is needed.

Finally, a core property of this approach is its *monotonicity*. At all
times, the current UnrolledCost is a conservatively low estimate. This
ensures that we will never early-exit from the analysis due to exceeding
a threshold when if we had continued, the cost would have gone back
below the threshold. These kinds of bugs can cause incredibly hard to
track down random changes to behavior.

We could use a techinque similar (but much simpler) within the inliner
as well to avoid considering speculated code in the inline cost.

Reviewers: chandlerc

Subscribers: sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11758

llvm-svn: 269388
2016-05-13 01:42:39 +00:00
Chandler Carruth 49c22190d0 [PM] Port of the DepndenceAnalysis to the new PM.
Ported DA to the new PM by splitting the former DependenceAnalysis Pass
into a DependenceInfo result type and DependenceAnalysisWrapperPass type
and adding a new PM-style DependenceAnalysis analysis pass returning the
DependenceInfo.

Patch by Philip Pfaffe, most of the review by Justin.

Differential Revision: http://reviews.llvm.org/D18834

llvm-svn: 269370
2016-05-12 22:19:39 +00:00
Simon Pilgrim 3ac4d831ee Tidied up switch cases. NFCI.
Split FCMP//ICMP/SEL from the basic arithmetic cost functions. They were not sharing any notable code path (just the return) and were repeatedly testing the opcode.

llvm-svn: 269348
2016-05-12 21:01:20 +00:00
Davide Italiano 851f879f32 [PM] Make LowerAtomic a FunctionPass.
Differential Revision: http://reviews.llvm.org/D20025

llvm-svn: 269322
2016-05-12 18:49:32 +00:00
Michael Kuperstein 82e7df5a58 [LoopVectorizer] LoopVectorBody doesn't need to be a vector. NFC.
LoopVectorBody was changed from a single pointer to a SmallVector when
store predication was introduced in r200270. Since r247139, store predication
no longer splits the vector loop body in-place, so we can go back to having
a single LoopVectorBody block.

This reverts the no-longer-needed changes from r200270.

llvm-svn: 269321
2016-05-12 18:44:51 +00:00
David Majnemer 96f0d383a7 [SCCP] Resolve shifts beyond the bitwidth to undef
Shifts beyond the bitwidth are undef but SCCP resolved them to zero.
Instead, DTRT and resolve them to undef.

This reimplements the transform which caused PR27712.

llvm-svn: 269269
2016-05-12 03:07:40 +00:00
Sanjoy Das e0aa414acf All llvm.deoptimize declarations must use the same calling convention
This new verifier rule lets us unambigously pick a calling convention
when creating a new declaration for
`@llvm.experimental.deoptimize.<ty>`.  It is also congruent with our
lowering strategy -- since all calls to `@llvm.experimental.deoptimize`
are lowered to calls to `__llvm_deoptimize`, it is reasonable to enforce
a unique calling convention.

Some of the tests that were breaking this verifier rule have had to be
split up into different .ll files.

The inliner was violating this rule as well, and has been fixed to avoid
producing invalid IR.

llvm-svn: 269261
2016-05-12 01:17:38 +00:00
Davide Italiano cd7c84bd8b Revert "[SCCP] Partially propagate informations when the input is not fully defined."
This reverts commit r269105 as it caused PR27712.

llvm-svn: 269252
2016-05-11 23:06:10 +00:00
Teresa Johnson 2e03094d45 [ThinLTO] Don't re-analyze callee at same threshold unnecessarily
This should just be a compile-time change. Correct the check for whether
we have already analyzed the callee when making summary based decisions.
There is no need to reprocess one at the same threshold as when it was
last processed.

llvm-svn: 269251
2016-05-11 22:56:19 +00:00
Rafael Espindola 83658d6e7a Return a StringRef from getSection.
This is similar to how getName is handled.

llvm-svn: 269218
2016-05-11 18:21:59 +00:00
Rafael Espindola f329be8394 Delete mayBeOverridden.
It is the same as isInterposable which seems to be the preferred name.

llvm-svn: 269150
2016-05-11 01:26:06 +00:00
Rong Xu ca28a0afb6 [PGO] Use WeakAny linkage for __llvm_profile_raw_version
Use WeakAny linkage instead of LinkOnceAny, as the symbol can be removed with
LinkOnceAny in O2 (not referenced).

llvm-svn: 269146
2016-05-11 00:31:59 +00:00
Dehao Chen b76e5d948a Propagate branch metadata when some branch probability is missing.
Summary: In sample profile, some branches may have profile missing due to profile inaccuracy. We want existing branch probability still valid after propagation.

Reviewers: hfinkel, davidxl, spatel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19948

llvm-svn: 269137
2016-05-10 23:07:19 +00:00
Xinliang David Li da1955835d [PM]: port IR based profUse pass to new pass manager
llvm-svn: 269129
2016-05-10 21:59:52 +00:00
Tim Northover 3961735f03 Revert "MemCpyOpt: combine local load/store sequences into memcpy."
This reverts commit r269125. It was in my tree when I ran "git svn dcommit".
It's really still under review.

llvm-svn: 269127
2016-05-10 21:49:40 +00:00
Tim Northover 6c65c71639 MemCpyOpt: combine local load/store sequences into memcpy.
Sort of the BB-local equivalent to idiom-recognizer: if we have a basic-block
that really implements a memcpy operation, backends can benefit from seeing
this.

llvm-svn: 269125
2016-05-10 21:48:11 +00:00
Hans Wennborg 719b26ba54 Loop unroller: set thresholds for optsize and minsize functions to zero
Before r268509, Clang would disable the loop unroll pass when optimizing
for size. That commit enabled it to be able to support unroll pragmas
in -Os builds. However, this regressed binary size in one of Chromium's
DLLs with ~100 KB.

This restores the original behaviour of no unrolling at -Os, but doing it
in LLVM instead of Clang makes more sense, and also allows the pragmas to
keep working.

Differential revision: http://reviews.llvm.org/D20115

llvm-svn: 269124
2016-05-10 21:45:55 +00:00
Lawrence Hu e58a814c07 Enable loopreroll for sext of loop control only IV
This patch extend loopreroll to allow the instruction chain
        of loop control only IV has sext.

        Differential Revision: http://reviews.llvm.org/D19820

llvm-svn: 269121
2016-05-10 21:16:49 +00:00
Lawrence Hu fe7c87beac Revert r26084: Enable loopreroll for sext of loop control only IV
llvm-svn: 269119
2016-05-10 21:11:09 +00:00
Peter Collingbourne dba995601b Cloning: Clean up the interface to the CloneFunction function.
Remove the ModuleLevelChanges argument, and the ability to create new
subprograms for cloned functions. The latter was added without review in
r203662, but it has no in-tree clients (all non-test callers pass false
for ModuleLevelChanges [1], so it isn't reachable outside of tests). It
also isn't clear that adding a duplicate subprogram to the compile unit is
always the right thing to do when cloning a function within a module. If
this functionality comes back it should be accompanied with a more concrete
use case.

Furthermore, all in-tree clients add the returned function to the module.
Since that's pretty much the only sensible thing you can do with the function,
just do that in CloneFunction.

[1] http://llvm-cs.pcc.me.uk/lib/Transforms/Utils/CloneFunction.cpp/rCloneFunction

Differential Revision: http://reviews.llvm.org/D18628

llvm-svn: 269110
2016-05-10 20:23:24 +00:00
Chad Rosier 4e6cda2db5 [InstCombine] Fold icmp ugt/ult (udiv i32 C2, X), C1.
This patch adds support for two optimizations:
icmp ugt (udiv C2, X), C1 -> icmp ule X, C2/(C1+1)
icmp ult (udiv C2, X), C1 -> icmp ugt X, C2/C1

Differential Revision: http://reviews.llvm.org/D20123

llvm-svn: 269109
2016-05-10 20:22:09 +00:00
Davide Italiano 7860c9bbf4 [SCCP] Partially propagate informations when the input is not fully defined.
With this patch:
%r1 = lshr i64 -1, 4294967296 -> undef

Before this patch:
%r1 = lshr i64 -1, 4294967296 -> 0

llvm-svn: 269105
2016-05-10 19:49:47 +00:00
Peter Collingbourne ccdc225c27 Re-apply r269081 and r269082 with a fix for MSVC.
llvm-svn: 269094
2016-05-10 18:07:21 +00:00
Peter Collingbourne 4d41cb6cc6 Revert r269081 and r269082 while I try to find the right incantation to fix MSVC build.
llvm-svn: 269091
2016-05-10 17:54:43 +00:00
Rong Xu b6211a0b4f [PGO] resubmit r268969
Put the test into a target specific directory.

llvm-svn: 269090
2016-05-10 17:45:33 +00:00
Lawrence Hu 8cc3b37d2c Enable loopreroll for sext of loop control only IV
This patch extend loopreroll to allow the instruction chain
    of loop control only IV has sext.

llvm-svn: 269084
2016-05-10 17:42:27 +00:00
Peter Collingbourne 0df2b085bc WholeProgramDevirt: Move logic for finding devirtualizable call sites to Analysis.
The plan is to eventually make this logic simpler, however I expect it to
be a little tricky for the foreseeable future (at least until we're rid of
pointee types), so move it here so that it can be reused to build a summary
index for devirtualization.

Differential Revision: http://reviews.llvm.org/D20005

llvm-svn: 269081
2016-05-10 17:34:21 +00:00
Teresa Johnson 8570fe47ef [ThinLTO] Add option to emit imports files for distributed backends
Summary:
Add support for emission of plaintext lists of the imported files for
each distributed backend compilation. Used for distributed build file
staging.

Invoked with new gold-plugin thinlto-emit-imports-files option, which is
only valid with thinlto-index-only (i.e. for distributed builds), or
from llvm-lto with new -thinlto-action=emitimports value.

Depends on D19556.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D19636

llvm-svn: 269067
2016-05-10 15:54:09 +00:00
Teresa Johnson 84174c3771 Restore "[ThinLTO] Emit individual index files for distributed backends"
This restores commit r268627:
    Summary:
    When launching ThinLTO backends in a distributed build (currently
    supported in gold via the thinlto-index-only plugin option), emit
    an individual index file for each backend process as described here:
    http://lists.llvm.org/pipermail/llvm-dev/2016-April/098272.html

    ...

    Differential Revision: http://reviews.llvm.org/D19556

Address msan failures by avoiding std::prev on map.end(), the
theory is that this is causing issues due to some known UB problems
in __tree.

llvm-svn: 269059
2016-05-10 13:48:23 +00:00
Chuang-Yu Cheng 175741d5a7 Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotation
Loop rotation clones instruction from the old header into the preheader. If
there were uses of values produced by these instructions that were outside
the loop, we have to insert PHI nodes to merge the two values. If the values
are used by DbgIntrinsics they will be used as a MetadataAsValue of a
ValueAsMetadata of the original values, and iterating all of the uses of the
original value will not update the DbgIntrinsics. The new code checks if the
values are used by DbgIntrinsics and if so, updates them using essentially
the same logic as the original code.

The attached testcase demonstrates the issue. Without the fix, the
DbgIntrinic outside the loop uses values computed inside the loop, even
though these values do not dominate the DbgIntrinsic.

Author: Thomas Jablin (tjablin)
Reviewers: dblaikie aprantl kbarton hfinkel cycheng

http://reviews.llvm.org/D19564

llvm-svn: 269034
2016-05-10 09:45:44 +00:00
Arnaud A. de Grandmaison 333ef381b8 [InstCombine] Remove trivially empty va_start/va_end and va_copy/va_end ranges.
When a va_start or va_copy is immediately followed by a va_end (ignoring
debug information or other start/end in between), then it is safe to
remove the pair. As this code shares some commonalities with the lifetime
markers, this has been factored to helper functions.

This InstCombine pattern kicks-in 3 times when running the LLVM test
suite.

llvm-svn: 269033
2016-05-10 09:24:49 +00:00
Renato Golin d876eecf02 Revert "[PGO] Fix __llvm_profile_raw_version linkage in MACHO IR instrumentation generates a COMDAT symbol __llvm_profile_raw_version to overwrite the same symbol in profile run-time to distinguish IR profiles from Clang generated profiles. In MACHO, LinkOnceODR linkage is used due to the lack of COMDAT support."
This reverts commits r268969, r268979 and r268984. They had target specific test
in generic directories without the correct specifiers and made it hard for us to
come up with a good solution by rapidly committing untested changes.

This test needs to be in a target specific directory or have the correct REQUIRED
identifier.

llvm-svn: 269027
2016-05-10 08:23:57 +00:00
Elena Demikhovsky c434d091c5 [LoopVectorize] Handling induction variable with non-constant step.
Allow vectorization when the step is a loop-invariant variable.
This is the loop example that is getting vectorized after the patch:

 int int_inc;
 int bar(int init, int *restrict A, int N) {

  int x = init;
  for (int i=0;i<N;i++){
    A[i] = x;
    x += int_inc;
  }
  return x;
 }

"x" is an induction variable with *loop-invariant* step.
But it is not a primary induction. Primary induction variable with non-constant step is not handled yet.

Differential Revision: http://reviews.llvm.org/D19258

llvm-svn: 269023
2016-05-10 07:33:35 +00:00
Denis Zobnin 15d1e64b2b [LAA] Rename "isStridedPtr" with "getPtrStride". NFC.
Changing misleading function name was approved in http://reviews.llvm.org/D17268.
Patch by Roman Shirokiy.

llvm-svn: 269021
2016-05-10 05:55:16 +00:00
Justin Lebar 50deb6d028 Minor formatting fixes in LoopUnroll.cpp.
llvm-svn: 268995
2016-05-10 00:31:23 +00:00
Adam Nemet c6bbd80d59 [IndirectCallPromotion] Remove duplicate comment. NFC
llvm-svn: 268986
2016-05-09 23:03:06 +00:00
Chad Rosier 58919cc6f8 Typo. NFC.
llvm-svn: 268975
2016-05-09 21:37:43 +00:00
Xinliang David Li dfa21c310d Cleanup followup of r268710 - [PM] port IR based PGO prof-gen pass to new pass manager
llvm-svn: 268974
2016-05-09 21:37:12 +00:00
Rong Xu a12f6d3c7b [PGO] Fix __llvm_profile_raw_version linkage in MACHO
IR instrumentation generates a COMDAT symbol __llvm_profile_raw_version to
overwrite the same symbol in profile run-time to distinguish IR profiles from
Clang generated profiles. In MACHO, LinkOnceODR linkage is used due to the
lack of COMDAT support.

But LinkOnceODR linkage might have .weak_def_can_be_hidden assembly directive,
while the weak variable in run-time has a .weak_definition directive. Linker
will not merge these two symbols even they have the same name. The end result
is IR profiles are not properly flagged in MACHO.

This patch changes the linkage for __llvm_profile_raw_version in each module to
LinkOnceAny so that it has same .weak_definition directive as in the run-time.

Differential Revision: http://reviews.llvm.org/D20078

llvm-svn: 268969
2016-05-09 21:03:06 +00:00
Marcin Koscielnicki 60b3cbe095 [MSan] [AArch64] Fix vararg helper for >1 or non-int fixed arguments.
This fixes http://llvm.org/PR27646 on AArch64.

There are three issues here:

- The GR save area is 7 words in size, instead of 8.  This is not enough
  if none of the fixed arguments is passed in GRs (they're all floats or
  aggregates).
- The first argument is ignored (which counteracts the above if it's passed
  in GR).
- Like x86_64, fixed arguments landing in the overflow area are wrongly
  counted towards the overflow offset.

Differential Revision: http://reviews.llvm.org/D20023

llvm-svn: 268967
2016-05-09 20:57:36 +00:00
Chad Rosier 131a42ccdf [InstCombine] Fold icmp eq/ne (udiv i32 A, B), 0 -> icmp ugt/ule B, A.
Differential Revision: http://reviews.llvm.org/D20036

llvm-svn: 268960
2016-05-09 19:30:20 +00:00
Joerg Sonnenberger 8ffe7ab7c2 Optimize a printf with a double procent to putchar.
llvm-svn: 268922
2016-05-09 14:36:16 +00:00
Junmo Park 955298746d Minor code cleanups. NFC.
llvm-svn: 268888
2016-05-08 23:22:58 +00:00
Xinliang David Li d55827f7b2 [PM] code refactoring -- preparation for new PM porting /NFC
llvm-svn: 268851
2016-05-07 05:39:12 +00:00
Philip Reames 6f4d0088c6 Reapply 267210 with fix for PR27490
Original Commit Message
Extend load/store type canonicalization to handle unordered operations

Extend the type canonicalization logic to work for unordered atomic loads and stores.  Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before.  Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered.  If you see problems, feel free to revert this change, but please make sure you collect a test case. 

Note that the concern about lowering is now much less likely.  PR27490 proved that we already *were* mucking with the types of ordered atomics and volatiles.  As a result, this change doesn't introduce as much new behavior as originally thought.

llvm-svn: 268809
2016-05-06 22:17:01 +00:00
Philip Reames 4a3c3b66d7 [GVN] PRE of unordered loads
Again, fairly simple.  Only change is ensuring that we actually copy the property of the load correctly.  The aliasing legality constraints were already handled by the FRE patches.  There's nothing special about unorder atomics from the perspective of the PRE algorithm itself.

llvm-svn: 268804
2016-05-06 21:43:51 +00:00
Sanjoy Das 091fcfa3a7 [RS4GC] Fix typo in comment
llvm-svn: 268790
2016-05-06 20:39:33 +00:00
Marcin Koscielnicki b088ad1e09 [MSan] [X86] Fix vararg helper for fixed arguments in overflow area.
This fixes http://llvm.org/PR27646 on x86_64.

Differential Revision: http://reviews.llvm.org/D19997

llvm-svn: 268783
2016-05-06 19:36:56 +00:00
Philip Reames 1fdce639d2 [GVN] Handle unordered atomics in cross block FRE
You'll note there are essentially no code changes here.  Cross block FRE heavily reuses code from the block local FRE.  All of the tricky parts were done as part of the previous patch and the refactoring that removed the original code duplication.  

llvm-svn: 268775
2016-05-06 18:46:45 +00:00
Philip Reames ae8997f496 [GVN] Do local FRE for unordered atomic loads
This patch is the first in a small series teaching GVN to optimize unordered loads aggressively. This change just handles block local FRE because that's the simplest thing which lets me test MDA, and the AvailableValue pieces. Somewhat suprisingly, MDA appears fine and only a couple of small changes are needed in GVN.

Once this is in, I'll tackle non-local FRE and PRE. The former looks like a natural extension of this, the later will require a couple of minor changes.

Differential Revision: http://reviews.llvm.org/D19440

llvm-svn: 268770
2016-05-06 18:17:13 +00:00
Mehdi Amini 31407ba009 Tweak the ThinLTO pass pipeline
Summary:
The original ThinLTO pipeline was derived from some
work I did tuning FullLTO on the test suite and SPEC. This
patch reduces the amount of work done in the "linker phase" of
the build, and extend the function simplifications passes
performed during the "compile phase". This helps the build time
by reducing the IR as much as possible during the compile phase
and limiting the work to be performed during the "link phase",
while keeping the performance "on par" with the existing pipeline.

Reviewers: tejohnson

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D19773

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268769
2016-05-06 18:17:03 +00:00
Sanjay Patel 1cb6241a89 [SimplifyCFG] propagate branch metadata when creating select (retry r268550 / r268751 with possible fix)
Retrying r268550/r268751 which were reverted at r268577/r268765 due a memory sanitizer failure.
I have not been able to reproduce that failure, but I've taken another guess at fixing
the problem in this version of the patch and will watch for another failure.

Original commit message:
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.

Differential Revision: http://reviews.llvm.org/D19674

llvm-svn: 268767
2016-05-06 18:07:46 +00:00
Sanjay Patel 84a0bf64a8 revert r268751 - caused same failures on msan bot
llvm-svn: 268765
2016-05-06 17:51:37 +00:00
Sanjay Patel 6609510c32 [SimplifyCFG] propagate branch metadata when creating select (retry r268550 with possible fix)
Retrying r268550 which was reverted at r268577 due a memory sanitizer failure.
I have not been able to reproduce that failure, but I've taken a guess at fixing
the problem in this version of the patch and will watch for another failure.

Original commit message:
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.

Differential Revision: http://reviews.llvm.org/D19674

llvm-svn: 268751
2016-05-06 17:07:47 +00:00
Chad Rosier 4ab37c0037 [SimplifyCFG] Prefer a simplification based on a dominating condition.
Rather than merge two branches with a common destination.
Differential Revision: http://reviews.llvm.org/D19743

llvm-svn: 268735
2016-05-06 14:25:14 +00:00
Ryan Govostes 6194ae69fe Fix whitespace and line wrapping. NFC.
llvm-svn: 268725
2016-05-06 11:22:11 +00:00
Ryan Govostes 3f37df0326 [asan] add option to set shadow mapping offset
Allowing overriding the default ASAN shadow mapping offset with the
-asan-shadow-offset option, and allow zero to be specified for both offset and
scale.

Patch by Aaron Carroll <aaronc@apple.com>.

llvm-svn: 268724
2016-05-06 10:25:22 +00:00
Mehdi Amini 3b132e34b0 ThinLTO: fix assertion and refactor check for hidden use from inline ASM in a helper function
This test was crashing, and currently it breaks bootstrapping clang with debuginfo

Differential Revision: http://reviews.llvm.org/D20008

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268715
2016-05-06 08:25:33 +00:00
Xinliang David Li 8aebf44c97 [PM] port IR based PGO prof-gen pass to new pass manager
llvm-svn: 268710
2016-05-06 05:49:19 +00:00
Philip Reames 32b55181fa [EarlyCSE] Rename a variable for clarity [NFC]
llvm-svn: 268701
2016-05-06 01:13:58 +00:00
Davide Italiano f54f2f0893 [PM] Port Interprocedural SCCP to the new pass manager.
llvm-svn: 268684
2016-05-05 21:05:36 +00:00
Dehao Chen f50c67ce7c Revert http://reviews.llvm.org/D19926 as it breaks tests.
llvm-svn: 268681
2016-05-05 20:47:53 +00:00
Dehao Chen e48b4ee98c Simplify CFG before assigning discriminator.
Summary: We need to clean up CFG before assigning discriminator to minimize the impact of optimization on debug info.

Reviewers: davidxl, dblaikie, dnovillo

Subscribers: dnovillo, danielcdh, llvm-commits

Differential Revision: http://reviews.llvm.org/D19926

llvm-svn: 268675
2016-05-05 20:18:49 +00:00
Marcin Koscielnicki 60061c21cb [MSan] [MIPS64] Fix vararg helper for >1 fixed argument.
This fixes http://llvm.org/PR27646 on Mips64.

Differential Revision: http://reviews.llvm.org/D19989

llvm-svn: 268673
2016-05-05 20:13:17 +00:00
Vitaly Buka 1df2338bb6 Revert "[ThinLTO] Emit individual index files for distributed backends"
MemorySanitizer: use-of-uninitialized-value in lib/Bitcode/Writer/BitcodeWriter.cpp:364:70
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/12544/steps/check-llvm%20msan/logs/stdio

This reverts commit 0c4a898ea550699d1b2f4fe3767251c8f9a48d52.

llvm-svn: 268660
2016-05-05 18:31:00 +00:00
Chad Rosier b438a327d7 Remove dead include. NFC.
llvm-svn: 268655
2016-05-05 17:55:51 +00:00
Chad Rosier 799e4c6fc3 Remove dead include. NFC.
llvm-svn: 268654
2016-05-05 17:53:43 +00:00
Silviu Baranga 28eb344140 Fix unused variable warning after r268632
llvm-svn: 268634
2016-05-05 15:27:57 +00:00
Silviu Baranga c05bab8a9c [LV] Identify more induction PHIs by coercing expressions to AddRecExprs
Summary:
Some PHIs can have expressions that are not AddRecExprs due to the presence
of sext/zext instructions. In order to prevent the Loop Vectorizer from
bailing out when encountering these PHIs, we now coerce the SCEV
expressions to AddRecExprs using SCEV predicates (when possible).

We only do this when the alternative would be to not vectorize.

Reviewers: mzolotukhin, anemet

Subscribers: mssimpso, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17153

llvm-svn: 268633
2016-05-05 15:20:39 +00:00
Silviu Baranga 7e0d4353f2 [LV] Refactor the validation of PHI inductions. NFC
This moves the validation of PHI inductions into a
separate method, making it easier to reuse this
logic.

llvm-svn: 268632
2016-05-05 15:14:01 +00:00
Teresa Johnson 9254ebe3c0 [ThinLTO] Emit individual index files for distributed backends
Summary:
When launching ThinLTO backends in a distributed build (currently
supported in gold via the thinlto-index-only plugin option), emit
an individual index file for each backend process as described here:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098272.html

The individual index file encodes the summary and module information
required for implementing the importing/exporting decisions made
for a given module in the thin link step.
This is in place of the current mechanism that uses the combined index
to make importing decisions in each back end independently. It is an
enabler for doing global summary based optimizations in the thin link
step (which will be recorded in the individual index files), and reduces
the size of the index that must be sent to each backend process, and
the amount of work to scan it in the backends.

Rather than create entirely new ModuleSummaryIndex structures (and all
the included unique_ptrs) for each backend index file, a map is created
to record all of the GUID and summary pointers needed for a particular
index file. The IndexBitcodeWriter walks this map instead of the full
index (hiding the details of managing the appropriate summary iteration
in a new iterator subclass). This is more efficient than walking the
entire combined index and filtering out just the needed summaries during
each backend bitcode index write.

Depends on D19481.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D19556

llvm-svn: 268627
2016-05-05 13:44:56 +00:00
Davide Italiano 344e838fea [PM] Port EliminateAvailableExternally pass to the new pass manager.
llvm-svn: 268599
2016-05-05 02:37:32 +00:00
Ryan Govostes 8c21be6b3e Revert "[asan] add option to set shadow mapping offset"
This reverts commit ba89768f97b1d4326acb5e33c14eb23a05c7bea7.

llvm-svn: 268588
2016-05-05 01:27:04 +00:00
Ryan Govostes 097c5b051c [asan] add option to set shadow mapping offset
Allowing overriding the default ASAN shadow mapping offset with the
-asan-shadow-offset option, and allow zero to be specified for both offset and
scale.

llvm-svn: 268586
2016-05-05 01:14:39 +00:00
Dehao Chen d55bc4c7ab clang-format some files in preparation of coming patch reviews.
llvm-svn: 268583
2016-05-05 00:54:54 +00:00
Davide Italiano 164b9bc6fe [PM] Port ConstantMerge to the new pass manager.
llvm-svn: 268582
2016-05-05 00:51:09 +00:00
Adam Nemet 3c5eabfcbc [LoopDataPrefetch] Add optimization remark
With -Rpass=loop-data-prefetch, show the memory access that got
prefetched.

llvm-svn: 268578
2016-05-05 00:08:15 +00:00
Vitaly Buka fdcea9d78a Revert "[SimplifyCFG] propagate branch metadata when creating select"
MemorySanitizer: use-of-uninitialized-value
0x4910e47 in count /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/Support/MathExtras.h:159:12
0x4910e47 in countLeadingZeros<unsigned long> /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/Support/MathExtras.h:183
0x4910e47 in FitWeights /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:855
0x4910e47 in SimplifyCondBranchToCondBranch /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:2895

This reverts commit 609f4dd4bf3bc735c8c047a4d4b0a8e9e4d202e2.

llvm-svn: 268577
2016-05-04 23:59:33 +00:00
Davide Italiano a7f5e88932 Revert "[SCCP] Throw away dead code. NFC."
This reverts commit r268568, as it broke the bots.

llvm-svn: 268570
2016-05-04 23:27:13 +00:00
Davide Italiano fc1214fee2 [SCCP] Throw away dead code. NFC.
llvm-svn: 268568
2016-05-04 23:05:59 +00:00
Balaram Makam 569eaec5f3 "Reapply r268521 "[InstCombine] Canonicalize icmp instructions based on dominating conditions.""
This reapplies commit r268521, that was reverted in r268530 due to a test failure in select-implied.ll
Modified the test case to reflect the new change.

llvm-svn: 268557
2016-05-04 21:32:14 +00:00
Sanjay Patel 7e8c285814 [SimplifyCFG] propagate branch metadata when creating select
Unlike earlier similar fixes, we need to recalculate the branch weights
in this case.

Differential Revision: http://reviews.llvm.org/D19674

llvm-svn: 268550
2016-05-04 20:48:24 +00:00
Balaram Makam 31e7e13789 Revert "[InstCombine] Canonicalize icmp instructions based on dominating conditions."
This reverts commit 573a40f79b35cf3e71db331bb00f6a84f03b835d.

llvm-svn: 268530
2016-05-04 18:37:35 +00:00
Balaram Makam cf3bcb2625 [InstCombine] Canonicalize icmp instructions based on dominating conditions.
Summary:
    This patch canonicalizes conditions based on the constant range information
    of the dominating branch condition.
    For example:

      %cmp = icmp slt i64 %a, 0
      br i1 %cmp, label %land.lhs.true, label %lor.rhs
      lor.rhs:
        %cmp2 = icmp sgt i64 %a, 0

    Would now be canonicalized into:

      %cmp = icmp slt i64 %a, 0
      br i1 %cmp, label %land.lhs.true, label %lor.rhs
      lor.rhs:
        %cmp2 = icmp ne i64 %a, 0

Reviewers: mcrosier, gberry, t.p.northover, llvm-commits, reames, hfinkel, sanjoy, majnemer

Subscribers: MatzeB, majnemer, mcrosier

Differential Revision: http://reviews.llvm.org/D18841

llvm-svn: 268521
2016-05-04 17:34:20 +00:00
Hans Wennborg 0c3518e84b [SimplifyCFG] isSafeToSpeculateStore now ignores debug info
This patch fixes PR27615.

@llvm.dbg.value instructions no longer count towards the maximum number of
instructions to look back at in the instruction list when searching for a
store instruction. This should make the output consistent between debug and
non-debug build.

Patch by Henric Karlsson <henric.karlsson@ericsson.com>!

Differential Revision: http://reviews.llvm.org/D19912

llvm-svn: 268512
2016-05-04 15:40:57 +00:00
Chad Rosier 7ab9a7b203 Use a uniform name for the load combine pass. NFC.
llvm-svn: 268507
2016-05-04 15:19:02 +00:00
Igor Laevsky fb1811d3a0 [RS4GC] Use SetVector/MapVector instead of DenseSet/DenseMap to guarantee stable ordering
Goal of this change is to guarantee stable ordering of the statepoint arguments and other 
newly inserted values such as gc.relocates. Previously we had explicit sorting in a couple
of places. However for unnamed values ordering was partial and overall we didn't have any 
strong invariant regarding it. This change switches all data structures to use SetVector's
and MapVector's which provide possibility for deterministic iteration over them.
Explicit sorting is now redundant and was removed.

Differential Revision: http://reviews.llvm.org/D19669

llvm-svn: 268502
2016-05-04 14:55:36 +00:00
Davide Italiano 17da174b8b [IPO/ConstantMerge] Convert to static function, to facilitate transition to the new PM.
llvm-svn: 268476
2016-05-04 03:21:20 +00:00
David Majnemer 95549497ec [GlobalDCE, Misc] Don't remove functions referenced by ifuncs
We forgot to consider the target of ifuncs when considering if a
function was alive or dead.

N.B. Also update a few auxiliary tools like bugpoint and
verify-uselistorder.

This fixes PR27593.

llvm-svn: 268468
2016-05-04 00:20:48 +00:00
Andrew Kaylor 50271f787e Add opt-bisect support to additional passes that can be skipped
Differential Revision: http://reviews.llvm.org/D19882

llvm-svn: 268457
2016-05-03 22:32:30 +00:00
Justin Bogner d0d2341f30 PM: Port LoopRotation to the new loop pass manager
llvm-svn: 268452
2016-05-03 22:02:31 +00:00
Justin Bogner ab6a513b4e PM: Port LoopSimplifyCFG to the new pass manager
llvm-svn: 268446
2016-05-03 21:47:32 +00:00
Davide Italiano c91e0b2fde [IPO/ConstantMerge] Garbage collect dead code. NFC.
llvm-svn: 268442
2016-05-03 21:30:10 +00:00
Davide Italiano 296d12cd40 [IPO/IPCP] Convert to use static functions. NFC.
In preparation for porting this pass to the new PM.

llvm-svn: 268429
2016-05-03 20:08:24 +00:00
Davide Italiano 66228c4cf1 [IPO/GlobalDCE] Port to the new pass manager.
Differential Revision:  http://reviews.llvm.org/D19782

llvm-svn: 268425
2016-05-03 19:39:15 +00:00
Jack Liu f101c0f7a1 [SROA] Function canConvertValue needs to check whether both NewTy and OldTy pointers are
pointing to the same addr space. This can prevent SROA from creating a bitcast
between pointers with different addr spaces.

Differential Revision: http://reviews.llvm.org/D19697

llvm-svn: 268424
2016-05-03 19:30:48 +00:00
Jack Liu 430e2c2140 Revert 268409 due to missing comment.
llvm-svn: 268421
2016-05-03 19:15:02 +00:00
Jack Liu 1ff4a0b7ee (no commit message)
llvm-svn: 268409
2016-05-03 18:01:43 +00:00
Sanjoy Das 4ae3920c5b [LICM] Kill SCEV loop dispositions if needed
SCEV caches whether SCEV expressions are loop invariant, variant or
computable.  LICM breaks this cache, almost by definition; so clear the
SCEV disposition cache if LICM changed anything.

llvm-svn: 268408
2016-05-03 17:50:11 +00:00
Sanjoy Das 7e7a5a050a Use all_of instead of a raw loop; NFC
Added some tests despite being NFC, since it looks like nothing was
exercising the "all incoming values to exit PHIs are same" logic.

llvm-svn: 268407
2016-05-03 17:50:06 +00:00
Sanjoy Das 905fc27ebf [LoopDeletion] Clear SCEV loop dispositions
`Loop::makeLoopInvariant` can hoist instructions out of loops, so loop
dispositions for the loop it operated on may need to be cleared.  We can
be smarter here (especially around how `forgetLoopDispositions` is
implemented), but let's be correct first.

Fixes PR27570.

llvm-svn: 268406
2016-05-03 17:50:02 +00:00
Vedant Kumar 43cba7333c [ProfileData] Add error codes for compression failures
Be more specific in describing compression failures. Also, check for
this kind of error in emitNameData().

This is part of a series of patches to transition ProfileData over to
the stricter Error/Expected interface.

llvm-svn: 268400
2016-05-03 16:53:17 +00:00
Mehdi Amini 7f7d8be518 Move "Eliminate Available Externally" immediately after the inliner
This pass is supposed to reduce the size of the IR for compile time
purpose. We should run it ASAP, except when we prepare for LTO or
ThinLTO, and we want to keep them available for link-time inline.

Differential Revision: http://reviews.llvm.org/D19813

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268394
2016-05-03 15:46:00 +00:00
Kristof Beyls c08f70588d Mark that SpeculativeExecution preserves Globals Alias Analysis.
A few benchmarks with lots of accesses to global variables in the hot
loops regressed a lot since r266399, which added the
SpeculativeExecution pass to the default pipeline. The problem is that
this pass doesn't mark Globals Alias Analysis as preserved. Globals
Alias Analysis is computed in a module pass, whereas
SpeculativeExecution is a function pass, and a lot of passes dependent
on the Globals Alias Analysis to optimize these benchmarks are also
function passes. As such, the Globals Alias Analysis information cannot
be recomputed between SpeculativeExecution and the following function
passes needing that information.

SpeculativeExecution doesn't invalidate Globals Alias Analysis, so mark
it as such to fix those performance regressions.

Differential Revision: http://reviews.llvm.org/D19806

llvm-svn: 268370
2016-05-03 08:33:26 +00:00
David Majnemer 3d90bb79c4 [LoopUnroll] Unroll loops which have exit blocks to EH pads
We were overly cautious in our analysis of loops which have invokes
which unwind to EH pads.  The loop unroll transform is safe because it
only clones blocks in the loop body, it does not try to split critical
edges involving EH pads.  Instead, move the necessary safety check to
LoopUnswitch.

N.B. The safety check for loop unswitch is covered by an existing test
which fails without it.

llvm-svn: 268357
2016-05-03 03:57:40 +00:00
Mehdi Amini 5b85d8d67b ThinLTO: do not import function whose linkage prevents inlining.
There is not point in importing a "weak" or a "linkonce" function
since we won't be able to inline it anyway.
We already had a targeted check for WeakAny, this is using the
same check on GlobalValue as the inline, i.e.
isMayBeOverriddenLinkage()

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268341
2016-05-03 00:27:28 +00:00
Mehdi Amini 1e918c9cb3 Revert "ThinLTO: do not import function whose linkage prevents inlining."
This reverts commit r268315, the tests are not passing.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268317
2016-05-02 22:26:04 +00:00
Mehdi Amini bda9b2ae9e ThinLTO: do not import function whose linkage prevents inlining.
There is not point in importing a "weak" or a "linkonce" function
since we won't be able to inline it anyway.
We already had a targeted check for WeakAny, this is using the
same check on GlobalValue as the inline, i.e.
isMayBeOverriddenLinkage()

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268315
2016-05-02 22:11:27 +00:00
Xinliang David Li 5ad7c820fc Code refactoring -- preparation for new PM porting /NFC
llvm-svn: 268301
2016-05-02 20:33:59 +00:00
Reid Kleckner bca59d2a43 Revert "[SimplifyCFG] Extend TryToSimplifyUncondBranchFromEmptyBlock for empty block including lifetime intrinsics"
This reverts commit r268254.

This change causes assertion failures while building Chromium. Reduced
test case coming soon.

llvm-svn: 268288
2016-05-02 19:43:22 +00:00
Chad Rosier fcb2210812 Typo. NFC.
llvm-svn: 268280
2016-05-02 19:06:04 +00:00
Chad Rosier 4466ff50eb Use false rather than 0 for a boolean value. NFC.
llvm-svn: 268279
2016-05-02 19:06:02 +00:00
Mehdi Amini 0ddf404cf4 ReversePostOrderFunctionAttrs is not modifying the call graph, let's preserve it.
When running cc1 with -flto=thin, it is followed by GlobalOpt, which
requires the callgraph. This saves rebuilding one.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268266
2016-05-02 18:03:33 +00:00
Hans Wennborg b7599329fc [SimplifyCFG] Extend TryToSimplifyUncondBranchFromEmptyBlock for empty block including lifetime intrinsics
Make it possible that TryToSimplifyUncondBranchFromEmptyBlock merges empty
basic block including lifetime intrinsics as well as phi nodes and
unconditional branch into its successor or predecessor(s).

If successor of empty block has single predecessor, all contents including
lifetime intrinsics are sinked into the successor. Otherwise, they are
hoisted into its predecessor(s) and then merged into the predecessor(s).

Patch by Josh Yoon <josh.yoon@samsung.com>!

Differential Revision: http://reviews.llvm.org/D19257

llvm-svn: 268254
2016-05-02 17:22:54 +00:00
Mehdi Amini 45c7b3ecb5 Move createReversePostOrderFunctionAttrsPass right after the inliner is done
This is where it was originally, until LoopVersioningLICM was
inserted before in r259986, I don't believe it was on purpose.

Differential Revision: http://reviews.llvm.org/D19809

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268252
2016-05-02 16:53:16 +00:00
Adam Nemet d02872c7b4 [LLE] Fix typo from r263058
This was meant to check unit stride for both the load and the store.

Thanks to Roman Shirokiy for noticing this.

llvm-svn: 268251
2016-05-02 16:52:00 +00:00
Simon Pilgrim ca140b17cb [InstCombine][SSE] Added support to VPERMD/VPERMPS to shuffle combine to accept UNDEF elements.
llvm-svn: 268206
2016-05-01 20:43:02 +00:00
Simon Pilgrim eeacc40e27 [InstCombine][SSE] Added support to VPERMILVAR to shuffle combine to accept UNDEF elements.
llvm-svn: 268204
2016-05-01 20:22:42 +00:00
Simon Pilgrim e5e8c2fde0 [InstCombine][SSE] Added support to PSHUFB to shuffle combine to accept UNDEF elements.
llvm-svn: 268202
2016-05-01 19:26:21 +00:00
Simon Pilgrim 8cddf8b3c6 [InstCombine][AVX2] Combine VPERMD/VPERMPS intrinsics with constant masks to shufflevector.
llvm-svn: 268199
2016-05-01 16:41:22 +00:00
Marcin Koscielnicki 57290f934a [ASan] Add shadow offset for SystemZ.
SystemZ on Linux currently has 53-bit address space.  In theory, the hardware
could support a full 64-bit address space, but that's not supported due to
kernel limitations (it'd require 5-level page tables), and there are no plans
for that.  The default process layout stays within first 4TB of address space
(to avoid creating 4-level page tables), so any offset >= (1 << 42) is fine.
Let's use 1 << 52 here, ie. exactly half the address space.

I've originally used 7 << 50 (uses top 1/8th of the address space), but ASan
runtime assumes there's some space after the shadow area.  While this is
fixable, it's simpler to avoid the issue entirely.

Also, I've originally wanted to have the shadow aligned to 1/8th the address
space, so that we can use OR like X86 to assemble the offset.  I no longer
think it's a good idea, since using ADD enables us to load the constant just
once and use it with register + register indexed addressing.

Differential Revision: http://reviews.llvm.org/D19650

llvm-svn: 268161
2016-04-30 09:57:34 +00:00
Simon Pilgrim 640f9964c7 [InstCombine][AVX] VPERMILVAR to shuffle combine to use general aggregate elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.

llvm-svn: 268158
2016-04-30 07:23:30 +00:00
Sanjoy Das 47cf2affbd [LowerGuardIntrinsics] Keep track of !make.implicit metadata
If a guard call being lowered by LowerGuardIntrinsics has the
`!make.implicit` metadata attached, then reattach the metadata to the
branch in the resulting expanded form of the intrinsic.  This allows us
to implement null checks as guards and still get the benefit of implicit
null checks.

llvm-svn: 268148
2016-04-30 00:55:59 +00:00
Lawrence Hu 1befea2bdc Reroll loops with multiple IV and negative step part 3
support multiple induction variables

    This patch enable loop reroll for the following case:
        for(int i=0;  i<N; i += 2) {
           S += *a++;
           S += *a++;
        };

Differential Revision: http://reviews.llvm.org/D16550

llvm-svn: 268147
2016-04-30 00:51:22 +00:00
Sanjoy Das 52c68bb0f5 [LowerGuardIntrinsics] Preserve calling conv when lowering
llvm-svn: 268142
2016-04-30 00:17:47 +00:00
Xinliang David Li 4b2fdccad9 Reapply r268107 after fixing a bug breaks debug build.
Makes the new method to set data needed by debug dump.

llvm-svn: 268130
2016-04-29 22:59:36 +00:00
Sanjoy Das 107aefc2fc Mark guards on true as "trivially dead"
This moves some logic added to EarlyCSE in rL268120 into
`llvm::isInstructionTriviallyDead`.  Adds a test case for DCE to
demonstrate that passes other than EarlyCSE can now pick up on the new
information.

llvm-svn: 268126
2016-04-29 22:23:16 +00:00
Sanjoy Das ee81b23fe7 [EarlyCSE] Simplify guard intrinsics
Summary:
This change teaches EarlyCSE some basic properties of guard intrinsics:

 - Guard intrinsics read all memory, but don't write to any memory
 - After a guard has executed, the condition it was guarding on can be
   assumed to be true
 - Guard intrinsics on a constant `true` are no-ops

Reviewers: reames, hfinkel

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19578

llvm-svn: 268120
2016-04-29 21:52:58 +00:00
Xinliang David Li 0552521b03 Revert r268107 -- debug build failure
llvm-svn: 268116
2016-04-29 21:43:28 +00:00
Simon Pilgrim bf60cc492c [InstCombine][SSE] PSHUFB to shuffle combine to use general aggregate elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.

llvm-svn: 268115
2016-04-29 21:34:54 +00:00
Xinliang David Li 1ffa28a3f1 [inliner]: Refactor inline deferring logic into its own method /NFC
The implemented heuristic has a large body of code which better sits
in its own function for better readability. It also allows adding more
heuristics easier in the future.

llvm-svn: 268107
2016-04-29 21:21:44 +00:00
Chad Rosier cd62bf5821 [InstCombine] Determine the result of a select based on a dominating condition.
Differential Revision: http://reviews.llvm.org/D19550

llvm-svn: 268104
2016-04-29 21:12:31 +00:00
Sanjay Patel 9190b4add8 [InstCombine] clean up; NFC
llvm-svn: 268099
2016-04-29 20:54:56 +00:00
George Burgess IV 1b1fef30d0 [MemorySSA] Fix bugs in walker; refactor unittests a bit.
This patch fixes two somewhat related bugs in MemorySSA's caching
walker. These bugs were found because D19695 brought up the problem
that we'd have defs cached to themselves, which is incorrect.

The bugs this fixes are:

- We would sometimes skip the nearest clobber of a MemoryAccess, because
  we would query our cache for a given potential clobber before
  checking if the potential clobber is the clobber we're looking for.
  The cache entry for the potential clobber would point to the nearest
  clobber *of the potential clobber*, so if that was a cache hit, we'd
  ignore the potential clobber entirely.

- There are times (sometimes in DFS, sometimes in the getClobbering...
  functions) where we would insert cache entries that say a def
  clobbers itself.

There's a bit of common code between the fixes for the bugs, so they
aren't split out into multiple commits.

This patch also adds a few unit tests, and refactors existing tests a
bit to reduce the duplication of setup code.

llvm-svn: 268087
2016-04-29 18:42:55 +00:00
Dehao Chen 21aefaec97 Do not read callee name when matching IR to profile as it is not used.
Summary: Callee name is not used to identify a callsite now, so do not read it during annotation.

Reviewers: davidxl, dnovillo

Subscribers: dnovillo, danielcdh, llvm-commits

Differential Revision: http://reviews.llvm.org/D19704

llvm-svn: 268069
2016-04-29 17:19:10 +00:00
Sanjay Patel d5b0e54b49 [InstCombine] add helper function for ICmp with constant canonicalization; NFCI
As suggested in http://reviews.llvm.org/D17859 , we should enhance this
to support vectors.

llvm-svn: 268059
2016-04-29 16:22:25 +00:00
Filipe Cabecinhas 0da9937517 Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them.
Summary:
Historically, we had a switch in the Makefiles for turning on "expensive
checks". This has never been ported to the cmake build, but the
(dead-ish) code is still around.

This will also make it easier to turn it on in buildbots.

Reviewers: chandlerc

Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits

Differential Revision: http://reviews.llvm.org/D19723

llvm-svn: 268050
2016-04-29 15:22:48 +00:00
David Majnemer fadc6db036 [GlobalOpt] Propagate operand bundles
We neglected to transfer operand bundles for some transforms.  These
were found via inspection, I'll try to come up with some test cases.

llvm-svn: 268011
2016-04-29 08:07:22 +00:00
David Majnemer 231a68cc22 [InstCombine] Propagate operand bundles
We neglected to transfer operand bundles for some transforms.  These
were found via inspection, I'll try to come up with some test cases.

llvm-svn: 268010
2016-04-29 08:07:20 +00:00
David Majnemer 1a5799fe3e [DeadArgumentElimination] Propagate operand bundles to promoted call sites
We neglected to transfer operand bundles when performing argument
promotion.

llvm-svn: 268008
2016-04-29 07:22:36 +00:00
Adam Nemet 88ec491830 [LoopDist] Also emit optimization remark on success (-Rpass=)
The option -Rpass=loop-distribute now reports the loops that were
distributed.

llvm-svn: 268006
2016-04-29 07:10:46 +00:00
Adam Nemet 4338d6769e [LoopDist] Pass 'Function' to main class. NFC
Next patch will add another use for 'Function' inside the class.

llvm-svn: 268005
2016-04-29 07:10:39 +00:00
David Majnemer 13d5526392 [SLPVectorizer] Add operand bundles to vectorized functions
SLPVectorizing a call site should result in further propagation of its
bundles.

llvm-svn: 268004
2016-04-29 07:09:51 +00:00
David Majnemer 50ddc0e1b6 [LoopVectorize] Add operand bundles to vectorized functions
Also, do not crash when calculating a cost model for loop-invariant
token values.

llvm-svn: 268003
2016-04-29 07:09:48 +00:00
David Majnemer cd24bb1d3a [ArgumentPromotion] Propagate operand bundles to promoted call sites
We neglected to transfer operand bundles when performing argument
promotion.

This fixes PR27568.

llvm-svn: 267986
2016-04-29 04:56:12 +00:00
Michael Zolotukhin 1816d03b7d [PR25281] Remove AAResultsWrapper from preserved analyses of loop vectorizer.
We don't preserve AAResults, because, for one, we don't preserve SCEV-AA.
That fixes PR25281.

llvm-svn: 267980
2016-04-29 03:31:25 +00:00
Ivan Krasin 8dafa2da8e Fix build by casting to the proper int type.
Reviewers: eugenis

Differential Revision: http://reviews.llvm.org/D19706

llvm-svn: 267974
2016-04-29 02:09:57 +00:00
Hal Finkel 1b66f7e3c8 [LoopVectorize] Keep hints from original loop on the vector loop
We need to keep loop hints from the original loop on the new vector loop.
Failure to do this meant that, for example:

  void foo(int *b) {
  #pragma clang loop unroll(disable)
    for (int i = 0; i < 16; ++i)
      b[i] = 1;
  }

this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the
hints that unrolling should be disabled, and then we'd unroll it.

llvm-svn: 267970
2016-04-29 01:27:40 +00:00
Evgeniy Stepanov 35f3e5e4e7 [msan] Handle vector compare x86 intrinsics.
This handles SSE and SSE2 cmp_* and comiXX_* intrinsics.

llvm-svn: 267966
2016-04-29 01:19:52 +00:00
Adam Nemet 0ba164bbcb [LoopDist] Emit optimization remarks (-Rpass*)
I closely followed the precedents set by the vectorizer:

* With -Rpass-missed, the loop is reported with further details pointing
to -Rpass--analysis.

* -Rpass-analysis reports the details why distribution has failed.

* Regardless of -Rpass*, when distribution fails for a loop where
distribution was forced with the pragma, a warning is produced according
to -Wpass-failed.  In this case the analysis info is also printed even
without -Rpass-analysis.

llvm-svn: 267952
2016-04-28 23:08:32 +00:00
Adam Nemet adeccf7658 [LoopDist] Improve debug messages
The next patch will start using these for -Rpass-analysis so they won't
be internal-only anymore.

Move the 'Skipping; ' prefix that some of the message are using into the
'fail' function.  We don't want to include this prefix in
the -Rpass-analysis report.

llvm-svn: 267951
2016-04-28 23:08:30 +00:00
Adam Nemet 7f38e1199a [LoopDist] Add helper to print debug message when distribution fails. NFC
This will form the basis to emit optimization remarks (-Rpass*).

llvm-svn: 267950
2016-04-28 23:08:27 +00:00
Hal Finkel 50316d95a9 [Inliner] Preserve llvm.mem.parallel_loop_access metadata
When inlining a call site with llvm.mem.parallel_loop_access metadata, this
metadata needs to be propagated to all cloned memory-accessing instructions.
Otherwise, inlining parts of the loop body will invalidate the annotation.

With this functionality, we now vectorize the following as expected:

  void Body(int *res, int *c, int *d, int *p, int i) {
    res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
  }

  void Test(int *res, int *c, int *d, int *p, int n) {
    int i;

  #pragma clang loop vectorize(assume_safety)
    for (i = 0; i < 1600; i++) {
      Body(res, c, d, p, i);
    }
  }

llvm-svn: 267949
2016-04-28 23:00:04 +00:00
Rong Xu 62d5e473ce [PGO] Fix incorrect Twine usage in emitting optimization remarks.
Should not store Twine objects to local variables. This is fixed the test
failures with r267815 in VS2015 X64 build.

llvm-svn: 267908
2016-04-28 17:49:56 +00:00
Rong Xu 08afb05491 Minor format change and fixing typos in the comments. NFC.
llvm-svn: 267905
2016-04-28 17:31:22 +00:00
Arch D. Robison 0e61034018 [SLPVectorizer] Extend SLP Vectorizer to deal with aggregates.
The refactoring portion part was done as r267748.

http://reviews.llvm.org/D14185

llvm-svn: 267899
2016-04-28 16:11:45 +00:00
Chad Rosier 712b7d7630 [GVN] Minor code cleanup. NFC.
Differential Revision: http://reviews.llvm.org/D18828
Patch by Aditya Kumar!

llvm-svn: 267898
2016-04-28 16:00:15 +00:00
Geoff Berry 5ae272c2c1 [EarlyCSE] Change LoadValue field Value *Data to Instruction *Inst. NFC.
Made in preparation for adding MemorySSA support to EarlyCSE.

llvm-svn: 267893
2016-04-28 15:22:37 +00:00
Geoff Berry 354fac2a69 [EarlyCSE] Sort includes. NFC.
Reviewers: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19617

llvm-svn: 267890
2016-04-28 14:59:27 +00:00
Ahmed Bougacha 17482a5696 [InstCombine] Remove trailing whitespace. NFC.
r267873.

llvm-svn: 267887
2016-04-28 14:36:07 +00:00
Simon Pilgrim bd4a3be7d2 [InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits
The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits.

This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero.

Differential Revision: http://reviews.llvm.org/D19614

llvm-svn: 267873
2016-04-28 12:22:53 +00:00
Rong Xu 6e34c490ff [PGO] Promote indirect calls to conditional direct calls with value-profile
This patch implements the transformation that promotes indirect calls to
conditional direct calls when the indirect-call value profile meta-data is
available.

Differential Revision: http://reviews.llvm.org/D17864

llvm-svn: 267815
2016-04-27 23:20:27 +00:00
Sanjay Patel facf45a82f [SimplifyCFG] propagate branch metadata when creating select
There's no existing test for this path, and I don't know how to expose
it in a regression test, but I'm assuming there's some reason this
path exists. 

llvm-svn: 267813
2016-04-27 23:14:12 +00:00
Rong Xu af5aebaa32 [PGO] Prohibit address recording if the function is both internal and COMDAT
Differential Revision: http://reviews.llvm.org/D19515

llvm-svn: 267792
2016-04-27 21:17:30 +00:00
Ahmed Bougacha ace97c1f7d [LIR] Set attributes on memset_pattern16.
"inferattrs" will deduce the attribute, but it will be too late for
many optimizations. Set it ourselves when creating the call.

Differential Revision: http://reviews.llvm.org/D17598

llvm-svn: 267762
2016-04-27 19:04:50 +00:00
Ahmed Bougacha 7f97193dd7 [LIR] Reuse variable. NFCI.
llvm-svn: 267761
2016-04-27 19:04:46 +00:00
Ahmed Bougacha 44c19876c7 [InferAttrs] Mark memset_pattern16 params nocapture.
Differential Revision: http://reviews.llvm.org/D19471

llvm-svn: 267760
2016-04-27 19:04:43 +00:00
Ahmed Bougacha b0624a2cb4 [TLI] Unify LibFunc attribute inference. NFCI.
Now the pass is just a tiny wrapper around the util. This lets us reuse
the logic elsewhere (done here for BuildLibCalls) instead of duplicating
it.

The next step is to have something like getOrInsertLibFunc that also
sets the attributes.

Differential Revision: http://reviews.llvm.org/D19470

llvm-svn: 267759
2016-04-27 19:04:40 +00:00
Ahmed Bougacha d765a82b54 [TLI] Unify LibFunc signature checking. NFCI.
I tried to be as close as possible to the strongest check that
existed before; cleaning these up properly is left for future work.

Differential Revision: http://reviews.llvm.org/D19469

llvm-svn: 267758
2016-04-27 19:04:35 +00:00
Matthew Simpson 622b95be7b [LV] Reallow positive-stride interleaved load groups with gaps
We previously disallowed interleaved load groups that may cause us to
speculatively access memory out-of-bounds (r261331). We did this by ensuring
each load group had an access corresponding to the first and last member.
Instead of bailing out for these interleaved groups, this patch enables us to
peel off the last vector iteration, ensuring that we execute at least one
iteration of the scalar remainder loop. This solution was proposed in the
review of the previous patch.

Differential Revision: http://reviews.llvm.org/D19487

llvm-svn: 267751
2016-04-27 18:21:36 +00:00
Arch D. Robison aca7c412b4 [SLPVectorizer] Refactor where MinVecRegSize and MaxVecRegSize live.
This is the first of two commits for extending SLP Vectorizer to deal with aggregates.
This commit merely refactors existing logic.

http://reviews.llvm.org/D14185

llvm-svn: 267748
2016-04-27 17:46:25 +00:00
Matthew Simpson e5dfb08fcb [TTI] Add hook for vector extract with extension
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.

Differential Revision: http://reviews.llvm.org/D18523

llvm-svn: 267725
2016-04-27 15:20:21 +00:00
Teresa Johnson df5ef8711f [ThinLTO] Refine fix to avoid renaming of uses in inline assembly.
Summary:
Refine the workaround from r266877 that attempts to prevent
renaming of locals in inline assembly, so that in addition to looking
for a llvm.used local value, that there is at least one inline assembly
call in the module. Otherwise, debug functions added to the llvm.used
can block importing/exporting unnecessarily.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D19573

llvm-svn: 267717
2016-04-27 14:19:38 +00:00
Artur Pilipenko 9bb6beabf4 isSafeToLoadUnconditionally support queries without a context
This is required to use this function from isSafeToSpeculativelyExecute

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16231

llvm-svn: 267692
2016-04-27 11:00:48 +00:00
Adam Nemet d2fa414718 [LoopDist] Add llvm.loop.distribute.enable loop metadata
Summary:
D19403 adds a new pragma for loop distribution.  This change adds
support for the corresponding metadata that the pragma is translated to
by the FE.

As part of this I had to rethink the flag -enable-loop-distribute.  My
goal was to be backward compatible with the existing behavior:

  A1. pass is off by default from the optimization pipeline
  unless -enable-loop-distribute is specified

  A2. pass is on when invoked directly from opt (e.g. for unit-testing)

The new pragma/metadata overrides these defaults so the new behavior is:

  B1. A1 + enable distribution for individual loop with the pragma/metadata

  B2. A2 + disable distribution for individual loop with the pragma/metadata

The default value whether the pass is on or off comes from the initiator
of the pass.  From the PassManagerBuilder the default is off, from opt
it's on.

I moved -enable-loop-distribute under the pass.  If the flag is
specified it overrides the default from above.

Then the pragma/metadata can further modifies this per loop.

As a side-effect, we can now also use -enable-loop-distribute=0 from opt
to emulate the default from the optimization pipeline.  So to be precise
this is the new behavior:

  C1. pass is off by default from the optimization pipeline
  unless -enable-loop-distribute or the pragma/metadata enables it

  C2. pass is on when invoked directly from opt
  unless -enable-loop-distribute=0 or the pragma/metadata disables it

Reviewers: hfinkel

Subscribers: joker.eph, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19431

llvm-svn: 267672
2016-04-27 05:28:18 +00:00
Vaivaswatha Nagaraj 08efb0efcd [Cloning] cloneLoopWithPreheader(): add assert to ensure no sub-loops
Summary:
cloneLoopWithPreheader() does not update LoopInfo for sub-loop of
the original loop being cloned. Add assert to ensure no sub-loops for loop being cloned.

Reviewers: anemet, ashutosh.nema, hfinkel

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D15922

llvm-svn: 267671
2016-04-27 05:25:09 +00:00
Evgeny Stupachenko 23ce61b663 The patch fixes PR27392.
Summary:
 It is incorrect to compare TripCount (which is BECount + 1)
  with extraiters (or Count) to check if we should enter unrolled
  loop or not, because TripCount can potentially overflow
  (when BECount is max unsigned integer).
 While comparing BECount with (Count - 1) is overflow safe and
  therefore correct.

Reviewer: hfinkel

Differential Revision: http://reviews.llvm.org/D19256

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 267662
2016-04-27 03:04:54 +00:00
Sanjoy Das 5253a089ba Fix typo in comment; NFC
llvm-svn: 267653
2016-04-27 01:44:31 +00:00
Mehdi Amini b4e1e8297b ThinLTO: do not promote GlobalVariable that have a specific section.
Differential Revision: http://reviews.llvm.org/D18298

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267646
2016-04-27 00:32:13 +00:00
Matt Arsenault ba437c67d2 SLSR: Use UnknownAddressSpace instead of 0 for pure arithmetic.
In the case where isLegalAddressingMode is used for cases
not related to addressing modes, such as pure adds and muls,
it should not be using address space 0. LSR already passes -1
as the address space in these cases.

llvm-svn: 267645
2016-04-27 00:32:09 +00:00
Adam Nemet 61399ac424 [LoopDist] Split main class. NFC
This splits out the per-loop functionality from the Pass class.

With this the fact whether the loop is forced-distribute with the new
metadata/pragma can be cached in the per-loop class rather than passed
around.

llvm-svn: 267643
2016-04-27 00:31:03 +00:00
Justin Bogner c2bf63d29d PM: Port Reassociate to the new pass manager
llvm-svn: 267631
2016-04-26 23:39:29 +00:00
Justin Bogner cb8a21c88e Reassociate: Convert another functor into a lambda. NFC
Also move the explanatory comment with it.

llvm-svn: 267628
2016-04-26 23:32:00 +00:00
Sanjay Patel 29dea0d230 [SimplifyCFG] propagate branch metadata when creating select
llvm-svn: 267624
2016-04-26 23:15:48 +00:00
Sanjay Patel d2d2aa52cd [LowerExpectIntrinsic] make default likely/unlikely ratio bigger
We need the default ratio to be sufficiently large that it triggers transforms 
based on block frequency info (BFI) and plays well with the recently introduced
BranchProbability used by CGP.

Differential Revision: http://reviews.llvm.org/D19435

llvm-svn: 267615
2016-04-26 22:23:38 +00:00
Justin Bogner 90744d215b Reassociate: Simplify using lambdas. NFC
llvm-svn: 267614
2016-04-26 22:22:18 +00:00
David Majnemer abb9f55c80 Revert "[SimplifyLibCalls] sprintf doesn't copy null bytes"
The destination buffer that sprintf uses is restrict qualified, we do
not need to worry about derived pointers referenced via format
specifiers.

This reverts commit r267580.

llvm-svn: 267605
2016-04-26 21:04:47 +00:00
Elena Demikhovsky 308a7eb0d2 Masked Store in Loop Vectorizer - bugfix
Fixed a bug in loop vectorization with conditional store.

Differential Revision: http://reviews.llvm.org/D19532

llvm-svn: 267597
2016-04-26 20:18:04 +00:00
Justin Bogner 4563a06cee PM: Port Internalize to the new pass manager
llvm-svn: 267596
2016-04-26 20:15:52 +00:00
David Majnemer 8cd77baebc [SimplifyLibCalls] sprintf doesn't copy null bytes
sprintf doesn't read or copy the terminating null byte from it's string
operands.  sprintf will append it's own after processing all of the
format specifiers.

This fixes PR27526.

llvm-svn: 267580
2016-04-26 18:16:49 +00:00
Dehao Chen 5d6d4841ed Tune basic block annotation algorithm.
Summary:
Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary.

This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19301

llvm-svn: 267519
2016-04-26 04:59:11 +00:00
Hal Finkel e4c0c1679b [SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging
When SimplifyCFG merges identical instructions from both sides of a diamond, it
can preserve !llvm.mem.parallel_loop_access (as it does with most of the other
metadata). There's no real data or control dependency change in this case.

llvm-svn: 267515
2016-04-26 02:06:06 +00:00
Hal Finkel 411d31ad72 [LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops
I really thought we were doing this already, but we were not. Given this input:

void Test(int *res, int *c, int *d, int *p) {
  for (int i = 0; i < 16; i++)
    res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}

we did not vectorize the loop. Even with "assume_safety" the check that we
don't if-convert conditionally-executed loads (to protect against
data-dependent deferenceability) was not elided.

One subtlety: As implemented, it will still prefer to use a masked-load
instrinsic (given target support) over the speculated load. The choice here
seems architecture specific; the best option depends on how expensive the
masked load is compared to a regular load. Ideally, using the masked load still
reduces unnecessary memory traffic, and so should be preferred. If we'd rather
do it the other way, flipping the order of the checks is easy.

The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also
implies that if conversion is okay.

Differential Revision: http://reviews.llvm.org/D19512

llvm-svn: 267514
2016-04-26 02:00:36 +00:00
David Majnemer 30ffc4ce45 [SROA] Don't falsely report that changes have occured
We would report that the function changed despite creating no new
allocas or performing any promotion.

This fixes PR27316.

llvm-svn: 267507
2016-04-26 01:05:00 +00:00
Justin Bogner 1a07501379 PM: Port GlobalOpt to the new pass manager
llvm-svn: 267499
2016-04-26 00:28:01 +00:00
Justin Bogner d2f3d0a79d PM: Convert the logic for GlobalOpt into static functions. NFC
Pass all of the state we need around as arguments, so that these
functions are easier to reuse. There is one part of this that is
unusual: we pass around a functor to look up a DomTree for a function.
This will be a necessary abstraction when we try to use this code in
both the legacy and the new pass manager.

llvm-svn: 267498
2016-04-26 00:27:56 +00:00
Arch D. Robison be0490a6e8 Optimize store of "bitcast" from vector to aggregate.
This patch is what was the "instcombine" portion of D14185, with an additional 
test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). 
The patch causes instcombine to replace sequences of extractelement-insertvalue-store 
that act essentially like a bitcast followed by a store.

Differential review: http://reviews.llvm.org/D14260

llvm-svn: 267482
2016-04-25 22:22:39 +00:00
Teresa Johnson c851d216e2 [ThinLTO] Introduce typedef for commonly-used map type (NFC)
Add a typedef for the std::map<GlobalValue::GUID, GlobalValueSummary *>
map that is passed around to identify summaries for values defined in a
particular module. This shortens up declarations in a variety of places.

llvm-svn: 267471
2016-04-25 21:09:51 +00:00
Etienne Bergeron 50f02aa3fa Cleanup redundant expression in InstCombineAndOrXor.
Summary:
The expression is redundant on both side of operator |.

detected by : http://reviews.llvm.org/D19451

Reviewers: rnk, majnemer

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D19459

llvm-svn: 267458
2016-04-25 20:15:33 +00:00
Chad Rosier e2cbd13e56 [ValueTracking] Improve isImpliedCondition when the dominating cond is false.
llvm-svn: 267430
2016-04-25 17:23:36 +00:00
Anna Thomas 95f68aa7eb Test commit: modified comment. NFC
llvm-svn: 267406
2016-04-25 13:58:05 +00:00
James Molloy eb040cc55f [GlobalOpt] Allow constant globals to be SRA'd
The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!).

There seems to be no reason why we can't SRA constants too, so let's do it.

llvm-svn: 267393
2016-04-25 10:48:29 +00:00
Mehdi Amini bf4513b9aa Run GlobalOpt before emitting the bitcode for ThinLTO
This is motivated by reducing the size of the IR and thus reduce
compile time.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267385
2016-04-25 08:47:49 +00:00
Mehdi Amini f72ca86b71 ThinLTO: Move createNameAnonFunctionPass insertion in PassManagerBuilder (NFC)
It is just code motion, but makes more sense this way.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267384
2016-04-25 08:47:37 +00:00
Simon Pilgrim 4c564ad4dd Tweak comments to make it clear that these combines are for SSE scalar instructions.
llvm-svn: 267360
2016-04-24 19:31:56 +00:00
Simon Pilgrim 4b5462f119 [InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required
As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns.

llvm-svn: 267359
2016-04-24 18:35:59 +00:00
Simon Pilgrim 83020942d3 [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2)
Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:

1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND).

2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements

3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly

We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns).

Differential Revision: http://reviews.llvm.org/D19318

llvm-svn: 267357
2016-04-24 18:23:14 +00:00
Simon Pilgrim 424da1637a [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2)
This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:

1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input)

2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input

Differential Revision: http://reviews.llvm.org/D17490

llvm-svn: 267356
2016-04-24 18:12:42 +00:00
Simon Pilgrim 1c9a9f255c [InstCombine] Avoid updating argument demanded elements in separate passes.
As discussed on D17490, we should attempt to update an intrinsic's arguments demanded elements in one pass if we can.

llvm-svn: 267355
2016-04-24 17:57:27 +00:00
Simon Pilgrim 2f6097d113 [X86][InstCombine] Tidyup VPERMILVAR -> shufflevector conversion to helper function. NFCI.
llvm-svn: 267352
2016-04-24 17:23:46 +00:00
Simon Pilgrim c0c56e747a [X86][InstCombine] Tidyup PSHUFB -> shufflevector conversion to helper function. NFCI.
llvm-svn: 267351
2016-04-24 17:00:34 +00:00
Teresa Johnson 28e457bccd [ThinLTO] Remove GlobalValueInfo class from index
Summary:
Remove the GlobalValueInfo and change the ModuleSummaryIndex to directly
reference summary objects. The info structure was there to support lazy
parsing of the combined index summary objects, which is no longer
needed and not supported.

Reviewers: joker.eph

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19462

llvm-svn: 267344
2016-04-24 14:57:11 +00:00
Mehdi Amini cb87494f4c Always traverse GlobalVariable initializer when computing the export list
Summary:
We are always importing the initializer for a GlobalVariable.
So if a GlobalVariable is in the export-list, we pull in any
refs as well.

Reviewers: tejohnson

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19102

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267303
2016-04-23 23:29:24 +00:00
Sanjay Patel dc88bd6e1f replace duplicated static functions for profile metadata access with BranchInst member function; NFCI
llvm-svn: 267295
2016-04-23 20:01:22 +00:00
Sanjay Patel 85ce0f1f1f improve documentation comments; NFC
llvm-svn: 267292
2016-04-23 16:31:48 +00:00
Nico Weber 0aa9845d15 Revert r267210, it makes clang assert (PR27490).
llvm-svn: 267232
2016-04-22 22:08:42 +00:00
Andrew Kaylor aa641a5171 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Rong Xu f8f051cbf5 [PGO] change the interface for createPGOFuncNameMetadata()
This patch changes the interface for createPGOFuncNameMetadata() where we add
another PGOFuncName argument.

Differential Revision: http://reviews.llvm.org/D19433

llvm-svn: 267216
2016-04-22 21:00:17 +00:00
Philip Reames 5f0e36947b [unordered] sink unordered stores at end of blocks
The existing code turned out to be completely correct when auditted.  Thus, only minor code changes and adding a couple of tests.

llvm-svn: 267215
2016-04-22 20:53:32 +00:00
Sanjoy Das f97229d6ba Fold compares for distinct allocations
Summary:
We can fold compares to false when two distinct allocations within a
function are compared for equality.

Patch by Anna Thomas!

Reviewers: majnemer, reames, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19390

llvm-svn: 267214
2016-04-22 20:52:25 +00:00
Philip Reames eedef73b63 [unordered] Extend load/store type canonicalization to handle unordered operations
Extend the type canonicalization logic to work for unordered atomic loads and stores.  Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before.  Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered.  If you see problems, feel free to revert this change, but please make sure you collect a test case.  

llvm-svn: 267210
2016-04-22 20:33:48 +00:00
Justin Bogner b93949089e PM: Port SinkingPass to the new pass manager
llvm-svn: 267199
2016-04-22 19:54:10 +00:00
Justin Bogner 82077c4ab0 PM: Reorder the functions used for SinkingPass. NFC
This will make the port to the new PM easier to follow.

llvm-svn: 267198
2016-04-22 19:54:04 +00:00
Jun Bum Lim d29a24e4fd [DeadStoreElimination] Shorten beginning of memset overwritten by later stores
Summary: This change will shorten memset if the beginning of memset is overwritten by later stores.

Reviewers: hfinkel, eeckstein, dberlin, mcrosier

Subscribers: mgrang, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18906

llvm-svn: 267197
2016-04-22 19:51:29 +00:00
Justin Bogner 395c2127ed PM: Port DCE to the new pass manager
Also add a very basic test, since apparently there aren't any tests
for DCE whatsoever to add the new pass version to.

llvm-svn: 267196
2016-04-22 19:40:41 +00:00
Adam Nemet fe3def7c2a [LoopUtils] Extend findStringMetadataForLoop to return the value for metadata
E.g. for:

  !1 = {"llvm.distribute", i32 1}

it now returns the MDOperand for 1.

I will use this in LoopDistribution to check the value of the metadata.

Note that the change is backward-compatible with its current use in
LoopVersioningLICM.  An Optional implicitly converts to a bool depending
whether it contains a value or not.

llvm-svn: 267190
2016-04-22 19:10:05 +00:00
Chad Rosier 1a4bc110f5 [EarlyCSE/CVP] Add stats for CVPs and make sure to account for any Changes.
llvm-svn: 267187
2016-04-22 18:47:21 +00:00
Geoff Berry 9fe26e6dc9 [MemorySSA] Fix bug in CachingMemorySSAWalker::invalidateInfo
Summary:
CachingMemorySSAWalker::invalidateInfo was using IsCall to determine
which cache map needed to be cleared of entries referring to the invalidated
MemoryAccess, but there could also be entries referring to it in the
other cache map (value entries, not key entries).  This change just
clears both tables to be conservatively correct.

Also add a verifyRemoved() function, called when expensive
checks (i.e. XDEBUG) are enabled to verify that the invalidated
MemoryAccess object is not referenced in any of the caches.

Reviewers: dberlin, george.burgess.iv

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19388

llvm-svn: 267157
2016-04-22 14:44:10 +00:00
David Majnemer bfd695d591 [EarlyCSE] Don't add the overflow flags to the hash
We take the intersection of overflow flags while CSE'ing.
This permits us to consider two instructions with different overflow
behavior to be replaceable.

llvm-svn: 267153
2016-04-22 14:12:50 +00:00
Silviu Baranga e985c76b90 [InstCombine] Preserve fast math flags when combining PHIs
Summary:
When optimizing PHIs which have inputs floating point binary
operators, we preserve all IR flags except the fast math
flags.

This change removes the logic which tracked some of the IR flags
(no wrap, exact) and replaces it by doing an and on the IR flags of
all inputs to the PHI - which will also handle the fast math
flags.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19370

llvm-svn: 267139
2016-04-22 11:21:36 +00:00
Vedant Kumar 6013f45f92 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
David Majnemer d0ce8f1485 [GVN] Respect fast-math-flags on fcmps
We assumed that flags were only present on binary operators.  This is
not true, they may also be present on calls and fcmps.

llvm-svn: 267113
2016-04-22 06:37:51 +00:00
David Majnemer 9554c1339c [EarlyCSE] Take the intersection of flags on instructions
EarlyCSE had inconsistent behavior with regards to flag'd instructions:
- In some cases, it would pessimize if the available instruction had
  different flags by not performing CSE.
- In other cases, it would miscompile if it replaced an instruction
  which had no flags with an instruction which has flags.

Fix this by being more consistent with our flag handling by utilizing
andIRFlags.

llvm-svn: 267111
2016-04-22 06:37:45 +00:00
Duncan P. N. Exon Smith 71480bd0c7 ValueMapper/Enumerator: Clean up code in post-order traversals, NFC
Re-layer the functions in the new (i.e., newly correct) post-order
traversals in ValueEnumerator (r266947) and ValueMapper (r266949).
Instead of adding a node to the worklist in a helper function and
returning a flag to say what happened, return the node itself.  This
makes the code way cleaner: the worklist is local to the main function,
there is no flag for an early loop exit (since we can cleanly bury the
loop), and it's perfectly clear when pointers into the worklist might be
invalidated.

I'm fixing both algorithms in the same commit to avoid repeating the
commit message; if you take the time to understand one the other should
be easy.  The diff itself isn't entirely obvious since the traversals
have some noise (i.e., things to do), but here's the high-level change:

    auto helper = [&WL](T *Op) {     auto helper = [](T **&I, T **E) {
                                 =>    while (I != E) {
      if (shouldVisit(Op)) {             T *Op = *I++;
        WL.push(Op, Op->begin());        if (shouldVisit(Op)) {
        return true;                       return Op;
      }                                }
      return false;                    return nullptr;
    };                               };
                                 =>
    WL.push(S, S->begin());          WL.push(S, S->begin());
    while (!empty()) {               while (!empty()) {
      auto *N = WL.top().N;            auto *N = WL.top().N;
      auto *&I = WL.top().I;           auto *&I = WL.top().I;
      bool DidChange = false;
      while (I != N->end())
        if (helper(*I++)) {      =>    if (T *Op = helper(I, N->end()) {
          DidChange = true;              WL.push(Op, Op->begin());
          break;                         continue;
        }                              }
      if (DidChange)
        continue;

      POT.push(WL.pop());        =>    POT.push(WL.pop());
    }                                }

Thanks to Mehdi for helping me find a better way to layer this.

llvm-svn: 267099
2016-04-22 02:33:06 +00:00
Mike Aizatsky 243b71fd8b Fixed flag description
Summary:
asan-use-after-return control feature we call use-after-return or
stack-use-after-return.

Reviewers: kcc, aizatsky, eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19284

llvm-svn: 267064
2016-04-21 22:00:13 +00:00
Derek Bruening d862c178b0 [esan] EfficiencySanitizer instrumentation pass
Summary:
Adds an instrumentation pass for the new EfficiencySanitizer ("esan")
performance tuning family of tools.  Multiple tools will be supported
within the same framework.  Preliminary support for a cache fragmentation
tool is included here.

The shared instrumentation includes:
+ Turn mem{set,cpy,move} instrinsics into library calls.
+ Slowpath instrumentation of loads and stores via callouts to
  the runtime library.
+ Fastpath instrumentation will be per-tool.
+ Which memory accesses to ignore will be per-tool.

Reviewers: eugenis, vitalybuka, aizatsky, filcab

Subscribers: filcab, vkalintiris, pcc, silvas, llvm-commits, zhaoqin, kcc

Differential Revision: http://reviews.llvm.org/D19167

llvm-svn: 267058
2016-04-21 21:30:22 +00:00
JF Bastien c22d29982b NFC: fix copy / paste comment
llvm-svn: 267039
2016-04-21 19:53:39 +00:00
JF Bastien 3e2e69f607 NFC: fix nonsensical comment
llvm-svn: 267036
2016-04-21 19:41:48 +00:00
Sanjoy Das a085cfc150 Folding compares with unescaped allocations
Summary:
If we know that the pointer allocated within a function does not escape,
we can fold away comparisons that are done with global pointers

Patch by Anna Thomas!

Reviewers: reames, majnemer, sanjoy

Subscribers: mgrang, mcrosier, majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D19276

llvm-svn: 267035
2016-04-21 19:26:45 +00:00