Commit Graph

17728 Commits

Author SHA1 Message Date
Jonas Paulsson 22776892c9 [SLPVectorizer] Pass the right type argument to getCmpSelInstrCost()
In getEntryCost(), make the scalar type for a compare instruction that of the
operands, not i1. This is needed in order to call getCmpSelInstrCost() for a
compare in a sensible way, the same way as the LoopVectorizer does.

New test: test/Transforms/SLPVectorizer/SystemZ/SLP-cmp-cost-query.ll

Review: Matthew Simpson
https://reviews.llvm.org/D31601

llvm-svn: 300061
2017-04-12 13:29:25 +00:00
Jonas Paulsson 592dbea779 [LoopVectorizer] Improve handling of branches during cost estimation.
The cost for a branch after vectorization is very different depending on if
the vectorizer will if-convert the block (branch is eliminated), or if
scalarized and predicated blocks will be produced (branch duplicated before
each block). There is also the case of remaining scalar branches, such as the
back-edge branch.

This patch handles these cases differently with TTI based cost estimates.

Review: Matthew Simpson
https://reviews.llvm.org/D31175

llvm-svn: 300058
2017-04-12 13:13:15 +00:00
Jonas Paulsson da74ed42da [LoopVectorizer, TTI] New method supportsEfficientVectorElementLoadStore()
Since SystemZ supports vector element load/store instructions, there is no
need for extracts/inserts if a vector load/store gets scalarized.

This patch lets Target specify that it supports such instructions by means of
a new TTI hook that defaults to false.

The use for this is in the LoopVectorizer getScalarizationOverhead() method,
which will with this patch produce a smaller sum for a vector load/store on
SystemZ.

New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll

Review: Adam Nemet
https://reviews.llvm.org/D30680

llvm-svn: 300056
2017-04-12 12:41:37 +00:00
Jonas Paulsson fccc7d66c3 [SystemZ] TargetTransformInfo cost functions implemented.
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.

Interleaved access vectorization enabled.

BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.

Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631

llvm-svn: 300052
2017-04-12 11:49:08 +00:00
Bjorn Pettersson 4af0593ecc [LoadCombine] Avoid analysing dead basic blocks
Summary:
Dead basic blocks may be forming a loop, for which SSA form is
fulfilled, but with a circular def-use chain. LoadCombine could
enter an infinite loop when analysing such dead code. This patch
solves the problem by simply avoiding to analyse all basic blocks
that aren't forward reachable, from function entry, in LoadCombine.

Fixes https://bugs.llvm.org/show_bug.cgi?id=27065

Reviewers: mehdi_amini, chandlerc, grosser, Bigcheese, davide

Reviewed By: davide

Subscribers: dberlin, zzheng, bjope, grandinj, Ka-Ka, materi, jholewinski, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D31032

llvm-svn: 300034
2017-04-12 08:07:55 +00:00
Chandler Carruth 927d8e610a [IR] Redesign the case iterator in SwitchInst to actually be an iterator
and to expose a handle to represent the actual case rather than having
the iterator return a reference to itself.

All of this allows the iterator to be used with common STL facilities,
standard algorithms, etc.

Doing this exposed some missing facilities in the iterator facade that
I've fixed and required some work to the actual iterator to fully
support the necessary API.

Differential Revision: https://reviews.llvm.org/D31548

llvm-svn: 300032
2017-04-12 07:27:28 +00:00
Craig Topper b5194eeebf [InstCombine][IR] Add a commutable BinOp matcher. Use it to reduce some code. NFC
llvm-svn: 300030
2017-04-12 05:49:28 +00:00
Bob Haarman 4075ccc717 ThinLTOBitcodeWriter: keep comdats together, rename if leader is renamed
Summary:
COFF requires that every comdat contain a symbol with the same name as
the comdat. ThinLTOBitcodeWriter renames symbols, which may cause this
requirement to be violated. This change avoids such violations by
renaming comdats if their leaders are renamed. It also keeps comdats
together when splitting modules.

Reviewers: pcc, mehdi_amini, tejohnson

Reviewed By: pcc

Subscribers: rnk, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31963

llvm-svn: 300019
2017-04-12 01:43:07 +00:00
Reid Kleckner c2cb560045 [IR] Add AttributeSet to hide AttributeSetNode* again, NFC
Summary:
For now, it just wraps AttributeSetNode*. Eventually, it will hold
AvailableAttrs as an inline bitset, and adding and removing enum
attributes will be super cheap.

This sinks AttributeSetNode back down to lib/IR/AttributeImpl.h.

Reviewers: pete, chandlerc

Subscribers: llvm-commits, jfb

Differential Revision: https://reviews.llvm.org/D31940

llvm-svn: 300014
2017-04-12 00:38:00 +00:00
Evgeniy Stepanov 90fd87303c [asan] Give global metadata private linkage.
Internal linkage preserves names like "__asan_global_foo" which may
account to 2% of unstripped binary size.

llvm-svn: 299995
2017-04-11 22:28:13 +00:00
Anna Thomas 00dc1b74b7 [LV] Avoid vectorizing first order recurrence when phi uses are outside loop
In the vectorization of first order recurrence, we vectorize such
that the last element in the vector will be the one extracted to pass into the
scalar remainder loop. However, this is not true when there is a phi (other
than the primary induction variable) is used outside the loop.
In such a case, we need the value from the second last iteration (i.e.
the phi value), not the last iteration (which would be the phi update).
I've added a test case for this. Also see PR32396.

A follow up patch would generate the correct code gen for such cases,
and turn this vectorization on.

Differential Revision: https://reviews.llvm.org/D31910

Reviewers: mssimpso
llvm-svn: 299985
2017-04-11 21:02:00 +00:00
Daniel Berlin 554dcd8c89 MemorySSA: Move to Analysis, from Transforms/Utils. It's used as
Analysis, it has Analysis passes, and once NewGVN is made an Analysis,
this removes the cross dependency from Analysis to Transform/Utils.
NFC.

llvm-svn: 299980
2017-04-11 20:06:36 +00:00
Andrea Di Biagio 8e26936bfd [AddDiscriminators] Assign discriminators to MemIntrinsic calls.
Before this patch, pass AddDiscriminators always avoided to assign
discriminators to intrinsic calls. This was done mainly for two reasons:
 1) We wanted to minimize the number of based discriminators used.
 2) We wanted to avoid non-deterministic discriminator assignment for
    different debug levels.

Unfortunately, that approach was problematic for MemIntrinsic calls.
MemIntrinsic calls can be split by SROA into loads and stores, and each new
load/store instruction would obtain the debug location from the original
intrinsic call.
If we don't assign a discriminator to MemIntrinsic calls, then we cannot
correctly set the discriminator for the newly created loads and stores.
This may have a negative impact on the basic block weight computation
performed by the SampleLoader.

This patch fixes the issue by letting MemIntrinsic calls have a discriminator.

Differential Revision: https://reviews.llvm.org/D31900

llvm-svn: 299972
2017-04-11 19:07:30 +00:00
Craig Topper 957a94cc03 Fix spelling compliment->complement. Mostly refering to 2s complement. NFC
llvm-svn: 299970
2017-04-11 18:47:58 +00:00
Craig Topper 271b2245f4 [InstCombine] Use ConstantExpr::getBinOpIdentity to implement getIdentityValue.
This removes a TODO in getIdentityValue and may allow some transforms to occur earlier. But I was unable to find any transforms we didn't already handle.

llvm-svn: 299966
2017-04-11 17:42:40 +00:00
Sanjay Patel 28611acef9 revert r299851 - [InstCombine] fix matching of or-of-icmps constants (PR32524)
This is a candidate culprit for multiple bot fails, so reverting pending investigation.

llvm-svn: 299955
2017-04-11 15:57:32 +00:00
Serge Guelton 59a2d7b909 Module::getOrInsertFunction is using C-style vararg instead of variadic templates.
From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments.
The variadic template is an obvious solution to both issues.

Differential Revision: https://reviews.llvm.org/D31070

llvm-svn: 299949
2017-04-11 15:01:18 +00:00
Geoff Berry 9d597adde4 [GVNHoist] Re-enable GVNHoist by default
Turn GVNHoist back on by default now that PR32153 has been fixed.

llvm-svn: 299944
2017-04-11 14:36:30 +00:00
Keno Fischer 30779772cf [StripDeadDebug/DIFinder] Track inlined SPs
Summary:
In rL299692 I improved strip-dead-debug-info's ability to drop CUs that are not
referenced from the current module. However, in doing so I neglected to realize
that some SPs could be referenced entirely from inlined functions. It appears
I was not the only one to make this mistake, because DebugInfoFinder, doesn't
find those SPs either. Fix this in DebugInfoFinder and then use that to make
sure not to drop those CUs in strip-dead-debug-info.

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31904

llvm-svn: 299936
2017-04-11 13:32:11 +00:00
Diana Picus b050c7fbe0 Revert "Turn some C-style vararg into variadic templates"
This reverts commit r299925 because it broke the buildbots. See e.g.
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008

llvm-svn: 299928
2017-04-11 10:07:12 +00:00
Serge Guelton 5fd75fb72e Turn some C-style vararg into variadic templates
Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.

llvm-svn: 299925
2017-04-11 08:36:52 +00:00
Sylvestre Ledru 06faa9bf32 Simplify the code and remove dead code
Summary: Fix coverity cid 1374240

Reviewers: dberlin

Reviewed By: dberlin

Differential Revision: https://reviews.llvm.org/D31928

llvm-svn: 299924
2017-04-11 08:21:27 +00:00
Craig Topper 8c75adf95b [InstCombine] Refinement of r299915. Only consider a ConstantVector for Neg if all the elements are Undef or ConstantInt.
llvm-svn: 299917
2017-04-11 06:32:48 +00:00
Craig Topper 18f9e424e7 [InstCombine] Support weird size element types in dyn_castNegVal.
llvm-svn: 299915
2017-04-11 05:42:47 +00:00
Hal Finkel b63ed91549 [LICM] Hoist fp division from the loops and replace by a reciprocal
When allowed, we can hoist a division out of a loop in favor of a
multiplication by the reciprocal. Fixes PR32157.

Patch by vit9696!

Differential Revision: https://reviews.llvm.org/D30819

llvm-svn: 299911
2017-04-11 02:22:54 +00:00
Daniel Berlin bf80cfe6b6 Revert "NewGVN: Don't propagate over phi backedges where undef causes us to have >1 value."
It's not ready yet this was an accidental commit :(

This reverts r299903

llvm-svn: 299904
2017-04-11 00:07:26 +00:00
Daniel Berlin 3938111fe7 NewGVN: Don't propagate over phi backedges where undef causes us to have >1 value.
Fixes PR 32607.

llvm-svn: 299903
2017-04-11 00:02:38 +00:00
Reid Kleckner eb9dd5b87f Reland "[IR] Make AttributeSetNode public, avoid temporary AttributeList copies"
This re-lands r299875.

I introduced a bug in Clang code responsible for replacing K&R, no
prototype declarations with a real function definition with a prototype.
The bug was here:

       // Collect any return attributes from the call.
  -    if (oldAttrs.hasAttributes(llvm::AttributeList::ReturnIndex))
  -      newAttrs.push_back(llvm::AttributeList::get(newFn->getContext(),
  -                                                  oldAttrs.getRetAttributes()));
  +    newAttrs.push_back(oldAttrs.getRetAttributes());

Previously getRetAttributes() carried AttributeList::ReturnIndex in its
AttributeList. Now that we return the AttributeSetNode* directly, it no
longer carries that index, and we call this overload with a single node:
  AttributeList::get(LLVMContext&, ArrayRef<AttributeSetNode*>)

That aborted with an assertion on x86_32 targets. I added an explicit
triple to the test and added CHECKs to help find issues like this in the
future sooner.

llvm-svn: 299899
2017-04-10 23:31:05 +00:00
Davide Italiano f58a30236b [NewGVN] Surround with parens to clarify allegedly ambiguous precedence.
This Placates GCC7 with -Werror. Also, clang-format the assertions
while I'm here.

llvm-svn: 299895
2017-04-10 23:08:35 +00:00
Davide Italiano fa6a0a819d [MemorySSA] We don't need to compute dominator levels anymore.
Differential Revision:  https://reviews.llvm.org/D31818

llvm-svn: 299893
2017-04-10 22:44:46 +00:00
Matt Arsenault 3c1fc768ed Allow DataLayout to specify addrspace for allocas.
LLVM makes several assumptions about address space 0. However,
alloca is presently constrained to always return this address space.
There's no real way to avoid using alloca, so without this
there is no way to opt out of these assumptions.

The problematic assumptions include:
- That the pointer size used for the stack is the same size as
  the code size pointer, which is also the maximum sized pointer.

- That 0 is an invalid, non-dereferencable pointer value.

These are problems for AMDGPU because alloca is used to
implement the private address space, which uses a 32-bit
index as the pointer value. Other pointers are 64-bit
and behave more like LLVM's notion of generic address
space. By changing the address space used for allocas,
we can change our generic pointer type to be LLVM's generic
pointer type which does have similar properties.

llvm-svn: 299888
2017-04-10 22:27:50 +00:00
Dehao Chen d4a3397861 Emit less compiler optimization remarks in samplepgo to reduce a call to findCalleeFunctionSamples which is going to be refactored.
Summary: Now the SamplePGO support is more stable, we do not need so many verbose optimization remarks emitted.

Reviewers: dnovillo, davidxl

Reviewed By: davidxl

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D31826

llvm-svn: 299883
2017-04-10 20:49:16 +00:00
Geoff Berry 635e505675 [GVNHoist] Call isGuaranteedToTransferExecutionToSuccessor on each instruction
w.r.t. https://bugs.llvm.org/show_bug.cgi?id=32153
The consensus seems to be isGuaranteedToTransferExecutionToSuccessor should be called for each function.

Patch by Aditya Kumar

Differential Revision: https://reviews.llvm.org/D31035

llvm-svn: 299882
2017-04-10 20:45:17 +00:00
Evgeniy Stepanov ed7fce7c84 Revert "[asan] Put ctor/dtor in comdat."
This reverts commit r299696, which is causing mysterious test failures.

llvm-svn: 299880
2017-04-10 20:36:36 +00:00
Evgeniy Stepanov ba7c2e9661 Revert "[asan] Fix dead stripping of globals on Linux."
This reverts commit r299697, which caused a big increase in object file size.

llvm-svn: 299879
2017-04-10 20:36:30 +00:00
Reid Kleckner 211b1f324f Revert "[IR] Make AttributeSetNode public, avoid temporary AttributeList copies"
This reverts r299875. A Linux bot came back with a test failure:
http://bb.pgr.jp/builders/test-clang-i686-linux-RA/builds/741/steps/test_clang/logs/Clang%20%3A%3A%20CodeGen__2006-05-19-SingleEltReturn.c

llvm-svn: 299878
2017-04-10 20:34:19 +00:00
Reid Kleckner 324c99dee5 [IR] Make AttributeSetNode public, avoid temporary AttributeList copies
Summary:
AttributeList::get(Fn|Ret|Param)Attributes no longer creates a temporary
AttributeList just to hide the AttributeSetNode type.

I've also added a factory method to create AttributeLists from a
parallel array of AttributeSetNodes. I think this simplifies
construction of AttributeLists when rewriting function prototypes.
Previously we would test if a particular index had attributes, and
conditionally add a temporary attribute list to a vector. Now the
attribute set vector is parallel to the argument vector already that
these passes already construct.

My long term vision is to wrap AttributeSetNode* inside an AttributeSet
type that holds the enum attributes, but that will come in a follow up
change.

I haven't done any performance measurements for this change because
profiling hasn't shown that any of the affected code is hot.

Reviewers: pete, chandlerc, sanjoy, hfinkel

Reviewed By: pete

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D31198

llvm-svn: 299875
2017-04-10 20:18:10 +00:00
Sanjay Patel e4159d2238 [InstCombine] improve variable names; NFCI
llvm-svn: 299871
2017-04-10 19:38:36 +00:00
Matt Arsenault daa08875b3 [MemCpyOpt] Only replace memcpy with bitcast if address spaces match
Patch by James Price

llvm-svn: 299866
2017-04-10 19:00:25 +00:00
Daniel Berlin 74603a68ef MemorySSA: Make lifetime starts defs for mustaliased pointers
Summary:
While we don't want them aliasing with other pointers, there seems to
be no point in not having them clobber must-aliased'd pointers.

If some day, we split the aliasing and ordering chains, we'd make this
not aliasing but an ordering barrier (IE it doesn't affect it's
memory, but we can't hoist it above it).

Reviewers: hfinkel, george.burgess.iv

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31865

llvm-svn: 299865
2017-04-10 18:46:00 +00:00
Craig Topper 0d830ff7bf [InstCombine] Use commutable matchers and m_OneUse in visitSub to shorten code. Add missing test cases.
In one case I removed commute handling for a multiply with a constant since we'll eventually get the constant on the right hand side.

llvm-svn: 299863
2017-04-10 18:09:25 +00:00
Craig Topper 98851adc2a [InstCombine] Use m_c_Add to shorten some code. Add testcases for this fold since they were missing. NFC
llvm-svn: 299853
2017-04-10 16:59:40 +00:00
Sanjay Patel 570e35c157 [InstCombine] fix matching of or-of-icmps constants (PR32524)
Also, make the same change in and-of-icmps and remove a hack for detecting that case.

Finally, add some FIXME comments because the code duplication here is awful.

This should fix the remaining IR problem noted in:
https://bugs.llvm.org/show_bug.cgi?id=32524

llvm-svn: 299851
2017-04-10 16:55:57 +00:00
Craig Topper 3eec73e20b [InstCombine] Support folding of add instructions with vector constants into select operations
We currently only fold scalar add of constants into selects. This improves this to support vectors too.

Differential Revision: https://reviews.llvm.org/D31683

llvm-svn: 299847
2017-04-10 16:40:00 +00:00
Craig Topper 31cc143b51 [InstCombine] Use commutable and/or/xor matchers to simplify some code
Summary:
This is my first time using the commutable matchers so wanted to make sure I was doing it right.

Are there any other matcher tricks to further shrink this? Can we commute the whole match so we don't have to LHS and RHS separately?

Reviewers: davide, spatel

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31680

llvm-svn: 299840
2017-04-10 07:13:40 +00:00
Craig Topper 838d13e7ee [InstCombine] Make sure we preserve fast math flags when folding fp instructions into phi nodes
Summary: I noticed in the select folding code that we copied fast math flags, but did not do the same for the similar handling in phi nodes. This patch fixes that to do the same thing as select

Reviewers: spatel, davide, majnemer, hfinkel

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31690

llvm-svn: 299838
2017-04-10 07:00:10 +00:00
Craig Topper d8840d7b10 [InstCombine] use m_c_And and m_c_Xor to handle commuted versions of a transform.
llvm-svn: 299837
2017-04-10 06:53:28 +00:00
Craig Topper 7639460367 [InstCombine] Remove unnecessary dyn_cast to BinaryOperator around some matcher checks in visitXor.
The matchers themselves should be enough.

llvm-svn: 299835
2017-04-10 06:53:23 +00:00
Craig Topper 4738321f0c [InstCombine] Make the (A|B)^B -> A & ~B transform code consistent with the very similar (A&B)^B -> ~A & B code. This should be NFC except for the addition of hasOneUse check.
I think this code is still overly complicated and should use matchers, but first I wanted to make it consistent.

llvm-svn: 299834
2017-04-10 06:53:21 +00:00
Craig Topper 4f16d82d6b [InstCombine] Use m_OneUse to shorten some code. NFC
llvm-svn: 299833
2017-04-10 06:53:19 +00:00
Xin Tong 34888c08bc [SCCP] Resolve indirect branch target when possible.
Summary:
Resolve indirect branch target when possible.
This potentially eliminates more basicblocks and result in better evaluation for phi and other things.

Reviewers: davide, efriedma, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30322

llvm-svn: 299830
2017-04-10 00:33:25 +00:00
Sanjay Patel 16a054d5c7 [InstCombine] remove dead cases from icmp pair switches; NFCI
"PredicatesFoldable" returns false for signed/unsigned mismatched pairs,
so these cases should never exist. We'll default to 'unreachable' on those 
predicate combos instead.

Most of what's left in these switches belongs in InstSimplify (and may 
already be there), so there's probably more that can be done to reduce
this code.

llvm-svn: 299829
2017-04-09 21:51:34 +00:00
Davide Italiano 612d5a9c5c [Mem2Reg] Remove AliasSetTracker updating logic from the pass.
No caller has been passing it for a long time.

llvm-svn: 299827
2017-04-09 20:47:14 +00:00
Hal Finkel a9d67cf601 [MemorySSA] Fix use of pointsToConstantMemory in isUseTriviallyOptimizableToLiveOnEntry
In isUseTriviallyOptimizableToLiveOnEntry, pointsToConstantMemory needs to be
called on the load's pointer operand, not on the result of the load (which
might not even be a pointer).

llvm-svn: 299823
2017-04-09 12:57:50 +00:00
Craig Topper afa07c5ef6 [InstCombine] Extend some OR combines to support vectors.
This adds support for these combines for vectors
(X^C)|Y -> (X|Y)^C iff Y&C == 0
Y|(X^C) -> (X|Y)^C iff Y&C == 0

llvm-svn: 299822
2017-04-09 06:12:41 +00:00
Craig Topper e63c21b1ba [InstCombine] Extend a canonicalization check to apply to vector constants too.
llvm-svn: 299821
2017-04-09 06:12:39 +00:00
Craig Topper 437c97622b [InstCombine] Use the SubOne helper function to shorten some code. NFC
llvm-svn: 299819
2017-04-09 06:12:34 +00:00
Craig Topper 9d1821b262 [InstCombine] rename variable for easier reading; NFC
We usually give constants a 'C' somewhere in the name...

llvm-svn: 299818
2017-04-09 06:12:31 +00:00
Gor Nishanov bfb2a9db31 [coroutines] Make CoroSplit pass deterministic
coro-split-after-phi.ll test was flaky due to non-determinism in
the coroutine frame construction that was sorting the spill
vector using a pointer to a def as a part of the key.

The sorting was intended to make sure that spills for the same def
are kept together, however, we populate the vector by processing
defs in order, so the spill entires will end up together anyways.

This change removes spill sorting and restores the determinism
in the test.

llvm-svn: 299809
2017-04-08 00:49:46 +00:00
Evgeniy Stepanov 349adbacca [cfi] Take over existing __cfi_check in CrossDSOCFI.
https://reviews.llvm.org/D31796 will emit a dummy __cfi_check in the
frontend.

llvm-svn: 299805
2017-04-07 23:00:20 +00:00
Daniel Berlin a823656ce7 NewGVN: Make CongruenceClass a real class in preparation for splitting
NewGVN into analysis and eliminator.

llvm-svn: 299792
2017-04-07 18:38:09 +00:00
Gor Nishanov 138ad6c9c0 [coroutines] Insert spills of PHI instructions correctly
Summary:
Fix a bug where we were inserting a spill in between the PHIs in the beginning of the block.
Consider this fragment:

```
begin:
  %phi1 = phi i32 [ 0, %entry ], [ 2, %alt ]
  %phi2 = phi i32 [ 1, %entry ], [ 3, %alt ]
  %sp1 = call i8 @llvm.coro.suspend(token none, i1 false)
  switch i8 %sp1, label %suspend [i8 0, label %resume
                                  i8 1, label %cleanup]
resume:
  call i32 @print(i32 %phi1)
```
Unless we are spilling the argument or result of the invoke, we were always inserting the spill immediately following the instruction.
The fix adds a check that if the spilled instruction is a PHI Node, select an appropriate insert point with `getFirstInsertionPt()` that
skips all the PHI Nodes and EH pads.

Reviewers: majnemer, rnk

Reviewed By: rnk

Subscribers: qcolombet, EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D31799

llvm-svn: 299771
2017-04-07 14:16:49 +00:00
Matthew Simpson 11fe2e9f2b Reapply r298620: [LV] Vectorize GEPs
This patch reapplies r298620. The original patch was reverted because of two
issues. First, the patch exposed a bug in InstCombine that caused the Chromium
builds to fail (PR32414). This issue was fixed in r299017. Second, the patch
introduced a bug in the vectorizer's scalars analysis that caused test suite
builds to fail on SystemZ. The scalars analysis was too aggressive and marked a
memory instruction scalar, even though it was going to be vectorized. This
issue has been fixed in the current patch and several new test cases for the
scalars analysis have been added.

llvm-svn: 299770
2017-04-07 14:15:34 +00:00
Craig Topper 33e0dbcc58 [InstCombine] Handle more commuted cases of ((A & B) | ~A) -> (~A | B)
llvm-svn: 299747
2017-04-07 07:32:00 +00:00
Daniel Berlin d952ceae2f AliasAnalysis: Be less conservative about volatile than atomic.
Summary:
getModRefInfo is meant to answer the question "what impact does this
instruction have on a given memory location" (not even another
instruction).

Long debate on this on IRC comes to the conclusion the answer should be "nothing special".

That is, a noalias volatile store does not affect a memory location
just by being volatile.  Note: DSE and GVN and memdep currently
believe this, because memdep just goes behind AA's back after it says
"modref" right now.

see line 635 of memdep. Prior to this patch we would get modref there, then check aliasing,
and if it said noalias, we would continue.

getModRefInfo *already* has this same AA check, it just wasn't being used because volatile was
lumped in with ordering.

(I am separately testing whether this code in memdep is now dead except for the invariant load case)

Reviewers: jyknight, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31726

llvm-svn: 299741
2017-04-07 01:28:36 +00:00
Craig Topper 72a622cac7 [InstCombine] Add more commuted patterns to support folding ((~A & B) | A) -> (A | B).
llvm-svn: 299737
2017-04-07 00:29:47 +00:00
Craig Topper a521c30dc6 [InstCombine] Remove testing assert I accidentally left in r299710.
llvm-svn: 299715
2017-04-06 21:29:43 +00:00
Craig Topper b4da6840d8 [InstCombine] When checking to see if we can turn subtracts of 2^n - 1 into xor, we only need to call computeKnownBits on the RHS not the whole subtract. While there use isMask instead of isPowerOf2(C+1)
Calling computeKnownBits on the RHS should allows us to recurse one step further. isMask is equivalent to the isPowerOf2(C+1) except in the case where C is all ones. But that was already handled earlier by creating a not which is an Xor with all ones. So this should be fine.

llvm-svn: 299710
2017-04-06 21:06:03 +00:00
Rong Xu 2bf4c59025 [PGO] Preserve GlobalsAA in pgo-memop-opt pass.
Preserve GlobalsAA analysis in memory intrinsic calls optimization based on
profiled size.

llvm-svn: 299707
2017-04-06 20:56:00 +00:00
Craig Topper 7226d796aa [InstCombine] Remove redundant combine from visitAnd
This combine is fully handled by SimplifyDemandedInstructionBits as of r299658 where I fixed this code to ensure the Add/Sub had only a single user. Otherwise it would fire and create additional instructions. That fix resulted in an improvement to code generated for tsan which is why I committed it before deleting.

Differential Revision: https://reviews.llvm.org/D31543

llvm-svn: 299704
2017-04-06 20:41:48 +00:00
Mehdi Amini db11fdfda5 Revert "Turn some C-style vararg into variadic templates"
This reverts commit r299699, the examples needs to be updated.

llvm-svn: 299702
2017-04-06 20:23:57 +00:00
Mehdi Amini 579540a8f7 Turn some C-style vararg into variadic templates
Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D31070

llvm-svn: 299699
2017-04-06 20:09:31 +00:00
Evgeniy Stepanov 6c3a8cbc4d [asan] Fix dead stripping of globals on Linux.
Use a combination of !associated, comdat, @llvm.compiler.used and
custom sections to allow dead stripping of globals and their asan
metadata. Sometimes.

Currently this works on LLD, which supports SHF_LINK_ORDER with
sh_link pointing to the associated section.

This also works on BFD, which seems to treat comdats as
all-or-nothing with respect to linker GC. There is a weird quirk
where the "first" global in each link is never GC-ed because of the
section symbols.

At this moment it does not work on Gold (as in the globals are never
stripped).

This is a re-land of r298158 rebased on D31358. This time,
asan.module_ctor is put in a comdat as well to avoid quadratic
behavior in Gold.

llvm-svn: 299697
2017-04-06 19:55:17 +00:00
Evgeniy Stepanov 5dfe420d10 [asan] Put ctor/dtor in comdat.
When possible, put ASan ctor/dtor in comdat.

The only reason not to is global registration, which can be
TU-specific. This is not the case when there are no instrumented
globals. This is also limited to ELF targets, because MachO does
not have comdat, and COFF linkers may GC comdat constructors.

The benefit of this is a lot less __asan_init() calls: one per DSO
instead of one per TU. It's also necessary for the upcoming
gc-sections-for-globals change on Linux, where multiple references to
section start symbols trigger quadratic behaviour in gold linker.

This is a rebase of r298756.

llvm-svn: 299696
2017-04-06 19:55:13 +00:00
Evgeniy Stepanov 039af609f1 [asan] Delay creation of asan ctor.
Create the constructor in the module pass.
This in needed for the GC-friendly globals change, where the constructor can be
put in a comdat  in some cases, but we don't know about that in the function
pass.

This is a rebase of r298731 which was reverted due to a false alarm.

llvm-svn: 299695
2017-04-06 19:55:09 +00:00
Keno Fischer bacc64b5fa [StripDeadDebugInfo] Drop dead CUs entirely
Summary:
Prior to this while it would delete the dead DIGlobalVariables, it would
leave dead DICompileUnits and everything referenced therefrom. For a bit
bitcode file with thousands of compile units those dead nodes easily
outnumbered the real ones. Clean that up.

Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D31720

llvm-svn: 299692
2017-04-06 19:26:22 +00:00
Daniel Berlin 21279bd37a NewGVN: Rename some functions for consistency
llvm-svn: 299685
2017-04-06 18:52:58 +00:00
Daniel Berlin 08fe6e0f74 NewGVN: Fixup some small issues
llvm-svn: 299684
2017-04-06 18:52:55 +00:00
Daniel Berlin 5845e0549e NewGVN: Fix a small formatting issue in performSymbolicLoadEvaluation.
llvm-svn: 299683
2017-04-06 18:52:53 +00:00
Daniel Berlin 1316a94ebc NewGVN: This patch makes memory congruence work for all types of
memorydefs, not just stores.  Along the way, we audit and fixup issues
about how we were tracking memory leaders, and improve the verifier
to notice more memory congruency issues.

llvm-svn: 299682
2017-04-06 18:52:50 +00:00
Craig Topper 3fc1225c18 [InstCombine] Fix a case where we weren't checking that an instruction had a single use resulting in extra instructions being created.
llvm-svn: 299658
2017-04-06 16:42:46 +00:00
Daniel Berlin d7a7ae061f MemorySSA: Remove MemorySSA walker caching.
Summary:
Remove all the caching the clobber walker does, and that the
caching walker does.  With the patch to enable storing clobbering
access results for stores, i can find no improvement with the cache
turned on (and a number of degradations, both time and memory, from
the cost of caching.  For a large program i have, we do millions of
lookups and inserts with zero hits).

I haven't tried to rename or simplify the walker otherwise yet.

(Appreciate some perf testing on this past my own testing)

Reviewers: george.burgess.iv, davide

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31576

llvm-svn: 299578
2017-04-05 19:01:58 +00:00
Sanjay Patel 50c82c4395 [InstCombine] add fold for icmp with or mask of low bits (PR32542)
We already have these 'and' folds:

// X & -C == -C -> X >  u ~C
// X & -C != -C -> X <= u ~C
//   iff C is a power of 2

...but we were missing the 'or' siblings.

http://rise4fun.com/Alive/n6

This should improve:
https://bugs.llvm.org/show_bug.cgi?id=32524
...but there are 2 or more other pieces to fix still.

Differential Revision: https://reviews.llvm.org/D31712

llvm-svn: 299570
2017-04-05 17:57:05 +00:00
Sanjay Patel 519a87a468 [InstCombine] fix formatting and variable names; NFCI
There must be some opportunity to refactor big chunks of nearly duplicated code in FoldOrOfICmps / FoldAndOfICmps.
Also, none of this works with vectors, but it should.

llvm-svn: 299568
2017-04-05 17:38:34 +00:00
Daniel Berlin 3082b8e062 MemorySSA: Fix and use optimized_def_chain
llvm-svn: 299566
2017-04-05 17:26:25 +00:00
Akira Hatanaka 75be84f3c2 [ObjCArc] Do not dereference an invalidated iterator.
Fix a bug in ARC contract pass where an iterator that pointed to a
deleted instruction was dereferenced.

It appears that tryToContractReleaseIntoStoreStrong was incorrectly
assuming that a call to objc_retain would not immediately follow a call
to objc_release.

rdar://problem/25276306

llvm-svn: 299507
2017-04-05 03:44:09 +00:00
Bob Haarman 6de8134784 ThinLTOBitcodeWriter: handle aliases first in filterModule
Summary: This change fixes a "local linkage requires default visibility" assert when attempting to build LLVM with ThinLTO on Windows.

Reviewers: pcc, tejohnson, mehdi_amini

Reviewed By: pcc

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D31632

llvm-svn: 299491
2017-04-05 00:42:07 +00:00
Daniel Berlin e33bc31df4 Re-apply MemorySSA: Add support for caching clobbering access in
stores with some fixes.

Summary:
This enables us to cache the clobbering access for stores, despite the
fact that we can't rewrite the use-def chains themselves.

Early testing shows that, after this change, for larger testcases, it
will be a significant net positive (memory and time) to remove the
walker caching.

Reviewers: george.burgess.iv, davide

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31567

llvm-svn: 299486
2017-04-04 23:43:10 +00:00
Daniel Berlin f49d4c45a1 Revert "MemorySSA: Add support for caching clobbering access in stores"
This reverts revision r299322.

llvm-svn: 299485
2017-04-04 23:43:04 +00:00
Sanjay Patel 0bf0abedf6 [InstCombine] rename variable for easier reading; NFC
We usually give constants a 'C' somewhere in the name...

llvm-svn: 299474
2017-04-04 22:06:03 +00:00
Craig Topper c745b6a1f6 [InstCombine] Turn subtract of vectors of i1 into xor like we do for scalar i1. Matches what we already do for add.
llvm-svn: 299472
2017-04-04 21:44:56 +00:00
Craig Topper 86173600ec [InstCombine] Support folding and/or/xor with a constant vector RHS into selects and phis
Currently we only fold with ConstantInt RHS. This generalizes to any Constant RHS.

Differential Revision: https://reviews.llvm.org/D31610

llvm-svn: 299466
2017-04-04 20:26:25 +00:00
Rong Xu 48596b6f7a [PGO] Memory intrinsic calls optimization based on profiled size
This patch optimizes two memory intrinsic operations: memset and memcpy based
on the profiled size of the operation. The high level transformation is like:
  mem_op(..., size)
  ==>
  switch (size) {
    case s1:
       mem_op(..., s1);
       goto merge_bb;
    case s2:
       mem_op(..., s2);
       goto merge_bb;
    ...
    default:
       mem_op(..., size);
       goto merge_bb;
    }
  merge_bb:

Differential Revision: http://reviews.llvm.org/D28966

llvm-svn: 299446
2017-04-04 16:42:20 +00:00
Craig Topper e06b6bcfa1 [InstCombine] Use setAllBits in place of getAllOnesValue since we know the bitwidths are the same. NFCI
llvm-svn: 299413
2017-04-04 05:03:02 +00:00
Zvi Rackover 82bf48d8b9 InstCombine: Use the InstSimplify hook for shufflevector
Summary: Start using the recently added InstSimplify hook for shuffles in the respective InstCombine visitor.

Reviewers: spatel, RKSimon, craig.topper, majnemer

Reviewed By: majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D31526

llvm-svn: 299412
2017-04-04 04:47:57 +00:00
Craig Topper 1604f0773b [InstCombine] Remove canonicalization for (X & C1) | C2 --> (X | C2) & (C1|C2) when C1 & C2 have common bits.
It turns out that SimplifyDemandedInstructionBits will get called earlier and remove bits from C1 first. Effectively doing (X & (C1&C2)) | C2. So by the time it got to this check there could be no common bits.

I think the DAGCombiner has the same check but its check can be executed because it handles demanded bits later. I'll look at it next.

llvm-svn: 299384
2017-04-03 20:41:47 +00:00
Craig Topper 3882613956 [DAGCombine][InstCombine] Fix inverted if condition in equivalent comments in DAGCombine and InstCombine. NFC
llvm-svn: 299378
2017-04-03 19:18:48 +00:00
Craig Topper 79120e80b8 Revert r299337 "[InstCombine] Remove redundant combine from visitAnd"
One of the tsan bots started failing at this commit. I don't see anything obviously wrong with the commit so trying this to see if it recovers.

Failing log: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/6792

llvm-svn: 299366
2017-04-03 17:22:23 +00:00
Sanjay Patel 77bf622db6 [InstCombine] fix formatting for foldLogOpOfMaskedICmps and related bits; NFCI
1. Improve enum, function, and variable names.
2. Improve comments.
3. Fix variable capitalization.
4. Run clang-format.

As an existing code comment suggests, this should work with vector types / splat constants too,
so making this look right first will reduce the diffs needed for that change.

llvm-svn: 299365
2017-04-03 16:53:12 +00:00
Craig Topper d33ee1b960 [APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword
This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation.

Differential Revision: https://reviews.llvm.org/D31565

llvm-svn: 299362
2017-04-03 16:34:59 +00:00
Craig Topper d0b053d229 [InstCombine] Make foldOpWithConstantIntoOperand take a BinaryOperator instead of a generic Instruction.
It blindly assumes there are two operands so make it explicit.

llvm-svn: 299351
2017-04-03 07:08:08 +00:00
Craig Topper 07944f891c [InstCombine] Remove a And transform that should be handled by SimplifyDemandedInstructionBits. NFCI
llvm-svn: 299349
2017-04-03 06:02:09 +00:00
Craig Topper 70e4f434ae [InstCombine] Make InstCombiner::OptAndOp take a BinaryOperator instead of an Instruction.
The callers have already performed the necessary cast before calling. This allows us to remove a comment that says the instruction must be a BinaryOperator and make it explicit in the argument type.

Had to add a default case to the switch because BinaryOperator::getOpcode() returns a BinaryOps enum.

llvm-svn: 299339
2017-04-02 17:57:30 +00:00
Craig Topper d133591a7e [InstCombine] Remove redundant combine from visitAnd
As far as I can tell this combine is fully handled by SimplifyDemandedInstructionBits.

I was only looking at this because it is the only user of APIntOps::isShiftedMask which is itself broken. As demonstrated by r299187. I was going to fix isShiftedMask and needed to make sure we had coverage for the new cases it would expose to this combine. But looks like we can nuke it instead.

Differential Revision: https://reviews.llvm.org/D31543

llvm-svn: 299337
2017-04-02 17:34:30 +00:00
Daniel Berlin 07daac8a36 NewGVN: Handle coercion of constant stores, loads, memory insts.
Summary:
Depends on D30928.

This adds support for coercion of stores and memory instructions that do not require insertion to process.
Another few tests down.
I added the relevant tests from rle.ll

Reviewers: davide

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D30929

llvm-svn: 299330
2017-04-02 13:23:44 +00:00
Nikolai Bozhenov fca527af5c [BypassSlowDivision] Do not bypass division of hash-like values
Disable bypassing if one of the operands looks like a hash value. Slow
division often occurs in hashtable implementations and fast division is
never taken there because a hash value is extremely unlikely to have
enough upper bits set to zero.

A value is considered to be hash-like if it is produced by

1) XOR operation
2) Multiplication by a constant wider than the shorter type
3) PHI node with all incoming values being hash-like

Differential Revision: https://reviews.llvm.org/D28200

llvm-svn: 299329
2017-04-02 13:14:30 +00:00
Daniel Berlin 8a00270838 MemorySSA: Add support for caching clobbering access in stores
Summary:
This enables us to cache the clobbering access for stores, despite the
fact that we can't rewrite the use-def chains themselves.

Early testing shows that, after this change, for larger testcases, it will be a significant net positive (memory and time) to remove the walker caching.

Reviewers: george.burgess.iv, davide

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31567

llvm-svn: 299322
2017-04-02 05:09:15 +00:00
Daniel Berlin 9a9c9ff260 NewGVN: Don't try to kill off the stored value of stores when
processing the congruence class of the store.
Because we use the stored value of a store as the def, it isn't dead
just because it appears as a def when it comes from a store.

Note: I have not hit any cases with the memory code as it is where
this breaks anything, just because of what memory congruences we
actually allow.  In a followup that improves memory congruence,
this bug actually breaks real stuff (but the verifier catches it).

llvm-svn: 299300
2017-04-01 09:44:33 +00:00
Daniel Berlin 9b4984926c NewGVN: Clean up GVNExpression memory hierarchy, restructure hash computation a bit so we don't have to redefine it for loads, stores, and calls
llvm-svn: 299299
2017-04-01 09:44:29 +00:00
Daniel Berlin 871ecd90ca NewGVN: Use def_chain iterator in singleReachablePhiPath instead of recursion
llvm-svn: 299298
2017-04-01 09:44:24 +00:00
Daniel Berlin 07275c3065 Move def_chain iterator to MemorySSA.h so it can be reused
llvm-svn: 299297
2017-04-01 09:44:19 +00:00
Daniel Berlin d042031f0f MemorySSA: Push const correctness further.
llvm-svn: 299295
2017-04-01 09:01:12 +00:00
Daniel Berlin 7500c5641e MemorySSA: Kill the WalkTargetCache now that we have getBlockDefs.
llvm-svn: 299294
2017-04-01 08:59:45 +00:00
Craig Topper 47fd2de304 [APInt] Fix bugs in isShiftedMask to match behavior of the similar function in MathExtras.h
This removes a parameter from the routine that was responsible for a lot of the issue. It was a bit count that had to be set to the BitWidth of the APInt and would get passed to getLowBitsSet. This guaranteed the call to getLowBitsSet would create an all ones value. This was then compared to (V | (V-1)). So the only shifted masks we detected had to have the MSB set.

The one in tree user is a transform in InstCombine that never fires due to earlier transforms covering the case better. I've submitted a patch to remove it completely, but for now I've just adapted it to the new interface for isShiftedMask.

llvm-svn: 299273
2017-03-31 22:23:42 +00:00
Craig Topper e625d74271 [InstCombine] When adding an Instruction and its Users to the worklist at the same time, make sure we put the Users in first. Then put in the instruction.
This way we ensure we immediately revisit the instruction and do any additional optimizations before visiting the users. Otherwise we might visit the users, then the instruction, then users again, then instruction again.

llvm-svn: 299267
2017-03-31 21:35:30 +00:00
Craig Topper 885fa12e8a [APInt] Remove shift functions from APIntOps namespace. Replace the few users with the APInt class methods. NFCI
llvm-svn: 299248
2017-03-31 20:01:16 +00:00
Joerg Sonnenberger 28bed106e0 Do not translate rint into nearbyint, but truncate it like nearbyint.
A common way to implement nearbyint is by fiddling with the floating
point environment and calling rint. This is used at least by the BSD
libm and musl. As such, canonicalizing the latter to the former will
create infinite loops for libm and generally pessimize performance, at
least when the generic C versions are used.

This change preserves the rint in the libcall translation and also
handles the domain truncation logic, so that rint with float argument
will be reduced to rintf etc.

llvm-svn: 299247
2017-03-31 19:58:07 +00:00
Dehao Chen fed890ea3a Fix the InstCombine to reserve the VP metadata and sets correct call count.
Summary: Currently the VP metadata was dropped when InstCombine converts a call to direct call. This patch converts the VP metadata to branch_weights so that its hotness is recorded.

Reviewers: eraman, davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31344

llvm-svn: 299228
2017-03-31 15:59:52 +00:00
Mikael Holmen 79235bd4d8 [Scalarizer] Handle scalar arguments in vector GEP
Summary:
Triggered by commit r298620: "[LV] Vectorize GEPs".

If we encounter a vector GEP with scalar arguments, we splat the scalar
into a vector of appropriate size before we scatter the argument.

Reviewers: arsenm, mehdi_amini, bkramer

Reviewed By: arsenm

Subscribers: bjope, mssimpso, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D31416

llvm-svn: 299186
2017-03-31 06:29:49 +00:00
Peter Collingbourne 6b193966ac ThinLTOBitcodeWriter: Use Module::global_values(). NFCI.
llvm-svn: 299132
2017-03-30 23:43:08 +00:00
Craig Topper 79e5bc528d [InstCombine] Fix typo last->least. NFC
llvm-svn: 299123
2017-03-30 22:28:55 +00:00
Matt Arsenault 79f837c254 AMDGPU: Add all atomicrmw fields to atomic.inc/dec
Add scope, order, isVolatile

llvm-svn: 299122
2017-03-30 22:21:40 +00:00
Hongbin Zheng bfd7c38de7 [SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its operands are non-negative
Since there is no sdiv in SCEV, an 'udiv' is a better canonical form than an 'sdiv' as the user of induction variable

Differential Revision: https://reviews.llvm.org/D31488

llvm-svn: 299118
2017-03-30 21:56:56 +00:00
Simon Pilgrim 68168d17b9 Spelling mistakes in comments. NFCI.
Based on corrections mentioned in patch for clang for PR27635

llvm-svn: 299072
2017-03-30 12:59:53 +00:00
Matthew Simpson c8f0aeccda [InstCombine] Correct the check for vector GEPs
Some of the GEP combines (e.g., descaling) can't handle vector GEPs. We have an
existing check that attempts to bail out if given a vector GEP. However, the
check only tests the GEP's pointer operand. A GEP results in a vector of
pointers if at least one of its operands is vector-typed (e.g., its pointer
operand could be a scalar, but its index could be a vector). We should just
check the type of the GEP itself. This should fix PR32414.

Reference: https://bugs.llvm.org/show_bug.cgi?id=32414
Differential Revision: https://reviews.llvm.org/D31470

llvm-svn: 299017
2017-03-29 18:23:08 +00:00
Filipe Cabecinhas 8b94273fe6 Cleanup in preparation for D30703. NFCI
Make the enumerators follow the coding convention and start with OW_...

llvm-svn: 298996
2017-03-29 14:42:27 +00:00
Anna Thomas 923e574bff [InstCombine] For select rule, use positive check of constant int for select operand. NFCI
llvm-svn: 298906
2017-03-28 09:32:24 +00:00
Alex Shlyapnikov bbd5cc63d7 Revert "[asan] Delay creation of asan ctor."
Speculative revert. Some libfuzzer tests are affected.

This reverts commit r298731.

llvm-svn: 298890
2017-03-27 23:11:50 +00:00
Alex Shlyapnikov 09171aa31f Revert "[asan] Put ctor/dtor in comdat."
Speculative revert, some libfuzzer tests are affected.

This reverts commit r298756.

llvm-svn: 298889
2017-03-27 23:11:47 +00:00
Matthew Simpson b8ff4a4a70 [LV] Transform truncations of non-primary induction variables
The vectorizer tries to replace truncations of induction variables with new
induction variables having the smaller type. After r295063, this optimization
was applied to all integer induction variables, including non-primary ones.
When optimizing the truncation of a non-primary induction variable, we still
need to transform the new induction so that it has the correct start value.
This should fix PR32419.

Reference: https://bugs.llvm.org/show_bug.cgi?id=32419
llvm-svn: 298882
2017-03-27 20:07:38 +00:00
Anna Thomas f57ae33381 [InstCombine] Avoid incorrect folding of select into phi nodes when incoming element is a vector type
Summary:
We are incorrectly folding selects into phi nodes when the incoming value of a phi
node is a constant vector. This optimization is done in `FoldOpIntoPhi` when the
select condition is a phi node with constant incoming values.
Without the fix, we are miscompiling (i.e. incorrectly folding the
select into the phi node) when the vector contains non-zero
elements.
This patch fixes the miscompile and we will correctly fold based on the
select vector operand (see added test cases).

Reviewers: majnemer, sanjoy, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31189

llvm-svn: 298845
2017-03-27 13:52:51 +00:00
Serge Pavlov b71bb80c2d [LoopUnroll] Remap references in peeled iteration
References in cloned blocks must be remapped prior to dominator
calculation.

Differential Revision: https://reviews.llvm.org/D31281

llvm-svn: 298811
2017-03-26 16:46:53 +00:00
Joerg Sonnenberger fa7367428a Split the SimplifyCFG pass into two variants.
The first variant contains all current transformations except
transforming switches into lookup tables. The second variant
contains all current transformations.

The switch-to-lookup-table conversion results in code that is more
difficult to analyze and optimize by other passes. Most importantly,
it can inhibit Dead Code Elimination. As such it is often beneficial to
only apply this transformation very late. A common example is inlining,
which can often result in range restrictions for the switch expression.

Changes in execution time according to LNT:
SingleSource/Benchmarks/Misc/fp-convert +3.03%
MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk -11.20%
MultiSource/Benchmarks/Olden/perimeter/perimeter -10.43%
and a couple of smaller changes. For perimeter it also results 2.6%
a smaller binary.

Differential Revision: https://reviews.llvm.org/D30333

llvm-svn: 298799
2017-03-26 06:44:08 +00:00
Chandler Carruth 0d256c0f5d [IR] Make SwitchInst::CaseIt almost a normal iterator.
This moves it to the iterator facade utilities giving it full random
access semantics, etc. It can also now be used with standard algorithms
like std::all_of and std::any_of and range adaptors like llvm::reverse.

Also make the semantics of iterating match what every other iterator
uses and forbid decrementing past the begin iterator. This was used as
a hacky way to work around iterator invalidation. However, every
instance trying to do this failed to actually avoid touching invalid
iterators despite the clear documentation that the removed and all
subsequent iterators become invalid including the end iterator. So I've
added a return of the next iterator to removeCase and rewritten the
loops that were doing this to correctly follow the iterator pattern of
either incremneting or removing and assigning fresh values to the
iterator and the end.

In one case we were trying to go backwards to make this cleaner but it
doesn't actually work. I've made that code match the code we use
everywhere else to remove cases as we iterate. This changes the order of
cases in one test output and I moved that test to CHECK-DAG so it
wouldn't care -- the order isn't semantically meaningful anyways.

llvm-svn: 298791
2017-03-26 02:49:23 +00:00
Craig Topper 47596dd4cc [InstCombine] Change the interface of SimplifyDemandedBits so that it takes the instruction and operand instead of the Use.
The first thing it did was get the User for the Use to get the instruction back. This requires looking through the Uses for the User using the waymarking walk. That's pretty fast, but its probably still better to just pass the Instruction we already had.

llvm-svn: 298772
2017-03-25 06:52:52 +00:00
Davide Italiano e9781e7b2f [NewGVN] Adjust NDEBUG markers.
This avoids 'used but not defined' warnings in Release builds
with GCC.

llvm-svn: 298760
2017-03-25 02:40:02 +00:00
Evgeniy Stepanov 71bb8f1ad0 [asan] Put ctor/dtor in comdat.
When possible, put ASan ctor/dtor in comdat.

The only reason not to is global registration, which can be
TU-specific. This is not the case when there are no instrumented
globals. This is also limited to ELF targets, because MachO does
not have comdat, and COFF linkers may GC comdat constructors.

The benefit of this is a lot less __asan_init() calls: one per DSO
instead of one per TU. It's also necessary for the upcoming
gc-sections-for-globals change on Linux, where multiple references to
section start symbols trigger quadratic behaviour in gold linker.

llvm-svn: 298756
2017-03-25 01:01:11 +00:00
Craig Topper 8fbb74b5b2 Revert r298711 "[InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits"
Tsan bot is failing.

llvm-svn: 298745
2017-03-24 22:12:10 +00:00
Ivan Krasin c2124e185c Revert r298620: [LV] Vectorize GEPs
Reason: breaks linking Chromium with LLD + ThinLTO (a pass crashes)
LLVM bug: https://bugs.llvm.org//show_bug.cgi?id=32413

Original change description:

[LV] Vectorize GEPs

This patch adds support for vectorizing GEPs. Previously, we only generated
vector GEPs on-demand when creating gather or scatter operations. All GEPs from
the original loop were scalarized by default, and if a pointer was to be stored
to memory, we would have to build up the pointer vector with insertelement
instructions.

With this patch, we will vectorize all GEPs that haven't already been marked
for scalarization.

The patch refines collectLoopScalars to more exactly identify the scalar GEPs.
The function now more closely resembles collectLoopUniforms. And the patch
moves vector GEP creation out of vectorizeMemoryInstruction and into the main
vectorization loop. The vector GEPs needed for gather and scatter operations
will have already been generated before vectoring the memory accesses.

Original Differential Revision: https://reviews.llvm.org/D30710

llvm-svn: 298735
2017-03-24 20:49:43 +00:00
Evgeniy Stepanov 64e872a91f [asan] Delay creation of asan ctor.
Create the constructor in the module pass.
This in needed for the GC-friendly globals change, where the constructor can be
put in a comdat  in some cases, but we don't know about that in the function
pass.

llvm-svn: 298731
2017-03-24 20:42:15 +00:00
Matt Arsenault 4c7795dd31 AMDGPU: Fold rcp/rsq of undef to undef
llvm-svn: 298725
2017-03-24 19:04:57 +00:00
Matt Arsenault 18bb24a1be TTI: Split IsSimple in MemIntrinsicInfo
All this did before was assert in EarlyCSE.

llvm-svn: 298724
2017-03-24 18:56:43 +00:00
Teresa Johnson 428b9e0627 [ThinLTO] Correct counting of functions in inliner stats
Summary: Declarations need to be filtered out when counting functions.

Reviewers: eraman

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31336

llvm-svn: 298720
2017-03-24 17:59:06 +00:00
Craig Topper d4521c2fc2 [InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits
SimplifyDemandedUseBits for Add/Sub already recursed down LHS and RHS for simplifying bits. If that didn't provide any simplifications we fall back to calling computeKnownBits which will recurse again. Instead just take the known bits for LHS and RHS we already have and call into a new function in ValueTracking that can calculate the known bits given the LHS/RHS bits.

llvm-svn: 298711
2017-03-24 16:56:51 +00:00
Benjamin Kramer 46f5e2c47b Make GCC happy again.
llvm-svn: 298702
2017-03-24 14:15:35 +00:00
Daniel Berlin ffc30781f4 NewGVN: Small cleanup of two dominance related functions to make
them easier to understand.

llvm-svn: 298692
2017-03-24 06:33:51 +00:00
Daniel Berlin 0e9001131d NewGVN: Small cleanup of useless expression deletion, and don't uselessly create two expressions in symbolic store evaluation.
llvm-svn: 298691
2017-03-24 06:33:48 +00:00
Daniel Berlin 9d0796e5d0 NewGVN: Fix PR32403 - Handling of undef in phis was not quite correct
due to LLVM's view of phi nodes.  It would cause NewGVN not to fixpoint
in some interesting edge cases.

llvm-svn: 298687
2017-03-24 05:30:34 +00:00
Craig Topper 36f2e0eee8 [InstCombine] Use range-based for loop. NFC
llvm-svn: 298680
2017-03-24 02:58:02 +00:00
Craig Topper df73e7c5b7 [InstCombine] Fix 80 column violation I accidentally introduced. NFC
llvm-svn: 298679
2017-03-24 02:57:59 +00:00