Commit Graph

11547 Commits

Author SHA1 Message Date
Alexey Samsonov 1f64750258 Fix typo in variable name
llvm-svn: 209784
2014-05-29 01:10:14 +00:00
Alexey Samsonov 96e239f564 [ASan] Use llvm.global_ctors to insert init-order checking calls into ASan runtime.
Don't assume that dynamically initialized globals are all initialized from
_GLOBAL__<module_name>I_ function. Instead, scan the llvm.global_ctors and
insert poison/unpoison calls to each function there.

Patch by Nico Weber!

llvm-svn: 209780
2014-05-29 00:51:15 +00:00
Rafael Espindola 6196b7430e Revert "Revert "InstCombine: Improvement to check if signed addition overflows.""
This reverts commit r209762, bringing back r209746. It was not responsible for the libc++ build failure

llvm-svn: 209776
2014-05-28 21:43:52 +00:00
Rafael Espindola 910528a3eb Revert "Add support for combining GEPs across PHI nodes"
This reverts commit r209755.

it was the real cause of the libc++ build failure.

llvm-svn: 209775
2014-05-28 21:41:21 +00:00
Rafael Espindola fb59b05ca4 Revert "InstCombine: Improvement to check if signed addition overflows."
This reverts commit r209746.

It looks it is causing a crash while building libcxx. I am trying to get a
reduced testcase.

llvm-svn: 209762
2014-05-28 18:48:10 +00:00
Louis Gerbarg 727f1cbb17 Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

llvm-svn: 209755
2014-05-28 17:38:31 +00:00
Rafael Espindola 085b57941f InstCombine: Improvement to check if signed addition overflows.
This patch implements two things:

1. If we know one number is positive and another is negative, we return true as
   signed addition of two opposite signed numbers will never overflow.

2. Implemented TODO : If one of the operands only has one non-zero bit, and if
   the other operand has a known-zero bit in a more significant place than it
   (not including the sign bit) the ripple may go up to and fill the zero, but
   won't change the sign. e.x -  (x & ~4) + 1

We make sure that we are ignoring 0 at MSB.

Patch by Suyog Sarda.

llvm-svn: 209746
2014-05-28 15:30:40 +00:00
Evgeniy Stepanov 386b58d056 [asancov] Don't emit extra runtime calls when compiling without coverage.
llvm-svn: 209721
2014-05-28 09:26:46 +00:00
Jingyue Wu 80a738dc62 Distribute sext/zext to the operands of and/or/xor
This is an enhancement to SeparateConstOffsetFromGEP. With this patch, we can
extract a constant offset from "s/zext and/or/xor A, B".

Added a new test @ext_or to verify this enhancement.

Refactoring the code, I also extracted some common logic to function
Distributable. 

llvm-svn: 209670
2014-05-27 18:00:00 +00:00
Filipe Cabecinhas e8d6a1e82f Post-commit fixes for r209643
Detected by Daniel Jasper, Ilia Filippov, and Andrea Di Biagio
Fixed the argument order to select (the mask semantics to blendv* are the
inverse of select) and fixed the tests
Added parenthesis to the assert condition
Ran clang-format

llvm-svn: 209667
2014-05-27 16:54:33 +00:00
Evgeniy Stepanov 47b1a95f1c [asancov] Emit an initializer passing number of coverage code locations in each module.
llvm-svn: 209654
2014-05-27 12:39:31 +00:00
Daniel Jasper 73458c95ac Fix bad assert.
llvm-svn: 209648
2014-05-27 09:55:37 +00:00
Filipe Cabecinhas 82ac07c283 Convert some X86 blendv* intrinsics into IR.
Summary:
Implemented an InstCombine transformation that takes a blendv* intrinsic
call and translates it into an IR select, if the mask is constant.

This will eventually get lowered into blends with immediates if possible,
or pblendvb (with an option to further optimize if we can transform the
pblendvb into a blend+immediate instruction, depending on the selector).
It will also enable optimizations by the IR passes, which give up on
sight of the intrinsic.

Both the transformation and the lowering of its result to asm got shiny
new tests.

The transformation is a bit convoluted because of blendvp[sd]'s
definition:

Its mask is a floating point value! This forces us to convert it and get
the highest bit. I suppose this happened because the mask has type
__m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin.

I will send an email to llvm-dev to discuss if we want to change this or
not.

Reviewers: grosbach, delena, nadav

Differential Revision: http://reviews.llvm.org/D3859

llvm-svn: 209643
2014-05-27 03:42:20 +00:00
Kostya Serebryany 4d237a8503 [asan] decrease asan-instrumentation-with-call-threshold from 10000 to 7000, see PR17409
llvm-svn: 209623
2014-05-26 11:57:16 +00:00
Owen Anderson 115aa160e6 Make the LoopRotate pass's maximum header size configurable both programmatically
and via the command line, mirroring similar functionality in LoopUnroll.  In
situations where clients used custom unrolling thresholds, their intent could
previously be foiled by LoopRotate having a hardcoded threshold.

llvm-svn: 209617
2014-05-26 08:58:51 +00:00
Peter Collingbourne 0a4376190f Add an extension point for peephole optimizers.
This extension point allows adding passes that perform peephole optimizations
similar to the instruction combiner. These passes will be inserted after
each instance of the instruction combiner pass.

Differential Revision: http://reviews.llvm.org/D3905

llvm-svn: 209595
2014-05-25 10:27:02 +00:00
Tim Northover 3b0846e8f7 AArch64/ARM64: move ARM64 into AArch64's place
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.

"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.

This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.

llvm-svn: 209577
2014-05-24 12:50:23 +00:00
Jingyue Wu bbb6e4a885 Add the extracted constant offset using GEP
Fixed a TODO in r207783.

Add the extracted constant offset using GEP instead of ugly
ptrtoint+add+inttoptr. Using GEP simplifies future optimizations and makes IR
easier to understand. 

Updated all affected tests, and added a new test in split-gep.ll to cover a
corner case where emitting uglygep is necessary.

llvm-svn: 209537
2014-05-23 18:39:40 +00:00
Kostya Serebryany c7895a83d2 [asan] properly instrument memory accesses that have small alignment (smaller than min(8,size)) by making two checks instead of one. This may slowdown some cases, e.g. long long on 32-bit or wide loads produced after loop unrolling. The benefit is higher sencitivity.
llvm-svn: 209508
2014-05-23 11:52:07 +00:00
Diego Novillo 7f8af8bf91 Add support for missed and analysis optimization remarks.
Summary:
This adds two new diagnostics: -pass-remarks-missed and
-pass-remarks-analysis. They take the same values as -pass-remarks but
are intended to be triggered in different contexts.

-pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed,
which passes call when they tried to apply a transformation but
couldn't.

-pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis,
which passes call when they want to inform the user about analysis
results.

The patch also:

1- Adds support in the inliner for the two new remarks and a
   test case.

2- Moves emitOptimizationRemark* functions to the llvm namespace.

3- Adds an LLVMContext argument instead of making them member functions
   of LLVMContext.

Reviewers: qcolombet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3682

llvm-svn: 209442
2014-05-22 14:19:46 +00:00
Quentin Colombet c88baa5c10 [LSR] Canonicalize reg1 + ... + regN into reg1 + ... + 1*regN.
This commit introduces a canonical representation for the formulae.
Basically, as soon as a formula has more that one base register, the scaled
register field is used for one of them. The register put into the scaled
register is preferably a loop variant.
The commit refactors how the formulae are built in order to produce such
representation.
This yields a more accurate, but still perfectible, cost model.

<rdar://problem/16731508>

llvm-svn: 209230
2014-05-20 19:25:04 +00:00
Eric Christopher 650c8f2a06 Clean up language and grammar.
Based on a patch by jfcaron3@gmail.com!
PR19806

llvm-svn: 209216
2014-05-20 17:11:11 +00:00
Zinovy Nis abdf44e7f3 [LV][REFACTOR] One more tiny fix for printing debug locations in loop vectorizer. Now consistent with the remarks emitter.
Differential Revision: http://reviews.llvm.org/D3821

llvm-svn: 209197
2014-05-20 08:26:20 +00:00
Peter Collingbourne 68a889757d Check the alwaysinline attribute on the call as well as on the caller.
Differential Revision: http://reviews.llvm.org/D3815

llvm-svn: 209150
2014-05-19 18:25:54 +00:00
Matt Arsenault 04b67ceeeb Use range for
llvm-svn: 209147
2014-05-19 17:52:48 +00:00
Eric Christopher a5ec92556f Revert "Patch for function cloning to inline all blocks whose address is taken"
as it was causing build failures in ruby.

This reverts commit r207713.

llvm-svn: 209135
2014-05-19 16:04:10 +00:00
Dinesh Dwivedi f82f16e3e6 Added inst-combine for 'MIN(MIN(A, 97), 23)' and 'MAX(MAX(A, 23), 97)'
This removes TODO added in r208849 [http://reviews.llvm.org/D3629]

MIN(MIN(A, 97), 23) -> MIN(A, 23)
MAX(MAX(A, 23), 97) -> MAX(A, 97)

Differential Revision: http://reviews.llvm.org/D3785

llvm-svn: 209110
2014-05-19 07:08:32 +00:00
Rafael Espindola f1bedd3747 Use create methods since msvc doesn't handle delegating constructors.
llvm-svn: 209076
2014-05-17 21:29:57 +00:00
Rafael Espindola 8370565820 Reduce abuse of default values in the GlobalAlias constructor.
This is in preparation for adding an optional offset.

llvm-svn: 209073
2014-05-17 19:57:46 +00:00
NAKAMURA Takumi 7ef81a4f98 Revert r209049 and r209065, "Add support for combining GEPs across PHI nodes"
It broke clang selfhosting even after r209065.

llvm-svn: 209067
2014-05-17 14:39:21 +00:00
Louis Gerbarg 455805694e Fix for sanitizer crash introduced in r209049
This patch fixes 3 issues introduced by r209049 that only showed up in on
the sanitizer buildbots. One was a typo in a compare. The other is a check to
confirm that the single differing value in the two incoming GEPs is the same
type. The final issue was the the IRBuilder under some circumstances would
build PHIs in the middle of the block.

llvm-svn: 209065
2014-05-17 06:51:36 +00:00
Louis Gerbarg 8d2a43e9be Add support for combining GEPs across PHI nodes
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:

  GEP1--\
        |-->PHI1-->GEP3
  GEP2--/

This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):

  GEP1--\                     --\                           --\
        |-->PHI1-->GEP3  ==>    |-->PHI2->GEP12->GEP3 == >    |-->PHI2->GEP123
  GEP2--/                     --/                           --/

This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.

Tests included.

rdar://15547484

llvm-svn: 209049
2014-05-16 23:47:24 +00:00
Rafael Espindola e0098928c9 Delete getAliasedGlobal.
llvm-svn: 209040
2014-05-16 22:37:03 +00:00
Reid Kleckner fceb76f5f9 Add comdat key field to llvm.global_ctors and llvm.global_dtors
This allows us to put dynamic initializers for weak data into the same
comdat group as the data being initialized.  This is necessary for MSVC
ABI compatibility.  Once we have comdats for guard variables, we can use
the combination to help GlobalOpt fire more often for weak data with
guarded initialization on other platforms.

Reviewers: nlewycky

Differential Revision: http://reviews.llvm.org/D3499

llvm-svn: 209015
2014-05-16 20:39:27 +00:00
Rafael Espindola 6b238633b7 Fix most of PR10367.
This patch changes the design of GlobalAlias so that it doesn't take a
ConstantExpr anymore. It now points directly to a GlobalObject, but its type is
independent of the aliasee type.

To avoid changing all alias related tests in this patches, I kept the common
syntax

@foo = alias i32* @bar

to mean the same as now. The cases that used to use cast now use the more
general syntax

@foo = alias i16, i32* @bar.

Note that GlobalAlias now behaves a bit more like GlobalVariable. We
know that its type is always a pointer, so we omit the '*'.

For the bitcode, a nice surprise is that we were writing both identical types
already, so the format change is minimal. Auto upgrade is handled by looking
through the casts and no new fields are needed for now. New bitcode will
simply have different types for Alias and Aliasee.

One last interesting point in the patch is that replaceAllUsesWith becomes
smart enough to avoid putting a ConstantExpr in the aliasee. This seems better
than checking and updating every caller.

A followup patch will delete getAliasedGlobal now that it is redundant. Another
patch will add support for an explicit offset.

llvm-svn: 209007
2014-05-16 19:35:39 +00:00
Rafael Espindola 4fe0094fd1 Change the GlobalAlias constructor to look a bit more like GlobalVariable.
This is part of the fix for pr10367. A GlobalAlias always has a pointer type,
so just have the constructor build the type.

llvm-svn: 208983
2014-05-16 13:34:04 +00:00
Rafael Espindola 5a52b9f139 Revert "Implement global merge optimization for global variables."
This reverts commit r208934.

The patch depends on aliases to GEPs with non zero offsets. That is not
supported and fairly broken.

The good news is that GlobalAlias is being redesigned and will have support
for offsets, so this patch should be a nice match for it.

llvm-svn: 208978
2014-05-16 13:02:18 +00:00
Stepan Dyatkovskiy 948366ac0b MergeFunctions Pass, introduced total ordering among GEP operations.
Patch replaces old isEquivalentGEP implementation, and changes type of
comparison result from bool (equal or not) to {-1, 0, 1} (less, equal, greater).

This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 208976
2014-05-16 11:55:02 +00:00
Stepan Dyatkovskiy fa6820a035 MergeFunctions Pass, introduced total ordering among operations.
Patch replaces old isEquivalentOperation implementation, and changes type of
comparison result from bool (equal or not) to {-1, 0, 1} (less, equal, greater).

This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 208973
2014-05-16 11:02:22 +00:00
Stepan Dyatkovskiy 5c2cc2506d MergeFunctions Pass, introduced total ordering among function attributes.
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

llvm-svn: 208953
2014-05-16 08:55:34 +00:00
Jiangning Liu 932e1c3924 Implement global merge optimization for global variables.
This commit implements two command line switches -global-merge-on-external
and -global-merge-aligned, and both of them are false by default, so this
optimization is disabled by default for all targets.

For ARM64, some back-end behaviors need to be tuned to get this optimization
further enabled.

llvm-svn: 208934
2014-05-15 23:45:42 +00:00
Reid Kleckner 900d46ff39 Don't insert lifetime.end markers between a musttail call and ret
The allocas going out of scope are immediately killed by the return
instruction.

This is a resend of r208912, which was committed accidentally.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3792

llvm-svn: 208920
2014-05-15 21:10:46 +00:00
Reid Kleckner b16564109b Revert "Don't insert lifetime.end markers between a musttail call and ret"
This reverts commit r208912.

It was committed accidentally without review.

llvm-svn: 208914
2014-05-15 20:41:05 +00:00
Reid Kleckner 6af21245eb Remove unused variable in inliner
We have to iterate over all the calls that were inlined to find out if
any were musttail.

Sink another variable down to where its used.

llvm-svn: 208913
2014-05-15 20:39:42 +00:00
Reid Kleckner 26ab7ead45 Don't insert lifetime.end markers between a musttail call and ret
The allocas going out of scope are immediately killed by the return
instruction.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3630

llvm-svn: 208912
2014-05-15 20:39:13 +00:00
Reid Kleckner f0915aa0e6 Teach the inliner how to preserve musttail invariants
The interesting case is what happens when you inline a musttail call
through a musttail call site.  In this case, we can't break perfect
forwarding or allow any stack growth.

Instead of merging control flow from the inlined return instruction
after a musttail call into the body of the caller, leave the inlined
return instruction in the caller so that the musttail call stays in the
tail position.

More work is required in http://reviews.llvm.org/D3630 to handle the
case where the inlined function has dynamic allocas or byval arguments.

Reviewers: chandlerc

Differential Revision: http://reviews.llvm.org/D3491

llvm-svn: 208910
2014-05-15 20:11:28 +00:00
Dinesh Dwivedi 83c11da849 Reverting r208848, reason: build failure: sanitizer-x86_64-linux-bootstrap/builds/3399
llvm-svn: 208852
2014-05-15 08:22:55 +00:00
Dinesh Dwivedi f675f4201b Added instcombine for 'MIN(MIN(A, 27), 93)' and 'MAX(MAX(A, 93), 27)'
MIN(MIN(A, 23), 97) -> MIN(A, 23)
MAX(MAX(A, 97), 23) -> MAX(A, 97)

Differential Revision: http://reviews.llvm.org/D3629

llvm-svn: 208849
2014-05-15 06:13:40 +00:00
Dinesh Dwivedi 837c16097e Added inst combine transforms for single bit tests from Chris's note
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing

Z3 Verifications code for above transform
http://rise4fun.com/Z3/Pmsh

Differential Revision: http://reviews.llvm.org/D3717

llvm-svn: 208848
2014-05-15 06:01:33 +00:00
Alp Toker beaca19c7c Fix typos
llvm-svn: 208839
2014-05-15 01:52:21 +00:00
David Majnemer 186c94244c InstCombine: Optimize -x s< cst
Summary:
This gets rid of a sub instruction by moving the negation to the
constant when valid.

Reviewers: nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3773

llvm-svn: 208827
2014-05-15 00:02:20 +00:00
Jay Foad a0653a3e6c Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Evgeniy Stepanov ed31ca4bdc [asan] Fix compiler warnings.
llvm-svn: 208769
2014-05-14 10:56:19 +00:00
Evgeniy Stepanov aaf4bb2394 [asan] Set debug location in ASan function prologue.
Most importantly, it gives debug location info to the coverage callback.

This change also removes 2 cases of unnecessary setDebugLoc when IRBuilder
is created with the same debug location.

llvm-svn: 208767
2014-05-14 10:30:15 +00:00
Serge Pavlov e6de9e39a8 Fix the case when reordering shuffle and binop produces a constant.
This resolves PR19737.

llvm-svn: 208762
2014-05-14 09:05:09 +00:00
Nick Lewycky f0cf8fa941 Optimize integral reciprocal (udiv 1, x and sdiv 1, x) to not use division. This fires exactly once in a clang bootstrap, but covers a few different results from http://www.cs.utah.edu/~regehr/souper/
llvm-svn: 208750
2014-05-14 03:03:05 +00:00
Benjamin Kramer 3f085ba6d6 GVN: Fix non-determinism in map iteration.
Iterating over a DenseMaop is non-deterministic and results to unpredictable IR
output.

Based on a patch by Daniel Reynaud!

llvm-svn: 208728
2014-05-13 21:06:40 +00:00
Benjamin Kramer d97f95e2f2 GVN: rangify a couple of loops.
No functionality change.

llvm-svn: 208727
2014-05-13 21:06:36 +00:00
Rafael Espindola 99e05cf163 Split GlobalValue into GlobalValue and GlobalObject.
This allows code to statically accept a Function or a GlobalVariable, but
not an alias. This is already a cleanup by itself IMHO, but the main
reason for it is that it gives a lot more confidence that the refactoring to fix
the design of GlobalAlias is correct. That will be a followup patch.

llvm-svn: 208716
2014-05-13 18:45:48 +00:00
Serge Pavlov b575ee8294 Fix type of shuffle resulted from shuffle merge.
This fix resolves PR19730.

llvm-svn: 208666
2014-05-13 06:07:21 +00:00
Serge Pavlov 02ff620c7b Fix type of shuffle obtained from reordering with binary operation
In transformation:
    BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef)
type of the undef argument must be same as type of BinOp.

llvm-svn: 208531
2014-05-12 10:11:27 +00:00
Serge Pavlov 0581109708 Fix reordering of shuffles and binary operations
Do not apply transformation:

    BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))

if operands v1 and v2 are of different size.
This change fixes PR19717, which was caused by r208488.
    

llvm-svn: 208518
2014-05-12 05:44:53 +00:00
Benjamin Kramer 1fef64214c SLPVectorizer: Instead of just performing CSE on dead blocks ignore them completely.
Turns out that there is a very cheap way of testing whether a block is dead,
just look it up in the DomTree. We have to do this anyways so just ignore
unreachable blocks before sorting by domination. This restores a proper
ordering for std::stable_sort when dead code is present.

Covered by existing tests & buildbots running in STL debug mode (MSVC).

llvm-svn: 208492
2014-05-11 10:28:58 +00:00
Serge Pavlov 9ef66a8266 Reorder shuffle and binary operation.
This patch enables transformations:

    BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))
    BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2)

They allow to eliminate extra shuffles in some cases.

Differential Revision: http://reviews.llvm.org/D3525

llvm-svn: 208488
2014-05-11 08:46:12 +00:00
Benjamin Kramer 8722aa5754 SLPVectorizer: When sorting by domination for CSE don't assert on unreachable code.
There is no total ordering if the CFG is disconnected. We don't care if we
catch all CSE opportunities in dead code either so just exclude ignore them in
the assert.

PR19646

llvm-svn: 208461
2014-05-09 23:28:49 +00:00
Louis Gerbarg 1f54b82164 Add ExtractValue instruction to SimplifyCFG's ComputeSpeculationCost
Since ExtractValue is not included in ComputeSpeculationCost CFGs containing
ExtractValueInsts cannot be simplified. In particular this interacts with
InstCombineCompare's tendency to insert add.with.overflow intrinsics for
certain idiomatic math operations, preventing optimization.

This patch adds ExtractValue to the ComputeSpeculationCost. Test case included

rdar://14853450

llvm-svn: 208434
2014-05-09 17:02:46 +00:00
Rafael Espindola fc13db477a Use auto and clang-format this snippet.
llvm-svn: 208421
2014-05-09 16:01:06 +00:00
Nick Lewycky 54a824b758 Improve wording to make it sounds more like a change than an analysis.
llvm-svn: 208370
2014-05-08 23:04:46 +00:00
Michael Zolotukhin 292d3caa15 [InstCombine] Some cleanup in optimization of redundant insertvalue instructions.
And one more test added.

llvm-svn: 208355
2014-05-08 19:50:24 +00:00
Richard Smith c45f3f7433 Simplify and fix incorrect comment. No functionality change.
llvm-svn: 208272
2014-05-08 01:08:43 +00:00
Duncan P. N. Exon Smith e60adfdbd0 GlobalValue: Assert symbols with local linkage have default visibility
The change to ExtractGV.cpp has no functionality change except to avoid
the asserts.  Existing testcases already cover this, so I didn't add a
new one.

llvm-svn: 208264
2014-05-07 23:00:22 +00:00
Chandler Carruth d70cc604af Tidy up whitespace with clang-format prior to making significant
changes.

llvm-svn: 208229
2014-05-07 17:36:59 +00:00
Michael Zolotukhin 7d6293a0d3 [InstCombine] Add optimization of redundant insertvalue instructions.
rdar://problem/11861387

llvm-svn: 208214
2014-05-07 14:30:18 +00:00
Evgeniy Stepanov c14fc42137 [msan] Fix -fsanitize=memory -fno-integrated-as.
llvm-svn: 208211
2014-05-07 14:10:51 +00:00
Stepan Dyatkovskiy cfd641f123 MergeFunctions Pass, introduced total ordering among values.
This is a third patch of patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).

This patch description:
Being comparing functions we need to compare values we meet at left and
right sides.
Its easy to sort things out for external values. It just should be
the same value at left and right.
But for local values (those were introduced inside function body)
we have to ensure they were introduced at exactly the same place,
and plays the same role.

In short, patch introduces values serial numbering and comparison routine.
The last one compares two values by their serial numbers.

llvm-svn: 208189
2014-05-07 11:11:39 +00:00
Zinovy Nis da925c0d7c [BUG][REFACTOR]
1) Fix for printing debug locations for absolute paths.
2) Location printing is moved into public method DebugLoc::print() to avoid re-inventing the wheel.

Differential Revision: http://reviews.llvm.org/D3513

llvm-svn: 208177
2014-05-07 09:51:22 +00:00
Stepan Dyatkovskiy d103130ee0 Second patch of patch series that improves MergeFunctions performance time from O(N*N) to
O(N*log(N)). The idea is to introduce total ordering among functions set.
It allows to build binary tree and perform function look-up procedure in O(log(N)) time. 

This patch description:
Introduced total ordering among constants implemented in cmpConstants method.
Method performs lexicographical comparison between constants represented as
hypothetical numbers of next format:
<bitcastability-trait><raw-bit-contents>

Please, read cmpConstants declaration comments for more details.

llvm-svn: 208173
2014-05-07 09:05:10 +00:00
Nico Weber ba8a99cf77 Fix ASan init function detection after clang r208128.
llvm-svn: 208141
2014-05-06 23:17:26 +00:00
Richard Smith c167d656e7 Re-commit r208025, reverted in r208030, with a fix for a conformance issue
which GCC detects and Clang does not!

llvm-svn: 208033
2014-05-06 01:44:26 +00:00
Richard Smith 09bf116939 Revert r208025, which made buildbots unhappy for unknown reasons.
llvm-svn: 208030
2014-05-06 01:26:00 +00:00
Richard Smith 6cf1d744d8 Add llvm::function_ref (and a couple of uses of it), representing a type-erased reference to a callable object.
llvm-svn: 208025
2014-05-06 01:01:29 +00:00
Nick Lewycky 7185b5d60c Detabify.
llvm-svn: 208019
2014-05-06 00:46:20 +00:00
Nick Lewycky 5ef6bc8815 Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'.
The number of tail call to loop conversions remains the same (1618 by my count).

The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly.

llvm-svn: 208017
2014-05-05 23:59:03 +00:00
Yi Jiang 79eb0aa8cb Reapply: Add slp vectorization to LTO passes. The bug it exposed has been fixed by r207983. <radar://16641956>
llvm-svn: 208013
2014-05-05 23:14:46 +00:00
Yi Jiang a4821fc9fb Always set alignment of vectorized LD/ST in SLP-Vectorizer. <rdar://problem/16812145>
llvm-svn: 207983
2014-05-05 17:59:14 +00:00
Duncan P. N. Exon Smith 1789fb6493 LTO: -internalize sets visibility to default
Visibility is meaningless when the linkage is local.  Change
`-internalize` to reset the visibility to `default`.

<rdar://problem/16141113>

llvm-svn: 207979
2014-05-05 17:40:44 +00:00
Timur Iskhodzhanov 9dbc206303 [ASan/Win] Fix issue 305 -- don't instrument .CRT initializer/terminator callbacks
See https://code.google.com/p/address-sanitizer/issues/detail?id=305
Reviewed at http://reviews.llvm.org/D3607

llvm-svn: 207968
2014-05-05 14:28:38 +00:00
Benjamin Kramer 9130cb8547 LoopUnroll: If we're doing partial unrolling, use the PartialThreshold to limit unrolling.
Otherwise we use the same threshold as for complete unrolling, which is
way too high. This made us unroll any loop smaller than 150 instructions
by 8 times, but only if someone specified -march=core2 or better,
which happens to be the default on darwin.

llvm-svn: 207940
2014-05-04 19:12:38 +00:00
Arnold Schwaighofer cd566c423a SLPVectorizer: Bring back the insertelement patch (r205965) with fixes
When can't assume a vectorized tree is rooted in an instruction. The IRBuilder
could have constant folded it. When we rebuild the build_vector (the series of
InsertElement instructions) use the last original InsertElement instruction. The
vectorized tree root is guaranteed to be before it.

Also, we can't assume that the n-th InsertElement inserts the n-th element into
a vector.

This reverts r207746 which reverted the revert of the revert of r205018 or so.

Fixes the test case in PR19621.

llvm-svn: 207939
2014-05-04 17:10:15 +00:00
Benjamin Kramer 64425fe875 SLPVectorizer: Lazily allocate the map for block numbering.
There is no point in creating it if we're not going to vectorize
anything. Creating the map is expensive as it creates large values.
No functionality change.

llvm-svn: 207916
2014-05-03 15:50:37 +00:00
Karthik Bhat ddd0cb5ecf Vectorize intrinsic math function calls in SLPVectorizer.
This patch adds support to recognize and vectorize intrinsic math functions in SLPVectorizer.
Review: http://reviews.llvm.org/D3560 and http://reviews.llvm.org/D3559

llvm-svn: 207901
2014-05-03 09:59:54 +00:00
Eric Christopher 6c26beb770 Clean up constructor logic and member access for LoopVectorizeHints.
There are public functions that mutate various members as well as
another private member already, so make all the members private to
avoid the discontinuity and add accessors for the values. Should
be no functional change.

llvm-svn: 207868
2014-05-02 20:40:04 +00:00
Nico Weber 4b2acde21a Teach GlobalDCE how to remove empty global_ctor entries.
This moves most of GlobalOpt's constructor optimization
code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The
public interface is a single function OptimizeGlobalCtorsList() that
takes a predicate returning which constructors to remove.

GlobalOpt calls this with a function that statically evaluates all
constructors, just like it did before. This part of the change is
behavior-preserving.

Also add a call to this from GlobalDCE with a filter that removes global
constructors that contain a "ret" instruction and nothing else – this
fixes PR19590.

llvm-svn: 207856
2014-05-02 18:35:25 +00:00
Akira Hatanaka f76388dd7e [GVN] Pass the phi-translated address of a load instead of the untranslated
address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where
PRE is applied to a load that is not partially redundant.

<rdar://problem/16638765>.

llvm-svn: 207853
2014-05-02 17:59:17 +00:00
Nick Lewycky 718ada97bc Fold strlen(expr ? "str1" : "str2") to x ? len1 : len2. This fires about 330 times in a bootstrap of clang.
llvm-svn: 207828
2014-05-02 04:11:45 +00:00
Benjamin Kramer cd1a98bf74 Update and sort CMakeLists.
llvm-svn: 207785
2014-05-01 18:59:11 +00:00
Eli Bendersky a108a65df2 Add an optimization that does CSE in a group of similar GEPs.
This optimization merges the common part of a group of GEPs, so we can compute
each pointer address by adding a simple offset to the common part.

The optimization is currently only enabled for the NVPTX backend, where it has
a large payoff on some benchmarks.

Review: http://reviews.llvm.org/D3462

Patch by Jingyue Wu.

llvm-svn: 207783
2014-05-01 18:38:36 +00:00
Chandler Carruth 18c2fbb143 Revert r205965, which essentially reverts r205018 for the second time.
=[

Turns out that this was the root cause of PR19621. We found a crasher
only recently (likely due to improvements elsewhere in the SLP
vectorizer) but the reduced test case failed all the way back to here.
I've confirmed that reverting this patch both fixes the reduced test
case in PR19621 and the actual source file that led to it, so it seems
to really be rooted here. I've replied to the commit thread with
discussion of my (feeble) attempts to debug this. Didn't make it very
far, so reverting now that we have a good test case so that things can
get back to healthy while the debugging carries on.

llvm-svn: 207746
2014-05-01 11:24:11 +00:00
Gerolf Hoflehner 3282af13d4 Patch for function cloning to inline all blocks whose address is taken
Not all address taken blocks get inlined. The reason is
that a blocks new address is known only when it is cloned. But e.g.
a branch instruction in a different block could need that address earlier
while it gets cloned. The solution is to collect the set of all
blocks that can potentially get inlined and compute a new block address
up front. Then clone and cleanup.

rdar://16427209

llvm-svn: 207713
2014-04-30 22:05:02 +00:00
Yi Jiang e2d5f29c2f Revert r207571 - Add slp vectorization to LTO passes
llvm-svn: 207693
2014-04-30 19:27:24 +00:00