Commit Graph

13630 Commits

Author SHA1 Message Date
Piotr Padlewski dc9b2cfc50 inariant.group handling in GVN
The most important part required to make clang
devirtualization works ( ͡°͜ʖ ͡°).
The code is able to find non local dependencies, but unfortunatelly
because the caller can only handle local dependencies, I had to add
some restrictions to look for dependencies only in the same BB.

http://reviews.llvm.org/D12992

llvm-svn: 249196
2015-10-02 22:12:22 +00:00
Bruno Cardoso Lopes b491a2d641 [SimplifyLibCalls] Fix instruction misplacement in string/memory libcall optimization
When trying to optimize fortified library functions use the right
location to insert new instructions in order to preserve correct
def-use order.

This fixes an issue where a misplaced instruction definition would
happen to be *after* one of its use after a RAUW, forming invalid IR.
This behavior was introduced by r227250.

Differential Revision: http://reviews.llvm.org/D13301

rdar://problem/22802369

llvm-svn: 249092
2015-10-01 22:43:53 +00:00
Arnaud A. de Grandmaison 849f3bf8c9 [InstCombine] Remove trivially empty lifetime start/end ranges.
Summary:
Some passes may open up opportunities for optimizations, leaving empty
lifetime start/end ranges. For example, with the following code:

    void foo(char *, char *);
    void bar(int Size, bool flag) {
      for (int i = 0; i < Size; ++i) {
        char text[1];
        char buff[1];
        if (flag)
          foo(text, buff); // BBFoo
      }
    }

the loop unswitch pass will create 2 versions of the loop, one with
flag==true, and the other one with flag==false, but always leaving
the BBFoo basic block, with lifetime ranges covering the scope of the for
loop. Simplify CFG will then remove BBFoo in the case where flag==false,
but will leave the lifetime markers.

This patch teaches InstCombine to remove trivially empty lifetime marker
ranges, that is ranges ending right after they were started (ignoring
debug info or other lifetime markers in the range).

This fixes PR24598: excessive compile time after r234581.

Reviewers: reames, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13305

llvm-svn: 249018
2015-10-01 14:54:31 +00:00
Jingyue Wu df1a1b113b [NaryReassociate] SeenExprs records WeakVH
Summary:
The instructions SeenExprs records may be deleted during rewriting.
FindClosestMatchingDominator should ignore these deleted instructions.

Fixes PR24301.

Reviewers: grosser

Subscribers: grosser, llvm-commits

Differential Revision: http://reviews.llvm.org/D13315

llvm-svn: 248983
2015-10-01 03:51:44 +00:00
Dehao Chen 7c41dd6498 Update sample profile propagation algorithm.
http://reviews.llvm.org/D13218

llvm-svn: 248968
2015-10-01 00:26:56 +00:00
Michael Zolotukhin fc783e91e0 [SLP] Don't vectorize loads of non-packed types (like i1, i2).
Summary:
Given an array of i2 elements, 4 consecutive scalar loads will be lowered to
i8-sized loads and thus will access 4 consecutive bytes in memory. If we
vectorize these loads into a single <4 x i2> load, it'll access only 1 byte in
memory. Hence, we should prohibit vectorization in such cases.

PS: Initial patch was proposed by Arnold.

Reviewers: aschwaighofer, nadav, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13277

llvm-svn: 248943
2015-09-30 21:05:43 +00:00
Evgeniy Stepanov f608111d1b Fix debug info with SafeStack.
llvm-svn: 248933
2015-09-30 19:55:43 +00:00
Fiona Glaser b0c6d9174e DeadCodeElimination: rewrite to be faster
Same strategy as simplifyInstructionsInBlock. ~1/3 less time
on my test suite. This pass doesn't have many in-tree users,
but getting rid of an O(N^2) worst case and making it cleaner
should at least make it a viable alternative to ADCE, since
it's now consistently somewhat faster.

llvm-svn: 248927
2015-09-30 17:49:49 +00:00
Erik Eckstein 848c1aa452 SLPVectorizer: limit the scheduling region size per basic block.
Usually large blocks are not a problem. But if a large block (> 10k instructions)
contains many (potential) chains of vector instructions, and those chains are
spread over a wide range of instructions, then scheduling becomes a compile time problem.
This change introduces a limit for the accumulate scheduling region size of a block.
For real-world functions this limit will never be exceeded (it's about 10x larger than
the maximum value seen in the test-suite and external test suite).

llvm-svn: 248917
2015-09-30 17:00:44 +00:00
Andrea Di Biagio 0594e2a1e9 [InstCombine] Teach how to convert SSSE3/AVX2 byte shuffles to builtin shuffles if the shuffle mask is constant.
This patch teaches InstCombiner how to convert a SSSE3/AVX2 byte shuffle to a
builtin shuffle if the mask is constant.

Converting byte shuffle intrinsic calls to builtin shuffles can help finding
more opportunities for combining shuffles later on in selection dag.

We may end up with byte shuffles with constant masks as the result of inlining.

Differential Revision: http://reviews.llvm.org/D13252

llvm-svn: 248913
2015-09-30 16:44:39 +00:00
Dehao Chen 6722688eaa http://reviews.llvm.org/D13145
Support hierarachical sample profile format.

llvm-svn: 248865
2015-09-30 00:42:46 +00:00
Evgeniy Stepanov d3f544f271 [safestack] Fix a stupid mix-up in the direct-tls code path.
llvm-svn: 248863
2015-09-30 00:01:47 +00:00
Dehao Chen 8e7df83e6a http://reviews.llvm.org/D13231
Change lookup functions to const functions.

llvm-svn: 248818
2015-09-29 18:28:15 +00:00
Dehao Chen 028e122ca9 Revert r248810 which breaks tests.
llvm-svn: 248814
2015-09-29 18:18:49 +00:00
Dehao Chen 410a25aa7a http://reviews.llvm.org/D13231
Change lookup functions to const functions.

llvm-svn: 248810
2015-09-29 17:59:58 +00:00
Simon Pilgrim 43f5e0848e [InstCombine] Improve Vector Demanded Bits Through Bitcasts
Currently SimplifyDemandedVectorElts can only peek through bitcasts if the vectors have the same number of elements.

This patch fixes and enables some existing (disabled) code to support bitcasting to vectors with more/fewer elements. It currently only accepts cases when vectors alias cleanly (i.e. number of elements are an exact multiple of the other vector).

This was added to improve the demanded vector elements support for SSE vector shifts which require the __m128i (<2 x i64>) argument type to be bitcast to the vector type for the builtin shift. I've added extra tests for various additional bitcasts.

Differential Revision: http://reviews.llvm.org/D12935

llvm-svn: 248784
2015-09-29 08:19:11 +00:00
Chen Li 9f27fc0599 [LoopUnswitch] Add block frequency analysis to recognize hot/cold regions
Summary: This patch adds block frequency analysis to LoopUnswitch pass to recognize hot/cold regions. For cold regions the pass only performs trivial unswitches since they do not increase code size, and for hot regions everything works as before. This helps to minimize code growth in cold regions and be more aggressive in hot regions. Currently the default cold regions are blocks with frequencies below 20% of function entry frequency, and it can be adjusted via -loop-unswitch-cold-block-frequency flag. The entire feature is controlled via -loop-unswitch-with-block-frequency flag and it is off by default.

Reviewers: broune, silvas, dnovillo, reames

Subscribers: davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D11605

llvm-svn: 248777
2015-09-29 05:03:32 +00:00
Evgeniy Stepanov d8b86f7cdc Move dbg.declare intrinsics when merging and replacing allocas.
Place new and update dbg.declare calls immediately after the
corresponding alloca.

Current code in replaceDbgDeclareForAlloca puts the new dbg.declare
at the end of the basic block. LLVM codegen has problems emitting
debug info in a situation when dbg.declare appears after all uses of
the variable. This usually kinda works for inlining and ASan (two
users of this function) but not for SafeStack (see the pending change
in http://reviews.llvm.org/D13178).

llvm-svn: 248769
2015-09-29 00:30:19 +00:00
Sean Silva ace7818ce6 [GlobalOpt] Sort members of llvm.used deterministically
Patch by Jake VanAdrighem!

Summary:
Fix the way we sort the llvm.used and llvm.compiler.used members.

This bug seems to have been introduced in rL183756 through a set of improper casts to GlobalValue*. In subsequent patches this problem was missed and transformed into a getName call on a ConstantExpr.

Reviewers: silvas

Subscribers: silvas, llvm-commits

Differential Revision: http://reviews.llvm.org/D12851

llvm-svn: 248728
2015-09-28 19:02:11 +00:00
Fiona Glaser f74cc40e34 Improve performance of SimplifyInstructionsInBlock
1. Use a worklist, not a recursive approach, to avoid needless
   revisitation and being repeatedly forced to jump back to the
   start of the BB if a handle is invalidated.

2. Only insert operands to the worklist if they become unused
   after a dead instruction is removed, so we don’t have to
   visit them again in most cases.

3. Use a SmallSetVector to track the worklist.

4. Instead of pre-initting the SmallSetVector like in
   DeadCodeEliminationPass, only put things into the worklist
   if they have to be revisited after the first run-through.
   This minimizes how much the actual SmallSetVector gets used,
   which saves a lot of time.

llvm-svn: 248727
2015-09-28 18:56:07 +00:00
Weiming Zhao 310770a90f [LoopReroll] Ignore debug intrinsics
Originally, debug intrinsics and annotation intrinsics may prevent
the loop to be rerolled, now they are ignored.

Differential Revision: http://reviews.llvm.org/D13150

llvm-svn: 248718
2015-09-28 17:03:23 +00:00
Sanjay Patel 9533407566 [InstCombine] fold zexts and constants into a phi (PR24766)
This is one step towards solving PR24766:
https://llvm.org/bugs/show_bug.cgi?id=24766

We were not producing the same IR for these two C functions because the store
to the temp bool causes extra zexts:

#include <stdbool.h>

bool switchy(char x1, char x2, char condition) {
   bool conditionMet = false;
   switch (condition) {
   case 0: conditionMet = (x1 == x2); break;
   case 1: conditionMet = (x1 <= x2); break;
   }
   return conditionMet;
}

bool switchy2(char x1, char x2, char condition) {
   switch (condition) {
   case 0: return (x1 == x2);
   case 1: return (x1 <= x2);
   }
  return false;
}

As noted in the code comments, this test case manages to avoid the more general existing
phi optimizations where there are only 2 phi inputs or where there are no constant phi 
args mixed in with the casts ops. It seems like a corner case, but if we don't catch it, 
then I don't think we can get SimplifyCFG to further optimize towards the canonical form
for this function shown in the bug report.

Differential Revision: http://reviews.llvm.org/D12866

llvm-svn: 248689
2015-09-27 20:34:31 +00:00
Joseph Tremoulet 09af67aba5 [EH] Create removeUnwindEdge utility
Summary:
Factor the code that rewrites invokes to calls and rewrites WinEH
terminators to their "unwind to caller" equivalents into a helper in
Utils/Local, and use it in the three places I'm aware of that need to do
this.


Reviewers: andrew.w.kaylor, majnemer, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13152

llvm-svn: 248677
2015-09-27 01:47:46 +00:00
Sanjay Patel e1b09caaaf [InstCombine] match De Morgan's Law hidden by zext ops (PR22723)
This is a fix for PR22723:
https://llvm.org/bugs/show_bug.cgi?id=22723

My first attempt at this was to change what I thought was the root problem:

xor (zext i1 X to i32), 1 --> zext (xor i1 X, true) to i32

...but we create the opposite pattern in InstCombiner::visitZExt(), so infinite loop!

My next idea was to fix the matchIfNot() implementation in PatternMatch, but that would
mean potentially returning a different size for the match than what was input. I think
this would require all users of m_Not to check the size of the returned match, so I 
abandoned that idea.

I settled on just fixing the exact case presented in the PR. This patch does allow the
2 functions in PR22723 to compile identically (x86):

bool test(bool x, bool y) { return !x | !y; }
bool test(bool x, bool y) { return !x || !y; }
...
andb	%sil, %dil
xorb	$1, %dil
movb	%dil, %al
retq

Differential Revision: http://reviews.llvm.org/D12705

llvm-svn: 248634
2015-09-25 23:21:38 +00:00
Justin Bogner 0638b7ba99 ADCE: Fix typo in file comment. NFC
llvm-svn: 248613
2015-09-25 21:03:46 +00:00
Charlie Turner 2720593ab4 [InstCombine] Recognize another bswap idiom.
Summary:
The byte-swap recognizer can now notice that this

```
uint32_t bswap(uint32_t x)
{
  x = (x & 0x0000FFFF) << 16 | (x & 0xFFFF0000) >> 16;
  x = (x & 0x00FF00FF) << 8 | (x & 0xFF00FF00) >> 8;
  return x;
}
```
    
is a bswap. Fixes PR23863.

Reviewers: nlewycky, hfinkel, hans, jmolloy, rengolin

Subscribers: majnemer, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D12637

llvm-svn: 248482
2015-09-24 10:24:58 +00:00
Michael Zolotukhin 74621cced7 Add CFG Simplification pass after Loop Unswitching.
Loop unswitching produces conditional branches with constant condition,
and it's beneficial for later passes to clean this up with simplify-cfg.
We do this after the second invocation of loop-unswitch, but not after
the first one. Not doing so might cause problem for passes like
LoopUnroll, whose estimate of loop body size would be less accurate.

Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D13064

llvm-svn: 248460
2015-09-24 03:50:17 +00:00
Evgeniy Stepanov 8685daf23e [safestack] Fix compiler crash in the presence of stack restores.
A use can be emitted before def in a function with stack restore
points but no static allocas.

llvm-svn: 248455
2015-09-24 01:23:51 +00:00
Michael Zolotukhin d56ee06d1f [Unroll] When completely unrolling the loop, replace conditinal branches with unconditional.
Nothing is expected to change, except we do less redundant work in
clean-up.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12951

llvm-svn: 248444
2015-09-23 23:12:43 +00:00
Wei Mi 3cc9204a52 Put profile variables of COMDAT functions to it's own COMDAT group.
In -fprofile-instr-generate compilation, to remove the redundant profile
variables for the COMDAT functions, these variables are placed in the same
COMDAT group as its associated function. This way when the COMDAT function
is not picked by the linker, those profile variables will also not be
output in the final binary. This may cause warning when mix link objects
built w and wo -fprofile-instr-generate.

This patch puts the profile variables for COMDAT functions to its own COMDAT
group to avoid the problem.

Patch by xur.
Differential Revision: http://reviews.llvm.org/D12248

llvm-svn: 248440
2015-09-23 22:40:45 +00:00
Lawrence Hu cac0b89289 Swap loop invariant GEP with loop variant GEP to allow more LICM.
This patch changes the order of GEPs generated by Splitting GEPs
    pass, specially when one of the GEPs has constant and the base is
    loop invariant, then we will generate the GEP with constant first
    when beneficial, to expose more cases for LICM.

    If originally Splitting GEP generate the following:
      do.body.i:
        %idxprom.i = sext i32 %shr.i to i64
        %2 = bitcast %typeD* %s to i8*
        %3 = shl i64 %idxprom.i, 2
        %uglygep = getelementptr i8, i8* %2, i64 %3
        %uglygep7 = getelementptr i8, i8* %uglygep, i64 1032
      ...
    Now it genereates:
      do.body.i:
        %idxprom.i = sext i32 %shr.i to i64
        %2 = bitcast %typeD* %s to i8*
        %3 = shl i64 %idxprom.i, 2
        %uglygep = getelementptr i8, i8* %2, i64 1032
        %uglygep7 = getelementptr i8, i8* %uglygep, i64 %3
      ...

    For no-loop cases, the original way of generating GEPs seems to
    expose more CSE cases, so we don't change the logic for no-loop
    cases, and only limit our change to the specific case we are
    interested in.

llvm-svn: 248420
2015-09-23 19:25:30 +00:00
Akira Hatanaka f6afd11538 [InstCombine] Preserve metadata when merging loads that are phi
arguments.

Make sure InstCombiner::FoldPHIArgLoadIntoPHI doesn't drop the following
metadata:

MD_tbaa
MD_alias_scope
MD_noalias
MD_invariant_load
MD_nonnull
MD_range

rdar://problem/17617709

Differential Revision: http://reviews.llvm.org/D12710

llvm-svn: 248419
2015-09-23 18:40:57 +00:00
Evgeniy Stepanov a2002b08f7 Android support for SafeStack.
Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

This is a re-commit of a change in r248357 that was reverted in
r248358.

llvm-svn: 248405
2015-09-23 18:07:56 +00:00
Vedant Kumar ff08e926ba [Inline] Use AssumptionCache from the right Function
This changes the behavior of AddAligntmentAssumptions to match its
comment. I.e, prove the asserted alignment in the context of the caller,
not the callee.

Thanks to Mehdi Amini for seeing the issue here! Also to Artur Pilipenko
who also saw a fix for the issue.

rdar://22521387

Differential Revision: http://reviews.llvm.org/D12997

llvm-svn: 248390
2015-09-23 15:49:08 +00:00
David Majnemer fa36bde2f6 [DeadArgElim] Split the invoke successor edge
Invoking a function which returns an aggregate can sometimes be
transformed to return a scalar value.  However, this means that we need
to create an insertvalue instruction(s) to recreate the correct
aggregate type.  We achieved this by inserting an insertvalue
instruction at the invoke's normal successor.  However, this is not
feasible if the normal successor uses the invoke's return value inside a
PHI node.

Instead, split the edge between the invoke and the unwind successor and
create the insertvalue instruction in the new basic block.  The new
basic block's successor will be the old invoke successor which leaves
us with IR which is well behaved.

This fixes PR24906.

llvm-svn: 248387
2015-09-23 15:41:09 +00:00
Igor Laevsky 029bd93c5d [DeadStoreElimination] Remove dead zero store to calloc initialized memory
This change allows dead store elimination to remove zero and null stores into memory freshly allocated with calloc-like function.

Differential Revision: http://reviews.llvm.org/D13021

llvm-svn: 248374
2015-09-23 11:38:44 +00:00
Simon Pilgrim 9cb018b6b6 [X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR
This patches removes the x86.sse41.pmovsx* intrinsics, provides a suitable upgrade path and updates relevant tests to sign extend a subvector instead.

LLVM counterpart to D12835

Differential Revision: http://reviews.llvm.org/D13002

llvm-svn: 248368
2015-09-23 08:48:33 +00:00
Sanjoy Das 2aacc0ecca [SCEV] Introduce ScalarEvolution::getOne and getZero.
Summary:
It is fairly common to call SE->getConstant(Ty, 0) or
SE->getConstant(Ty, 1); this change makes such uses a little bit
briefer.

I've refactored the call sites I could find easily to use getZero /
getOne.

Reviewers: hfinkel, majnemer, reames

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12947

llvm-svn: 248362
2015-09-23 01:59:04 +00:00
Evgeniy Stepanov 8d0e3011d8 Revert "Android support for SafeStack."
test/Transforms/SafeStack/abi.ll breaks when target is not supported;
needs refactoring.

llvm-svn: 248358
2015-09-23 01:23:22 +00:00
Evgeniy Stepanov ce2e16f00c Android support for SafeStack.
Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

llvm-svn: 248357
2015-09-23 01:03:51 +00:00
Michael Zolotukhin deade19630 [Unroll] Do not crash trying to propagate a value to vector load.
llvm-svn: 248333
2015-09-22 22:27:12 +00:00
Michael Zolotukhin 8bb31dd08a [Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad.
Apart from checking that GlobalVariable is a constant, we should check
that it's not a weak constant, in which case we can't propagate its
value.

llvm-svn: 248327
2015-09-22 21:41:29 +00:00
NAKAMURA Takumi 10c80e7996 Prune trailing whitespaces.
llvm-svn: 248265
2015-09-22 11:19:03 +00:00
NAKAMURA Takumi 0a7d0ad95f Untabify.
llvm-svn: 248264
2015-09-22 11:15:07 +00:00
NAKAMURA Takumi a9cb538a74 Reformat blank lines.
llvm-svn: 248263
2015-09-22 11:14:39 +00:00
NAKAMURA Takumi 84965031a7 Reformat comment lines.
llvm-svn: 248262
2015-09-22 11:14:12 +00:00
NAKAMURA Takumi 70ad98aca4 Reformat.
llvm-svn: 248261
2015-09-22 11:13:55 +00:00
Evgeniy Stepanov 3c9c8338d0 Remove unused TargetTransformInfo dependency from SafeStack pass.
llvm-svn: 248233
2015-09-22 00:44:32 +00:00
Michael Zolotukhin 9f3aea6e1f [LoopUnswitch] Require DominatorTree info.
Summary:
We should either require the DT info to be available, or check if it's
available in every place we use DT (and we already miss such check in
one place, which causes failures in some cases). As other loop passes
preserve DT and it's usually available, it makes sense to just require
it here.

There is no regression test, because the bug only shows up if pass
manager decides to clean DT info right before LoopUnswitch. If
loop-unswitch is run separately, DT is available, so bug isn't exposed.

Reviewers: chandlerc, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13036

llvm-svn: 248230
2015-09-22 00:22:47 +00:00
Philip Reames 5f99423de9 [LICM] Hoist calls to readonly argmemonly functions even with stores in the loop
We know that an argmemonly function can only access memory pointed to by it's pointer arguments. Rather than needing to consider all possible stores as aliasing (as we do for a readonly function), we can only consider the aliasing of the pointer arguments.

Note that this change only addresses hoisting. I'm thinking about how to address speculation safety as well, but that will be a different change.

FYI, argmemonly disallows accessing memory through non-pointer typed arguments.  

Differential Revision: http://reviews.llvm.org/D12771

llvm-svn: 248220
2015-09-21 22:27:59 +00:00