Commit Graph

20619 Commits

Author SHA1 Message Date
Sanjay Patel c1416b60f2 [InstCombine] narrow vector select with padded condition and extracted result (PR38691)
shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) -->
sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask)

The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=38691
...is the last regression test. In that case, we're just left with the narrow select.

Note that if we do create new shuffles, they use the existing extraction identity mask, 
so there's no danger that this transform creates arbitrary shuffles.

Differential Revision: https://reviews.llvm.org/D51496

llvm-svn: 341708
2018-09-07 21:03:34 +00:00
Fangrui Song b3b61de09a [PGO] Fix some style issue of ControlHeightReduction
Reviewers: yamauchi

Reviewed By: yamauchi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51811

llvm-svn: 341702
2018-09-07 20:23:15 +00:00
Hiroshi Yamauchi 06650941a2 [PGO][CHR] Build/warning fix
llvm-svn: 341692
2018-09-07 18:44:53 +00:00
JF Bastien 7e2dd2d2fa NFC: remove magic bool in LoopIdiomRecognize
Use an enum class instead.

llvm-svn: 341684
2018-09-07 18:17:59 +00:00
Hiroshi Yamauchi 5fb509b763 [PGO][CHR] Small cleanup.
Summary:
Do away with demangling. It wasn't really necessary.
Declared some local functions to be static.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51740

llvm-svn: 341681
2018-09-07 18:00:58 +00:00
Craig Topper 040c2b0acf [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible
If the ~X wasn't able to simplify above the max/min, we might be able to simplify it by moving it below the max/min.

I had to modify the ~(min/max ~X, Y) transform to prevent getting stuck in a loop when we saw the new ~(max/min X, ~Y) before the ~Y had been folded away to remove the new not.

Differential Revision: https://reviews.llvm.org/D51398

llvm-svn: 341674
2018-09-07 16:19:50 +00:00
Anna Thomas 110df11a1a [LV] Fix code gen for conditionally executed loads and stores
Fix a latent bug in loop vectorizer which generates incorrect code for
memory accesses that are executed conditionally. As pointed in review,
this bug definitely affects uniform loads and may affect conditional
stores that should have turned into scatters as well).

The code gen for conditionally executed uniform loads on architectures
that support masked gather instructions is broken.

Without this patch, we were unconditionally executing the *conditional*
load in the vectorized version.

This patch does the following:
1. Uniform conditional loads on architectures with gather support will
   have correct code generated. In particular, the cost model
   (setCostBasedWideningDecision) is fixed.
2. For the recipes which are handled after the widening decision is set,
   we use the isScalarWithPredication(I, VF) form which is added in the
   patch.

3. Fix the vectorization cost model for scalarization
   (getMemInstScalarizationCost): implement and use isPredicatedInst to
   identify *all* predicated instructions, not just scalar+predicated. So,
   now the cost for scalarization will be increased for maskedloads/stores
   and gather/scatter operations. In short, we should be choosing the
   gather/scatter in place of scalarization on archs where it is
   profitable.
4. We needed to weaken the assert in useEmulatedMaskMemRefHack.

Reviewers: Ayal, hsaito, mkuper

Differential Revision: https://reviews.llvm.org/D51313

llvm-svn: 341673
2018-09-07 15:53:48 +00:00
Aditya Kumar 801394a3d7 Hot cold splitting pass
Find cold blocks based on profile information (or optionally with static analysis).
Forward propagate profile information to all cold-blocks.
Outline a cold region.
Set calling conv and prof hint for the callsite of the outlined function.

Worked in collaboration with: Sebastian Pop <s.pop@samsung.com>
Differential Revision: https://reviews.llvm.org/D50658

llvm-svn: 341669
2018-09-07 15:03:49 +00:00
Florian Hahn e32ff4b28a [InstCombine] Do not fold scalar ops over select with vector condition.
If OtherOpT or OtherOpF have scalar types and the condition is a vector,
we would create an invalid select.

Reviewers: spatel, john.brawn, mssimpso, craig.topper

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D51781

llvm-svn: 341666
2018-09-07 14:40:06 +00:00
Florian Hahn b30f7aeeeb [NewGVN] Mark function as changed if we erase instructions.
Currently eliminateInstructions only returns true if any instruction got
replaced. In the test case for this patch, we eliminate the trivially
dead calls, for which eliminateInstructions not do a replacement and the
function is not marked as changed, which is why the inliner crashes
while traversing the call graph.

Alternatively we could also change eliminateInstructions to return true
in case we mark instructions for deletion, but that's slightly more code
and doing it at the place where the replacement happens seems safer.

Fixes PR37517.

Reviewers: davide, mcrosier, efriedma, bjope

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D51169

llvm-svn: 341651
2018-09-07 11:41:34 +00:00
Alexander Potapenko 6301574cb4 [MSan] don't access MsanCtorFunction when using KMSAN
MSan has found a use of uninitialized memory in MSan, fix it.

llvm-svn: 341646
2018-09-07 09:56:36 +00:00
Alexander Potapenko 8fe99a0ef2 [MSan] Add KMSAN instrumentation to MSan pass
Introduce the -msan-kernel flag, which enables the kernel instrumentation.

The main differences between KMSAN and MSan instrumentations are:

- KMSAN implies msan-track-origins=2, msan-keep-going=true;
- there're no explicit accesses to shadow and origin memory.
  Shadow and origin values for a particular X-byte memory location are
  read and written via pointers returned by
  __msan_metadata_ptr_for_load_X(u8 *addr) and
  __msan_store_shadow_origin_X(u8 *addr, uptr shadow, uptr origin);
- TLS variables are stored in a single struct in per-task storage. A call
  to a function returning that struct is inserted into every instrumented
  function before the entry block;
- __msan_warning() takes a 32-bit origin parameter;
- local variables are poisoned with __msan_poison_alloca() upon function
  entry and unpoisoned with __msan_unpoison_alloca() before leaving the
  function;
- the pass doesn't declare any global variables or add global constructors
  to the translation unit.

llvm-svn: 341637
2018-09-07 09:10:30 +00:00
Max Kazantsev 9e6845d8e1 [IndVars] Set Changed when we delete dead instructions. PR38855
IndVars does not set `Changed` flag when it eliminates dead instructions. As result,
it may make IR modifications and report that it has done nothing. It leads to inconsistent
preserved analyzes results.

Differential Revision: https://reviews.llvm.org/D51770
Reviewed By: skatkov

llvm-svn: 341633
2018-09-07 07:23:39 +00:00
Wei Mi 94d44c97bc [SampleFDO] Make sample profile loader unaware of compact format change.
The patch tries to make sample profile loader independent of profile format
change. It moves compact format related code into FunctionSamples and
SampleProfileReader classes, and sample profile loader only has to interact
with those two classes and will be unaware of profile format changes.

The cleanup also contain some fixes to further remove the difference between
compactbinary format and binary format. After the cleanup using different
formats originated from the same profile will generate the same binaries,
which we verified by compiling two large server benchmarks w/wo thinlto.

Differential Revision: https://reviews.llvm.org/D51643

llvm-svn: 341591
2018-09-06 22:03:37 +00:00
Sanjay Patel 93bd15a005 [InstCombine] add xor+not folds
This fold is needed to avoid a regression when we try
to recommit rL300977. 
We can't see the most basic win currently because 
demanded bits changes the patterns:
https://rise4fun.com/Alive/plpp

llvm-svn: 341559
2018-09-06 16:23:40 +00:00
Alexander Potapenko 7f270fcf0a [MSan] store origins for variadic function parameters in __msan_va_arg_origin_tls
Add the __msan_va_arg_origin_tls TLS array to keep the origins for variadic function parameters.
Change the instrumentation pass to store parameter origins in this array.

This is a reland of r341528.

test/msan/vararg.cc doesn't work on Mips, PPC and AArch64 (because this
patch doesn't touch them), XFAIL these arches.
Also turned out Clang crashed on i80 vararg arguments because of
incorrect origin type returned by getOriginPtrForVAArgument() - fixed it
and added a test.

llvm-svn: 341554
2018-09-06 15:14:36 +00:00
Sanjay Patel 1a00ffd656 [InstCombine] fix formatting in SimplifyDemandedVectorElts->Select; NFCI
I'm preparing to add the same functionality both here and to the DAG 
version of this code in D51696 / D51433, so try to make those cases 
as similar as possible to avoid bugs.

llvm-svn: 341545
2018-09-06 13:19:22 +00:00
Alexander Potapenko ac6595bd53 [MSan] revert r341528 to unbreak the bots
llvm-svn: 341541
2018-09-06 12:19:27 +00:00
Florian Hahn c51d08825f [LoopInterchange] Cleanup unused variables.
llvm-svn: 341537
2018-09-06 10:41:01 +00:00
Florian Hahn 236f6feb8f [LoopInterchange] Move preheader creation to transform stage and simplify.
There is no need to create preheaders in the analysis stage, we only
need them when adjusting the branches. Also, the only cases we need to
create our own preheaders is when they have more than 1 predecessors or
PHI nodes (even with only 1 predecessor, we could have an LCSSA phi
node). I have simplified the conditions and added some assertions to be
sure. Because we know the inner and outer loop need to be tightly
nested, it is sufficient to check if the inner loop preheader is the
outer loop header to check if we need to create a new preheader.

Reviewers: efriedma, mcrosier, karthikthecool

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D51703

llvm-svn: 341533
2018-09-06 09:57:27 +00:00
Alexander Potapenko 1a10ae0def [MSan] store origins for variadic function parameters in __msan_va_arg_origin_tls
Add the __msan_va_arg_origin_tls TLS array to keep the origins for
variadic function parameters.
Change the instrumentation pass to store parameter origins in this array.

llvm-svn: 341528
2018-09-06 08:50:11 +00:00
Alexander Potapenko d518c5fc87 [MSan] Make sure variadic function arguments do not overflow __msan_va_arg_tls
Turns out that calling a variadic function with too many (e.g. >100 i64's)
arguments overflows __msan_va_arg_tls, which leads to smashing other TLS
data with function argument shadow values.

getShadow() already checks for kParamTLSSize and returns clean shadow if
the argument does not fit, so just skip storing argument shadow for such
arguments.

llvm-svn: 341525
2018-09-06 08:21:54 +00:00
Max Kazantsev f90154069c Revert "[IndVars] Turn isValidRewrite into an assertion" because it seems wrong
llvm-svn: 341517
2018-09-06 05:52:47 +00:00
Max Kazantsev 51690c4f52 [IndVars] Turn isValidRewrite into an assertion
Function rewriteLoopExitValues contains a check on isValidRewrite which
is needed to make sure that SCEV does not convert the pattern
`gep Base, (&p[n] - &p[0])` into `gep &p[n], Base - &p[0]`. This problem
has been fixed in SCEV long ago, so this check is just obsolete.

This patch converts it into an assertion to make sure that the SCEV will
not mess up this case in the future.

Differential Revision: https://reviews.llvm.org/D51582
Reviewed By: atrick

llvm-svn: 341516
2018-09-06 05:21:25 +00:00
Benjamin Kramer 9abad4814d [ControlHeightReduction] Remove unused includes
Also clang-format them.

llvm-svn: 341468
2018-09-05 13:51:05 +00:00
Benjamin Kramer 837f93af7d [Aggressive InstCombine] Move C bindings to their own header file.
llvm-svn: 341461
2018-09-05 11:41:12 +00:00
Richard Trieu 47c2bc58b3 Prevent unsigned overflow.
The sum of the weights is caculated in an APInt, which has a width smaller than
64.  In certain cases, the sum of the widths would overflow when calculations
are done inside an APInt, but would not if done with uint64_t.  Since the
values will be passed as uint64_t in the function call anyways, do all the math
in 64 bits.  Also added an assert in case the probabilities overflow 64 bits.

llvm-svn: 341444
2018-09-05 04:19:15 +00:00
Fangrui Song c8f348cba7 Fix -Wunused-function in release build after rL341386
llvm-svn: 341443
2018-09-05 03:10:20 +00:00
Sanjay Patel 63cf26cf01 [InstCombine] fix xor-or-xor fold to check uses and handle commutes
I'm probably missing some way to use m_Deferred to remove the code
duplication, but that can be a follow-up.

The improvement in demand_shrink_nsw.ll is an example of missing
the fold because the pattern matching was deficient. I didn't try
to follow the bits in that test, but Alive says it's correct:
https://rise4fun.com/Alive/ugc

llvm-svn: 341426
2018-09-04 23:22:13 +00:00
Zhaoshi Zheng a0aa41d793 Revert "Revert r341269: [Constant Hoisting] Hoisting Constant GEP Expressions"
Reland r341269. Use std::stable_sort when sorting constant condidates.

Reverting commit, r341365:

  Revert r341269: [Constant Hoisting] Hoisting Constant GEP Expressions

  One of the tests is failing 50% of the time when expensive checks are
  enabled. Not sure how deep the problem is so just reverting while the
  author can investigate so that the bots stop repeatedly failing and
  blaming things incorrectly. Will respond with details on the original
  commit.

Original commit, r341269:

  [Constant Hoisting] Hoisting Constant GEP Expressions

  Leverage existing logic in constant hoisting pass to transform constant GEP
  expressions sharing the same base global variable. Multi-dimensional GEPs are
  rewritten into single-dimensional GEPs.

  https://reviews.llvm.org/D51396

Differential Revision: https://reviews.llvm.org/D51654

llvm-svn: 341417
2018-09-04 22:17:03 +00:00
Anna Thomas dbacea188b [LV] First order recurrence phis should not be treated as uniform
This is fix for PR38786.
First order recurrence phis were incorrectly treated as uniform,
which caused them to be vectorized as uniform instructions.

Patch by Ayal Zaks and Orivej Desh!

Reviewed by: Anna

Differential Revision: https://reviews.llvm.org/D51639

llvm-svn: 341416
2018-09-04 22:12:23 +00:00
Hiroshi Yamauchi bd897a02a0 Fix a memory leak after rL341386.
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51658

llvm-svn: 341412
2018-09-04 21:28:22 +00:00
Sanjay Patel 0f70f86ce0 [InstCombine] make ((X & C) ^ C) form consistent for vectors
It would be better to create a 'not' here, but that's not possible yet.

llvm-svn: 341410
2018-09-04 21:17:14 +00:00
Sanjay Patel a89f183253 [InstCombine] simplify code for xor folds; NFCI
This is just a cleanup step. The TODO comments show
what is wrong with the 'and' version of the fold.
Fixing this should be part of recommitting:
rL300977

llvm-svn: 341405
2018-09-04 21:00:13 +00:00
Reid Kleckner 792a4f8a21 Fix unused variable warning
llvm-svn: 341400
2018-09-04 20:34:47 +00:00
Fedor Sergeev 8b6effd969 [SimpleLoopUnswitch] remove a chain of dead blocks at once
Recent change to deleteDeadBlocksFromLoop was not enough to
fix all the problems related to dead blocks after nontrivial
unswitching of switches.

We need to delete all the dead blocks that were created during
unswitching, otherwise we will keep having problems with phi's
or dead blocks.

This change removes all the dead blocks that are reachable from the loop,
not trying to track whether these blocks are newly created by unswitching
or not. While not completely correct, we are unlikely to get loose but
reachable dead blocks that do not belong to our loop nest.

It does fix all the failures currently known, in particular PR38778.

Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D51519

llvm-svn: 341398
2018-09-04 20:19:41 +00:00
Hiroshi Yamauchi 72ee6d6000 Fix build failures after rL341386.
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51647

llvm-svn: 341391
2018-09-04 18:10:54 +00:00
Hiroshi Yamauchi 9775a620b0 [PGO] Control Height Reduction
Summary:
Control height reduction merges conditional blocks of code and reduces the
number of conditional branches in the hot path based on profiles.

if (hot_cond1) { // Likely true.
  do_stg_hot1();
}
if (hot_cond2) { // Likely true.
  do_stg_hot2();
}

->

if (hot_cond1 && hot_cond2) { // Hot path.
  do_stg_hot1();
  do_stg_hot2();
} else { // Cold path.
  if (hot_cond1) {
    do_stg_hot1();
  }
  if (hot_cond2) {
    do_stg_hot2();
  }
}

This speeds up some internal benchmarks up to ~30%.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: xbolva00, dmgreen, mehdi_amini, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D50591

llvm-svn: 341386
2018-09-04 17:19:13 +00:00
Chandler Carruth 6cb12444cc Revert r341269: [Constant Hoisting] Hoisting Constant GEP Expressions
One of the tests is failing 50% of the time when expensive checks are
enabled. Not sure how deep the problem is so just reverting while the
author can investigate so that the bots stop repeatedly failing and
blaming things incorrectly. Will respond with details on the original
commit.

llvm-svn: 341365
2018-09-04 13:36:44 +00:00
Chandler Carruth 664aa868f5 [x86/SLH] Add a real Clang flag and LLVM IR attribute for Speculative
Load Hardening.

Wires up the existing pass to work with a proper IR attribute rather
than just a hidden/internal flag. The internal flag continues to work
for now, but I'll likely remove it soon.

Most of the churn here is adding the IR attribute. I talked about this
Kristof Beyls and he seemed at least initially OK with this direction.
The idea of using a full attribute here is that we *do* expect at least
some forms of this for other architectures. There isn't anything
*inherently* x86-specific about this technique, just that we only have
an implementation for x86 at the moment.

While we could potentially expose this as a Clang-level attribute as
well, that seems like a good question to defer for the moment as it
isn't 100% clear whether that or some other programmer interface (or
both?) would be best. We'll defer the programmer interface side of this
for now, but at least get to the point where the feature can be enabled
without relying on implementation details.

This also allows us to do something that was really hard before: we can
enable *just* the indirect call retpolines when using SLH. For x86, we
don't have any other way to mitigate indirect calls. Other architectures
may take a different approach of course, and none of this is surfaced to
user-level flags.

Differential Revision: https://reviews.llvm.org/D51157

llvm-svn: 341363
2018-09-04 12:38:00 +00:00
Nicola Zaghen 9588ad9611 [InstCombine] Fold icmp ugt/ult (add nuw X, C2), C --> icmp ugt/ult X, (C - C2)
Support for sgt/slt was added in rL294898, this adds the same cases also for unsigned compares.

This is the Alive proof: https://rise4fun.com/Alive/nyY

Differential Revision: https://reviews.llvm.org/D50972

llvm-svn: 341353
2018-09-04 10:29:48 +00:00
Max Kazantsev f34115c627 [NFC] Add assert to detect LCSSA breaches early
llvm-svn: 341347
2018-09-04 06:34:40 +00:00
Max Kazantsev 2cbba56337 [IndVars] Fix usage of SCEVExpander to not mess with SCEVConstant. PR38674
This patch removes the function `expandSCEVIfNeeded` which behaves not as
it was intended. This function tries to make a lookup for exact existing expansion
and only goes to normal expansion via `expandCodeFor` if this lookup hasn't found
anything. As a result of this, if some instruction above the loop has a `SCEVConstant`
SCEV, this logic will return this instruction when asked for this `SCEVConstant` rather
than return a constant value. This is both non-profitable and in some cases leads to
breach of LCSSA form (as in PR38674).

Whether or not it is possible to break LCSSA with this algorithm and with some
non-constant SCEVs is still in question, this is still being investigated. I wasn't
able to construct such a test so far, so maybe this situation is impossible. If it is,
it will go as a separate fix.

Rather than do it, it is always correct to just invoke `expandCodeFor` unconditionally:
it behaves smarter about insertion points, and as side effect of this it will choose a
constant value for SCEVConstants. For other SCEVs it may end up finding a better insertion
point. So it should not be worse in any case.

NOTE: So far the only known case for which this transform may break LCSSA is mapping
of SCEVConstant to an instruction. However there is a suspicion that the entire algorithm
can compromise LCSSA form for other cases as well (yet not proved).

Differential Revision: https://reviews.llvm.org/D51286
Reviewed By: etherzhhb

llvm-svn: 341345
2018-09-04 05:01:35 +00:00
Sanjay Patel 2fe1f62c88 [InstCombine] simplify xor/not folds; NFCI
llvm-svn: 341336
2018-09-03 18:40:56 +00:00
Sanjay Patel d75064e6d5 [InstCombine] allow add+not --> sub for arbitrary vector constants.
llvm-svn: 341335
2018-09-03 18:21:59 +00:00
Florian Hahn cc9dc599ba [SLC] Support expanding pow(x, n+0.5) to x * x * ... * sqrt(x)
Reviewers: evandro, efriedma, spatel

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D51435

llvm-svn: 341330
2018-09-03 17:37:39 +00:00
Argyrios Kyrtzidis c30340b207 Add header guards to some headers that are missing them
Also adjust some of dsymutil's headers to put the header guards at the top,
otherwise the compiler will not recognize them as header guards.

llvm-svn: 341323
2018-09-03 16:22:05 +00:00
Sanjay Patel 17e709b66a [InstCombine] allow not+sub fold for arbitrary vector constants
The fold was implemented for the general case but use-limitation,
but the later constant version which didn't check uses was only
matching splat constants.

llvm-svn: 341292
2018-09-02 19:31:45 +00:00
Sanjay Patel ca36eb4e33 [Reassociate] swap binop operands to increase factoring potential
If we have a pair of binops feeding another pair of binops, rearrange the operands so 
the matching pair are together because that allows easy factorization folds to happen 
in instcombine:
((X << S) & Y) & (Z << S) --> ((X << S) & (Z << S)) & Y (reassociation)

--> ((X & Z) << S) & Y (factorize shift from 'and' ops optimization)

This is part of solving PR37098:
https://bugs.llvm.org/show_bug.cgi?id=37098

Note that there's an instcombine version of this patch attached there, but we're trying
to make instcombine have less responsibility to improve compile-time efficiency.

For reasons I still don't completely understand, reassociate does this kind of transform
sometimes, but misses everything in my motivating cases.

This patch on its own is gluing an independent cleanup chunk to the end of the existing 
RewriteExprTree() loop. We can build on it and do something stronger to better order the 
full expression tree like D40049. That might be an alternative to the proposal to add a 
separate reassociation pass like D41574.

Differential Revision: https://reviews.llvm.org/D45842

llvm-svn: 341288
2018-09-02 14:22:54 +00:00
Sanjay Patel 099b1a4b0c [InstCombine] simplify code for 'or' fold
This is no-outwardly-visible-change intended, so no test.
But the code is smaller and more efficient. The check for
a 'not' op is intended to avoid the expensive value tracking
call when it should not be necessary, and it might prevent
infinite looping when we resurrect:
rL300977

llvm-svn: 341280
2018-09-01 15:08:59 +00:00
Zhaoshi Zheng f5297fb24b [Constant Hoisting] Hoisting Constant GEP Expressions
Leverage existing logic in constant hoisting pass to transform constant GEP
expressions sharing the same base global variable. Multi-dimensional GEPs are
rewritten into single-dimensional GEPs.

Differential Revision: https://reviews.llvm.org/D51396

llvm-svn: 341269
2018-09-01 00:04:56 +00:00
Matt Arsenault c807ce0ee4 SLPVectorizer: Fix assert with different sized address spaces
llvm-svn: 341215
2018-08-31 14:34:53 +00:00
Evandro Menezes 2123ea7d5c [InstCombine] Expand the simplification of pow() into exp2()
Generalize the simplification of `pow(2.0, y)` to `pow(2.0 ** n, y)` for all
scalar and vector types.

This improvement helps some benchmarks in SPEC CPU2000 and CPU2006, such as
252.eon, 447.dealII, 453.povray.  Otherwise, no significant regressions on
x86-64 or A64.

Differential revision: https://reviews.llvm.org/D49273

llvm-svn: 341095
2018-08-30 19:04:51 +00:00
Eli Friedman 94d3e4dd77 [SROA] Fix alignment for uses of PHI nodes.
Splitting an alloca can decrease the alignment of GEPs into the
partition.  Normally, rewriting accounts for this, but the code was
missing for uses of PHI nodes and select instructions.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38707 .

Differential Revision: https://reviews.llvm.org/D51335

llvm-svn: 341094
2018-08-30 18:59:24 +00:00
Matt Morehouse 7e042bb1d1 [libFuzzer] Port to Windows
Summary:
Port libFuzzer to windows-msvc.
This patch allows libFuzzer targets to be built and run on Windows, using -fsanitize=fuzzer and/or fsanitize=fuzzer-no-link. It allows these forms of coverage instrumentation to work on Windows as well.
It does not fix all issues, such as those with -fsanitize-coverage=stack-depth, which is not usable on Windows as of this patch.
It also does not fix any libFuzzer integration tests. Nearly all of them fail to compile, fixing them will come in a later patch, so libFuzzer tests are disabled on Windows until them.

Patch By: metzman

Reviewers: morehouse, rnk

Reviewed By: morehouse, rnk

Subscribers: #sanitizers, delcypher, morehouse, kcc, eraman

Differential Revision: https://reviews.llvm.org/D51022

llvm-svn: 341082
2018-08-30 15:54:44 +00:00
Nicolai Haehnle 35617ed4cb [NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis
Summary:
This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).

The purpose of this patch is to free up the name DivergenceAnalysis for the new generic
implementation. The generic implementation class will be shared by specialized
divergence analysis classes.

Patch by: Simon Moll

Reviewed By: nhaehnle

Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50434

Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb
llvm-svn: 341071
2018-08-30 14:21:36 +00:00
Martin Storsjo 22dcddf651 Revert "[SimplifyCFG] Common debug handling [NFC]"
This reverts commit r340997.

This change turned out not to be NFC after all, but e.g. causes
clang to crash when building the linux kernel for aarch64.

llvm-svn: 341031
2018-08-30 08:06:50 +00:00
Max Kazantsev d3a4cbe153 [NFC] Move OrderedInstructions and InstructionPrecedenceTracking to Analysis
These classes don't make any changes to IR and have no reason to be in
Transform/Utils. This patch moves them to Analysis folder. This will allow
us reusing these classes in some analyzes, like MustExecute.

llvm-svn: 341015
2018-08-30 04:49:03 +00:00
Max Kazantsev 3c284bde3f Re-enable "[NFC] Unify guards detection"
rL340921 has been reverted by rL340923 due to linkage dependency
from Transform/Utils to Analysis which is not allowed. In this patch
this has been fixed, a new utility function moved to Analysis.

Differential Revision: https://reviews.llvm.org/D51152

llvm-svn: 341014
2018-08-30 03:39:16 +00:00
Philip Reames bed556133d [SimplifyCFG] Rename a variable for readibility of a future change [NFC]
llvm-svn: 341004
2018-08-30 00:12:29 +00:00
Philip Reames 6bd16b5850 [SimplifyCFG] Fix a cost modeling oversight in branch commoning
The cost modeling was not accounting for the fact we were duplicating the instruction once per predecessor.  With a default threshold of 1, this meant we were actually creating #pred copies.

Adding to the fun, there is *absolutely no* test coverage for this.  Simply bailing for more than one predecessor passes all checked in tests.

llvm-svn: 341001
2018-08-30 00:03:02 +00:00
Philip Reames 7c57dac955 [SimplifyCFG] Common debug handling [NFC]
llvm-svn: 340997
2018-08-29 23:22:07 +00:00
Reid Kleckner 9397c2a23b Revert r340947 "[InstCombine] Expand the simplification of pow() into exp2()"
It broke the clang-cl self-host.

llvm-svn: 340991
2018-08-29 22:58:33 +00:00
Philip Reames 1887c40b22 Add a todo and tests to Address a review commnt from D50925 [NFC]
llvm-svn: 340978
2018-08-29 22:09:21 +00:00
Philip Reames f562fc8dbf [LICM] Hoist stores of invariant values to invariant addresses out of loops
Teach LICM to hoist stores out of loops when the store writes to a location otherwise unused in the loop, writes a value which is invariant, and is guaranteed to execute if the loop is entered.

Worth noting is that this transformation is partially overlapping with the existing promotion transformation. Reasons this is worthwhile anyway include:
 * For multi-exit loops, this doesn't require duplication of the store.
 * It kicks in for case where we can't prove we exit through a normal exit (i.e. we may throw), but can prove the store executes before that possible side exit.

Differential Revision: https://reviews.llvm.org/D50925

llvm-svn: 340974
2018-08-29 21:49:30 +00:00
Fedor Sergeev 7b49aa03af [SimpleLoopUnswitch] After unswitch delete dead blocks in parent loops
Summary:
Assert from PR38737 happens on the dead block inside the parent loop
after unswitching nontrivial switch in the inner loop.

deleteDeadBlocksFromLoop now takes extra care to detect/remove dead
blocks in all the parent loops in addition to the blocks from original
loop being unswitched.

Reviewers: asbirlea, chandlerc

Reviewed By: asbirlea

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51415

llvm-svn: 340955
2018-08-29 19:10:44 +00:00
Matt Morehouse cf311cfc20 Revert "[libFuzzer] Port to Windows"
This reverts r340949 due to bot breakage again.

llvm-svn: 340954
2018-08-29 18:40:41 +00:00
Sanjay Patel 0f29e953b7 [InstCombine] canonicalize fneg with llvm.sin
This is a follow-up to rL339604 which did the same transform
for a sin libcall. The handling of intrinsics vs. libcalls
is unfortunately scattered, so I'm just adding this next to
the existing transform for llvm.cos for now.

This should resolve PR38458:
https://bugs.llvm.org/show_bug.cgi?id=38458
If the call was already negated, the negates will cancel
each other out.

llvm-svn: 340952
2018-08-29 18:27:49 +00:00
Matt Morehouse 245ebd71ef [libFuzzer] Port to Windows
Summary:
Port libFuzzer to windows-msvc.
This patch allows libFuzzer targets to be built and run on Windows, using -fsanitize=fuzzer and/or fsanitize=fuzzer-no-link. It allows these forms of coverage instrumentation to work on Windows as well.
It does not fix all issues, such as those with -fsanitize-coverage=stack-depth, which is not usable on Windows as of this patch.
It also does not fix any libFuzzer integration tests. Nearly all of them fail to compile, fixing them will come in a later patch, so libFuzzer tests are disabled on Windows until them.

Reviewers: morehouse, rnk

Reviewed By: morehouse, rnk

Subscribers: #sanitizers, delcypher, morehouse, kcc, eraman

Differential Revision: https://reviews.llvm.org/D51022

llvm-svn: 340949
2018-08-29 18:08:34 +00:00
Evandro Menezes 22e0bdf4ed [InstCombine] Expand the simplification of pow() with nested exp{,2}()
Expand the simplification of `pow(exp{,2}(x), y)` to all FP types.

This improvement helps some benchmarks in SPEC CPU2000 and CPU2006, such as
252.eon, 447.dealII, 453.povray.  Otherwise, no significant regressions on
x86-64 or A64.

Differential revision: https://reviews.llvm.org/D51195

llvm-svn: 340948
2018-08-29 17:59:48 +00:00
Evandro Menezes a3a7b53571 [InstCombine] Expand the simplification of pow() into exp2()
Generalize the simplification of `pow(2.0, y)` to `pow(2.0 ** n, y)` for all
scalar and vector types.

This improvement helps some benchmarks in SPEC CPU2000 and CPU2006, such as
252.eon, 447.dealII, 453.povray.  Otherwise, no significant regressions on
x86-64 or A64.

Differential revision: https://reviews.llvm.org/D49273

llvm-svn: 340947
2018-08-29 17:59:34 +00:00
Craig Topper 2bcb1eeee1 [InstCombine] Replace two calls to getNumUses() with !hasNUsesOrMore
We were calling getNumUses to check for 1 or 2 uses. But getNumUses is linear in the number of uses. We can instead use !hasNUsesOrMore(3) which will stop the linear scan as soon as it determines there are at least 3 uses even if there are more.

llvm-svn: 340939
2018-08-29 17:09:21 +00:00
Sanjay Patel d4e19d272a [InstCombine] move declarations closer to uses; NFC
llvm-svn: 340930
2018-08-29 14:42:12 +00:00
Sanjay Patel 7a05641fa8 [InstCombine] remove unnecessary shuffle undef folding
Add a test for constant folding to show that 
(shuffle undef, undef, mask)
should already be handled via instsimplify.

llvm-svn: 340926
2018-08-29 13:24:34 +00:00
Alexandros Lamprineas f6db5bcd38 Revert r340922 "[GVNHoist] Re-enable GVNHoist by default"
Another sanitizer buildbot failed this time at bootstrap when
compiling SemaTemplateInstantiate.cpp with this assertion:
`dominates(MD, U) && "Memory Def does not dominate it's uses"'.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/15047

llvm-svn: 340925
2018-08-29 13:00:55 +00:00
Hans Wennborg 2c390c54f6 Revert r340921 "[NFC] Unify guards detection"
This broke the build, see e.g.

http://lab.llvm.org:8011/builders/clang-cmake-armv8-lnt/builds/4626/
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/18647/
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/5856/
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/22800/

> We have multiple places in code where we try to identify whether or not
> some instruction is a guard. This patch factors out this logic into a separate
> utility function which works uniformly in all places.
>
> Differential Revision: https://reviews.llvm.org/D51152
> Reviewed By: fedor.sergeev

llvm-svn: 340923
2018-08-29 12:21:32 +00:00
Alexandros Lamprineas c03b9b8854 [GVNHoist] Re-enable GVNHoist by default
Rebase rL338240 since the excessive memory usage observed when using
GVNHoist with UBSan has been fixed by rL340818.

Differential Revision: https://reviews.llvm.org/D49858

llvm-svn: 340922
2018-08-29 11:58:34 +00:00
Max Kazantsev 1dafaa87d9 [NFC] Unify guards detection
We have multiple places in code where we try to identify whether or not
some instruction is a guard. This patch factors out this logic into a separate
utility function which works uniformly in all places.

Differential Revision: https://reviews.llvm.org/D51152
Reviewed By: fedor.sergeev

llvm-svn: 340921
2018-08-29 11:37:34 +00:00
Max Kazantsev 8b4ffe66d6 [NFC] Factor out guard utility methods into a separate file
This patch creates file GuardUtils which will contain logic for work with guards
that can be shared across different passes.

Differential Revision: https://reviews.llvm.org/D51151
Reviewed By: fedor.sergeev

llvm-svn: 340914
2018-08-29 10:51:59 +00:00
Hans Wennborg e0f3e9283f LoopSink: Don't sink into blocks without an insertion point (PR38462)
In the PR, LoopSink was trying to sink into a catchswitch block, which
doesn't have a valid insertion point.

Differential Revision: https://reviews.llvm.org/D51307

llvm-svn: 340900
2018-08-29 06:55:27 +00:00
Zhaoshi Zheng 35818e2789 [QTOOL-37352] Consider isLegalAddressingImm in Constant Hoisting
In Thumb1, legal imm range is [0, 255] for ADD/SUB instructions. However, the
legal imm range for LD/ST in (R+Imm) addressing mode is [0, 127]. Imms in
[128, 255] are materialized by mov R, #imm, and LD/STs use them in (R+R)
addressing mode.

This patch checks if a constant is used as offset in (R+Imm), if so, it checks
isLegalAddressingMode passing the constant value as BaseOffset.

Differential Revision: https://reviews.llvm.org/D50931

llvm-svn: 340882
2018-08-28 23:00:59 +00:00
Alina Sbirlea 52e97a28d4 [SimpleLoopUnswitch] Form dedicated exits after trivial unswitches.
Summary:
Form dedicated exits after trivial unswitches.
Fixes PR38737, PR38283.

Reviewers: chandlerc, fedor.sergeev

Subscribers: sanjoy, jlebar, uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D51375

llvm-svn: 340871
2018-08-28 20:41:05 +00:00
Matt Morehouse bab8556f01 Revert "[libFuzzer] Port to Windows"
This reverts commit r340860 due to failing tests.

llvm-svn: 340867
2018-08-28 19:07:24 +00:00
Matt Morehouse c6fff3b6f5 [libFuzzer] Port to Windows
Summary:
Port libFuzzer to windows-msvc.
This patch allows libFuzzer targets to be built and run on Windows, using -fsanitize=fuzzer and/or fsanitize=fuzzer-no-link. It allows these forms of coverage instrumentation to work on Windows as well.
It does not fix all issues, such as those with -fsanitize-coverage=stack-depth, which is not usable on Windows as of this patch.
It also does not fix any libFuzzer integration tests. Nearly all of them fail to compile, fixing them will come in a later patch, so libFuzzer tests are disabled on Windows until them.

Patch By: metzman

Reviewers: morehouse, rnk

Reviewed By: morehouse, rnk

Subscribers: morehouse, kcc, eraman

Differential Revision: https://reviews.llvm.org/D51022

llvm-svn: 340860
2018-08-28 18:34:32 +00:00
Matt Arsenault 10de2775bd AMDGPU: Remove nan tests in class if src is nnan
llvm-svn: 340850
2018-08-28 18:10:02 +00:00
David Bolvansky c1b27b562b [Inliner] Attribute callsites with inline remarks
Summary:
Sometimes reading an output *.ll file it is not easy to understand why some callsites are not inlined. We can read output of inline remarks (option --pass-remarks-missed=inline) and try correlating its messages with the callsites.

An easier way proposed by this patch is to add to every callsite processed by Inliner an attribute with the latest message that describes the cause of not inlining this callsite. The attribute is called //inline-remark//. By default this feature is off. It can be switched on by the option //-inline-remark-attribute//.

For example in the provided test the result method //@test1// has two callsites //@bar// and inline remarks report different inlining missed reasons:
  remark: <unknown>:0:0: bar not inlined into test1 because too costly to inline (cost=-5, threshold=-6)
  remark: <unknown>:0:0: bar not inlined into test1 because it should never be inlined (cost=never): recursive

It is not clear which remark correspond to which callsite. With the inline remark attribute enabled we get the reasons attached to their callsites:
  define void @test1() {
    call void @bar(i1 true) #0
    call void @bar(i1 false) #2
    ret void
  }
  attributes #0 = { "inline-remark"="(cost=-5, threshold=-6)" }
  ..
  attributes #2 = { "inline-remark"="(cost=never): recursive" }

Patch by: yrouban (Yevgeny Rouban)

Reviewers: xbolva00, tejohnson, apilipenko

Reviewed By: xbolva00, tejohnson

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D50435

llvm-svn: 340834
2018-08-28 15:27:25 +00:00
Mikael Holmen 4d652c4ce7 [CloneFunction] Constant fold terminators before checking single predecessor
Summary:
This fixes PR31105.

There is code trying to delete dead code that does so by e.g. checking if
the single predecessor of a block is the block itself.

That check fails on a block like this
 bb:
   br i1 undef, label %bb, label %bb
since that has two (identical) predecessors.

However, after the check for dead blocks there is a call to
ConstantFoldTerminator on the basic block, and that call simplifies the
block to
 bb:
   br label %bb

Therefore we now do the call to ConstantFoldTerminator before the check if
the block is dead, so it can realize that it really is.

The original behavior lead to the block not being removed, but it was
simplified as above, and then we did a call to
    Dest->replaceAllUsesWith(&*I);
with old and new being equal, and an assertion triggered.

Reviewers: chandlerc, fhahn

Reviewed By: fhahn

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D51280

llvm-svn: 340820
2018-08-28 12:40:11 +00:00
Alexandros Lamprineas 484bd13e2d [GVNHoist] Prune out useless CHI insertions
Fix for the out-of-memory error when compiling SemaChecking.cpp
with GVNHoist and ubsan enabled. I've used a cache for inserted
CHIs to avoid excessive memory usage.

Differential Revision: https://reviews.llvm.org/D50323

llvm-svn: 340818
2018-08-28 11:07:54 +00:00
Max Kazantsev 0c4b84e2df [NFC] A loop can never contain Ret instruction
llvm-svn: 340808
2018-08-28 09:26:28 +00:00
Craig Topper a6cd4b9bce [InstCombine] Extend (add (sext x), cst) --> (sext (add x, cst')) and (add (zext x), cst) --> (zext (add x, cst')) to work for vectors
Differential Revision: https://reviews.llvm.org/D51236

llvm-svn: 340796
2018-08-28 02:02:29 +00:00
Sanjay Patel c615910be5 [InstCombine] fix formatting; NFC
llvm-svn: 340790
2018-08-27 23:01:10 +00:00
Sanjay Patel 42d31c20a8 [InstCombine] allow shuffle+binop canonicalization with widening shuffles
This lines up with the behavior of an existing transform where if both 
operands of the binop are shuffled, we allow moving the binop before the 
shuffle regardless of whether the shuffle changes the size of the vector.

llvm-svn: 340787
2018-08-27 22:41:44 +00:00
Evandro Menezes 253991cfaf [PATCH] [InstCombine] Fix issue in the simplification of pow() with nested exp{,2}()
Fix the issue of duplicating the call to `exp{,2}()` when it's nested in
`pow()`, as exposed by rL340462.

Differential revision: https://reviews.llvm.org/D51194

llvm-svn: 340784
2018-08-27 22:11:15 +00:00
George Burgess IV aa09a82b4b s/std::set/DenseSet/; NFC
We only use this set for `insert` and `count`, so a hashing container
seems better here.

llvm-svn: 340783
2018-08-27 22:10:59 +00:00
Nico Weber e75fd1b184 fix comment typo
llvm-svn: 340744
2018-08-27 14:25:22 +00:00
Max Kazantsev b5dd092051 [NFC] Split logic of ImplicitControlFlowTracking to allow generalization
We have a class `ImplicitControlFlowTracking` which allows us to keep track of
instructions that can abnormally exit and answer queries like "whether or not
there is side-exiting instruction above this instruction in its block".

We may want to have the similar tracking for other types of "special" instructions,
for example instructions that write memory.

This patch separates ImplicitControlFlowTracking into two classes, isolating all
general logic not related to implicit control flow into its parent class. We can
later make another child of this class to keep track of instructions that write
memory.

The motivation for that is that we want to make these checks efficiently in the
patch https://reviews.llvm.org/D50891.

NOTE: The naming of the parent class is not super cool, but the other options we
have are hardly better. Please feel free to rename it as NFC if you think you've
found a more informative name for it.

Differential Revision: https://reviews.llvm.org/D50954
Reviewed By: fedor.sergeev

llvm-svn: 340728
2018-08-27 09:43:16 +00:00
Chandler Carruth 9ae926b973 [IR] Replace `isa<TerminatorInst>` with `isTerminator()`.
This is a bit awkward in a handful of places where we didn't even have
an instruction and now we have to see if we can build one. But on the
whole, this seems like a win and at worst a reasonable cost for removing
`TerminatorInst`.

All of this is part of the removal of `TerminatorInst` from the
`Instruction` type hierarchy.

llvm-svn: 340701
2018-08-26 09:51:22 +00:00
Chandler Carruth 698fbe7b59 [IR] Sink `isExceptional` predicate to `Instruction`, rename it to
`isExceptionalTermiantor` and implement it for opcodes as well following
the common pattern in `Instruction`.

Part of removing `TerminatorInst` from the `Instruction` type hierarchy
to make it easier to share logic and interfaces between instructions
that are both terminators and not terminators.

llvm-svn: 340699
2018-08-26 08:56:42 +00:00
Chandler Carruth 96fc1de77d [IR] Begin removal of TerminatorInst by removing successor manipulation.
The core get and set routines move to the `Instruction` class. These
routines are only valid to call on instructions which are terminators.

The iterator and *generic* range based access move to `CFG.h` where all
the other generic successor and predecessor access lives. While moving
the iterator here, simplify it using the iterator utilities LLVM
provides and updates coding style as much as reasonable. The APIs remain
pointer-heavy when they could better use references, and retain the odd
behavior of `operator*` and `operator->` that is common in LLVM
iterators. Adjusting this API, if desired, should be a follow-up step.

Non-generic range iteration is added for the two instructions where
there is an especially easy mechanism and where there was code
attempting to use the range accessor from a specific subclass:
`indirectbr` and `br`. In both cases, the successors are contiguous
operands and can be easily iterated via the operand list.

This is the first major patch in removing the `TerminatorInst` type from
the IR's instruction type hierarchy. This change was discussed in an RFC
here and was pretty clearly positive:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html

There will be a series of much more mechanical changes following this
one to complete this move.

Differential Revision: https://reviews.llvm.org/D47467

llvm-svn: 340698
2018-08-26 08:41:15 +00:00
Xinliang David Li bcf726a32d [PGO] add target md5sum in warning message for icall
Differential revision: http://reviews.llvm.org/D51193

llvm-svn: 340657
2018-08-24 21:38:24 +00:00