Commit Graph

20504 Commits

Author SHA1 Message Date
Roman Lebedev 28a42c7706 Revert "[InstCombine] Optimize redundant 'signed truncation check pattern'."
At least one buildbot was able to actually trigger that assert
on the top of the function. Will investigate.

This reverts commit r339610.

llvm-svn: 339612
2018-08-13 20:46:22 +00:00
Roman Lebedev 4c4750771f [InstCombine] Optimize redundant 'signed truncation check pattern'.
Summary:
This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250):
```
signed char test(unsigned int x) { return x; }
```
`clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3`
* Old: {F6904292}
* With this patch: {F6904294}

General pattern:
  X & Y

Where `Y` is checking that all the high bits (covered by a mask `4294967168`)
are uniform, i.e.  `%arg & 4294967168`  can be either  `4294967168`  or  `0`
Pattern can be one of:
  %t = add        i32 %arg,    128
  %r = icmp   ult i32 %t,      256
Or
  %t0 = shl       i32 %arg,    24
  %t1 = ashr      i32 %t0,     24
  %r  = icmp  eq  i32 %t1,     %arg
Or
  %t0 = trunc     i32 %arg  to i8
  %t1 = sext      i8  %t0   to i32
  %r  = icmp  eq  i32 %t1,     %arg
This pattern is a signed truncation check.

And `X` is checking that some bit in that same mask is zero.
I.e. can be one of:
  %r = icmp sgt i32   %arg,    -1
Or
  %t = and      i32   %arg,    2147483648
  %r = icmp eq  i32   %t,      0

Since we are checking that all the bits in that mask are the same,
and a particular bit is zero, what we are really checking is that all the
masked bits are zero.
So this should be transformed to:
  %r = icmp ult i32 %arg, 128

https://rise4fun.com/Alive/3Ou

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: RKSimon, erichkeane, vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D50465

llvm-svn: 339610
2018-08-13 20:33:08 +00:00
Sanjay Patel 66c6fe6534 revert r339608 - [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds
Can't set the builder flags without knowing this is an FPMathOperator. I'll add a test
for that and try again.

llvm-svn: 339609
2018-08-13 20:20:38 +00:00
Sanjay Patel 981f50919e [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds
llvm-svn: 339608
2018-08-13 20:14:27 +00:00
Sanjay Patel e45a83d447 [SimplifyLibCalls] add reflection fold for -sin(-x) (PR38458)
This is a very partial fix for the reported problem. I suspect
we do not get this fold in most motivating cases because most of
the time, the libcall would have been replaced by an intrinsic,
and that optimization is handled elsewhere...but maybe it should
be handled here?

llvm-svn: 339604
2018-08-13 19:24:41 +00:00
Sanjay Patel ce4ddbe960 [SimplifyLibCalls] reduce code for optimizeCos; NFCI
llvm-svn: 339588
2018-08-13 17:40:49 +00:00
Simon Pilgrim 82edf8d329 [InstCombine] Limit simplifyAllocaArraySize constant folding to values that fit into a uint64_t
Fixes OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5223

llvm-svn: 339584
2018-08-13 16:50:20 +00:00
Evandro Menezes 5ecd6c1a46 [SLC] Expand simplification of pow() for vector types
Also consider vector constants when simplifying `pow()`.

Differential revision: https://reviews.llvm.org/D50035

llvm-svn: 339578
2018-08-13 16:12:37 +00:00
Max Kazantsev 5c490b49c3 [GuardWidening] Widen very likely non-taken br instructions
This is a second part of D49974 that handles widening of conditional branches that
have very likely `false` branch.

Differential Revision: https://reviews.llvm.org/D50040
Reviewed By: reames

llvm-svn: 339537
2018-08-13 07:58:19 +00:00
Craig Topper 8caccc32b5 [InstCombine] Fix typo in comment. NFC
llvm-svn: 339532
2018-08-13 00:54:23 +00:00
Craig Topper 8bb49218bc [InstCombine] Replace call to haveNoCommonBitsSet in visitXor with just the special case that doesn't use computeKnownBits.
Summary: computeKnownBits is expensive. The cases that would be detected by the computeKnownBits portion of haveNoCommonBitsSet were already handled by the earlier call to SimplifyDemandedInstructionBits.

Reviewers: spatel, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50604

llvm-svn: 339531
2018-08-13 00:38:27 +00:00
David Bolvansky 01d98cc03f [InstCombine] Fold Select with binary op - non-commutative opcodes
Summary:
Basic version was merged - https://reviews.llvm.org/D49954

This adds support for FP & non-commutative opcodes

Precommited tests: https://reviews.llvm.org/rL338727

Reviewers: spatel, lebedev.ri

Reviewed By: spatel

Subscribers: jfb

Differential Revision: https://reviews.llvm.org/D50190

llvm-svn: 339520
2018-08-12 17:30:07 +00:00
Sanjay Patel dc185ee275 [InstCombine] fix/enhance fadd/fsub factorization
(X * Z) + (Y * Z) --> (X + Y) * Z
  (X * Z) - (Y * Z) --> (X - Y) * Z
  (X / Z) + (Y / Z) --> (X + Y) / Z
  (X / Z) - (Y / Z) --> (X - Y) / Z

The existing code that implemented these folds failed to 
optimize vectors, and it transformed code with multiple 
uses when it should not have.

llvm-svn: 339519
2018-08-12 15:48:26 +00:00
David Green f7111d1ece [UnJ] Improve explicit loop count checks
Try to improve the computed counts when it has been explicitly set by a pragma
or command line option. This moves the code around, so that first call to
computeUnrollCount to get a sensible count and override that if explicit unroll
and jam counts are specified.

Also added some extra debug messages for when unroll and jamming is disabled.

Differential Revision: https://reviews.llvm.org/D50075

llvm-svn: 339501
2018-08-11 07:37:31 +00:00
David Green 395b80cd3c [UnJ] Create a hasInvariantIterationCount function. NFC
Pulled out a separate function for some code that calculates
if an inner loop iteration count is invariant to it's outer
loop.

Differential Revision: https://reviews.llvm.org/D50063

llvm-svn: 339500
2018-08-11 06:57:28 +00:00
JF Bastien fe258d9776 Re-commit "[NFC] More ConstantMerge refactoring"
My previous change moved some code upwards which caused an assert in debug mode
because the global value didn't necessarily have an initializer. Don't do that.

llvm-svn: 339485
2018-08-10 22:41:09 +00:00
Philip Reames 85afd1a9a0 [LICM] Hoist assumes out of loops
If we have an assume which is known to execute and whose operand is invariant, we can lift that into the pre-header. So long as we don't change which paths the assume executes on, this is a legal transformation. It's likely to be a useful canonicalization as other transforms only look for dominating assumes.

Differential Revision: https://reviews.llvm.org/D50364

llvm-svn: 339481
2018-08-10 22:21:56 +00:00
JF Bastien b99f131ffd Revert "[NFC] More ConstantMerge refactoring"
Sanitizers seem unhappy.

llvm-svn: 339480
2018-08-10 22:10:20 +00:00
JF Bastien 62fb8ea4e0 [NFC] More ConstantMerge refactoring
This makes my upcoming patch much easier to read.

llvm-svn: 339478
2018-08-10 21:58:00 +00:00
Sanjay Patel 85e17bb195 [InstCombine] rearrange code for foldSelectBinOpIdentity; NFCI
This is a retry of rL339439 with a fix for the problem that
caused the original commit to be reverted at rL339446. 

That problem was that the compare can be integer while
the binop is FP or vice-versa, so we need to use the binop 
type when we ask for the identity constant.

A test to guard against the problem was added at rL339453.

llvm-svn: 339469
2018-08-10 20:30:35 +00:00
Matt Arsenault d35f46caf1 AMDGPU: Turn class x, p_zero|n_zero into fcmp oeq x, 0
The library does use this for some reason.

llvm-svn: 339461
2018-08-10 18:58:49 +00:00
Evgeniy Stepanov 453e7ac785 [hwasan] Add -hwasan-with-ifunc flag.
Summary: Similar to asan's flag, it can be used to disable the use of ifunc to access hwasan shadow address.

Reviewers: vitalybuka, kcc

Subscribers: srhines, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50544

llvm-svn: 339447
2018-08-10 16:21:37 +00:00
Sanjay Patel c9cc86a5b3 [InstCombine] revert r339439 - rearrange code for foldSelectBinOpIdentity
That was supposed to be NFC, but it exposed a logic hole somewhere that
caused bots to fail.

llvm-svn: 339446
2018-08-10 16:12:19 +00:00
Sanjay Patel 3b92a17526 [InstCombine] rearrange code for foldSelectBinOpIdentity; NFCI
This should make it easier to folow and to add the planned enhancements
such as D50190.

llvm-svn: 339439
2018-08-10 15:11:26 +00:00
Alexander Potapenko 75a954330b [MSan] Shrink the register save area for non-SSE builds
If code is compiled for X86 without SSE support, the register save area
doesn't contain FPU registers, so `AMD64FpEndOffset` should be equal to
`AMD64GpEndOffset`.

llvm-svn: 339414
2018-08-10 08:06:43 +00:00
David Bolvansky 909889b2cb [InstCombine] Transform str(n)cmp to memcmp
Summary:
Motivation examples:
int strcmp_memcmp() {
    char buf[12];
    return strcmp(buf, "key") == 0;
}

int strcmp_memcmp2() {
    char buf[12];
    return strcmp(buf, "key") != 0;
}

int strncmp_memcmp() {
    char buf[12];
    return strncmp(buf, "key", 3) == 0;
}

can be turned to memcmp.

See test file for more cases.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: spatel, llvm-commits

Differential Revision: https://reviews.llvm.org/D50233

llvm-svn: 339410
2018-08-10 04:32:54 +00:00
Matt Arsenault d54b7f0592 ValueTracking: Start enhancing isKnownNeverNaN
llvm-svn: 339399
2018-08-09 22:40:08 +00:00
Sanjay Patel c6944f795d [InstSimplify] move minnum/maxnum with Inf folds from instcombine
llvm-svn: 339396
2018-08-09 22:20:44 +00:00
JF Bastien 42ca9ccb70 [NFC] ConstantMerge: factor out some functions
This makes the code easier to read and will make an upcoming patch I have easier to review because that patch needed this refactoring to reuse some of the functions.

llvm-svn: 339391
2018-08-09 21:56:09 +00:00
JF Bastien ebcaa31768 ConstantMerge: update MadeChange when change is made
It was always false, which is obviously wrong.

llvm-svn: 339390
2018-08-09 21:36:57 +00:00
Philip Reames 7d79433136 [LICM] Suppress a compiler warning noticed by one of the bots
llvm-svn: 339388
2018-08-09 21:15:33 +00:00
Philip Reames ca256d93fb [LICM] hoist fences out of loops w/o memory operations
The motivating case is an otherwise dead loop with a fence in it. At the moment, this goes all the way through the optimizer and we end up emitting an entirely pointless loop on x86. This case may seem a bit contrived, but we've seen it in real code as the result of otherwise reasonable lowering strategies combined w/thread local memory optimizations (such as escape analysis).

To handle this simple case, we can teach LICM to hoist must execute fences when there is no other memory operation within the loop.

Differential Revision: https://reviews.llvm.org/D50489

llvm-svn: 339378
2018-08-09 20:18:42 +00:00
Sanjay Patel 55accd7dd3 [InstCombine] allow fsub+fmul FMF folds for vectors
llvm-svn: 339368
2018-08-09 18:42:12 +00:00
Alina Sbirlea bf9fe79397 SCEV should forget all loops containing a deleted block.
Summary:
LoopSimplifyCFG should update ScEv for all loops after a block is deleted.
If the deleted block "Succ" is part of L, then it is part of all parent loops, so forget topmost loop.

Reviewers: greened, mkazantsev, sanjoy

Subscribers: jlebar, javed.absar, uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D50422

llvm-svn: 339363
2018-08-09 17:53:26 +00:00
Reid Kleckner 80c6ec11d9 [GlobalOpt] Don't apply fastcc if it would break inalloca invariants
The inalloca parameter has to be the only parameter passed in memory.
Changing the convention to fastcc can break that.

At some point we should teach global opt how to optimize ABI attributes
like inalloca and maybe byval. These attributes are mainly used to match
C ABIs. They are harder for LLVM to optimize and they don't always
generate the best code.

Fixes PR38487

llvm-svn: 339360
2018-08-09 17:29:26 +00:00
Sanjay Patel ebec4204da [InstCombine] reduce code duplication; NFC
llvm-svn: 339349
2018-08-09 15:07:13 +00:00
JF Bastien 3f270336e1 [NFC] ConstantMerge: don't insert when find should be used
Summary: DenseMap's operator[] performs an insertion if the entry isn't found. The second phase of ConstantMerge isn't trying to insert anything: it's just looking to see if the first phased performed an insertion. Use find instead, avoiding insertion of every single global initializer in the map of constants. This has the side-effect of making all entries in CMap non-null (because only global declarations would have null initializers, and that would be a bug).

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D50476

llvm-svn: 339309
2018-08-09 04:17:48 +00:00
Philip Reames 22b20a09a0 [LICM] Add an assert to ensure all instruction types needing aliasing are handled [NFC]
llvm-svn: 339308
2018-08-09 03:44:28 +00:00
Sanjay Patel fe839695a8 [InstCombine] fold fadd+fsub with common operand
This is a sibling to the simplify from:
https://reviews.llvm.org/rL339174

llvm-svn: 339267
2018-08-08 16:19:22 +00:00
Sanjay Patel 2054dd79c2 [InstCombine] fold fsub+fsub with common operand
This is a sibling to the simplify from:
rL339171

llvm-svn: 339266
2018-08-08 16:04:48 +00:00
Sanjay Patel a194b2d2ff [InstCombine] fold fneg into constant operand of fmul/fdiv
This accounts for the missing IR fold noted in D50195. We don't need any fast-math to enable the negation transform. 
FP negation can always be folded into an fmul/fdiv constant to eliminate the fneg.

I've limited this to one-use to ensure that we are eliminating an instruction rather than replacing fneg by a 
potentially expensive fdiv or fmul.

Differential Revision: https://reviews.llvm.org/D50417

llvm-svn: 339248
2018-08-08 14:29:08 +00:00
Roman Lebedev a677651a5a [InstCombine] De Morgan: sink 'not' into 'xor' (PR38446)
Summary:
https://rise4fun.com/Alive/IT3

Comes up in the [most ugliest]  `signed int` -> `signed char`  case of
`-fsanitize=implicit-conversion` (https://reviews.llvm.org/D50250)
Previously, we were stuck with `not`: {F6867736}
But now we are able to completely get rid of it: {F6867737}
(FIXME: why are we loosing the metadata? that seems wrong/strange.)

Here, we only want to do that it we will be able to completely
get rid of that 'not'.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: vsk, erichkeane, llvm-commits

Differential Revision: https://reviews.llvm.org/D50301

llvm-svn: 339243
2018-08-08 13:31:19 +00:00
Anastasis Grammenos 52d5283483 [Local] Add dbg location on unreachable inst in changeToUnreachable
As show in https://bugs.llvm.org/show_bug.cgi?id=37960
it would be desirable to have debug location in the unreachable
instruction.

Also adds a unti test for this function.

Differential Revision: https://reviews.llvm.org/D50340

llvm-svn: 339173
2018-08-07 20:21:56 +00:00
Alexey Bataev 0edcd0278d [SLP] Fix insert point for reused extract instructions.
Summary:
Reworked the previously committed patch to insert shuffles for reused
extract element instructions in the correct position. Previous logic was
incorrect, and might lead to the crash with PHIs and EH instructions.

Reviewers: efriedma, javed.absar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50143

llvm-svn: 339166
2018-08-07 19:21:05 +00:00
Florian Hahn 950576bdf8 [GVN,NewGVN] Keep nonnull if K does not move.
In combineMetadata, we should be able to preserve K's nonnull metadata,
if K does not move. This condition should hold for all replacements by
NewGVN/GVN, but I added a bunch of assertions to verify that.

Fixes PR35038.

There probably are additional kinds of metadata that could be preserved
using similar reasoning. This is follow-up work.

Reviewers: dberlin, davide, efriedma, nlopes

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D47339

llvm-svn: 339149
2018-08-07 15:36:11 +00:00
Sanjay Patel 948ff87d7d [InstSimplify] move minnum/maxnum with common op fold from instcombine
llvm-svn: 339144
2018-08-07 14:36:27 +00:00
Florian Hahn 39bbe179aa [GVN,NewGVN] Move patchReplacementInstruction to Utils/Local.h
This function is shared between both implementations. I am not sure if
Utils/Local.h is the best place though.

Reviewers: davide, dberlin, efriedma, xbolva00

Reviewed By: efriedma, xbolva00

Differential Revision: https://reviews.llvm.org/D47337

llvm-svn: 339138
2018-08-07 13:27:33 +00:00
Max Kazantsev 640cb00365 [NFC] Factor out implicit control flow logic from GVN
Logic for tracking implicit control flow instructions was added to GVN to
perform PRE optimizations correctly. It appears that GVN is not the only
optimization that sometimes does PRE, so this logic is required in other
places (such as Jump Threading).

This is an NFC patch that encapsulates all ICF-related logic in a dedicated
utility class separated from GVN.

Differential Revision: https://reviews.llvm.org/D40293

llvm-svn: 339086
2018-08-07 01:47:20 +00:00
Philip Reames 3b35aaacb6 [LICM] Extract a helper function for readability [NFC]
llvm-svn: 339069
2018-08-06 22:07:37 +00:00
Evandro Menezes 6e137cb9f0 [SLC] Fix shrinking of pow()
Properly shrink `pow()` to `powf()` as a binary function and, when no other
simplification applies, do not discard it.

Differential revision: https://reviews.llvm.org/D50113

llvm-svn: 339046
2018-08-06 19:40:17 +00:00
David Bolvansky 1e51e6896f [NFC] Fixed unused function warnings
llvm-svn: 339021
2018-08-06 15:09:15 +00:00
David Bolvansky 3d2653bd39 Revert unused function fix
llvm-svn: 339020
2018-08-06 15:05:51 +00:00
David Bolvansky 6bca938bf0 [NFC] Fixed unused function warning
llvm-svn: 339019
2018-08-06 14:42:07 +00:00
Max Kazantsev 778f62bb46 Try to fix buildbot
llvm-svn: 338991
2018-08-06 06:35:21 +00:00
Max Kazantsev eded4abef8 [GuardWidening] Widen guards with conditions of frequently taken dominated branches
If there is a frequently taken branch dominated by a guard, and its condition is available
at the point of the guard, we can widen guard with condition of this branch and convert
the branch into unconditional:

  guard(cond1)
  if (cond2) {
    // taken in 99.9% cases
    // do something
  } else {
    // do something else    
  }

Converts to

  guard(cond1 && cond2)
  // do something

Differential Revision: https://reviews.llvm.org/D49974
Reviewed By: reames

llvm-svn: 338988
2018-08-06 05:49:19 +00:00
David Bolvansky 1a56ac790a [NFC] Fixed unused function warning
llvm-svn: 338986
2018-08-06 04:45:46 +00:00
Hsiangkai Wang ef72e481ea [DebugInfo] Refactor DbgInfoIntrinsic class hierarchy.
In the past, DbgInfoIntrinsic has a strong assumption that these
intrinsics all have variables and expressions attached to them.
However, it is too strong to derive the class for other debug entities.
Now, it has problems for debug labels.

In order to make DbgInfoIntrinsic as a base class for 'debug info', I
create a class for 'variable debug info', DbgVariableIntrinsic.

DbgDeclareInst, DbgAddrIntrinsic, and DbgValueInst will be derived from it.

Differential Revision: https://reviews.llvm.org/D50220

llvm-svn: 338984
2018-08-06 03:59:47 +00:00
David Bolvansky c0aa4b75a4 Enrich inline messages
Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.

Patch by: yrouban (Yevgeny Rouban)


Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00

Reviewed By: tejohnson, xbolva00

Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith

Differential Revision: https://reviews.llvm.org/D49412

llvm-svn: 338969
2018-08-05 14:53:08 +00:00
Chijun Sima 8b5de48d62 [TailCallElim] Preserve DT and PDT
Summary:
Previously, in the NewPM pipeline, TailCallElim recalculates the DomTree when it modifies any instruction in the Function.
For example,
```
CallInst *CI = dyn_cast<CallInst>(&I);
...
CI->setTailCall();
Modified = true;
...
if (!Modified || ...)
  return PreservedAnalyses::all();
```
After applying this patch, the DomTree only recalculates if needed (plus an extra insertEdge() + an extra deleteEdge() call).

When optimizing SQLite with `-passes="default<O3>"` pipeline of the newPM, the number of DomTree recalculation decreases by 6.2%, the number of nodes visited by DFS decreases by 2.9%. The time used by DomTree will decrease approximately 1%~2.5% after applying the patch.
 
Statistics:
```
Before the patch:
 23010 dom-tree-stats               - Number of DomTree recalculations
489264 dom-tree-stats               - Number of nodes visited by DFS -- DomTree
After the patch:
 21581 dom-tree-stats               - Number of DomTree recalculations
475088 dom-tree-stats               - Number of nodes visited by DFS -- DomTree
```

Reviewers: kuhar, dmgreen, brzycki, grosser, davide

Reviewed By: kuhar, brzycki

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49982

llvm-svn: 338954
2018-08-04 08:13:47 +00:00
Chijun Sima eacad79777 [ADCE] Remove the need of DomTree
Summary: ADCE doesn't need to query domtree.

Reviewers: kuhar, brzycki, dmgreen, davide, grosser

Reviewed By: kuhar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49988

llvm-svn: 338950
2018-08-04 02:50:12 +00:00
Anastasis Grammenos 4dfe279e00 [TRE][DebugInfo] Preserve Debug Location in new branch instruction
There are two branch instructions created
so the new test covers them both.

Differential Revision: https://reviews.llvm.org/D50263

llvm-svn: 338917
2018-08-03 20:27:13 +00:00
Evandro Menezes 5aa217ac68 [SLC] Refactor shrinking of functions (NFC)
Merge the helper functions for shrinking unary and binary functions into a
single one, while keeping all their functionality.  Otherwise, NFC.

llvm-svn: 338905
2018-08-03 17:50:16 +00:00
Joel Galenson cfe5bc158d Fix crash in bounds checking.
In r337830 I added SCEV checks to enable us to insert fewer bounds checks.  Unfortunately, this sometimes crashes when multiple bounds checks are added due to SCEV caching issues.  This patch splits the bounds checking pass into two phases, one that computes all the conditions (using SCEV checks) and the other that adds the new instructions.

Differential Revision: https://reviews.llvm.org/D49946

llvm-svn: 338902
2018-08-03 17:12:23 +00:00
Graham Yiu 58dbc00559 [Partial Inlining] Fix small bug in detecting if we did something
- It's possible for 'Changed' to return as false even if we did
  partial inline something.  Fixed to accumulate return values

llvm-svn: 338896
2018-08-03 14:42:53 +00:00
Chijun Sima 530484372b [Dominators] Make RemoveUnreachableBlocks return false if the BasicBlock is already awaiting deletion
Summary:
Previously, `removeUnreachableBlocks` still returns true (which indicates the CFG is changed) even when all the unreachable blocks found is awaiting deletion in the DDT class.
This makes code pattern like
```
// Code modified from lib/Transforms/Scalar/SimplifyCFGPass.cpp 
bool EverChanged = removeUnreachableBlocks(F, nullptr, DDT);
...
do {
    EverChanged = someMightHappenModifications();
    EverChanged |= removeUnreachableBlocks(F, nullptr, DDT);
  } while (EverChanged);
```
become a dead loop.
Fix this by detecting whether a BasicBlock is already awaiting deletion.

Reviewers: kuhar, brzycki, dmgreen, grosser, davide

Reviewed By: kuhar, brzycki

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49738

llvm-svn: 338882
2018-08-03 12:45:29 +00:00
Max Kazantsev dcf6706e52 [NFC] Add missing comment
llvm-svn: 338848
2018-08-03 10:41:51 +00:00
Max Kazantsev 65cd4836d2 [NFC] Move some methods into static functions
llvm-svn: 338843
2018-08-03 10:16:40 +00:00
Chijun Sima 21a8b605a1 [Dominators] Convert existing passes and utils to use the DomTreeUpdater class
Summary:
This patch is the second in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].

It converts passes (e.g. adce/jump-threading) and various functions which currently accept DDT in local.cpp and BasicBlockUtils.cpp to use the new DomTreeUpdater class.
These converted functions in utils can accept DomTreeUpdater with either UpdateStrategy and can deal with both DT and PDT held by the DomTreeUpdater.

Reviewers: brzycki, kuhar, dmgreen, grosser, davide

Reviewed By: brzycki

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48967

llvm-svn: 338814
2018-08-03 05:08:17 +00:00
Philip Reames 5937368d4f [LICM] Remove unneccessary safety check to increase sinking effectiveness
This one requires a bit of explaination.  It's not every day you simply delete code to implement an optimization.  :)

The transform in question is sinking an instruction from a loop to the uses in loop exiting blocks.  We know (from LCSSA) that all of the uses outside the loop must be phi nodes, and after predecessor splitting, we know all phi users must have a single operand.  Since the use must be strictly dominated by the def, we know from the definition of dominance/ssa that the exit block must execute along a (non-strict) subset of paths which reach the def.  As a result, duplicating a potentially faulting instruction can not *introduce* a fault that didn't previously exist in the program.  

The full story is that this patch builds on "rL338671: [LICM] Factor out fault legality from canHoistOrSinkInst [NFC]" which pulled this logic out of a common helper routine.  As best I can tell, this check was originally added to the helper function for hoisting legality, later an incorrect fastpath for loads/calls was added, and then the bug was fixed by duplicating the fault safety check in the hoist path.  This left the redundant check in the common code to pessimize sinking for no reason.  I split it out in an NFC, and am not removing the unneccessary check.  I wanted there to be something easy to revert in case I missed something.

Reviewed by: Anna Thomas (in person)

llvm-svn: 338794
2018-08-03 00:21:56 +00:00
Evandro Menezes 84e74362c1 [SLC] Refactor simplification of pow() (NFC)
llvm-svn: 338730
2018-08-02 15:43:57 +00:00
Sanjay Patel 3f6e9a71f7 [InstSimplify] move minnum/maxnum with undef fold from instcombine
llvm-svn: 338719
2018-08-02 14:33:40 +00:00
David Green bc2e1c3a90 [UnJ] Add debug messages for why loops are not unrolled. NFC
Adds some cleaned up debug messages from back when I was writing this.
Hopefully useful to others (and myself) as to why unroll and jam is not
transforming as expected.

Differential Revision: https://reviews.llvm.org/D50062

llvm-svn: 338676
2018-08-02 07:30:53 +00:00
Philip Reames 32cb80b9d3 [LICM] Factor out fault legality from canHoistOrSinkInst [NFC]
This method has three callers, each of which wanted distinct handling:
1) Sinking into a loop is moving an instruction known to execute before a loop into the loop.  We don't need to worry about introducing a fault at all in this case.
2) Hoisting from a loop into a preheader already duplicated the check in the caller.
3) Sinking from the loop into an exit block was the only true user of the code within the routine.  For the moment, this has just been lifted into the caller, but up next is examining the logic more carefully.  Whitelisting of loads and calls - while consistent with the previous code - is rather suspicious.  Either way, a behavior change is worthy of it's own patch.  

llvm-svn: 338671
2018-08-02 04:08:04 +00:00
Philip Reames 09de470e9e [LICM] hoisting/sinking legality - bail early for unsupported instructions
Originally, this was part of a larger refactoring I'd planned, but had to abandoned.  I figured the minor improvement in readability was worthwhile.

llvm-svn: 338663
2018-08-02 00:54:14 +00:00
George Burgess IV 213d1d23ef Reland r338431: "Add DebugCounters to DivRemPairs"
(Previously reverted in r338442)

I'm told that the breakage came from us using an x86 triple on configs
that didn't have x86 enabled. This is remedied by moving the
debugcounter test to an x86 directory (where there's also a
opt-bisect-isel.ll test for similar reasons).

I can't repro the reverse-iteration failure mentioned in the revert with
this patch, so I assume that a misconfiguration on my end is what caused
that.

Original commit message:

    Add DebugCounters to DivRemPairs

    For people who don't use DebugCounters, NFCI.

    Patch by Zhizhou Yang!

    Differential Revision: https://reviews.llvm.org/D50033

llvm-svn: 338653
2018-08-01 23:14:14 +00:00
Sanjay Patel 28c7e41c09 [InstSimplify] move minnum/maxnum with same arg fold from instcombine
llvm-svn: 338652
2018-08-01 23:05:55 +00:00
John Baldwin c5d7e04052 [ASAN] Use the correct shadow offset for ASAN on FreeBSD/mips64.
Reviewed By: atanasyan

Differential Revision: https://reviews.llvm.org/D49939

llvm-svn: 338650
2018-08-01 22:51:13 +00:00
Johannes Doerfert bed4babc56 [NFC][FunctionAttrs] Remove duplication in old/new PM pipeline
This patch just extract code into a separate function to remove some
duplication between the old and new pass manager pipeline. Due to the
different CGSCC iterators used, not all code duplication was eliminated.

llvm-svn: 338585
2018-08-01 16:37:51 +00:00
David Bolvansky fbbb83c782 Revert "Enrich inline messages", tests fail
llvm-svn: 338496
2018-08-01 08:02:40 +00:00
David Bolvansky 7f36cd9d96 Enrich inline messages
Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.

Patch by: yrouban (Yevgeny Rouban)


Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00

Reviewed By: tejohnson, xbolva00

Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith

Differential Revision: https://reviews.llvm.org/D49412

llvm-svn: 338494
2018-08-01 07:37:16 +00:00
Evandro Menezes 61e4e40750 [SLC] Refactor the simplication of pow() (NFC)
Reword comments and minor code reformatting.

llvm-svn: 338446
2018-07-31 22:11:02 +00:00
George Burgess IV 497e8fad51 Revert r338431: "Add DebugCounters to DivRemPairs"
This reverts r338431; the test it added is making buildbots unhappy.
Locally, I can repro the failure on reverse-iteration builds.

llvm-svn: 338442
2018-07-31 21:18:44 +00:00
George Burgess IV 907f4f6a74 Add DebugCounters to DivRemPairs
For people who don't use DebugCounters, NFCI.

Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D50033

llvm-svn: 338431
2018-07-31 20:07:46 +00:00
Ewan Crawford d83beb804c Fix InstCombine address space assert
Workaround bug where the InstCombine pass was asserting on the IR added in lit
test, where we have a bitcast instruction after a GEP from an addrspace cast.

The second bitcast in the test was getting combined into
`bitcast <16 x i32>* %0 to <16 x i32> addrspace(3)*`, which looks like it should
be an addrspace cast instruction instead. Otherwise if control flow is allowed
to continue as it is now we create a GEP instruction
`<badref> = getelementptr inbounds <16 x i32>, <16 x i32>* %0, i32 0`. However
because the type of this instruction doesn't match the address space we hit an
assert when replacing the bitcast with that GEP.

```
void llvm::Value::doRAUW(llvm::Value*, bool): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed.
```

Differential Revision: https://reviews.llvm.org/D50058

llvm-svn: 338395
2018-07-31 15:53:03 +00:00
Anastasis Grammenos ac3f8028da [DebugInfo][LCSSA] Preserve debug location in lcssa phis
Summary:
When inserting lcssa Phi Nodes in the exit block
mak sure to preserve the original instructions DL.

Reviewers: vsk

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D50009

llvm-svn: 338391
2018-07-31 14:54:52 +00:00
David Bolvansky ab79414f7b Revert Enrich inline messages
llvm-svn: 338389
2018-07-31 14:47:22 +00:00
David Bolvansky b562dbabda Enrich inline messages
Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.

Patch by: yrouban (Yevgeny Rouban)


Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00

Reviewed By: tejohnson, xbolva00

Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith

Differential Revision: https://reviews.llvm.org/D49412

llvm-svn: 338387
2018-07-31 14:25:24 +00:00
Alexey Bataev c0c3a6ed5e [SLP] Fix PR38339: Instruction does not dominate all uses!
Summary:
If the ExtractElement instructions can be optimized out during the
vectorization and we need to reshuffle the parent vector, this
ShuffleInstruction may be inserted in the wrong place causing compiler
to produce incorrect code.

Reviewers: spatel, RKSimon, mkuper, hfinkel, javed.absar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49928

llvm-svn: 338380
2018-07-31 14:02:43 +00:00
Sanjay Patel 9a801cb598 [InstCombine] simplify code for A & (A ^ B) --> A & ~B
This fold was written in an odd way and tried to avoid
an endless loop by bailing out on all constants instead
of the supposedly problematic case of -1. But (X & -1) 
should always be simplified before we reach here, so I'm
not sure how that is a problem.

There were no tests for the commuted patterns, so I added
those at rL338364.

llvm-svn: 338367
2018-07-31 13:00:03 +00:00
Max Kazantsev eb8e9c0940 [NFC] Collect statistics in GuardWidening
llvm-svn: 338348
2018-07-31 04:37:11 +00:00
Diego Caballero 3587150fcb [VPlan] Introduce VPLoopInfo analysis.
The patch introduces loop analysis (VPLoopInfo/VPLoop) for VPBlockBases.
This analysis will be necessary to perform some H-CFG transformations and
detect and introduce regions representing a loop in the H-CFG.

Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso

Reviewed By: fhahn 

Differential Revision: https://reviews.llvm.org/D48816

llvm-svn: 338346
2018-07-31 01:57:29 +00:00
Diego Caballero 2a34ac86d3 [VPlan] Introduce VPlan-based dominator analysis.
The patch introduces dominator analysis for VPBlockBases and extend
VPlan's GraphTraits specialization with the required interfaces. Dominator
analysis will be necessary to perform some H-CFG transformations and
to introduce VPLoopInfo (LoopInfo analysis on top of the VPlan representation).

Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48815

llvm-svn: 338310
2018-07-30 21:33:31 +00:00
David Bolvansky 6737b3a6a1 [InstCombine] Fold Select with binary op
Summary:
Fold
  %A = icmp eq i8 %x, 0
  %B = xor i8 %x, %z
  %C = select i1 %A, i8 %B, i8 %y
To
  %C = select i1 %A, i8 %z, i8 %y

Fixes https://bugs.llvm.org/show_bug.cgi?id=38345
Proof: https://rise4fun.com/Alive/43J

Reviewers: lebedev.ri, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49954

llvm-svn: 338300
2018-07-30 20:38:53 +00:00
Vlad Tsyrklevich 1c7160e85f Revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts commit r338240 because it was causing OOMs on the UBSan
buildbot when building clang/lib/Sema/SemaChecking.cpp

llvm-svn: 338297
2018-07-30 20:07:33 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Evandro Menezes a7d48286fb [SLC] Refactor the simplication of pow() (NFC)
Use more meaningful variable names.  Mostly NFC.

llvm-svn: 338266
2018-07-30 16:20:04 +00:00
Alexandros Lamprineas de3ca964c1 [GVNHoist] Re-enable GVNHoist by default
My initial motivation for this came from https://reviews.llvm.org/D48122,
where it was pointed out that my change didn't fit well in SimplifyCFG and
therefore using GVNHoist was a better way to go. GVNHoist has been disabled
for a while as there was a list of bugs related to it.

I have fixed the following bugs:

https://bugs.llvm.org/show_bug.cgi?id=37808 -> https://reviews.llvm.org/D48372 (rL337149)
https://bugs.llvm.org/show_bug.cgi?id=36787 -> https://reviews.llvm.org/D49555 (rL337674)
https://bugs.llvm.org/show_bug.cgi?id=37445 -> https://reviews.llvm.org/D49425 (rL337680)

The next two bugs no longer occur, and it's unclear which commit fixed them:

https://bugs.llvm.org/show_bug.cgi?id=36635
https://bugs.llvm.org/show_bug.cgi?id=37791

I investigated this one and proved to be unrelated to GVNHoist, but a genuine bug in NewGvn:

https://bugs.llvm.org/show_bug.cgi?id=37660

To convince myself GVNHoist is in a good state I made a successful bootstrap build of LLVM.
Merging this change now in order to make it to the LLVM 7.0.0 branch.

Differential Revision: https://reviews.llvm.org/D49858

llvm-svn: 338240
2018-07-30 10:50:18 +00:00
Max Kazantsev 3327bcaeb1 [NFC] Prepare GuardWidening for widening of cond branches
llvm-svn: 338229
2018-07-30 07:07:32 +00:00
Sanjay Patel 577c705752 [InstCombine] try to fold 'add+sub' to 'not+add'
These are reassociated versions of the same pattern and
similar transforms as in rL338200 and rL338118.

The motivation is identical to those commits:
Patterns with add/sub combos can be improved using
'not' ops. This is better for analysis and may lead
to follow-on transforms because 'xor' and 'add' are
commutative/associative. It can also help codegen.

llvm-svn: 338221
2018-07-29 18:13:16 +00:00
Sanjay Patel 818b253d3a [InstCombine] try to fold 'sub' to 'not'
https://rise4fun.com/Alive/jDd

Patterns with add/sub combos can be improved using
'not' ops. This is better for analysis and may lead
to follow-on transforms because 'xor' and 'add' are 
commutative/associative. It can also help codegen.  

llvm-svn: 338200
2018-07-28 16:48:44 +00:00
David Green fc4b0fe0a2 [GlobalOpt] Test array indices inside structs for out-of-bounds accesses
We now, from clang, can turn arrays of
  static short g_data[] = {16, 16, 16, 16, 16, 16, 16, 16, 0, 0, 0, 0, 0, 0, 0, 0};
into structs of the form
  @g_data = internal global <{ [8 x i16], [8 x i16] }> ...

GlobalOpt will incorrectly SROA it, not realising that the access to the first
element may overflow into the second. This fixes it by checking geps more
thoroughly.

I believe this makes the globalsra-partial.ll test case invalid as the %i value
could be out of bounds. I've re-purposed it as a negative test for this case.

Differential Revision: https://reviews.llvm.org/D49816

llvm-svn: 338192
2018-07-28 08:20:10 +00:00
Alina Sbirlea 5666c7e4bd [SimpleLoopUnswitch] Fix DT updates for trivial branch unswitching.
Summary:
Fixing 2 issues with the DT update in trivial branch switching, though I don't have a case where DT update fails.
1. After splitting ParentBB->UnswitchedBB edge, new edges become: ParentBB->LoopExitBB->UnswitchedBB, so remove ParentBB->LoopExitBB edge.
2. AFAIU, for multiple CFG changes, DT should be updated using batch updates, vs consecutive addEdge and removeEdge calls.

Reviewers: chandlerc, kuhar

Subscribers: sanjoy, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49925

llvm-svn: 338180
2018-07-28 00:01:05 +00:00
Reid Kleckner ba82788ff6 [InstrProf] Don't register __llvm_profile_runtime_user
Refactor some FileCheck prefixes while I'm at it.

Fixes PR38340

llvm-svn: 338172
2018-07-27 22:21:35 +00:00
Sanjay Patel 78e4b4d3c4 [InstCombine] not(sub X, Y) --> add (not X), Y
The tests with constants show a missing optimization.
Analysis for adds is better than subs, so this can also
help with other transforms. And codegen is better with 
adds for targets like x86 (destructive ops, no sub-from).

https://rise4fun.com/Alive/llK

llvm-svn: 338118
2018-07-27 10:54:48 +00:00
Max Kazantsev 4d980515d2 [SimplifyIndVar] Canonicalize comparisons to unsigned while eliminating truncs
This is a follow-up for the patch rL335020. When we replace compares against
trunc with compares against wide IV, we can also replace signed predicates with
unsigned where it is legal.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D48763

llvm-svn: 338115
2018-07-27 09:43:39 +00:00
Matt Arsenault d149650760 PatternMatch: Add wrappers for fabs and canonicalize
llvm-svn: 338111
2018-07-27 09:04:35 +00:00
Anastasis Grammenos f6e143e67f Revert "[LV][DebugInfo] Set DL to the middle block Icmp instruction"
This reverts commit r338106.

llvm-svn: 338109
2018-07-27 08:22:54 +00:00
Anastasis Grammenos 03948d0e0f [LV][DebugInfo] Set DL to the middle block Icmp instruction
Reviewers: hsaito

Differential Revision: https://reviews.llvm.org/D49746

llvm-svn: 338106
2018-07-27 07:12:44 +00:00
Chen Zheng 567485a72f [InstCombine] canonicalize abs pattern
Differential Revision: https://reviews.llvm.org/D48754

llvm-svn: 338092
2018-07-27 01:49:51 +00:00
Vedant Kumar b572f64212 [DebugInfo] LowerDbgDeclare: Add derefs when handling CallInst users
LowerDbgDeclare inserts a dbg.value before each use of an address
described by a dbg.declare. When inserting a dbg.value before a CallInst
use, however, it fails to append DW_OP_deref to the DIExpression.

The DW_OP_deref is needed to reflect the fact that a dbg.value describes
a source variable directly (as opposed to a dbg.declare, which relies on
pointer indirection).

This patch adds in the DW_OP_deref where needed. This results in the
correct values being shown during a debug session for a program compiled
with ASan and optimizations (see https://reviews.llvm.org/D49520). Note
that ConvertDebugDeclareToDebugValue is already correct -- no changes
there were needed.

One complication is that SelectionDAG is unable to distinguish between
direct and indirect frame-index (FRAMEIX) SDDbgValues. This patch also
fixes this long-standing issue in order to not regress integration tests
relying on the incorrect assumption that all frame-index SDDbgValues are
indirect. This is a necessary fix: the newly-added DW_OP_derefs cannot
be lowered properly otherwise. Basically the fix prevents a direct
SDDbgValue with DIExpression(DW_OP_deref) from being dereferenced twice
by a debugger. There were a handful of tests relying on this incorrect
"FRAMEIX => indirect" assumption which actually had incorrect
DW_AT_locations: these are all fixed up in this patch.

Testing:

- check-llvm, and an end-to-end test using lldb to debug an optimized
  program.
- Existing unit tests for DIExpression::appendToStack fully cover the
  new DIExpression::append utility.
- check-debuginfo (the debug info integration tests)

Differential Revision: https://reviews.llvm.org/D49454

llvm-svn: 338069
2018-07-26 20:56:53 +00:00
Sanjay Patel 6d6eab66e0 [InstCombine] fold udiv with common factor from muls with nuw
Unfortunately, sdiv isn't as simple because of UB due to overflow.

This fold is mentioned in PR38239:
https://bugs.llvm.org/show_bug.cgi?id=38239

llvm-svn: 338059
2018-07-26 19:22:41 +00:00
David Green eda3c9efa2 [UnJ] Common some code. NFC
Create a processHeaderPhiOperands for analysing the instructions
in the aft blocks that must be moved before the loop.

Differential Revision: https://reviews.llvm.org/D49061

llvm-svn: 338033
2018-07-26 15:19:07 +00:00
Fangrui Song 984a424c8a [LoadStoreVectorizer] Use const reference
llvm-svn: 337992
2018-07-26 01:11:36 +00:00
Roman Tereshin 4f10a9d3a3 [LSV] Look through selects for consecutive addresses
In some cases LSV sees (load/store _ (select _ <pointer expression>
<pointer expression>)) patterns in input IR, often due to sinking and
other forms of CFG simplification, sometimes interspersed with
bitcasts and all-constant-indices GEPs. With this
patch`areConsecutivePointers` method would attempt to handle select
instructions. This leads to an increased number of successful
vectorizations.

Technically, select instructions could appear in index arithmetic as
well, however, we don't see those in our test suites / benchmarks.
Also, there is a lot more freedom in IR shapes computing integral
indices in general than in what's common in pointer computations, and
it appears that it's quite unreliable to do anything short of making
select instructions first class citizens of Scalar Evolution, which
for the purposes of this patch is most definitely an overkill.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D49428

llvm-svn: 337965
2018-07-25 21:33:00 +00:00
Florian Hahn b6613ac665 Revert r337904: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.
I suspect it is causing the clang-stage2-Rthinlto failures.

llvm-svn: 337956
2018-07-25 19:44:19 +00:00
Florian Hahn 6f5c6adbcd Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.
r337828 resolves a PredicateInfo issue with unnamed types.

Original message:
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin

llvm-svn: 337904
2018-07-25 11:13:40 +00:00
Petr Hosek 47e5fcba57 [profile] Support profiling runtime on Fuchsia
This ports the profiling runtime on Fuchsia and enables the
instrumentation. Unlike on other platforms, Fuchsia doesn't use
files to dump the instrumentation data since on Fuchsia, filesystem
may not be accessible to the instrumented process. We instead use
the data sink to pass the profiling data to the system the same
sanitizer runtimes do.

Differential Revision: https://reviews.llvm.org/D47208

llvm-svn: 337881
2018-07-25 03:01:35 +00:00
Hideki Saito ef380b0fc5 [LV] Fix for PR38110, LV encountered llvm_unreachable()
Summary: truncateToMinimalBitWidths() doesn't handle all Instructions and the worst case is compiler crash via llvm_unreachable(). Fix is to add a case to handle PHINode and changed the worst case to NO-OP (from compiler crash).

Reviewers: sbaranga, mssimpso, hsaito

Reviewed By: hsaito

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49461

llvm-svn: 337861
2018-07-24 22:30:31 +00:00
Joel Galenson 8dbcc58917 Use SCEV to avoid inserting some bounds checks.
This patch uses SCEV to avoid inserting some bounds checks when they are not needed.  This slightly improves the performance of code compiled with the bounds check sanitizer.

Differential Revision: https://reviews.llvm.org/D49602

llvm-svn: 337830
2018-07-24 15:21:54 +00:00
Florian Hahn 36d2e25d5a [PredicateInfo] Use custom mangling to support ssa_copy with unnamed types.
This is a workaround and it would be better to fix this generally, but
doing it generally is quite tricky. See D48541 and PR38117.

Doing it in PredicateInfo directly allows us to use the type address to
differentiate different unnamed types, because neither the created
declarations nor the ssa_copy calls should be visible after
PredicateInfo got destroyed.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D49126

llvm-svn: 337828
2018-07-24 14:49:52 +00:00
Teresa Johnson e214fdeb69 [ThinLTO] Ensure the TargetLibraryInfo is constructed early enough
Summary:
Without this change, the WholeProgramDevirt pass, which requires the
TargetLibraryInfo, will construct one from the default triple.

Fixes PR38139.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49278

llvm-svn: 337750
2018-07-23 21:58:19 +00:00
George Burgess IV b00fb46479 [DebugCounters] Keep track of total counts
This patch makes debug counters keep track of the total number of times
we've called `shouldExecute` for each counter, so it's easier to build
automated tooling on top of these.

A patch to print these counts is coming soon.

Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D49560

llvm-svn: 337748
2018-07-23 21:49:36 +00:00
John Brawn fc18a6ad7d [GVN] Don't use the eliminated load as an available value in phi construction
In ConstructSSAForLoadSet if an available value is actually the load that we're
doing SSA construction to eliminate, then we can omit it as SSAUpdate will add
in the value for the phi that will be replacing it anyway. This can result in
simpler IR which can allow further optimisation.

Differential Revision: https://reviews.llvm.org/D44160

llvm-svn: 337686
2018-07-23 12:14:45 +00:00
Alexandros Lamprineas 592cc78dd8 [GVNHoist] safeToHoistLdSt allows illegal hoisting
Bug fix for PR36787. When reasoning if it's safe to hoist a load we
want to make sure that the defining memory access dominates the new
insertion point of the hoisted instruction. safeToHoistLdSt calls
firstInBB(InsertionPoint,DefiningAccess) which returns false if
InsertionPoint == DefiningAccess, and therefore it falsely thinks
it's safe to hoist.

Differential Revision: https://reviews.llvm.org/D49555

llvm-svn: 337674
2018-07-23 09:42:35 +00:00
Aditya Kumar 373ce7eca5 Early exit with cheaper checks
Reviewers: sebpop,davide,fhahn,trentxintong
Differential Revision: https://reviews.llvm.org/D49617

llvm-svn: 337643
2018-07-21 14:13:44 +00:00
Peter Collingbourne acf005676e Change the cap on the amount of padding for each vtable to 32-byte (previously it was 128-byte)
We tested different cap values with a recent commit of Chromium. Our results show that the 32-byte cap yields the smallest binary and all the caps yield similar performance.
Based on the results, we propose to change the cap value to 32-byte.

Patch by Zhaomo Yang!

Differential Revision: https://reviews.llvm.org/D49405

llvm-svn: 337622
2018-07-20 21:43:20 +00:00
Roman Tereshin 31d52847ef Reapply "[LSV] Refactoring + supporting bitcasts to a type of different size"
This reapplies commit r337489 reverted by r337541
Additionally, this commit contains a speculative fix to the issue reported in r337541
(the report does not contain an actionable reproducer, just a stack trace)

llvm-svn: 337606
2018-07-20 20:10:04 +00:00
Alexander Potapenko 80c6f41581 [MSan] Hotfix compilation
Make sure NewSI is used in materializeStores()

llvm-svn: 337577
2018-07-20 16:52:12 +00:00
Alexander Potapenko 5ff3abbc31 [MSan] run materializeChecks() before materializeStores()
When pointer checking is enabled, it's important that every pointer is
checked before its value is used.
For stores MSan used to generate code that calculates shadow/origin
addresses from a pointer before checking it.
For userspace this isn't a problem, because the shadow calculation code
is quite simple and compiler is able to move it after the check on -O2.
But for KMSAN getShadowOriginPtr() creates a runtime call, so we want the
check to be performed strictly before that call.

Swapping materializeChecks() and materializeStores() resolves the issue:
both functions insert code before the given IR location, so the new
insertion order guarantees that the code calculating shadow address is
between the address check and the memory access.

llvm-svn: 337571
2018-07-20 16:28:49 +00:00
Florian Hahn ec3ca89a17 [IPSCCP] Fix for bot failure caused by r337548
llvm-svn: 337554
2018-07-20 14:37:10 +00:00
Florian Hahn 0a560d5d9c Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters.
This version contains a fix to add values for which the state in ParamState change
to the worklist if the state in ValueState did not change. To avoid adding the
same value multiple times, mergeInValue returns true, if it added the value to
the worklist. The value is added to the worklist depending on its state in
ValueState.

Original message:
For comparisons with parameters, we can use the ParamState lattice
elements which also provide constant range information. This improves
the code for PR33253 further and gets us closer to use
ValueLatticeElement for all values.

Also, as we are using the range information in the solver directly, we
do not need tryToReplaceWithConstantRange afterwards anymore.

Reviewers: dberlin, mssimpso, davide, efriedma

Reviewed By: mssimpso

Differential Revision: https://reviews.llvm.org/D43762

llvm-svn: 337548
2018-07-20 13:29:12 +00:00
Sam McCall 57743883f1 Revert "[LSV] Refactoring + supporting bitcasts to a type of different size"
This reverts commit r337489.
It causes asserts to fire in some TensorFlow tests, e.g.
tensorflow/compiler/tests/gather_test.py on GPU.

Example stack trace:
Start test case: GatherTest.testHigherRank
assertion failed at third_party/llvm/llvm/lib/Support/APInt.cpp:819 in llvm::APInt llvm::APInt::trunc(unsigned int) const: width && "Can't truncate to 0 bits"
    @     0x5559446ebe10  __assert_fail
    @     0x55593ef32f5e  llvm::APInt::trunc()
    @     0x55593d78f86e  (anonymous namespace)::Vectorizer::lookThroughComplexAddresses()
    @     0x55593d78f2bc  (anonymous namespace)::Vectorizer::areConsecutivePointers()
    @     0x55593d78d128  (anonymous namespace)::Vectorizer::isConsecutiveAccess()
    @     0x55593d78c926  (anonymous namespace)::Vectorizer::vectorizeInstructions()
    @     0x55593d78c221  (anonymous namespace)::Vectorizer::vectorizeChains()
    @     0x55593d78b948  (anonymous namespace)::Vectorizer::run()
    @     0x55593d78b725  (anonymous namespace)::LoadStoreVectorizer::runOnFunction()
    @     0x55593edf4b17  llvm::FPPassManager::runOnFunction()
    @     0x55593edf4e55  llvm::FPPassManager::runOnModule()
    @     0x55593edf563c  (anonymous namespace)::MPPassManager::runOnModule()
    @     0x55593edf5137  llvm::legacy::PassManagerImpl::run()
    @     0x55593edf5b71  llvm::legacy::PassManager::run()
    @     0x55593ced250d  xla::gpu::IrDumpingPassManager::run()
    @     0x55593ced5033  xla::gpu::(anonymous namespace)::EmitModuleToPTX()
    @     0x55593ced40ba  xla::gpu::(anonymous namespace)::CompileModuleToPtx()
    @     0x55593ced33d0  xla::gpu::CompileToPtx()
    @     0x55593b26b2a2  xla::gpu::NVPTXCompiler::RunBackend()
    @     0x55593b21f973  xla::Service::BuildExecutable()
    @     0x555938f44e64  xla::LocalService::CompileExecutable()
    @     0x555938f30a85  xla::LocalClient::Compile()
    @     0x555938de3c29  tensorflow::XlaCompilationCache::BuildExecutable()
    @     0x555938de4e9e  tensorflow::XlaCompilationCache::CompileImpl()
    @     0x555938de3da5  tensorflow::XlaCompilationCache::Compile()
    @     0x555938c5d962  tensorflow::XlaLocalLaunchBase::Compute()
    @     0x555938c68151  tensorflow::XlaDevice::Compute()
    @     0x55593f389e1f  tensorflow::(anonymous namespace)::ExecutorState::Process()
    @     0x55593f38a625  tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady()::$_1::operator()()
*** SIGABRT received by PID 7798 (TID 7837) from PID 7798; ***

llvm-svn: 337541
2018-07-20 12:03:00 +00:00
Eli Friedman a3c78f5981 [SCCP] Don't use markForcedConstant on branch conditions.
It's more aggressive than we need to be, and leads to strange
workarounds in other places like call return value inference. Instead,
just directly mark an edge viable.

Tests by Florian Hahn.

Differential Revision: https://reviews.llvm.org/D49408

llvm-svn: 337507
2018-07-19 23:02:07 +00:00
Roman Tereshin b49b2a601f [LSV] Refactoring + supporting bitcasts to a type of different size
This is mostly a preparation work for adding a limited support for
select instructions. It proved to be difficult to do due to size and
irregularity of Vectorizer::isConsecutiveAccess, this is fixed here I
believe.

It also turned out that these changes make it simpler to finish one of
the TODOs and fix a number of other small issues, namely:

1. Looking through bitcasts to a type of a different size (requires
careful tracking of the original load/store size and some math
converting sizes in bytes to expected differences in indices of GEPs).

2. Reusing partial analysis of pointers done by first attempt in proving
them consecutive instead of starting from scratch. This added limited
support for nested GEPs co-existing with difficult sext/zext
instructions. This also required a careful handling of negative
differences between constant parts of offsets.

3. Handing a case where the first pointer index is not an add, but
something else (a function parameter for instance).

I observe an increased number of successful vectorizations on a large
set of shader programs. Only few shaders are affected, but those that
are affected sport >5% less loads and stores than before the patch.

Reviewed By: rampitec

Differential-Revision: https://reviews.llvm.org/D49342
llvm-svn: 337489
2018-07-19 19:42:43 +00:00
Farhana Aleen 8c7a30baea [LoadStoreVectorizer] Use getMinusScev() to compute the distance between two pointers.
Summary: Currently, isConsecutiveAccess() detects two pointers(PtrA and PtrB) as consecutive by
         comparing PtrB with BaseDelta+PtrA. This works when both pointers are factorized or
         both of them are not factorized. But isConsecutiveAccess() fails if one of the
         pointers is factorized but the other one is not.

         Here is an example:
         PtrA = 4 * (A + B)
         PtrB = 4 + 4A + 4B

         This patch uses getMinusSCEV() to compute the distance between two pointers.
         getMinusSCEV() allows combining the expressions and computing the simplified distance.

Author: FarhanaAleen

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D49516

llvm-svn: 337471
2018-07-19 16:50:27 +00:00
Teresa Johnson 28023dbed7 [ThinLTO] Enable ThinLTO WholeProgramDevirt and LowerTypeTests in new PM
Summary:
Enable these passes for CFI and WPD in ThinLTO and LTO with the new pass
manager. Add a couple of tests for both PMs based on the clang tests
tools/clang/test/CodeGen/thinlto-distributed-cfi*.ll, but just test
through llvm-lto2 and not with distributed ThinLTO.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49429

llvm-svn: 337461
2018-07-19 14:51:32 +00:00
Peter Collingbourne 4a653fa7f1 Rename __asan_gen_* symbols to ___asan_gen_*.
This prevents gold from printing a warning when trying to export
these symbols via the asan dynamic list after ThinLTO promotes them
from private symbols to external symbols with hidden visibility.

Differential Revision: https://reviews.llvm.org/D49498

llvm-svn: 337428
2018-07-18 22:23:14 +00:00
Xin Tong 074ccf32ce Skip debuginfo intrinsic in markLiveBlocks.
Summary:
The optimizer is 10%+ slower with vs without debuginfo. I started checking where
the difference is coming from.

I compiled sqlite3.c with and without debug info from CTMark and compare the time difference.

I use Xcode Instrument to find where time is spent. This brings about 20ms, out of ~20s.

Reviewers: davide, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D49337

llvm-svn: 337416
2018-07-18 18:40:45 +00:00
Simon Pilgrim 2b37ddce4b [SLPVectorizer] Avoid duplicate scalar cost calculations in BoUpSLP::getEntryCost. NFCI.
Pulled out from D49225, we have a lot of repeated scalar cost calculations, often with arguments that don't look the same but turn out to be.

llvm-svn: 337390
2018-07-18 13:53:55 +00:00
Roman Lebedev 3cb87e905c [InstCombine] Re-commit: Fold 'check for [no] signed truncation' pattern
Summary:
[[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]]

As discussed in https://reviews.llvm.org/D49179#1158957 and later,
the IR for 'check for [no] signed truncation' pattern can be improved:
https://rise4fun.com/Alive/gBf
^ that pattern will be produced by Implicit Integer Truncation sanitizer,
https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530
in signed case, therefore it is probably a good idea to improve it.

The DAGCombine will reverse this transform, see
https://reviews.llvm.org/D49266

This transform is surprisingly frustrating.
This does not deal with non-splat shift amounts, or with undef shift amounts.
I've outlined what i think the solution should be:
```
  // Potential handling of non-splats: for each element:
  //  * if both are undef, replace with constant 0.
  //    Because (1<<0) is OK and is 1, and ((1<<0)>>1) is also OK and is 0.
  //  * if both are not undef, and are different, bailout.
  //  * else, only one is undef, then pick the non-undef one.
```

This is a re-commit, as the original patch, committed in rL337190
was reverted in rL337344 as it broke chromium build:
https://bugs.llvm.org/show_bug.cgi?id=38204 and
https://crbug.com/864832
Proofs that the fixed folds are ok: https://rise4fun.com/Alive/VYM

Differential Revision: https://reviews.llvm.org/D49320

llvm-svn: 337376
2018-07-18 10:55:17 +00:00
Bob Haarman 4ebe5d59b6 Revert "[InstCombine] Fold 'check for [no] signed truncation' pattern"
This reverts r337190 (and a few follow-up commits), which caused the
Chromium build to fail. See
https://bugs.llvm.org/show_bug.cgi?id=38204 and
https://crbug.com/864832

llvm-svn: 337344
2018-07-18 02:18:28 +00:00
Vedant Kumar 9ece818291 [InstCombine] Preserve debug value when simplifying cast-of-select
InstCombine has a cast transform that matches a cast-of-select:

  Orig = cast (Src = select Cond TV FV)

And tries to replace it with a select which has the cast folded in:

  NewSel = select Cond (cast TV) (cast FV)

The combiner does RAUW(Orig, NewSel), so any debug values for Orig would
survive the transform. But debug values for Src would be lost.

This patch teaches InstCombine to replace all debug uses of Src with
NewSel (taking care of doing any necessary DIExpression rewriting).

Differential Revision: https://reviews.llvm.org/D49270

llvm-svn: 337310
2018-07-17 18:08:36 +00:00
Florian Hahn d95761d9d0 [IPSCCP] Run Solve each time we resolved an undef in a function.
Once we resolved an undef in a function we can run Solve, which could
lead to finding a constant return value for the function, which in turn
could turn undefs into constants in other functions that call it, before
resolving undefs there.

Computationally the amount of work we are doing stays the same, just the
order we process things is slightly different and potentially there are
a few less undefs to resolve.

We are still relying on the order of functions in the IR, which means
depending on the order, we are able to resolve the optimal undef first
or not. For example, if @test1 comes before @testf, we find the constant
return value of @testf too late and we cannot use it while solving
@test1.

This on its own does not lead to more constants removed in the
test-suite, probably because currently we have to be very lucky to visit
applicable functions in the right order.

Maybe we manage to come up with a better way of resolving undefs in more
'profitable' functions first.

Reviewers: efriedma, mssimpso, davide

Reviewed By: efriedma, davide

Differential Revision: https://reviews.llvm.org/D49385

llvm-svn: 337283
2018-07-17 14:04:59 +00:00
Simon Pilgrim 1a4f3c93fb [SLPVectorizer] Don't attempt horizontal reduction on pointer types (PR38191)
TTI::getMinMaxReductionCost typically can't handle pointer types - until this is changed its better to limit horizontal reduction to integer/float vector types only.

llvm-svn: 337280
2018-07-17 13:43:33 +00:00
whitequark a41b24f32d [LLVM-C] Fix name mangling on AggressiveInstCombine
Similarly to rL336736, at least one more C API function does not
properly get declared as extern "C" due to a missing header, causing
name mangling and linking errors.

This patch fixes calls to LLVMAddAggressiveInstCombinerPass().

Differential Revision: https://reviews.llvm.org/D49416

Reviewed By: whitequark

llvm-svn: 337264
2018-07-17 11:13:58 +00:00
Simon Pilgrim a0220b0570 Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.
llvm-svn: 337257
2018-07-17 09:39:55 +00:00
Roman Lebedev b79b4f539b [InstCombine] Fold 'check for [no] signed truncation' pattern
Summary:
[[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]]

As discussed in https://reviews.llvm.org/D49179#1158957 and later,
the IR for 'check for [no] signed truncation' pattern can be improved:
https://rise4fun.com/Alive/gBf
^ that pattern will be produced by Implicit Integer Truncation sanitizer,
https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530
in signed case, therefore it is probably a good idea to improve it.

Proofs for this transform: https://rise4fun.com/Alive/mgu
This transform is surprisingly frustrating.
This does not deal with non-splat shift amounts, or with undef shift amounts.
I've outlined what i think the solution should be:
```
  // Potential handling of non-splats: for each element:
  //  * if both are undef, replace with constant 0.
  //    Because (1<<0) is OK and is 1, and ((1<<0)>>1) is also OK and is 0.
  //  * if both are not undef, and are different, bailout.
  //  * else, only one is undef, then pick the non-undef one.
```

The DAGCombine will reverse this transform, see
https://reviews.llvm.org/D49266

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: JDevlieghere, rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D49320

llvm-svn: 337190
2018-07-16 16:45:42 +00:00
Teresa Johnson d68935c5ac Restore "[ThinLTO] Ensure we always select the same function copy to import"
This reverts commit r337081, therefore restoring r337050 (and fix in
r337059), with test fix for bot failure described after the original
description below.

In order to always import the same copy of a linkonce function,
even when encountering it with different thresholds (a higher one then a
lower one), keep track of the summary we decided to import.
This ensures that the backend only gets a single definition to import
for each GUID, so that it doesn't need to choose one.

Move the largest threshold the GUID was considered for import into the
current module out of the ImportMap (which is part of a larger map
maintained across the whole index), and into a new map just maintained
for the current module we are computing imports for. This saves some
memory since we no longer have the thresholds maintained across the
whole index (and throughout the in-process backends when doing a normal
non-distributed ThinLTO build), at the cost of some additional
information being maintained for each invocation of ComputeImportForModule
(the selected summary pointer for each import).

There is an additional map lookup for each callee being considered for
importing, however, this was able to subsume a map lookup in the
Worklist iteration that invokes computeImportForFunction. We also are
able to avoid calling selectCallee if we already failed to import at the
same or higher threshold.

I compared the run time and peak memory for the SPEC2006 471.omnetpp
benchmark (running in-process ThinLTO backends), as well as for a large
internal benchmark with a distributed ThinLTO build (so just looking at
the thin link time/memory). Across a number of runs with and without
this change there was no significant change in the time and memory.

(I tried a few other variations of the change but they also didn't
improve time or peak memory).

The new commit removes a test that no longer makes sense
(Transforms/FunctionImport/hotness_based_import2.ll), as exposed by the
reverse-iteration bot. The test depends on the order of processing the
summary call edges, and actually depended on the old problematic
behavior of selecting more than one summary for a given GUID when
encountered with different thresholds. There was no guarantee even
before that we would eventually pick the linkonce copy with the hottest
call edges, it just happened to work with the test and the old code, and
there was no guarantee that we would end up importing the selected
version of the copy that had the hottest call edges (since the backend
would effectively import only one of the selected copies).

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D48670

llvm-svn: 337184
2018-07-16 15:30:27 +00:00
Alexander Potapenko d1a381b17a MSan: minor fixes, NFC
- remove an extra space after |ID| declaration
 - drop the unused |FirstInsn| parameter in getShadowOriginPtrUserspace()

llvm-svn: 337159
2018-07-16 10:57:19 +00:00
Alexander Potapenko 725a4ddc9e [MSan] factor userspace-specific declarations into createUserspaceApi(). NFC
This patch introduces createUserspaceApi() that creates function/global
declarations for symbols used by MSan in the userspace.
This is a step towards the upcoming KMSAN implementation patch.

Reviewed at https://reviews.llvm.org/D49292

llvm-svn: 337155
2018-07-16 10:03:30 +00:00