Commit Graph

22423 Commits

Author SHA1 Message Date
Peter Collingbourne c3b8661df5 LowerTypeTests: Teach the pass to respect global alignments.
We were previously ignoring alignment entirely when combining globals
together in this pass. There are two main things that we need to do here:
add additional padding before each global to meet the alignment requirements,
and set the combined global's alignment to the maximum of all of the original
globals' alignments.

Since we now need to calculate layout as we go anyway, use the calculated
layout to produce GlobalLayout instead of using StructLayout.

Differential Revision: https://reviews.llvm.org/D65033

llvm-svn: 366722
2019-07-22 18:47:03 +00:00
Simon Pilgrim 3ebd2fe91a [SLPVectorizer] Fix some MSVC/cppcheck uninitialized variable warnings. NFCI.
llvm-svn: 366712
2019-07-22 17:57:36 +00:00
Christudasan Devadasan 006cf8c03d Added address-space mangling for stack related intrinsics
Modified the following 3 intrinsics:
int_addressofreturnaddress,
int_frameaddress & int_sponentry.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D64561

llvm-svn: 366679
2019-07-22 12:42:48 +00:00
Serguei Katkov c6c31da867 [Loop Peeling] Fix the handling of branch weights of peeled off branches.
Current algorithm to update branch weights of latch block and its copies is
based on the assumption that number of peeling iterations is approximately equal
to trip count.

However it is not correct. According to profitability check in one case we can decide to peel
in case it helps to reduce the number of phi nodes. In this case the number of peeled iteration
can be less then estimated trip count.

This patch introduces another way to set the branch weights to peeled of branches.
Let F is a weight of the edge from latch to header.
Let E is a weight of the edge from latch to exit.
F/(F+E) is a probability to go to loop and E/(F+E) is a probability to go to exit.
Then, Estimated TripCount = F / E.
For I-th (counting from 0) peeled off iteration we set the the weights for
the peeled latch as (TC - I, 1). It gives us reasonable distribution,
The probability to go to exit 1/(TC-I) increases. At the same time
the estimated trip count of remaining loop reduces by I.

As a result after peeling off N iteration the weights will be
(F - N * E, E) and trip count of loop becomes
F / E - N or TC - N.

The idea is taken from the review of the patch D63918 proposed by Philip.

Reviewers: reames, mkuper, iajbar, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64235

llvm-svn: 366665
2019-07-22 05:15:34 +00:00
Craig Topper e6cd20ba53 [InstCombine] Update comment I missed in r366649. NFC
llvm-svn: 366658
2019-07-21 16:15:03 +00:00
Craig Topper 1d149d08d3 [InstCombine] Remove insertRangeTest code that handles the equality case.
For equality, the function called getTrue/getFalse with the VT
of the comparison input. But getTrue/getFalse need the boolean VT.
So if this code ever executed, it would assert.

I believe these cases are removed by InstSimplify so we don't get here.

So this patch just fixes up an assert to exclude the equality
possibility and removes the broken code.

llvm-svn: 366649
2019-07-21 06:43:38 +00:00
Craig Topper 8fabdfe9fc [InstCombine] Don't use AddOne/SubOne to see if two APInts are 1 apart. Use APInt operations instead. NFCI
AddOne/SubOne create new Constant objects. That seems heavy for
comparing ConstantInts which wrap APInts. Just do the math on
on the APInts and compare them.

llvm-svn: 366648
2019-07-21 05:26:05 +00:00
Florian Hahn 0a7faa4e3d [Local] Zap blockaddress without users in ConstantFoldTerminator.
If the blockaddress is not destoryed, the destination block will still
be marked as having its address taken, limiting further transformations.

I think there are other places where the dead blockaddress constants are kept
around, I'll look into that as follow up.

Reviewers: craig.topper, brzycki, davide

Reviewed By: brzycki, davide

Differential Revision: https://reviews.llvm.org/D64936

llvm-svn: 366633
2019-07-20 12:25:47 +00:00
Hubert Tong 2711e16b35 [sanitizers] Use covering ObjectFormatType switches
Summary:
This patch removes the `default` case from some switches on
`llvm::Triple::ObjectFormatType`, and cases for the missing enumerators
(`UnknownObjectFormat`, `Wasm`, and `XCOFF`) are then added.

For `UnknownObjectFormat`, the effect of the action for the `default`
case is maintained; otherwise, where `llvm_unreachable` is called,
`report_fatal_error` is used instead.

Where the `default` case returns a default value, `report_fatal_error`
is used for XCOFF as a placeholder. For `Wasm`, the effect of the action
for the `default` case in maintained.

The code is structured to avoid strongly implying that the `Wasm` case
is present for any reason other than to make the switch cover all
`ObjectFormatType` enumerator values.

Reviewers: sfertile, jasonliu, daltenty

Reviewed By: sfertile

Subscribers: hiraditya, aheejin, sunfish, llvm-commits, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64222

llvm-svn: 366544
2019-07-19 08:46:18 +00:00
Serguei Katkov bde33af85a [Loop Peeling] Enable peeling of multiple exits by default.
Enable loop peeling with multiple exits where all non-latch exits
ends up with deopt by default.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: xbolva00, hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64619

llvm-svn: 366542
2019-07-19 08:35:45 +00:00
Roman Lebedev f2eb403144 [InstCombine] Dropping redundant masking before left-shift [5/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
f. `((x << MaskShAmt) a>> MaskShAmt) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
f. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

Normally, the inner pattern is sign-extend,
but for our purposes it's no different to other patterns:

alive proofs:
f: https://rise4fun.com/Alive/7U3

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64524

llvm-svn: 366540
2019-07-19 08:26:58 +00:00
Roman Lebedev 441c9d6ca8 [InstCombine] Dropping redundant masking before left-shift [4/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
e. `((x << MaskShAmt) l>> MaskShAmt) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
e. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

alive proofs:
e: https://rise4fun.com/Alive/0FT

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64521

llvm-svn: 366539
2019-07-19 08:26:47 +00:00
Roman Lebedev 3c212ce305 [InstCombine] Dropping redundant masking before left-shift [3/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
d. `(x & ((-1 << MaskShAmt) >> MaskShAmt)) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
d. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

alive proofs:
d: https://rise4fun.com/Alive/I5Y

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64519

llvm-svn: 366538
2019-07-19 08:26:37 +00:00
Roman Lebedev 2ebe57386d [InstCombine] Dropping redundant masking before left-shift [2/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
c. `(x & (-1 >> MaskShAmt)) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
c. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

alive proofs:
c: https://rise4fun.com/Alive/RgJh

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64517

llvm-svn: 366537
2019-07-19 08:26:25 +00:00
Roman Lebedev 4422a1657c [InstCombine] Dropping redundant masking before left-shift [1/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
b. `(x & (~(-1 << maskNbits))) << shiftNbits`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
b. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)`

alive proof:
b: https://rise4fun.com/Alive/y8M

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64514

llvm-svn: 366536
2019-07-19 08:26:13 +00:00
Roman Lebedev a5f0824eb5 [InstCombine] Dropping redundant masking before left-shift [0/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
a. `(x & ((1 << MaskShAmt) - 1)) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
a. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)`

alive proof:
a: https://rise4fun.com/Alive/wi9

Indeed, not all of these patterns are canonical.
But since this fold will only produce a single instruction
i'm really interested in handling even uncanonical patterns,
since i have this general kind of pattern in hotpaths,
and it is not totally outlandish for bit-twiddling code.

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Reviewers: spatel, nikic, huihuiz, xbolva00

Reviewed By: xbolva00

Subscribers: efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64512

llvm-svn: 366535
2019-07-19 08:25:43 +00:00
Serguei Katkov 0ffa833d54 [LoopInfo] Use early return in branch weight update functions. NFC.
llvm-svn: 366411
2019-07-18 07:36:20 +00:00
Peter Collingbourne 3b82b92c6b hwasan: Initialize the pass only once.
This will let us instrument globals during initialization. This required
making the new PM pass a module pass, which should still provide access to
analyses via the ModuleAnalysisManager.

Differential Revision: https://reviews.llvm.org/D64843

llvm-svn: 366379
2019-07-17 21:45:19 +00:00
Hideto Ueno 4a09a73fb0 [Attributor][NFC] Remove unnecessary debug output
llvm-svn: 366373
2019-07-17 21:11:02 +00:00
Hideto Ueno 11d3710c1c [Attributor] Deduce "willreturn" function attribute
Summary:
Deduce the "willreturn" attribute for functions.

For now, intrinsics are not willreturn. More annotation will be done in another patch.

Reviewers: jdoerfert

Subscribers: jvesely, nhaehnle, nicholas, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63046

llvm-svn: 366335
2019-07-17 15:15:43 +00:00
Philip Reames 6e1c3bb181 [IndVars] Speculative fix for an assertion failure seen in bots
I don't have an IR sample which is actually failing, but the issue described in the comment is theoretically possible, and should be guarded against even if there's a different root cause for the bot failures.

llvm-svn: 366241
2019-07-16 18:23:49 +00:00
Amara Emerson 228a7b4f2a [ADCE] Fix non-deterministic behaviour due to iterating over a pointer set.
Original patch by Yann Laigle-Chapuy

Differential Revision: https://reviews.llvm.org/D64785

llvm-svn: 366215
2019-07-16 15:23:10 +00:00
Rui Ueyama 49a3ad21d6 Fix parameter name comments using clang-tidy. NFC.
This patch applies clang-tidy's bugprone-argument-comment tool
to LLVM, clang and lld source trees. Here is how I created this
patch:

$ git clone https://github.com/llvm/llvm-project.git
$ cd llvm-project
$ mkdir build
$ cd build
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \
    -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \
    -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \
    -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm
$ ninja
$ parallel clang-tidy -checks='-*,bugprone-argument-comment' \
    -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \
    ::: ../llvm/lib/**/*.{cpp,h} ../clang/lib/**/*.{cpp,h} ../lld/**/*.{cpp,h}

llvm-svn: 366177
2019-07-16 04:46:31 +00:00
Peter Collingbourne e5c4b468f0 hwasan: Pad arrays with non-1 size correctly.
Spotted by eugenis.

Differential Revision: https://reviews.llvm.org/D64783

llvm-svn: 366171
2019-07-16 03:25:50 +00:00
Eric Christopher 93dfb93ad6 Temporarily Revert "[SLP] Recommit: Look-ahead operand reordering heuristic."
As there are some reported miscompiles with AVX512 and performance regressions
in Eigen. Verified with the original committer and testcases will be forthcoming.

This reverts commit r364964.

llvm-svn: 366154
2019-07-15 23:36:02 +00:00
Leonard Chan bb147aabc6 Revert "[NewPM] Port Sancov"
This reverts commit 5652f35817.

llvm-svn: 366153
2019-07-15 23:18:31 +00:00
Evgeniy Stepanov c5e7f56249 ARM MTE stack sanitizer.
Add "memtag" sanitizer that detects and mitigates stack memory issues
using armv8.5 Memory Tagging Extension.

It is similar in principle to HWASan, which is a software implementation
of the same idea, but there are enough differencies to warrant a new
sanitizer type IMHO. It is also expected to have very different
performance properties.

The new sanitizer does not have a runtime library (it may grow one
later, along with a "debugging" mode). Similar to SafeStack and
StackProtector, the instrumentation pass (in a follow up change) will be
inserted in all cases, but will only affect functions marked with the
new sanitize_memtag attribute.

Reviewers: pcc, hctim, vitalybuka, ostannard

Subscribers: srhines, mehdi_amini, javed.absar, kristof.beyls, hiraditya, cryptoad, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64169

llvm-svn: 366123
2019-07-15 20:02:23 +00:00
Johannes Doerfert 3dcd7996f1 [FunctionAttrs] Remove readonly and writeonly assertion
There are scenarios where mutually recursive functions may cause the SCC
to contain both read only and write only functions. This removes an
assertion when adding read attributes which caused a crash with a the
provided test case, and instead just doesn't add the attributes.

Patch by Luke Lau <luke.lau@intel.com>

Differential Revision: https://reviews.llvm.org/D60761

llvm-svn: 366090
2019-07-15 17:31:26 +00:00
Serguei Katkov d021ad9fbe [Loop Peeling] Fix the bug with IDom setting for exit loops
It is possible that loop exit has two predecessors in a loop body.
In this case after the peeling the iDom of the exit should be a clone of
iDom of original exit but no a clone of a block coming to this exit.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64618

llvm-svn: 366050
2019-07-15 09:13:11 +00:00
Florian Hahn 1d554b7441 [LoopVectorize] Pass unfiltered list of arguments to getIntrinsicInstCost.
We do not compute the scalarization overhead in getVectorIntrinsicCost
and TTI::getIntrinsicInstrCost requires the full arguments list.

llvm-svn: 366049
2019-07-15 08:48:47 +00:00
Serguei Katkov 3ed93b4673 [Loop Peeling] Enable peeling for loops with multiple exits
This CL enables peeling of the loop with multiple exits where
one exit should be from latch and others are basic blocks with
call to deopt.

The peeling is enabled under the flag which is false by default.

Reviewers: reames, mkuper, iajbar, fhahn
Reviewed By: reames
Subscribers: xbolva00, hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D63923

llvm-svn: 366048
2019-07-15 08:26:45 +00:00
Hideto Ueno 54869ec907 [Attributor] Deduce "nonnull" attribute
Summary:
Porting nonnull attribute to attributor.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63604

llvm-svn: 366043
2019-07-15 06:49:04 +00:00
Serguei Katkov 45c43e7d04 [LoopUtils] Extend the scope of getLoopEstimatedTripCount
With this patch the getLoopEstimatedTripCount function will
accept also the loops where there are more than one exit but
all exits except latch block should ends up with a call to deopt.

This side exits should not impact the estimated trip count.

Reviewers: reames, mkuper, danielcdh
Reviewed By: reames
Subscribers: fhahn, lebedev.ri, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64553

llvm-svn: 366042
2019-07-15 06:42:39 +00:00
Serguei Katkov f1ee04c42a [LoopInfo] Introduce getUniqueNonLatchExitBlocks utility function
Extract the code from LoopUnrollRuntime into utility function to
re-use it in D63923.

Reviewers: reames, mkuper
Reviewed By: reames
Subscribers: fhahn, hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64548

llvm-svn: 366040
2019-07-15 05:51:10 +00:00
Florian Hahn 9428d95ce7 [LV] Exclude loop-invariant inputs from scalar cost computation.
Loop invariant operands do not need to be scalarized, as we are using
the values outside the loop. We should ignore them when computing the
scalarization overhead.

Fixes PR41294

Reviewers: hsaito, rengolin, dcaballe, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D59995

llvm-svn: 366030
2019-07-14 20:12:36 +00:00
Johannes Doerfert c7a1db3298 [Attributor][NFC] Run clang-format on the attributor files (.h/.cpp)
The Attributor files are kept formatted with clang-format, we should try
to keep this state.

llvm-svn: 365984
2019-07-13 01:09:27 +00:00
Johannes Doerfert 0a7f4cdce9 [Attributor] Only return attributes with a valid state
Attributor::getAAFor will now only return AbstractAttributes with a
valid AbstractState. This simplifies call sites as they only need to
check if the returned pointer is non-null. It also reduces the potential
for accidental misuse.

llvm-svn: 365983
2019-07-13 01:09:21 +00:00
Alina Sbirlea db101864bd [MemorySSA] Use SetVector to avoid nondeterminism.
Summary:
Use a SetVector for DeadBlockSet.
Resolves PR42574.

Reviewers: george.burgess.iv, uabelho, dblaikie

Subscribers: jlebar, Prazek, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64601

llvm-svn: 365970
2019-07-12 22:30:30 +00:00
David Bolvansky 9f0d718c66 [InstCombine] Disable fold from D64285 for non-integer types
llvm-svn: 365959
2019-07-12 21:14:21 +00:00
Sterling Augustine 6d75a9e873 The variable "Latch" is only used in an assert, which makes builds that use "-DNDEBUG" fail with unused variable messages.
Summary: Move the logic into the assert itself.

Subscribers: hiraditya, sanjoy, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64654

llvm-svn: 365943
2019-07-12 18:51:08 +00:00
Stefan Stipanovic cb5ecae1f6 Addition to rL365925, removing remaining virtuals
llvm-svn: 365938
2019-07-12 18:34:06 +00:00
Leonard Chan 223573c8ba Remove unused methods in Sancov.
llvm-svn: 365931
2019-07-12 18:09:09 +00:00
Stefan Stipanovic 15e86f707b [Attributor] Removing unnecessary `virtual` keywords.
Some function in the Attributor framework are unnecessarily
marked virtual. This patch removes virtual keyword

Reviewers: jdoerfert

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D64637

llvm-svn: 365925
2019-07-12 17:42:14 +00:00
Hideto Ueno 65bbaf9ece [Attributor] Deduce "nofree" function attribute
Summary: Deduce "nofree" function attribute. A more concise description of "nofree" is on D49165.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: homerdin, hfinkel, lebedev.ri, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62687

llvm-svn: 365924
2019-07-12 17:38:51 +00:00
Philip Reames 34495b5533 [IndVars] Use exit count reasoning to discharge obviously untaken exits
Continue in the spirit of D63618, and use exit count reasoning to prove away loop exits which can not be taken since the backedge taken count of the loop as a whole is provably less than the minimal BE count required to take this particular loop exit.

As demonstrated in the newly added tests, this triggers in a number of cases where IndVars was previously unable to discharge obviously redundant exit tests. And some not so obvious ones.

Differential Revision: https://reviews.llvm.org/D63733

llvm-svn: 365920
2019-07-12 17:05:35 +00:00
Fangrui Song b251cc0d91 Delete dead stores
llvm-svn: 365903
2019-07-12 14:58:15 +00:00
David Bolvansky af1b3185f5 [InstCombine] Fold select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y) to ashr (X, Y))
Summary:
(select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y)) -> ashr (X, Y))
(select (icmp slt x, 1), ashr (X, Y), lshr (X, Y)) -> ashr (X, Y))

Fixes PR41173

Alive proof by @lebedev.ri (thanks)
Name: PR41173
  %cmp = icmp slt i32 %x, 1
  %shr = lshr i32 %x, %y
  %shr1 = ashr i32 %x, %y
  %retval.0 = select i1 %cmp, i32 %shr1, i32 %shr
  =>
  %retval.0 = ashr i32 %x, %y

Optimization: PR41173
Done: 1
Optimization is correct!

Reviewers: lebedev.ri, spatel

Reviewed By: lebedev.ri

Subscribers: nikic, craig.topper, llvm-commits, lebedev.ri

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64285

llvm-svn: 365893
2019-07-12 11:31:16 +00:00
Evandro Menezes b21692672e [InstCombine] Reorder pow() transformations (NFC)
Move the transformation from `powf(x, itofp(y))` to `powi(x, y)` to the
group of transformations related to the exponent.

llvm-svn: 365851
2019-07-12 00:33:49 +00:00
Leonard Chan 5652f35817 [NewPM] Port Sancov
This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.

Changes:

- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
  functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
  functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.

Differential Revision: https://reviews.llvm.org/D62888

llvm-svn: 365838
2019-07-11 22:35:40 +00:00
Stefan Stipanovic 0626367202 [Attributor] Deduce "nosync" function attribute.
Introduce and deduce "nosync" function attribute to indicate that a function
does not synchronize with another thread in a way that other thread might free memory.

Reviewers: jdoerfert, jfb, nhaehnle, arsenm

Subscribers: wdng, hfinkel, nhaenhle, mehdi_amini, steven_wu,
dexonsmith, arsenm, uenoku, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D62766

llvm-svn: 365830
2019-07-11 21:37:40 +00:00
Sanjay Patel 3487791fea [InstCombine] don't move FP negation out of a constant expression
-(X * ConstExpr) becomes X * (-ConstExpr), so don't reverse that
and infinite loop.

llvm-svn: 365774
2019-07-11 13:44:29 +00:00
Tim Northover 27658ed512 OpaquePtr: use load instruction directly for type. NFC.
llvm-svn: 365768
2019-07-11 13:12:08 +00:00
David Bolvansky e23be09e66 [InstCombine] Reorder recently added/improved pow transformations
Changed cases are now faster with exp2.

llvm-svn: 365758
2019-07-11 10:55:04 +00:00
Vitaly Buka d03bd1db59 NFC: Pass DataLayout into isBytewiseValue
Summary:
We will need to handle IntToPtr which I will submit in a separate patch as it's
not going to be NFC.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D63940

llvm-svn: 365709
2019-07-10 22:53:52 +00:00
Alina Sbirlea 58a37754bb [LoopRotate + MemorySSA] Keep an <instruction-cloned instruction> map.
Summary:
The map kept in loop rotate is used for instruction remapping, in order
to simplify the clones of instructions. Thus, if an instruction can be
simplified, its simplified value is placed in the map, even when the
clone is added to the IR. MemorySSA in contrast needs to know about that
clone, so it can add an access for it.
To resolve this: keep a different map for MemorySSA.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63680

llvm-svn: 365672
2019-07-10 17:36:56 +00:00
Vedant Kumar 5eb6ba060a [CodeExtractor] Fix sinking of allocas with multiple bitcast uses (PR42451)
An alloca which can be sunk into the extraction region may have more
than one bitcast use. Move these uses along with the alloca to prevent
use-before-def.

Testing: check-llvm, stage2 build of clang

Fixes llvm.org/PR42451.

Differential Revision: https://reviews.llvm.org/D64463

llvm-svn: 365660
2019-07-10 16:32:20 +00:00
Vedant Kumar f65f302cc7 [CodeExtractor] Simplify findAllocas, NFC
Split getLifetimeMarkers out into its own method and have it return a
struct.

Differential Revision: https://reviews.llvm.org/D64467

llvm-svn: 365659
2019-07-10 16:32:16 +00:00
Roman Lebedev c5f92bd67b [PatternMatch] Generalize m_SpecificInt_ULT() to take ICmpInst::Predicate
As discussed in the original review, this may be useful,
so let's just do it.

llvm-svn: 365652
2019-07-10 16:07:35 +00:00
David Bolvansky 0735cc1954 [InstCombine] pow(C,x) -> exp2(log2(C)*x)
Summary:
Transform
pow(C,x) 

To
exp2(log2(C)*x) 

if C > 0, C != inf, C != NaN (and C is not power of 2, since we have some fold for such case already).

log(C) is folded by the compiler and exp2 is much faster to compute than pow.

Reviewers: spatel, efriedma, evandro

Reviewed By: evandro

Subscribers: lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64099

llvm-svn: 365637
2019-07-10 14:43:27 +00:00
Serguei Katkov d000f8b69f [SimpleLoopUnswitch] Don't consider unswitching `switch` insructions with one unique successor
Only instructions with two or more unique successors should be considered for unswitching.

Patch Author: Daniil Suchkov.

Reviewers: reames, asbirlea, skatkov
Reviewed By: skatkov
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64404

llvm-svn: 365611
2019-07-10 10:25:22 +00:00
Nikita Popov 5ca39e828c [SLP] Optimize getSpillCost(); NFCI
For a given set of live values, the spill cost will always be the
same for each call. Compute the cost once and multiply it by the
number of calls.

(I'm not sure this spill cost modeling makes sense if there are
multiple calls, as the spill cost will likely be shared across
calls in that case. But that's how it currently works.)

llvm-svn: 365552
2019-07-09 20:24:44 +00:00
Peter Collingbourne 1366262b74 hwasan: Improve precision of checks using short granule tags.
A short granule is a granule of size between 1 and `TG-1` bytes. The size
of a short granule is stored at the location in shadow memory where the
granule's tag is normally stored, while the granule's actual tag is stored
in the last byte of the granule. This means that in order to verify that a
pointer tag matches a memory tag, HWASAN must check for two possibilities:

* the pointer tag is equal to the memory tag in shadow memory, or
* the shadow memory tag is actually a short granule size, the value being loaded
  is in bounds of the granule and the pointer tag is equal to the last byte of
  the granule.

Pointer tags between 1 to `TG-1` are possible and are as likely as any other
tag. This means that these tags in memory have two interpretations: the full
tag interpretation (where the pointer tag is between 1 and `TG-1` and the
last byte of the granule is ordinary data) and the short tag interpretation
(where the pointer tag is stored in the granule).

When HWASAN detects an error near a memory tag between 1 and `TG-1`, it
will show both the memory tag and the last byte of the granule. Currently,
it is up to the user to disambiguate the two possibilities.

Because this functionality obsoletes the right aligned heap feature of
the HWASAN memory allocator (and because we can no longer easily test
it), the feature is removed.

Also update the documentation to cover both short granule tags and
outlined checks.

Differential Revision: https://reviews.llvm.org/D63908

llvm-svn: 365551
2019-07-09 20:22:36 +00:00
Philip Reames a6548d0437 [PoisonChecking] Flesh out complete todo list for full coverage
Note: I don't actually plan to implement all of the cases at the moment, I'm just documenting them for completeness.  There's a couple of cases left which are practically useful for me in debugging loop transforms, and I'll probably stop there for the moment.
llvm-svn: 365550
2019-07-09 19:59:39 +00:00
Philip Reames 3dbd7e98d8 [PoisonCheker] Support for out of bounds operands on shifts + insert/extractelement
These are sources of poison which don't come from flags, but are clearly documented in the LangRef.  Left off support for scalable vectors for the moment, but should be easy to add if anyone is interested.  

llvm-svn: 365543
2019-07-09 19:26:12 +00:00
Philip Reames 3b38b92541 [PoisonChecking] Add validation rules for "exact" on sdiv/udiv
As directly stated in the LangRef, no ambiguity here...

llvm-svn: 365538
2019-07-09 18:56:41 +00:00
Philip Reames f47a313e71 Add a transform pass to make the executable semantics of poison explicit in the IR
Implements a transform pass which instruments IR such that poison semantics are made explicit. That is, it provides a (possibly partial) executable semantics for every instruction w.r.t. poison as specified in the LLVM LangRef. There are obvious parallels to the sanitizer tools, but this pass is focused purely on the semantics of LLVM IR, not any particular source language.

The target audience for this tool is developers working on or targetting LLVM from a frontend. The idea is to be able to take arbitrary IR (with the assumption of known inputs), and evaluate it concretely after having made poison semantics explicit to detect cases where either a) the original code executes UB, or b) a transform pass introduces UB which didn't exist in the original program.

At the moment, this is mostly the framework and still needs to be fleshed out. By reusing existing code we have decent coverage, but there's a lot of cases not yet handled. What's here is good enough to handle interesting cases though; for instance, one of the recent LFTR bugs involved UB being triggered by integer induction variables with nsw/nuw flags would be reported by the current code.

(See comment in PoisonChecking.cpp for full explanation and context)

Differential Revision: https://reviews.llvm.org/D64215

llvm-svn: 365536
2019-07-09 18:49:29 +00:00
Tim Northover 60afa49abe OpaquePtr: add Type parameter to Loads analysis API.
This makes the functions in Loads.h require a type to be specified
independently of the pointer Value so that when pointers have no structure
other than address-space, it can still do its job.

Most callers had an obvious memory operation handy to provide this type, but a
SROA and ArgumentPromotion were doing more complicated analysis. They get
updated to merge the properties of the various instructions they were
considering.

llvm-svn: 365468
2019-07-09 11:35:35 +00:00
Serguei Katkov 77bb3a486f [Loop Peeling] Add support for peeling of loops with multiple exits
This patch modifies the loop peeling transformation so that
it does not expect that there is only one loop exit from latch.

It modifies only transformation. Update of branch weights remains
only for exit from latch.

The motivation is that in follow-up patch I plan to enable loop peeling for
loops with multiple exits but only if other exits then from latch one goes to
block with call to deopt.

For now this patch is NFC.

Reviewers: reames, mkuper, iajbar, fhahn	
Reviewed By: reames, fhahn
Subscribers: zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D63921

llvm-svn: 365441
2019-07-09 06:07:25 +00:00
Serguei Katkov c6caddb73d [LoopInfo] Update getExitEdges to accept vector of pairs for non const BasicBlock
D63921 requires getExitEdges fills a vector of Edge pairs where
BasicBlocks are not constant.

The rest Loop API mostly returns non-const BasicBlocks, so to be more consistent with
other Loop API getExitEdges is modified to return non-const BasicBlocks as well.

This is an alternative solution to D64060. 

Reviewers: reames, fhahn
Reviewed By: reames, fhahn
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64309

llvm-svn: 365437
2019-07-09 04:20:43 +00:00
Philip Reames 0e344e9dc5 [LoopPred] Stylistic improvement to recently added NE/EQ normalization [NFC]
llvm-svn: 365425
2019-07-09 02:03:31 +00:00
Philip Reames 5a637cbdc7 [LoopPred] Extend LFTR normalization to the inverse EQ case
A while back, I added support for NE latches formed by LFTR.  I didn't think that quite through, as LFTR will also produce the inverse EQ form for some loops and I hadn't handled that.  This change just adds handling for that case as well.

llvm-svn: 365419
2019-07-09 01:27:45 +00:00
Johannes Doerfert accd3e8747 [Attributor] Deduce the "returned" argument attribute
Deduce the "returned" argument attribute by collecting all potentially
returned values.

Not only the unique return value, if any, can be used by subsequent
attributes but also the set of all potentially returned values as well
as the mapping from returned values to return instructions that they
originate from (see AAReturnedValues::checkForallReturnedValues).

Change in statistics (-stats) for LLVM-TS + Spec2006, totaling ~19% more "returned" arguments.

  ADDED: attributor                   NumAttributesManifested                  n/a ->        637
  ADDED: attributor                   NumAttributesValidFixpoint               n/a ->      25545
  ADDED: attributor                   NumFnArgumentReturned                    n/a ->        637
  ADDED: attributor                   NumFnKnownReturns                        n/a ->      25545
  ADDED: attributor                   NumFnUniqueReturned                      n/a ->      14118
CHANGED: deadargelim                  NumRetValsEliminated                     470 ->        449 (    -4.468%)
REMOVED: functionattrs                NumReturned                              535 ->        n/a
CHANGED: indvars                      NumElimIdentity                          138 ->        164 (   +18.841%)

Reviewers: homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames, efriedma, chandlerc

Subscribers: hiraditya, bollu, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D59919

llvm-svn: 365407
2019-07-08 23:27:20 +00:00
Sanjay Patel 3dee113ebc [InstCombine] fold insertelement into splat of same scalar
Forming the canonical splat shuffle improves analysis and
may allow follow-on transforms (although some possibilities
are missing as shown in the test diffs).

The backend generically turns these patterns into build_vector,
so there should be no codegen regressions. All targets are
expected to be able to lower splats efficiently.

llvm-svn: 365379
2019-07-08 19:48:52 +00:00
Whitney Tsang 7d8f30e6b2 Keep the order of the basic blocks in the cloned loop as the original
loop
Summary:
Do the cloning in two steps, first allocate all the new loops, then
clone the basic blocks in the same order as the original loop.
Reviewer: Meinersbur, fhahn, kbarton, hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, hiraditya, llvm-commits
Tag: https://reviews.llvm.org/D64224
Differential Revision:

llvm-svn: 365366
2019-07-08 18:30:35 +00:00
Brian Homerding 498687bff2 Add, and infer, a nofree function attribute
Removing dead code leftover from refactor.

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D49165

llvm-svn: 365345
2019-07-08 16:33:32 +00:00
Sanjay Patel 0b59103a73 [InstCombine] canonicalize insert+splat to/from element 0 of vector
We recognize a splat from element 0 in (VectorUtils) llvm::getSplatValue()
and also in ShuffleVectorInst::isZeroEltSplatMask(), so this converts
to that form for better matching.

The backend generically turns these patterns into build_vector,
so there should be no codegen difference.

llvm-svn: 365342
2019-07-08 16:26:48 +00:00
Brian Homerding b4b21d807e Add, and infer, a nofree function attribute
This patch adds a function attribute, nofree, to indicate that a function does
not, directly or indirectly, call a memory-deallocation function (e.g., free,
C++'s operator delete).

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D49165

llvm-svn: 365336
2019-07-08 15:57:56 +00:00
Cameron McInally 771769be90 [Float2Int] Add support for unary FNeg to Float2Int
Differential Revision: https://reviews.llvm.org/D63941

llvm-svn: 365324
2019-07-08 14:46:07 +00:00
Philip Reames 9e62c86408 [IRBuilder] Introduce helpers for and/or of multiple values at once
We had versions of this code scattered around, so consolidate into one location.

Not strictly NFC since the order of intermediate results may change in some places, but since these operations are associatives, should not change results.

llvm-svn: 365259
2019-07-06 03:46:18 +00:00
Eugene Leviant 3aef35288b [ThinLTO] Attempt to recommit r365188 after alignment fix
llvm-svn: 365215
2019-07-05 15:25:05 +00:00
Eugene Leviant e91f86f0ac Reverted r365188 due to alignment problems on i686-android
llvm-svn: 365206
2019-07-05 13:26:05 +00:00
Eugene Leviant 820cc01d1e [ThinLTO] Attempt to recommit r365040 after caching fix
It's possible that some function can load and store the same
variable using the same constant expression:

store %Derived* @foo, %Derived** bitcast (%Base** @bar to %Derived**)
%42 = load %Derived*, %Derived** bitcast (%Base** @bar to %Derived**)

The bitcast expression was mistakenly cached while processing loads,
and never examined later when processing store. This caused @bar to
be mistakenly treated as read-only variable. See load-store-caching.ll.

llvm-svn: 365188
2019-07-05 12:00:10 +00:00
Sanjay Patel 75b5edf6a1 [InstCombine] allow undef elements when forming splat from chain of insertelements
We allow forming a splat (broadcast) shuffle, but we were conservatively limiting
that to cases where all elements of the vector are specified. It should be safe
from a codegen perspective to allow undefined lanes of the vector because the
expansion of a splat shuffle would become the chain of inserts again.

Forming splat shuffles can reduce IR and help enable further IR transforms.
Motivating bugs:
https://bugs.llvm.org/show_bug.cgi?id=42174
https://bugs.llvm.org/show_bug.cgi?id=16739

Differential Revision: https://reviews.llvm.org/D63848

llvm-svn: 365147
2019-07-04 16:45:34 +00:00
Serguei Katkov 6d8813a391 [LoopPeel] Some small comment update. NFC.
Follow-up change of comment after
https://reviews.llvm.org/D63917 is landed.

llvm-svn: 365107
2019-07-04 05:10:14 +00:00
Chen Zheng 469f30abab [PowerPC] Hardware Loop branch instruction's condition may not be icmp.
This fixes pr42492.
Differential Revision: https://reviews.llvm.org/D64124

llvm-svn: 365104
2019-07-04 01:51:47 +00:00
Reid Kleckner f7e52fbdb5 Revert [ThinLTO] Optimize writeonly globals out
This reverts r365040 (git commit 5cacb91475)

Speculatively reverting, since this appears to have broken check-lld on
Linux. Partial analysis in https://crbug.com/981168.

llvm-svn: 365097
2019-07-04 00:03:30 +00:00
Eli Friedman 41ee3977c4 [JumpThreading] Fix threading with unusual PHI nodes.
If the block being cloned contains a PHI node, in general, we need to
clone that PHI node, even though it's trivial. If the operand of the PHI
is an instruction in the block being cloned, the correct value for the
operand doesn't exist until SSAUpdater constructs it.

We usually don't hit this issue because we try to avoid threading across
loop headers, but it's possible to hit this in some cases involving
irreducible CFGs.  I added a flag to allow threading across loop headers
to make the testcase easier to understand.

Thanks to Brian Rzycki for reducing the testcase.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42085.

Differential Revision: https://reviews.llvm.org/D63913

llvm-svn: 365094
2019-07-03 23:12:39 +00:00
Philip Reames ea06d63c35 [LFTR] Use SCEVExpander for the pointer limit case instead of manual IR gen
As noted in the test change, this is not trivially NFC, but all of the changes in output are cases where the SCEVExpander form is more canonical/optimal than the hand generation.  

llvm-svn: 365075
2019-07-03 20:03:46 +00:00
Philip Reames 14f1543425 [LFTR] Remove a stray variable shadow *of the same value* [NFC]
llvm-svn: 365072
2019-07-03 19:08:43 +00:00
Philip Reames e7a258c6d9 [LFTR] Style and comment changes to clarify the narrow vs wide bitwidth evaluation behavior [NFC]
llvm-svn: 365071
2019-07-03 19:03:37 +00:00
Philip Reames abc8f344d6 [LFTR] Sink the decision not use truncate scheme for constants into genLoopLimit [NFC]
We might as well just evaluate the constants using SCEV, and having the cases grouped makes the logic slightly easier to read anyway.

llvm-svn: 365070
2019-07-03 18:41:03 +00:00
Philip Reames 4c80281c96 [LFTR] Remove falsely generalized (dead) code [NFC]
llvm-svn: 365067
2019-07-03 18:24:06 +00:00
Philip Reames 83cca94194 [LFTR] Hoist extend expressions outside of loops w/o waiting for LICM
The motivation for this is two fold:
1) Make the output (and thus tests)  a bit more readable to a human trying to understand the result of the transform
2) Reduce spurious diffs in a potential future change to restructure all of this logic to use SCEVExpander (which hoists by default)

llvm-svn: 365066
2019-07-03 18:18:36 +00:00
Eugene Leviant 5cacb91475 [ThinLTO] Optimize writeonly globals out
Differential revision: https://reviews.llvm.org/D63444

llvm-svn: 365040
2019-07-03 14:14:52 +00:00
Roman Lebedev 9f0c83902d [InstCombine] Y - ~X --> X + Y + 1 fold (PR42457)
Summary:
I *think* we'd want this new variant, because we obviously
have better handling for `add` as compared to `sub`/`not`.

https://rise4fun.com/Alive/WMn

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]]

Reviewers: spatel, nikic, huihuiz, efriedma

Reviewed By: spatel

Subscribers: RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63992

llvm-svn: 365011
2019-07-03 09:41:50 +00:00
Alexander Potapenko f82672873a MSan: handle callbr instructions
Summary:
Handling callbr is very similar to handling an inline assembly call:
MSan must checks the instruction's inputs.
callbr doesn't (yet) have outputs, so there's nothing to unpoison,
and conservative assembly handling doesn't apply either.

Fixes PR42479.

Reviewers: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64072

llvm-svn: 365008
2019-07-03 09:28:50 +00:00
Serguei Katkov c22e772a28 [LoopPeel] Re-factor llvm::peelLoop method. NFC.
Extract code dealing with branch weights in separate functions.

Reviewers: reames, mkuper, iajbar, fhahn
Reviewed By: reames, fhahn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D63917

llvm-svn: 365002
2019-07-03 05:59:23 +00:00
Chen Zheng dfdccbb26b [PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware loop.
Differential Revision: https://reviews.llvm.org/D63477

llvm-svn: 364993
2019-07-03 01:49:03 +00:00
David Bolvansky 10ee3ac396 [NFC] Strenghten isInteger condition for rL364940
llvm-svn: 364969
2019-07-02 21:16:34 +00:00
Vasileios Porpodas cf47ff5ffb [SLP] Recommit: Look-ahead operand reordering heuristic.
Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).

Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk

Reviewed By: RKSimon, dtemirbulatov

Subscribers: hiraditya, phosek, rnk, rcorcs, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60897

llvm-svn: 364964
2019-07-02 20:20:28 +00:00
Teresa Johnson a700436323 [ThinLTO] Add summary entries for index-based WPD
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).

Also added are the necessary bitcode records, and the corresponding
assembly support.

The follow-on index-based WPD patch is D55153.

Depends on D53890.

Reviewers: pcc

Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D54815

llvm-svn: 364960
2019-07-02 19:38:02 +00:00
David Bolvansky cb1a5a705c [SimplifyLibCalls] powf(x, sitofp(n)) -> powi(x, n)
Summary:
Partially solves https://bugs.llvm.org/show_bug.cgi?id=42190



Reviewers: spatel, nikic, efriedma

Reviewed By: efriedma

Subscribers: efriedma, nikic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63038

llvm-svn: 364940
2019-07-02 15:58:45 +00:00
Serge Guelton 4137aeb4bf Provide basic Full LTO extension points
Differential Revision: https://reviews.llvm.org/D61738

llvm-svn: 364937
2019-07-02 15:52:39 +00:00
Roman Lebedev 0bde7c6527 [InstCombine] Shift amount reassociation: fixup constantexpr handling (PR42484)
I was actually wondering if there was some nicer way than m_Value()+cast,
but apparently what i was really "subconsciously" thinking about
was correctness issue.

hasNoUnsignedWrap()/hasNoUnsignedWrap() exist for Instruction,
not for BinaryOperator, so let's just use m_Instruction(),
thus both avoiding a cast, and a crash.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42484,
      https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15587

llvm-svn: 364915
2019-07-02 12:54:48 +00:00
Reid Kleckner d72163947a [PGO] Update ICP pass for recent byval type changes
Fixes verifier errors encountered in PR42413.

Reviewers: xur, t.p.northover, inglorion, gbiv, george.burgess.iv

Differential Revision: https://reviews.llvm.org/D63842

llvm-svn: 364861
2019-07-01 22:43:39 +00:00
Sanjay Patel ddc1b40f26 [InstCombine] reduce more checks for power-of-2-or-zero using ctpop
Extends the transform from:
rL364341
...to include another (more common?) pattern that tests whether a
value is a power-of-2 (including or excluding zero).

llvm-svn: 364856
2019-07-01 22:00:00 +00:00
Jordan Rupprecht a7972dc04a Revert [SLP] Look-ahead operand reordering heuristic.
This reverts r364478 (git commit 574cb0eb3a)

The patch is causing compilation timeouts.

llvm-svn: 364846
2019-07-01 21:10:43 +00:00
Roman Lebedev 04d3d3bbff [InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459)
Summary:
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

This does address the regression being exposed in D63992.

https://godbolt.org/z/7DGltU
https://rise4fun.com/Alive/EPO0

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 | PR42459 ]]

Reviewers: spatel, nikic, huihuiz

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63993

llvm-svn: 364792
2019-07-01 15:55:24 +00:00
Roman Lebedev 72b8d41ce8 [InstCombine] Shift amount reassociation in bittest (PR42399)
Summary:
Given pattern:
`icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0`
we should move shifts to the same hand of 'and', i.e. rewrite as
`icmp eq/ne (and (x shift (Q+K)), y), 0`  iff `(Q+K) u< bitwidth(x)`

It might be tempting to not restrict this to situations where we know
we'd fold two shifts together, but i'm not sure what rules should there be
to avoid endless combine loops.

We pick the same shift that was originally used to shift the variable we picked to shift:
https://rise4fun.com/Alive/6x1v

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 | PR42399]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63829

llvm-svn: 364791
2019-07-01 15:55:15 +00:00
Roman Lebedev f55818e3a7 [InstCombine] Omit 'urem' where possible
This was added in D63390 / rL364286 to backend,
but it makes sense to also handle it in middle-end.
https://rise4fun.com/Alive/Zsln

llvm-svn: 364738
2019-07-01 09:41:43 +00:00
Yevgeny Rouban d4097b4a93 [SimpleLoopUnswitch] Implement handling of prof branch_weights metadata for SwitchInst
Differential Revision: https://reviews.llvm.org/D60606

llvm-svn: 364734
2019-07-01 08:43:53 +00:00
Sanjay Patel 706b48251f [InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics
This is the opposite direction of D62158 (we have to choose 1 form or the other).
Now that we have FMF on the select, this becomes more palatable. And the benefits
of having a single IR instruction for this operation (less chances of missing folds
based on extra uses, etc) overcome my previous comments about the potential advantage
of larger pattern matching/analysis.

Differential Revision: https://reviews.llvm.org/D62414

llvm-svn: 364721
2019-06-30 13:40:31 +00:00
Fangrui Song 78ee2fbf98 Cleanup: llvm::bsearch -> llvm::partition_point after r364719
llvm-svn: 364720
2019-06-30 11:19:56 +00:00
Nikita Popov 8023c84433 [LFTR] Rephrase getLoopTest into "based-on" check; NFCI
What we want to know here is whether we're already using this value
for the loop condition, so make the query about that. We can extend
this to a more general "based-on" relationship, rather than a direct
icmp use later.

llvm-svn: 364715
2019-06-29 15:12:59 +00:00
Sanjay Patel 77dc1e8568 [InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnum
This transform came up in D62414, but we should deal with it first.
We have LLVM intrinsics that correspond exactly to libm calls (unlike
most libm calls, these libm calls never set errno).
This holds without any fast-math-flags, so we should always canonicalize
to those intrinsics directly for better optimization.
Currently, we convert to fcmp+select only when we have FMF (nnan) because
fcmp+select does not preserve the semantics of the call in the general case.

Differential Revision: https://reviews.llvm.org/D63214

llvm-svn: 364714
2019-06-29 14:28:54 +00:00
Nikita Popov 61a8b62b4c [LFTR] Remove unnecessary latch check; NFCI
The whole indvars pass works on loops in simplified form, so there
is always a unique latch. Convert the condition into an assertion
in needsLFTR (though we also assert this in later LFTR functions).

Additionally update the comment on getLoopTest() now that we are
dealing with multiple exits.

llvm-svn: 364713
2019-06-29 12:41:02 +00:00
Roman Lebedev e3a94ba4a9 [InstCombine] Shift amount reassociation (PR42391)
Summary:
Given pattern:
`(x shiftopcode Q) shiftopcode K`
we should rewrite it as
`x shiftopcode (Q+K)`  iff `(Q+K) u< bitwidth(x)`
This is valid for any shift, but they must be identical.

* https://rise4fun.com/Alive/9E2
* exact on both lshr => exact https://rise4fun.com/Alive/plHk
* exact on both ashr => exact https://rise4fun.com/Alive/QDAA
* nuw on both shl => nuw https://rise4fun.com/Alive/5Uk
* nsw on both shl => nsw https://rise4fun.com/Alive/0plg

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42391 | PR42391]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: nikic

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63812

llvm-svn: 364712
2019-06-29 11:51:50 +00:00
Nikita Popov 2d756c4feb [LFTR] Fix post-inc pointer IV with truncated exit count (PR41998)
Fixes https://bugs.llvm.org/show_bug.cgi?id=41998. Usually when we
have a truncated exit count we'll truncate the IV when comparing
against the limit, in which case exit count overflow in post-inc
form doesn't matter. However, for pointer IVs we don't do that, so
we have to be careful about incrementing the IV in the wide type.

I'm fixing this by removing the IVCount variable (which was
ExitCount or ExitCount+1) and replacing it with a UsePostInc flag,
and then moving the actual limit adjustment to the individual cases
(which are: pointer IV where we add to the wide type, integer IV
where we add to the narrow type, and constant integer IV where we
add to the wide type).

Differential Revision: https://reviews.llvm.org/D63686

llvm-svn: 364709
2019-06-29 09:24:12 +00:00
Philip Reames 1504b6ee7e [IndVars] Remove a bit of manual constant folding [NFC]
SCEV is more than capable of folding (add x, trunc(0)) to x.  

llvm-svn: 364693
2019-06-29 00:19:31 +00:00
Cameron McInally 30e5cf1d8f [NewGVN] Add unary FNeg support to NewGVN pass
Differential Revision: https://reviews.llvm.org/D63933

llvm-svn: 364680
2019-06-28 20:09:32 +00:00
Cameron McInally ab4b2364e5 [GVNSink] Add unary FNeg support to GVNSink pass
Differential Revision: https://reviews.llvm.org/D63900

llvm-svn: 364678
2019-06-28 19:57:31 +00:00
Peter Collingbourne 7108df964a hwasan: Remove the old frame descriptor mechanism.
Differential Revision: https://reviews.llvm.org/D63470

llvm-svn: 364665
2019-06-28 17:53:26 +00:00
Peter Collingbourne 5378afc02a hwasan: Use llvm.read_register intrinsic to read the PC on aarch64 instead of taking the function's address.
This shaves an instruction (and a GOT entry in PIC code) off prologues of
functions with stack variables.

Differential Revision: https://reviews.llvm.org/D63472

llvm-svn: 364608
2019-06-27 23:24:07 +00:00
Cameron McInally 6e62a796d5 [GVN] Add support for unary FNeg to GVN pass
Differential Revision: https://reviews.llvm.org/D63896

llvm-svn: 364592
2019-06-27 21:05:02 +00:00
Johannes Doerfert 3b77583e95 [Attr] Add "willreturn" function attribute
This patch introduces a new function attribute, willreturn, to indicate
that a call of this function will either exhibit undefined behavior or
comes back and continues execution at a point in the existing call stack
that includes the current invocation.

This attribute guarantees that the function does not have any endless
loops, endless recursion, or terminating functions like abort or exit.

Patch by Hideto Ueno (@uenoku)

Reviewers: jdoerfert

Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62801

llvm-svn: 364555
2019-06-27 15:51:40 +00:00
Tim Northover 22c96a966b IR: compare type attributes deeply when looking into functions.
FunctionComparator attempts to produce a stable comparison of two Function
instances by looking at all available properties. Since ByVal attributes now
contain a Type pointer, they are not trivially ordered and FunctionComparator
should use its own Type comparison logic to sort them.

llvm-svn: 364523
2019-06-27 11:44:45 +00:00
Stefan Stipanovic 5360589b7d [Attributor] Deducing existing nounwind attribute.
Adding nounwind deduction in new attributor framework.

Reviewers: jdoerfert, uenoku

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D63379

llvm-svn: 364521
2019-06-27 11:27:54 +00:00
Gerolf Hoflehner e311a4d5c4 [SCCP] Fix non-deterministic uselists of return values (DenseMap -> MapVector)
llvm-svn: 364482
2019-06-26 21:44:37 +00:00
Vasileios Porpodas 574cb0eb3a [SLP] Look-ahead operand reordering heuristic.
Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).

Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk

Reviewed By: RKSimon, dtemirbulatov

Subscribers: rnk, rcorcs, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60897

llvm-svn: 364478
2019-06-26 21:25:24 +00:00
Guanzhong Chen 9aad997a5a [WebAssembly] Implement Address Sanitizer for Emscripten
Summary:
This diff enables address sanitizer on Emscripten.

On Emscripten, real memory starts at the value passed to --global-base.

All memory before this is used as shadow memory, and thus the shadow mapping
function is simply dividing by 8.

Reviewers: tlively, aheejin, sbc100

Reviewed By: sbc100

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D63742

llvm-svn: 364468
2019-06-26 20:16:14 +00:00
Philip Reames 03b2e2d986 [IndVars] Kill a redundant bit of debug output
llvm-svn: 364449
2019-06-26 17:19:09 +00:00
Sanjay Patel 71ad22707c [InstCombine] simplify code for inserts -> splat; NFC
llvm-svn: 364441
2019-06-26 15:52:59 +00:00
Clement Courbet 2851248fa1 Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."
Breaks sanitizers:
    libFuzzer :: cxxstring.test
    libFuzzer :: memcmp.test
    libFuzzer :: recommended-dictionary.test
    libFuzzer :: strcmp.test
    libFuzzer :: value-profile-mem.test
    libFuzzer :: value-profile-strcmp.test

llvm-svn: 364416
2019-06-26 12:13:13 +00:00
Clement Courbet 7b3a5f0e6d [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.
This allows later passes (in particular InstCombine) to optimize more
cases.

One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0.

llvm-svn: 364412
2019-06-26 11:50:18 +00:00
Florian Hahn 4c11b5268c [LoopUnroll] Add support for loops with exiting headers and uncond latches.
This patch generalizes the UnrollLoop utility to support loops that exit
from the header instead of the latch. Usually, LoopRotate would take care
of must of those cases, but in some cases (e.g. -Oz), LoopRotate does
not kick in.

Codesize impact looks relatively neutral on ARM64 with -Oz + LTO.

Program                                         master     patch     diff
 External/S.../CFP2006/447.dealII/447.dealII   629060.00  627676.00  -0.2%
 External/SPEC/CINT2000/176.gcc/176.gcc        1245916.00 1244932.00 -0.1%
 MultiSourc...Prolangs-C/simulator/simulator   86100.00   86156.00    0.1%
 MultiSourc...arks/Rodinia/backprop/backprop   66212.00   66252.00    0.1%
 MultiSourc...chmarks/Prolangs-C++/life/life   67276.00   67312.00    0.1%
 MultiSourc...s/Prolangs-C/compiler/compiler   69824.00   69788.00   -0.1%
 MultiSourc...Prolangs-C/assembler/assembler   86672.00   86696.00    0.0%

Reviewers: efriedma, vsk, paquette

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D61962

llvm-svn: 364398
2019-06-26 09:16:57 +00:00
Huihui Zhang b90cb57b63 [InstCombine] Simplify icmp ult/uge (shl %x, C2), C1 iff C1 is power of two -> icmp eq/ne (and %x, (lshr -C1, C2)), 0.
Simplify 'shl' inequality test into 'and' equality test.

This pattern happens in the middle-end while simplifying bitfield access,
Exposed in https://reviews.llvm.org/D63505

https://rise4fun.com/Alive/6uz

Reviewers: lebedev.ri, efriedma

Reviewed By: lebedev.ri

Subscribers: spatel, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63675

llvm-svn: 364348
2019-06-25 20:44:52 +00:00
Philip Reames c42a357178 [LFTR] Adjust debug output to include extensions (if any)
llvm-svn: 364346
2019-06-25 20:14:08 +00:00
Sanjay Patel fcfa056ceb [InstCombine] reduce checks for power-of-2-or-zero using ctpop
This follows up the transform from rL363956 to use the ctpop intrinsic when checking for power-of-2-or-zero.

This is matching the isPowerOf2() patterns used in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

But there's at least 1 instcombine follow-up needed to match the alternate form:

(v & (v - 1)) == 0;

We should have all of the backend expansions handled with:
rL364319
(x86-specific changes still needed for optimal code based on subtarget)

And the larger patterns to exclude zero as a power-of-2 are joining with this change after:
rL364153 ( D63660 )
rL364246

Differential Revision: https://reviews.llvm.org/D63777

llvm-svn: 364341
2019-06-25 18:51:44 +00:00
Whitney Tsang 7c1deeff4a Expand cloneLoopWithPreheader() to support cloning loop nest
Summary: cloneLoopWithPreheader() currently only support innermost loop,
and assert otherwise.
Reviewers: Meinersbur, fhahn, kbarton
Reviewed By: Meinersbur
Subscribers: hiraditya, jsji, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D63446

llvm-svn: 364310
2019-06-25 13:23:13 +00:00
Clement Courbet 3bc5ad551a [ExpandMemCmp] Move all options to TargetTransformInfo.
Split off from D60318.

llvm-svn: 364281
2019-06-25 08:04:13 +00:00
Huihui Zhang 4626613ffe [InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier.
Summary:
To generate simplified IR, make sure fold
  (X & ~C) ==/!= 0 --> X u</u>= C+1

is scheduled before fold
  ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.

https://rise4fun.com/Alive/7ZN

Reviewers: lebedev.ri, efriedma, spatel, craig.topper

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63505

llvm-svn: 364255
2019-06-25 00:09:10 +00:00
Sanjay Patel 2675b0c8ab [InstCombine] squash is-not-power-of-2 using ctpop
This is the Demorgan'd 'not' of the pattern handled in:
D63660 / rL364153

This is another intermediate IR step towards solving PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

We can test if a value is not a power-of-2 using ctpop(X) > 1,
so combining that with an is-zero check of the input is the
same as testing if not exactly 1 bit is set:

(X == 0) || (ctpop(X) u> 1) --> ctpop(X) != 1

llvm-svn: 364246
2019-06-24 22:35:26 +00:00
Vasileios Porpodas 3081f78776 [SLP] NFC: Fixed typo in comment
llvm-svn: 364237
2019-06-24 21:40:48 +00:00
Matt Arsenault 8025842599 InstCombine: Preserve nuw when reassociating nuw ops [3/3]
Alive says this is OK.

llvm-svn: 364235
2019-06-24 21:37:03 +00:00
Matt Arsenault 5d82ecd5d9 InstCombine: Preserve nuw when reassociating nuw ops [2/3]
Alive says this is OK.

llvm-svn: 364234
2019-06-24 21:37:02 +00:00
Matt Arsenault 5a89ba7343 InstCombine: Preserve nuw when reassociating nuw ops [1/3]
Alive says this is OK.

llvm-svn: 364233
2019-06-24 21:36:59 +00:00
Nikita Popov f1ffc4305d [CVP] Reenable nowrap flag inference
Inference of nowrap flags in CVP has been disabled, because it
triggered a bug in LFTR (https://bugs.llvm.org/show_bug.cgi?id=31181).
This issue has been fixed in D60935, so we should be able to reenable
nowrap flag inference now.

Differential Revision: https://reviews.llvm.org/D62776

llvm-svn: 364228
2019-06-24 20:13:13 +00:00
Cameron McInally fe3f15cf90 [SLP] Support unary FNeg vectorization
Differential Revision: https://reviews.llvm.org/D63609

llvm-svn: 364219
2019-06-24 19:24:23 +00:00
Sanjay Patel 89efefb170 [InstCombine] reduce funnel-shift i16 X, X, 8 to bswap X
Prefer the more exact intrinsic to remove a use of the input value
and possibly make further transforms easier (we will still need
to match patterns with funnel-shift of wider types as pieces of
bswap, especially if we want to canonicalize to funnel-shift with
constant shift amount). Discussed in D46760.

llvm-svn: 364187
2019-06-24 15:20:49 +00:00
Simon Pilgrim b617b0808d [InstCombine] SliceUpIllegalIntegerPHI - bail on out of range shifts
trunc(lshr) handling - if the shift is out of range (undefined) then bail like we do for non-constant shifts.

Fixes OSS Fuzz #15217

llvm-svn: 364181
2019-06-24 13:13:36 +00:00
Sanjoy Das e2291f5af9 Fix typo in comment; NFC
llvm-svn: 364159
2019-06-23 19:22:13 +00:00
Philip Reames d22a2a9a72 [IndVars] Remove dead instructions after folding trivial loop exit
In rL364135, I taught IndVars to fold exiting branches in loops with a zero backedge taken count (i.e. loops that only run one iteration).  This extends that to eliminate the dead comparison left around.  

llvm-svn: 364155
2019-06-23 17:06:57 +00:00
Sanjay Patel 13a5ae58fc [InstCombine] squash is-power-of-2 that uses ctpop
This is another intermediate IR step towards solving PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

We can test if a value is power-of-2-or-0 using ctpop(X) < 2,
so combining that with a non-zero check of the input is the
same as testing if exactly 1 bit is set:

(X != 0) && (ctpop(X) u< 2) --> ctpop(X) == 1

Differential Revision: https://reviews.llvm.org/D63660

llvm-svn: 364153
2019-06-23 14:22:37 +00:00
Philip Reames 8deb84c8ef Exploit a zero LoopExit count to eliminate loop exits
This turned out to be surprisingly effective. I was originally doing this just for completeness sake, but it seems like there are a lot of cases where SCEV's exit count reasoning is stronger than it's isKnownPredicate reasoning.

Once this is in, I'm thinking about trying to build on the same infrastructure to eliminate provably untaken checks. There may be something generally interesting here.

Differential Revision: https://reviews.llvm.org/D63618

llvm-svn: 364135
2019-06-22 17:54:25 +00:00
Nikita Popov 8c8e40f763 [NewGVN] Fix copy/paste mistake in cast
llvm-svn: 364130
2019-06-22 10:20:13 +00:00
Nikita Popov e96fda726e [NewGVN] Remove dead SwitchEdges variable; NFC
llvm-svn: 364129
2019-06-22 10:20:07 +00:00
Reid Kleckner 592a193285 Revert [SLP] Look-ahead operand reordering heuristic.
This reverts r364084 (git commit 5698921be2)

It caused crashes while compiling a file in Chrome. Reduction
forthcoming.

llvm-svn: 364111
2019-06-21 23:10:25 +00:00
Julian Lettner 19c4d660f4 [ASan] Use dynamic shadow on 32-bit iOS and simulators
The VM layout on iOS is not stable between releases. On 64-bit iOS and
its derivatives we use a dynamic shadow offset that enables ASan to
search for a valid location for the shadow heap on process launch rather
than hardcode it.

This commit extends that approach for 32-bit iOS plus derivatives and
their simulators.

rdar://50645192
rdar://51200372
rdar://51767702

Reviewed By: delcypher

Differential Revision: https://reviews.llvm.org/D63586

llvm-svn: 364105
2019-06-21 21:01:39 +00:00
Simon Pilgrim 5698921be2 [SLP] Look-ahead operand reordering heuristic.
This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).

Committed on behalf of @vporpo (Vasileios Porpodas)

Differential Revision: https://reviews.llvm.org/D60897

llvm-svn: 364084
2019-06-21 17:57:01 +00:00
David Bolvansky dbcdad51ff [InstCombine] (1 << (C - x)) -> ((1 << C) >> x) if C is bitwidth - 1
Summary:
```
%a = sub i32 31, %x
%r = shl i32 1, %a
  =>
%d = shl i32 1, 31
%r = lshr i32 %d, %x

Done: 1
Optimization is correct!
```

https://rise4fun.com/Alive/btZm

Reviewers: spatel, lebedev.ri, nikic

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63652

llvm-svn: 364073
2019-06-21 16:25:32 +00:00
David Bolvansky 4b28478389 [InstCombine] cttz(abs(x)) -> cttz(x)
Summary: Signedness does not change number of trailing zeros.

Reviewers: spatel, lebedev.ri, nikic

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D63546

llvm-svn: 364064
2019-06-21 15:26:22 +00:00
Sanjay Patel ddb9093684 [GVNSink] prevent crashing on mismatched instructions (PR42346)
Patch based on suggestion by James Molloy (@jmolloy) in:
https://bugs.llvm.org/show_bug.cgi?id=42346

llvm-svn: 364062
2019-06-21 15:17:24 +00:00
Jay Foad d9d3c91b48 [Scalarizer] Propagate IR flags
Summary:
The motivation for this was to propagate fast-math flags like nnan and
ninf on vector floating point operations to the corresponding scalar
operations to take advantage of follow-on optimizations. But I think
the same argument applies to all of our IR flags: if they apply to the
vector operation then they also apply to all the individual scalar
operations, and they might enable follow-on optimizations.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63593

llvm-svn: 364051
2019-06-21 14:10:18 +00:00
Fangrui Song dc8de6037c Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC
llvm-svn: 364006
2019-06-21 05:40:31 +00:00
Cameron McInally 1c0bd6dd2c [Reassociate] Remove bogus assert reported in PR42349.
Also, add a FIXME for the unsafe transform on a unary FNeg. A unary FNeg can only be transformed to a FMul by -1.0 when the nnan flag is present. The unary FNeg project is a WIP, so the unsafe transformation is acceptable until that work is complete.

The bogus assert with introduced in D63445.

llvm-svn: 363998
2019-06-20 23:03:55 +00:00
Rainer Orth 6fde832b82 [profile] Solaris ld supports __start___llvm_prof_data etc. labels
Currently, many profiling tests on Solaris FAIL like

  Command Output (stderr):
  --
  Undefined                       first referenced
   symbol                             in file
  __llvm_profile_register_names_function /tmp/lit_tmp_Nqu4eh/infinite_loop-9dc638.o
  __llvm_profile_register_function    /tmp/lit_tmp_Nqu4eh/infinite_loop-9dc638.o

Solaris 11.4 ld supports the non-standard GNU ld extension of adding
__start_SECNAME and __stop_SECNAME labels to sections whose names are valid
as C identifiers.  Given that we already use Solaris 11.4-only features
like ld -z gnu-version-script-compat and fully working .preinit_array
support in compiler-rt, we don't need to worry about older versions of
Solaris ld.

The patch documents that support (although the comment in
lib/Transforms/Instrumentation/InstrProfiling.cpp
(needsRuntimeRegistrationOfSectionRange) is quite cryptic what it's
actually about), and adapts the affected testcase not to expect the
alternativeq __llvm_profile_register_functions and __llvm_profile_init.
It fixes all affected tests.

Tested on amd64-pc-solaris2.11.

Differential Revision: https://reviews.llvm.org/D41111

llvm-svn: 363984
2019-06-20 21:27:06 +00:00
Alina Sbirlea d0b11698cd [LICM & MSSA] Limit unsafe sinking and hoisting.
Summary:
The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being
clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch).
If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink.
Given:
```
for i
  load a[i]
  store a[i]
```
The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it.
In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either
Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs.

Caught by PR42294.

This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict
hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63582

llvm-svn: 363982
2019-06-20 21:09:09 +00:00
Sanjay Patel 273d97e6bf [InstCombine] fix typo in comment; NFC
llvm-svn: 363974
2019-06-20 20:23:32 +00:00
Philip Reames a7fd8a806f [LFTR] Fix a (latent?) bug related to nested loops
I can't actually come up with a test case this triggers on without an out of tree change, but in theory, it's a bug in the recently added multiple exit LFTR support.  The root issue is that an exiting block common to two loops can (in theory) have computable exit counts for both loops.  Rewriting the exit of an inner loop in terms of the outer loops IV would cause the inner loop to either a) run forever, or b) terminate on the first iteration.

In practice, we appear to get lucky and not have the exit count computable for the outer loop, except when it's trivially zero.  Given we bail on zero exit counts, we don't appear to ever trigger this.  But I can't come up with a reason we *can't* compute an exit count for the outer loop on the common exiting block, so this may very well be triggering in some cases.

llvm-svn: 363964
2019-06-20 18:45:06 +00:00
Sanjay Patel 63311bfb83 [InstCombine] canonicalize check for power-of-2
The form that compares against 0 is better because:
1. It removes a use of the input value.
2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2
3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS).

This is a root cause for PR42314, but probably doesn't completely answer the codegen request:
https://bugs.llvm.org/show_bug.cgi?id=42314

Alive proof:
https://rise4fun.com/Alive/9kG

  Name: is power-of-2
  %neg = sub i32 0, %x
  %a = and i32 %neg, %x
  %r = icmp eq i32 %a, %x
  =>
  %dec = add i32 %x, -1
  %a2 = and i32 %dec, %x
  %r = icmp eq i32 %a2, 0

  Name: is not power-of-2
  %neg = sub i32 0, %x
  %a = and i32 %neg, %x
  %r = icmp ne i32 %a, %x
  =>
  %dec = add i32 %x, -1
  %a2 = and i32 %dec, %x
  %r = icmp ne i32 %a2, 0

llvm-svn: 363956
2019-06-20 17:41:15 +00:00
David Bolvansky 01511192b2 [InstCombine] cttz(-x) -> cttz(x)
Summary: Signedness does not change number of trailing zeros.

Reviewers: spatel, lebedev.ri, nikic

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63534

llvm-svn: 363951
2019-06-20 17:04:14 +00:00
Philip Reames eda1ba65ca LFTR for multiple exit loops
Teach IndVarSimply's LinearFunctionTestReplace transform to handle multiple exit loops. LFTR does two key things 1) it rewrites (all) exit tests in terms of a common IV potentially eliminating one in the process and 2) it moves any offset/indexing/f(i) style logic out of the loop.

This turns out to actually be pretty easy to implement. SCEV already has all the information we need to know what the backedge taken count is for each individual exit. (We use that when computing the BE taken count for the loop as a whole.) We basically just need to iterate through the exiting blocks and apply the existing logic with the exit specific BE taken count. (The previously landed NFC makes this super obvious.)

I chose to go ahead and apply this to all loop exits instead of only latch exits as originally proposed. After reviewing other passes, the only case I could find where LFTR form was harmful was LoopPredication. I've fixed the latch case, and guards aren't LFTRed anyways. We'll have some more work to do on the way towards widenable_conditions, but that's easily deferred.

I do want to note that I added one bit after the review.  When running tests, I saw a new failure (no idea why didn't see previously) which pointed out LFTR can rewrite a constant condition back to a loop varying one.  This was theoretically possible with a single exit, but the zero case covered it in practice.  With multiple exits, we saw this happening in practice for the eliminate-comparison.ll test case because we'd compute a ExitCount for one of the exits which was guaranteed to never actually be reached.  Since LFTR ran after simplifyAndExtend, we'd immediately turn around and undo the simplication work we'd just done.  The solution seemed obvious, so I didn't bother with another round of review.

Differential Revision: https://reviews.llvm.org/D62625

llvm-svn: 363883
2019-06-19 21:58:25 +00:00
Philip Reames ce53e2226c [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC]
(Resumbit of r363292 which was reverted along w/an earlier patch)

llvm-svn: 363877
2019-06-19 20:45:57 +00:00
Philip Reames f8104f01e6 [LFTR] Rename variable to minimize confusion [NFC]
(Recommit of r363293 which was reverted when a dependent patch was.)

As pointed out by Nikita in D62625, BackedgeTakenCount is generally used to refer to the backedge taken count of the loop. A conditional backedge taken count - one which only applies if a particular exit is taken - is called a ExitCount in SCEV code, so be consistent here.

llvm-svn: 363875
2019-06-19 20:41:28 +00:00
Huihui Zhang 670778c762 [InstCombine] Fold icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0 earlier
Summary:
To generate simplified IR, make sure fold
```
  (X & signbit) ==/!= 0) -> X s>=/s< 0;
```
is scheduled before fold
```
  ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.
```

https://rise4fun.com/Alive/fbdh

Reviewers: lebedev.ri, efriedma, spatel, craig.topper

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63026

llvm-svn: 363845
2019-06-19 17:31:39 +00:00
Cameron McInally 7aa898e61e [DFSan] Add UnaryOperator visitor to DataFlowSanitizer
Differential Revision: https://reviews.llvm.org/D62815

llvm-svn: 363814
2019-06-19 15:11:41 +00:00
Cameron McInally a027cf4764 [Reassociate] Handle unary FNeg in the Reassociate pass
Differential Revision: https://reviews.llvm.org/D63445

llvm-svn: 363813
2019-06-19 14:59:14 +00:00
Orlando Cazalet-Hyams 1251cac62a [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

> llvm-svn: 363046

llvm-svn: 363786
2019-06-19 10:50:47 +00:00
Michael Liao 4f7f70e262 Recommit [SROA] Enhance SROA to handle `addrspacecast`ed allocas
[SROA] Enhance SROA to handle `addrspacecast`ed allocas

- Fix typo in original change
- Add additional handling to ensure all return pointers are properly
  casted.

Summary:
- After `addrspacecast` is allowed to be eliminated in SROA, the
  adjusting of storage pointer (from `alloca) needs to handle the
  potential different address spaces between the storage pointer (from
  alloca) and the pointer being used.

Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63501

llvm-svn: 363743
2019-06-18 21:41:13 +00:00
Gor Nishanov 3fcad775c0 [coroutines] Add missing pass dependency.
Summary:
CoroSplit depends on CallGraphWrapperPass, but it was not explicitly adding it as a pass dependency.

This missing dependency can trigger errors / assertions / crashes in PMTopLevelManager::schedulePass() under certain configurations.

Author: ben-clayton

Reviewers: GorNishanov

Reviewed By: GorNishanov

Subscribers: capn, EricWF, modocache, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63144

llvm-svn: 363727
2019-06-18 19:49:48 +00:00
Jordan Rupprecht 33e85ad956 Revert [SROA] Enhance SROA to handle `addrspacecast`ed allocas
This reverts r363711 (git commit 76a149ef81)

This causes stage2 build failures, e.g.:
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/132/steps/stage%202%20build/logs/stdio
http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/87/steps/build-stage2-unified-tree/logs/stdio

llvm-svn: 363718
2019-06-18 18:40:04 +00:00
Michael Liao 76a149ef81 [SROA] Enhance SROA to handle `addrspacecast`ed allocas
Summary:
- After `addrspacecast` is allowed to be eliminated in SROA, the
  adjusting of storage pointer (from `alloca) needs to handle the
  potential different address spaces between the storage pointer (from
  alloca) and the pointer being used.

Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63501

llvm-svn: 363711
2019-06-18 17:58:49 +00:00
Yevgeny Rouban 69daf4a72d [SimplifyCFG] NFC, prof branch_weighs handling is simplified
Using the new SwitchInstProfUpdateWrapper this patch
simplifies 3 places of prof branch_weights handling.

Differential Revision: https://reviews.llvm.org/D62123

llvm-svn: 363652
2019-06-18 06:50:52 +00:00
Peter Collingbourne d57f7cc15e hwasan: Use bits [3..11) of the ring buffer entry address as the base stack tag.
This saves roughly 32 bytes of instructions per function with stack objects
and causes us to preserve enough information that we can recover the original
tags of all stack variables.

Now that stack tags are deterministic, we no longer need to pass
-hwasan-generate-tags-with-calls during check-hwasan. This also means that
the new stack tag generation mechanism is exercised by check-hwasan.

Differential Revision: https://reviews.llvm.org/D63360

llvm-svn: 363636
2019-06-17 23:39:51 +00:00
Peter Collingbourne fb9ce100d1 hwasan: Add a tag_offset DWARF attribute to instrumented stack variables.
The goal is to improve hwasan's error reporting for stack use-after-return by
recording enough information to allow the specific variable that was accessed
to be identified based on the pointer's tag. Currently we record the PC and
lower bits of SP for each stack frame we create (which will eventually be
enough to derive the base tag used by the stack frame) but that's not enough
to determine the specific tag for each variable, which is the stack frame's
base tag XOR a value (the "tag offset") that is unique for each variable in
a function.

In IR, the tag offset is most naturally represented as part of a location
expression on the llvm.dbg.declare instruction. However, the presence of the
tag offset in the variable's actual location expression is likely to confuse
debuggers which won't know about tag offsets, and moreover the tag offset
is not required for a debugger to determine the location of the variable on
the stack, so at the DWARF level it is represented as an attribute so that
it will be ignored by debuggers that don't know about it.

Differential Revision: https://reviews.llvm.org/D63119

llvm-svn: 363635
2019-06-17 23:39:41 +00:00
Philip Reames 44475363e8 Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges
This patch really contains two pieces:
    Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi.
    Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses.

Differential Revision: https://reviews.llvm.org/D63224

llvm-svn: 363619
2019-06-17 21:06:17 +00:00
Philip Reames fe8bd96ebd Fix a bug w/inbounds invalidation in LFTR (recommit)
Recommit r363289 with a bug fix for crash identified in pr42279.  Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case.  Test case previously committed in r363601.

Original commit comment follows:

This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939

llvm-svn: 363613
2019-06-17 20:32:22 +00:00
Joseph Tremoulet daa1ae6142 [EarlyCSE] Fix hashing of self-compares
Summary:
Update compare normalization in SimpleValue hashing to break ties (when
the same value is being compared to itself) by switching to the swapped
predicate if it has a lower numerical value.  This brings the hashing in
line with isEqual, which already recognizes the self-compares with
swapped predicates as equal.

Fixes PR 42280.

Reviewers: spatel, efriedma, nikic, fhahn, uabelho

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63349

llvm-svn: 363598
2019-06-17 19:11:28 +00:00
Matt Arsenault 6d741f29ec AMDGPU: Fold readlane/readfirstlane calls
llvm-svn: 363587
2019-06-17 17:52:35 +00:00
Warren Ristow 6452bdd29b [LV] Suppress vectorization in some nontemporal cases
When considering a loop containing nontemporal stores or loads for
vectorization, suppress the vectorization if the corresponding
vectorized store or load with the aligment of the original scaler
memory op is not supported with the nontemporal hint on the target.

This adds two new functions:
  bool isLegalNTStore(Type *DataType, unsigned Alignment) const;
  bool isLegalNTLoad(Type *DataType, unsigned Alignment) const;

to TTI, leaving the target independent default implementation as
returning true, but with overriding implementations for X86 that
check the legality based on available Subtarget features.

This fixes https://llvm.org/PR40759

Differential Revision: https://reviews.llvm.org/D61764

llvm-svn: 363581
2019-06-17 17:20:08 +00:00
Whitney Tsang 15b7f5b72d PHINode: introduce setIncomingValueForBlock() function, and use it.
Summary:
There is PHINode::getBasicBlockIndex() and PHINode::setIncomingValue()
but no function to replace incoming value for a specified BasicBlock*
predecessor.
Clearly, there are a lot of places that could use that functionality.

Reviewer: craig.topper, lebedev.ri, Meinersbur, kbarton, fhahn
Reviewed By: Meinersbur, fhahn
Subscribers: fhahn, hiraditya, zzheng, jsji, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D63338

llvm-svn: 363566
2019-06-17 14:38:56 +00:00
Matt Arsenault 1df203d78e InferAddressSpaces: Fix cloning original addrspacecast
If an addrspacecast needed to be inserted again, this was creating a
clone of the original cast for each user. Just use the original, which
also saves losing the value name.

llvm-svn: 363562
2019-06-17 14:13:29 +00:00
Bjorn Pettersson 83773b77a5 [LV] Deny irregular types in interleavedAccessCanBeWidened
Summary:
Avoid that loop vectorizer creates loads/stores of vectors
with "irregular" types when interleaving. An example of
an irregular type is x86_fp80 that is 80 bits, but that
may have an allocation size that is 96 bits. So an array
of x86_fp80 is not bitcast compatible with a vector
of the same type.

Not sure if interleavedAccessCanBeWidened is the best
place for this check, but it solves the problem seen
in the added test case. And it is the same kind of check
that already exists in memoryInstructionCanBeWidened.

Reviewers: fhahn, Ayal, craig.topper

Reviewed By: fhahn

Subscribers: hiraditya, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63386

llvm-svn: 363547
2019-06-17 12:02:24 +00:00
Hans Wennborg a9e5d2f35d Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"
Third time's the charm.

This was reverted in r363220 due to being suspected of an internal benchmark
regression and a test failure, none of which turned out to be caused by this.

llvm-svn: 363529
2019-06-17 07:47:28 +00:00
Yevgeny Rouban ee62c40eae [SimplifyCFG] Fix prof branch_weights MD while removing unreachable switch cases
SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata
if unreachable switch cases are removed. This patch fixes this bug by making use
of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122).
A new test is created.

Differential Revision: https://reviews.llvm.org/D62186

llvm-svn: 363527
2019-06-17 05:55:12 +00:00
Nikita Popov 9145562b48 [SimplifyIndVar] Simplify non-overflowing saturating add/sub
If we can detect that saturating math that depends on an IV cannot
overflow, replace it with simple math. This is similar to the CVP
optimization from D62703, just based on a different underlying
analysis (SCEV vs LVI) that catches different cases.

Differential Revision: https://reviews.llvm.org/D62792

llvm-svn: 363489
2019-06-15 08:48:52 +00:00
Akira Hatanaka a704a8f28c [ObjC][ARC] Delete ObjC runtime calls on global variables annotated
with 'objc_arc_inert'

Those calls are no-ops, so they can be safely deleted.

rdar://problem/49839633

Differential Revision: https://reviews.llvm.org/D62433

llvm-svn: 363468
2019-06-14 22:06:32 +00:00
Matt Arsenault 282dac717e SROA: Allow eliminating addrspacecasted allocas
There is a circular dependency between SROA and InferAddressSpaces
today that requires running both multiple times in order to be able to
eliminate all simple allocas and addrspacecasts. InferAddressSpaces
can't remove addrspacecasts when written to memory, and SROA helps
move pointers out of memory.

This should avoid inserting new commuting addrspacecasts with GEPs,
since there are unresolved questions about pointer wrapping between
different address spaces.

For now, don't replace volatile operations that don't match the alloca
addrspace, as it would change the address space of the access. It may
be still OK to insert an addrspacecast from the new alloca, but be
more conservative for now.

llvm-svn: 363462
2019-06-14 21:38:31 +00:00
Florian Hahn dcdd12b68c Revert Fix a bug w/inbounds invalidation in LFTR
Reverting because it breaks a green dragon build:
    http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208

This reverts r363289 (git commit eb88badff9)

llvm-svn: 363427
2019-06-14 17:23:09 +00:00
Florian Hahn a19809045c Revert [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC]
Reverting because it depends on r363289, which breaks a green dragon build:
    http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208

This reverts r363292 (git commit 42a3fc133d)

llvm-svn: 363426
2019-06-14 17:22:56 +00:00
Florian Hahn e1b4b1b46e Revert [LFTR] Rename variable to minimize confusion [NFC]
Reverting because it depends on r363289, which breaks a green dragon
build:
    http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208

This reverts r363293 (git commit c37be29634)

llvm-svn: 363425
2019-06-14 17:22:49 +00:00
Shawn Landden f2e60fc4e8 [SimpligyCFG] NFC intended, remove GCD that was only used for powers of two
and replace with an equilivent countTrailingZeros.

GCD is much more expensive than this, with repeated division.

This depends on D60823

Differential Revision: https://reviews.llvm.org/D61151

llvm-svn: 363422
2019-06-14 16:56:49 +00:00
Johannes Doerfert 282d34ee78 [Attributor] Disable the Attributor by default and fix a comment
llvm-svn: 363408
2019-06-14 14:53:41 +00:00
Matt Arsenault 492d71cc99 AMDGPU: Fold readlane intrinsics of constants
I'm not 100% sure about this, since I'm worried about IR transforms
that might end up introducing divergence downstream once replaced with
a constant, but I haven't come up with an example yet.

llvm-svn: 363406
2019-06-14 14:51:26 +00:00
Stanislav Mekhanoshin 68a2fef9ae [AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32
Differential Revision: https://reviews.llvm.org/D63301

llvm-svn: 363339
2019-06-13 23:47:36 +00:00
Philip Reames c37be29634 [LFTR] Rename variable to minimize confusion [NFC]
As pointed out by Nikita in D62625, BackedgeTakenCount is generally used to refer to the backedge taken count of the loop.  A conditional backedge taken count - one which only applies if a particular exit is taken - is called a ExitCount in SCEV code, so be consistent here.

llvm-svn: 363293
2019-06-13 18:40:15 +00:00
Philip Reames 42a3fc133d [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC]
llvm-svn: 363292
2019-06-13 18:32:55 +00:00
Philip Reames eb88badff9 Fix a bug w/inbounds invalidation in LFTR
This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying.  As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939

llvm-svn: 363289
2019-06-13 18:23:13 +00:00
Leonard Chan 09f56b51ec [clang][NewPM] Fix broken -O0 test from missing assumptions
Add an AssumptionCache callback to the InlineFuntionInfo used for the
AlwaysInlinerPass to match codegen of the AlwaysInlinerLegacyPass to generate
llvm.assume. This fixes CodeGen/builtin-movdir.c when new PM is enabled by
default.

Differential Revision: https://reviews.llvm.org/D63170

llvm-svn: 363287
2019-06-13 18:18:40 +00:00
Joseph Tremoulet 3bc6e2a7aa [EarlyCSE] Ensure equal keys have the same hash value
Summary:
The logic in EarlyCSE that looks through 'not' operations in the
predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is
equivalent to `select (cmp sgt X, Y), Y, X`.  Without this change,
however, only the latter is recognized as a form of `smin X, Y`, so the
two expressions receive different hash codes.  This leads to missed
optimization opportunities when the quadratic probing for the two hashes
doesn't happen to collide, and assertion failures when probing doesn't
collide on insertion but does collide on a subsequent table grow
operation.

This change inverts the order of some of the pattern matching, checking
first for the optional `not` and then for the min/max/abs patterns, so
that e.g. both expressions above are recognized as a form of `smin X, Y`.

It also adds an assertion to isEqual verifying that it implies equal
hash codes; this fires when there's a collision during insertion, not
just grow, and so will make it easier to notice if these functions fall
out of sync again.  A new flag --earlycse-debug-hash is added which can
be used when changing the hash function; it forces hash collisions so
that any pair of values inserted which compare as equal but hash
differently will be caught by the isEqual assertion.

Reviewers: spatel, nikic

Reviewed By: spatel, nikic

Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62644

llvm-svn: 363274
2019-06-13 15:24:11 +00:00
Sander de Smalen 7957fc6547 [IntrinsicEmitter] Extend argument overloading with forward references.
Extend the mechanism to overload intrinsic arguments by using either
backward or forward references to the overloadable arguments.

In for example:

  def int_something : Intrinsic<[LLVMPointerToElt<0>],
                                [llvm_anyvector_ty], []>;

LLVMPointerToElt<0> is a forward reference to the overloadable operand
of type 'llvm_anyvector_ty' and would allow intrinsics such as:

  declare i32* @llvm.something.v4i32(<4 x i32>);
  declare i64* @llvm.something.v2i64(<2 x i64>);

where the result pointer type is deduced from the element type of the
first argument.

If the returned pointer is not a pointer to the element type, LLVM will
give an error:

  Intrinsic has incorrect return type!
  i64* (<4 x i32>)* @llvm.something.v4i32

Reviewers: RKSimon, arsenm, rnk, greened

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D62995

llvm-svn: 363233
2019-06-13 08:19:33 +00:00
Shawn Landden 8b142bcc3f [SimplifyCFG] reverting preliminary Switch patches again
This reverts 363226 and 363227, both NFC intended

I swear I fixed the test case that is failing, and ran
the tests, but I will look into it again.

llvm-svn: 363229
2019-06-13 05:26:17 +00:00
Shawn Landden 636220e83c [SimpligyCFG] NFC intended, remove GCD that was only used for powers of two
and replace with an equilivent countTrailingZeros.

GCD is much more expensive than this, with repeated division.

This depends on D60823

Differential Revision: https://reviews.llvm.org/D61151

llvm-svn: 363227
2019-06-13 05:01:44 +00:00
David L. Jones c73fadaa84 Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors ...'
We have observed some failures with internal builds with this revision.

- Performance regressions:
  - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings).
  - Benchmarks for Abseil's SwissTable.
- Correctness:
  - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP).

hwennborg has already been notified, and is aware of reproducer failures.

llvm-svn: 363220
2019-06-13 02:04:45 +00:00
Philip Reames ae2581cef3 [IndVars] Extend diagnostic -replexitval flag w/ability to bypass hard use hueristic
Note: This does mean that "always" is now more powerful than it was. 
llvm-svn: 363196
2019-06-12 19:52:05 +00:00
Matt Arsenault aa6bdf9dcd LoopVersioning: Respect convergent
This changes the standalone pass only. Arguably the utility class
itself should assert there are no convergent calls. However, a target
pass with additional context may still be able to version a loop if
all of the dynamic conditions are sufficiently uniform.

llvm-svn: 363165
2019-06-12 14:05:58 +00:00
Matt Arsenault 86325be3d7 LoopLoadElim: Respect convergent
llvm-svn: 363162
2019-06-12 13:50:47 +00:00
Matt Arsenault 2466ba97bc LoopDistribute/LAA: Respect convergent
This case is slightly tricky, because loop distribution should be
allowed in some cases, and not others. As long as runtime dependency
checks don't need to be introduced, this should be OK. This is further
complicated by the fact that LoopDistribute partially ignores if LAA
says that vectorization is safe, and then does its own runtime pointer
legality checks.

Note this pass still does not handle noduplicate correctly, as this
should always be forbidden with it. I'm not going to bother trying to
fix it, as it would require more effort and I think noduplicate should
be removed.

https://reviews.llvm.org/D62607

llvm-svn: 363160
2019-06-12 13:34:19 +00:00
Orlando Cazalet-Hyams a947156396 Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion"
This reverts commit 1a0f7a2077.
See phabricator thread for D60831.

llvm-svn: 363132
2019-06-12 08:34:51 +00:00
Philip Reames 082cd30327 Generalize icmp matching in IndVars' eliminateTrunc
We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases.

Differential Revision: https://reviews.llvm.org/D63112

llvm-svn: 363108
2019-06-11 22:43:25 +00:00
Alina Sbirlea 3cef1f7d64 Only passes that preserve MemorySSA must mark it as preserved.
Summary:
The method `getLoopPassPreservedAnalyses` should not mark MemorySSA as
preserved, because it's being called in a lot of passes that do not
preserve MemorySSA.
Instead, mark the MemorySSA analysis as preserved by each pass that does
preserve it.
These changes only affect the new pass mananger.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62536

llvm-svn: 363091
2019-06-11 18:27:49 +00:00
Cameron McInally 08200d6d26 [InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZ
Differential Revision: https://reviews.llvm.org/D62612

llvm-svn: 363082
2019-06-11 16:21:21 +00:00
Cameron McInally 796de11331 [InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNeg
Differential Revision: https://reviews.llvm.org/D62629

llvm-svn: 363080
2019-06-11 15:45:41 +00:00
Orlando Cazalet-Hyams 1a0f7a2077 [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

llvm-svn: 363046
2019-06-11 10:37:20 +00:00
Sander de Smalen cbeb563cfb Change semantics of fadd/fmul vector reductions.
This patch changes how LLVM handles the accumulator/start value
in the reduction, by never ignoring it regardless of the presence of
fast-math flags on callsites. This change introduces the following
new intrinsics to replace the existing ones:

  llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd
  llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul

and adds functionality to auto-upgrade existing LLVM IR and bitcode.

Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D60261

llvm-svn: 363035
2019-06-11 08:22:10 +00:00
Rong Xu 7ea131c20c [PGO] Fix the buildbot failure in r362995
Fixed one unused variable warning.

llvm-svn: 363004
2019-06-10 23:20:04 +00:00
Rong Xu e44fa83c37 [PGO] Handle cases of non-instrument BBs
As shown in PR41279, some basic blocks (such as catchswitch) cannot be
instrumented. This patch filters out these BBs in PGO instrumentation.
It also sets the profile count to the fail-to-instrument edge, so that we
can propagate the counts in the CFG.

Differential Revision: https://reviews.llvm.org/D62700

llvm-svn: 362995
2019-06-10 22:36:27 +00:00
Philip Reames a9633d5f0b [LFTR] Use recomputed BE count
This was discussed as part of D62880.  The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening.  Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion.  This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable.

For the nestedIV example from elim-extend.ll, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)

Note that before is an i32 type, and the after is an i64.  Truncating the i64 produces the i32. 

llvm-svn: 362975
2019-06-10 19:18:53 +00:00
Philip Reames 5d84ccb230 Prepare for multi-exit LFTR [NFC]
This change does the plumbing to wire an ExitingBB parameter through the LFTR implementation, and reorganizes the code to work in terms of a set of individual loop exits. Most of it is fairly obvious, but there's one key complexity which makes it worthy of consideration. The actual multi-exit LFTR patch is in D62625 for context.

Specifically, it turns out the existing code uses the backedge taken count from before a IV is widened. Oddly, we can end up with a different (more expensive, but semantically equivelent) BE count for the loop when requerying after widening.  For the nestedIV example from elim-extend, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)

This is the only test in tree which seems sensitive to this difference. The actual result of using the wider BETC on this example is that we actually produce slightly better code. :)

In review, we decided to accept that test change.  This patch is structured to preserve the old behavior, but a separate change will immediate follow with the behavior change.  (I wanted it separate for problem attribution purposes.)

Differential Revision: https://reviews.llvm.org/D62880

llvm-svn: 362971
2019-06-10 17:51:13 +00:00
Sanjay Patel 9650c95b7e [InstCombine] allow unordered preds when canonicalizing to fabs()
We have a known-never-nan value via 'nnan', so an unordered predicate
is the same as its ordered sibling.

Similar to:
rL362937

llvm-svn: 362954
2019-06-10 15:39:00 +00:00
Sanjay Patel 85de9634e6 [InstCombine] fix bug in canonicalization to fabs()
Forgot to translate the predicate clauses in rL362943.

llvm-svn: 362945
2019-06-10 14:57:45 +00:00
Sanjay Patel 8b6d9f60ed [InstCombine] change canonicalization to fabs() to use FMF on fsub
Similar to rL362909:
This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fsub because they have the
same operand.

llvm-svn: 362943
2019-06-10 14:46:36 +00:00
Sanjay Patel 8cd8c5784b [InstCombine] allow unordered preds when canonicalizing to fabs()
PR42179:
https://bugs.llvm.org/show_bug.cgi?id=42179

llvm-svn: 362937
2019-06-10 14:14:51 +00:00
Vivek Pandya 11cb15f8ed Do not derive no-recurse attribute if function does not have exact definition.
This is fix for https://bugs.llvm.org/show_bug.cgi?id=41336

Reviewers: jdoerfert
Reviewed by: jdoerfert

Differential Revision: https://reviews.llvm.org/D63045

llvm-svn: 362918
2019-06-10 04:16:04 +00:00
Roman Lebedev d669758d84 [InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompiles
A precondition 'x != 0' was forgotten by me:
https://rise4fun.com/Alive/JFNP
https://rise4fun.com/Alive/jHvL

These 4 folds with non-constants could be re-enabled,
but for now let's go for the simplest solution.

https://bugs.llvm.org/show_bug.cgi?id=42198

llvm-svn: 362911
2019-06-09 16:30:42 +00:00
Sanjay Patel 87cd16a86e [InstCombine] change canonicalization to fabs() to use FMF on fneg
This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fneg (fsub) because they
have the same operand.

This works around the most glaring FMF logical inconsistency cited
in PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086

llvm-svn: 362909
2019-06-09 16:22:01 +00:00
Keno Fischer eb4a561fa3 [GVN] non-functional code movement
Summary: Move some code around, in preparation for later fixes
to the non-integral addrspace handling (D59661)

Patch By Jameson Nash <jameson@juliacomputing.com>

Reviewed By: reames, loladiro
Differential Revision: https://reviews.llvm.org/D59729

llvm-svn: 362853
2019-06-07 23:08:38 +00:00
Alina Sbirlea eaea538d18 [DomTreeUpdater] Add all insert before all delete updates to reduce compile time.
Summary:
The cleanup in D62751 introduced a compile-time regression due to the way DT updates are performed.
Add all insert edges then all delete edges in DTU to match the previous compile time.
Compile time on the test provided by @mstorsjo before and after this patch on my machine:
113.046s vs 35.649s
Repro: clang -target x86_64-w64-mingw32 -c -O3 glew-preproc.c; on https://martin.st/temp/glew-preproc.c.

Reviewers: kuhar, NutshellySima, mstorsjo

Subscribers: jlebar, mstorsjo, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62981

llvm-svn: 362839
2019-06-07 20:43:55 +00:00
Stefan Stipanovic 128e8e8fb9 test-commit
llvm-svn: 362802
2019-06-07 14:18:02 +00:00
Fangrui Song 19189993c9 [LV] Fix -Wunused-function after r362736
llvm-svn: 362762
2019-06-07 01:48:26 +00:00
Renato Golin 9e97caf594 [LV] Wrap LV illegality reporting in a function. NFC.
A function for loop vectorization illegality reporting has been
introduced:

void LoopVectorizationLegality::reportVectorizationFailure(
    const StringRef DebugMsg, const StringRef OREMsg,
    const StringRef ORETag, Instruction * const I) const;

The function prints a debug message when the debug for the compilation
unit is enabled as well as invokes the optimization report emitter to
generate a message with a specified tag. The function doesn't cover any
complicated logic when a custom lambda should be passed to the emitter,
only generating a message with a tag is supported.

The function always prints the instruction `I` after the debug message
whenever the instruction is specified, otherwise the debug message
ends with a dot: 'LV: Not vectorizing: Disabled/already vectorized.'

Patch by Pavel Samolysov <samolisov@gmail.com>

llvm-svn: 362736
2019-06-06 19:15:52 +00:00
Philip Reames 101915cfda [LoopPred] Fix a bug in unconditional latch bailout introduced in r362284
This is a really silly bug that even a simple test w/an unconditional latch would have caught.  I tried to guard against the case, but put it in the wrong if check.  Oops.

llvm-svn: 362727
2019-06-06 18:02:36 +00:00
Cameron McInally c72fbe5dc1 [MSAN] Add unary FNeg visitor to the MemorySanitizer
Differential Revision: https://reviews.llvm.org/D62909

llvm-svn: 362664
2019-06-05 22:37:05 +00:00
Mircea Trofin e3eeacd70a [CallSite removal] Refactoring llvm::InlineFunction APIs
Summary:
This change only unifies the API previous API pair accepting
CallInst and InvokeInst, thus making it easier to refactor
inliner pass ode to CallBase. The implementation of the unified
API still relies on the CallSite implementation.

Reviewers: eraman, chandlerc, jdoerfert

Reviewed By: jdoerfert

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62283

llvm-svn: 362656
2019-06-05 21:28:13 +00:00
Sanjay Patel ac111e526d [InstCombine] simplify code for bitcast of insertelement; NFC
llvm-svn: 362655
2019-06-05 21:26:52 +00:00
Matt Arsenault 663d762c9a NewGVN: Handle addrspacecast
The AllConstant check needs to be moved out of the if/else if chain to
avoid a test regression. The "there is no SimplifyZExt" comment
puzzles me, since there is SimplifyCastInst. Additionally, the
Simplify* calls seem to not see the operand as constant, so this needs
to be tried if the simplify failed.

llvm-svn: 362653
2019-06-05 21:15:52 +00:00
Tim Northover 8d7f118ab2 InstCombine: correctly change byval type attribute alongside call args.
When the byval attribute has a type, it must match the pointee type of
any parameter; but InstCombine was not updating the attribute when
folding casts of various kinds away.

llvm-svn: 362643
2019-06-05 20:38:17 +00:00
Dinar Temirbulatov 15c657d13d [SLP] Fix regression in broadcasts caused by operand reordering patch D59973.
This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 .
The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found.
Please see the lit test for some examples.

Committed on behalf of @vporpo (Vasileios Porpodas)
    
Differential Revision: https://reviews.llvm.org/D62427

llvm-svn: 362613
2019-06-05 15:26:28 +00:00
Sanjay Patel ad62a3a299 [LoopUtils][SLPVectorizer] clean up management of fast-math-flags
Instead of passing around fast-math-flags as a parameter, we can set those
using an IRBuilder guard object. This is no-functional-change-intended.

The motivation is to eventually fix the vectorizers to use and set the
correct fast-math-flags for reductions. Examples of that not behaving as
expected are:
https://bugs.llvm.org/show_bug.cgi?id=23116 (should be able to reduce with less than 'fast')
https://bugs.llvm.org/show_bug.cgi?id=35538 (possible miscompile for -0.0)
D61802 (should be able to reduce with IR-level FMF)

Differential Revision: https://reviews.llvm.org/D62272

llvm-svn: 362612
2019-06-05 14:58:04 +00:00
Simon Pilgrim db134aaec2 [IPO] Disabled 'default only' switch statements to fix MSVC warnings.
@jdoerfert Looks like these are placeholders for incoming abstract attributes patches so I've just commented the code out, even though this is usually frowned upon.

llvm-svn: 362592
2019-06-05 10:04:05 +00:00
Yevgeny Rouban a3e16719c4 Resubmit "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst"
This reverts commit 5b32f60ec3.
The fix is in commit 4f9e68148b.

This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.

Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126

llvm-svn: 362583
2019-06-05 05:46:40 +00:00
Michael Liao fa449a9bb2 Suppress false-positive GCC -Wreturn-type warning.
llvm-svn: 362582
2019-06-05 04:18:12 +00:00
Johannes Doerfert aade782a98 [Attributor] Pass infrastructure and fixpoint framework
NOTE: Note that no attributes are derived yet. This patch will not go in
      alone but only with others that derive attributes. The framework is
      split for review purposes.

This commit introduces the Attributor pass infrastructure and fixpoint
iteration framework. Further patches will introduce abstract attributes
into this framework.

In a nutshell, the Attributor will update instances of abstract
arguments until a fixpoint, or a "timeout", is reached. Communication
between the Attributor and the abstract attributes that are derived is
restricted to the AbstractState and AbstractAttribute interfaces.

Please see the file comment in Attributor.h for detailed information
including design decisions and typical use case. Also consider the class
documentation for Attributor, AbstractState, and AbstractAttribute.

Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames

Subscribers: mehdi_amini, mgorny, hiraditya, bollu, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59918

llvm-svn: 362578
2019-06-05 03:02:24 +00:00
Cameron McInally 5c7245b830 [Scalarizer] Add UnaryOperator visitor to scalarization pass
Differential Revision: https://reviews.llvm.org/D62858

llvm-svn: 362558
2019-06-04 23:01:36 +00:00
Alina Sbirlea bfceed49ce [Utils] Clean another duplicated util method.
Summary:
Following the cleanup in D48202, method foldBlockIntoPredecessor has the
same behavior. Replace its uses with MergeBlockIntoPredecessor.
Remove foldBlockIntoPredecessor.

Reviewers: chandlerc, dmgreen

Subscribers: jlebar, javed.absar, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62751

llvm-svn: 362538
2019-06-04 18:45:15 +00:00
Cameron McInally 89f9af5487 [SCCP] Add UnaryOperator visitor to SCCP for unary FNeg
Differential Revision: https://reviews.llvm.org/D62819

llvm-svn: 362449
2019-06-03 21:53:56 +00:00
Andrew Kaylor 4172dbab5d Fix a crash when the default of a switch is removed
This patch fixes a problem that occurs in LowerSwitch when a switch statement has a PHI node as its condition, and the PHI node only has two incoming blocks, and one of those incoming blocks is through an unreachable default in the switch statement. When this condition occurs, LowerSwitch holds a pointer to the condition value, but removes the switch block as a predecessor of the PHI block, causing the PHI node to be replaced. LowerSwitch then tries to use its stale pointer to the original condition value, causing a crash.

Differential Revision: https://reviews.llvm.org/D62560

llvm-svn: 362427
2019-06-03 17:54:15 +00:00
Philip Reames 9ed1673703 [LoopPred] Convert a second member function to a static helper [NFC]
(And remember to actually mark the first one static.)

llvm-svn: 362415
2019-06-03 16:23:20 +00:00
Philip Reames 0912b06f78 [LoopPred] Convert member function to free helper function [NFC]
llvm-svn: 362411
2019-06-03 16:17:14 +00:00
Nikita Popov 900578d1c1 [SimplifyIndVar] Refactor overflow check elimination code; NFC
Extract a willNotOverflow() helper function that is shared between
eliminateOverflowIntrinsic() and strengthenOverflowingOperation().
Use WithOverflowInst for the former.

We'll be able to reuse the same code for saturating intrinsics as
well.

llvm-svn: 362305
2019-06-01 20:21:53 +00:00
Nikita Popov 46d4dba6e6 [IndVarSimplify] Fixup nowrap flags during LFTR (PR31181)
Fix for https://bugs.llvm.org/show_bug.cgi?id=31181 and partial fix
for LFTR poison handling issues in general.

When LFTR moves a condition from pre-inc to post-inc, it may now
depend on value that is poison due to nowrap flags. To avoid this,
we clear any nowrap flag that SCEV cannot prove for the post-inc
addrec.

Additionally, LFTR may switch to a different IV that is dynamically
dead and as such may be arbitrarily poison. This patch will correct
nowrap flags in some but not all cases where this happens. This is
related to the adoption of IR nowrap flags for the pre-inc addrec.
(See some of the switch_to_different_iv tests, where flags are not
dropped or insufficiently dropped.)

Finally, there are likely similar issues with the handling of GEP
inbounds, but we don't have a test case for this yet.

Differential Revision: https://reviews.llvm.org/D60935

llvm-svn: 362292
2019-06-01 09:40:18 +00:00
Richard Trieu 4e875464df Inline variable into assert to fix unused variable warning.
llvm-svn: 362285
2019-06-01 03:32:20 +00:00
Philip Reames 19afdf74bb [LoopPred] Eliminate a redundant/confusing cover function [NFC]
llvm-svn: 362284
2019-06-01 03:09:28 +00:00
Philip Reames 099eca832e [LoopPred] Handle a subset of NE comparison based latches
At the moment, LoopPredication completely bails out if it sees a latch of the form:
%cmp = icmp ne %iv, %N
br i1 %cmp, label %loop, label %exit
OR
%cmp = icmp ne %iv.next, %NPlus1
br i1 %cmp, label %loop, label %exit

This is unfortunate since this is exactly the form that LFTR likes to produce. So, go ahead and recognize simple cases where we can.

For pre-increment loops, we leverage the fact that LFTR likes canonical counters (i.e. those starting at zero) and a (presumed) range fact on RHS to discharge the check trivially.

For post-increment forms, the key insight is in remembering that LFTR had to insert a (N+1) for the RHS. CVP can hopefully prove that add nsw/nuw (if there's appropriate range on N to start with). This leaves us both with the post-inc IV and the RHS involving an nsw/nuw add, and SCEV can discharge that with no problem.

This does still need to be extended to handle non-one steps, or other harder patterns of variable (but range restricted) starting values. That'll come later.

Differential Revision: https://reviews.llvm.org/D62748

llvm-svn: 362282
2019-06-01 00:31:58 +00:00
Erik Pilkington abb2a93c53 [SimplifyLibCalls] Fold more fortified functions into non-fortified variants
When the object size argument is -1, no checking can be done, so calling the
_chk variant is unnecessary. We already did this for a bunch of these
functions.

rdar://50797197

Differential revision: https://reviews.llvm.org/D62358

llvm-svn: 362272
2019-05-31 22:41:36 +00:00
Erik Pilkington 5234921119 NFC: Pull out a function to reduce some duplication
Part of https://reviews.llvm.org/D62358

llvm-svn: 362271
2019-05-31 22:41:31 +00:00
Nikita Popov 7bafae55c0 Reapply [CVP] Simplify non-overflowing saturating add/sub
If we can determine that a saturating add/sub will not overflow based
on range analysis, convert it into a simple binary operation. This is
a sibling transform to the existing with.overflow handling.

Reapplying this with an additional check that the saturating intrinsic
has integer type, as LVI currently does not support vector types.

Differential Revision: https://reviews.llvm.org/D62703

llvm-svn: 362263
2019-05-31 20:48:26 +00:00
Nikita Popov 23a02f6a5f [CVP] Fix assertion failure on vector with.overflow
Noticed on D62703. LVI only handles plain integers, not vectors of
integers. This was previously not an issue, because vector support
for with.overflow is only a relatively recent addition.

llvm-svn: 362261
2019-05-31 20:42:07 +00:00
Nikita Popov ccb63e0bfe Revert "[CVP] Simplify non-overflowing saturating add/sub"
This reverts commit 1e692d1777.

Causes assertion failure in builtins-wasm.c clang test.

llvm-svn: 362254
2019-05-31 19:04:47 +00:00
Nikita Popov 1e692d1777 [CVP] Simplify non-overflowing saturating add/sub
If we can determine that a saturating add/sub will not overflow
based on range analysis, convert it into a simple binary operation.
This is a sibling transform to the existing with.overflow handling.

Differential Revision: https://reviews.llvm.org/D62703

llvm-svn: 362242
2019-05-31 16:46:05 +00:00
Roman Lebedev 39390d8317 [InstCombine] 'C-(C2-X) --> X+(C-C2)' constant-fold
It looks this fold was already partially happening, indirectly
via some other folds, but with one-use limitation.
No other fold here has that restriction.

https://rise4fun.com/Alive/ftR

llvm-svn: 362217
2019-05-31 09:47:16 +00:00
Roman Lebedev 886c4ef35a [InstCombine] 'add (sub C1, X), C2 --> sub (add C1, C2), X' constant-fold
https://rise4fun.com/Alive/qJQ

llvm-svn: 362216
2019-05-31 09:47:04 +00:00
Nikita Popov e906f2a370 [CVP] Generalize willNotOverflow(); NFC
Change argument from WithOverflowInst to BinaryOpIntrinsic, so this
function can also be used for saturating math intrinsics.

llvm-svn: 362152
2019-05-30 21:03:10 +00:00
Martin Storsjo 0fe645c086 [InstCombine] Avoid use after free in DenseMap, when built with GCC
Previously, this used a statement like this:
    Map[A] = Map[B];

This is equivalent to the following:
    const auto &Src = Map[B];
    auto &Dest = Map[A];
    Dest = Src;

The second statement, "auto &Dest = Map[A];" can insert a new
element into the DenseMap, which can potentially grow and reallocate
the DenseMap's internal storage, which will invalidate the existing
reference to the source. When doing the actual assignment,
the Src reference is dereferenced, accessing memory that was
freed when the DenseMap grew.

This issue hasn't shown up when LLVM was built with Clang, because
the right hand side ended up dereferenced before evaulating the
left hand side. (If the value type is a larger data type, Clang doesn't
do this but behaves like GCC.)

With GCC, a cast to Value* isn't enough to make it dereference the
right hand side reference before invoking operator[] (while that is
enough to make Clang/LLVM do the right thing for larger types), but
storing it in an intermediate variable in a separate statement works.

This fixes PR42065.

Differential Revision: https://reviews.llvm.org/D62624

llvm-svn: 362150
2019-05-30 20:53:21 +00:00
Tim Northover b7141207a4 Reapply: IR: add optional type to 'byval' function parameters
When we switch to opaque pointer types we will need some way to describe
how many bytes a 'byval' parameter should occupy on the stack. This adds
a (for now) optional extra type parameter.

If present, the type must match the pointee type of the argument.

The original commit did not remap byval types when linking modules, which broke
LTO. This version fixes that.

Note to front-end maintainers: if this causes test failures, it's probably
because the "byval" attribute is printed after attributes without any parameter
after this change.

llvm-svn: 362128
2019-05-30 18:48:23 +00:00
Florian Hahn 9bbdde2598 [LV] Remove the redundant using LoopVectorizationPlanner:VPlanPtr
VPlan.h already contains the declaration of VPlanPtr type alias:

using VPlanPtr = std::unique_ptr<VPlan>;

The LoopVectorizationPlanner class also contains the same declaration
of VPlanPtr and therefore LoopVectorize requires a long wording when
its methods return VPlanPtr:

    LoopVectorizationPlanner::VPlanPtr
    LoopVectorizationPlanner::buildVPlanWithVPRecipes(...)

but LoopVectorize.cpp includes VPlan.h (via LoopVectorizationPlanner.h)
and can use VPlanPtr from that header.

Patch by Pavel Samolysov.

Reviewers: hsaito, rengolin, fhahn

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D62576

llvm-svn: 362126
2019-05-30 18:46:13 +00:00
Craig Topper 778e445c58 [LoopVectorize] Add FNeg instruction support
Differential Revision: https://reviews.llvm.org/D62510

llvm-svn: 362124
2019-05-30 18:19:35 +00:00
Roman Lebedev e8578953ac [LoopIdiom] Basic OptimizationRemarkEmitter handling
Summary:
I'm adding ORE to memset/memcpy formation, with tests,
but mainly this is split off from D61144.

Reviewers: reames, anemet, thegameg, craig.topper

Reviewed By: thegameg

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62631

llvm-svn: 362092
2019-05-30 13:02:06 +00:00
Roman Lebedev fae2e46766 [LoopIdiomRecognize][NFC] Sort includes
Split off from D61144

llvm-svn: 362091
2019-05-30 13:01:53 +00:00
Florian Hahn e4cfa89915 [LV] Inform about exactly reason of loop illegality
Currently, only the following information is provided by LoopVectorizer
in the case when the CF of the loop is not legal for vectorization:

 LV: Can't vectorize the instructions or CFG
    LV: Not vectorizing: Cannot prove legality.

But this information is not enough for the root cause analysis; what is
exactly wrong with the loop should also be printed:

 LV: Not vectorizing: The exiting block is not the loop latch.

Patch by Pavel Samolysov.

Reviewers: mkuper, hsaito, rengolin, fhahn

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D62311

llvm-svn: 362056
2019-05-30 05:03:12 +00:00
Matt Arsenault 79b3ea701c LoopVersioningLICM: Respect convergent and noduplicate
llvm-svn: 362031
2019-05-29 20:47:59 +00:00
Roman Lebedev 95dec50a35 [LoopIdiomRecognize][NFC] Use DEBUG_TYPE, add LLVM_DEBUG() to runOnNoncountableLoop()
Split off from D61144

llvm-svn: 362022
2019-05-29 20:11:53 +00:00
Nikita Popov 5382803b04 [InstCombine] Optimize always overflowing signed saturating add/sub
Based on the overflow direction information added in D62463, we can
now fold always overflowing signed saturating add/sub to signed min/max.

Differential Revision: https://reviews.llvm.org/D62544

llvm-svn: 362006
2019-05-29 18:37:13 +00:00
Matt Arsenault f80c4241b3 CallSiteSplitting: Respect convergent and noduplicate
llvm-svn: 361990
2019-05-29 16:59:48 +00:00
Teresa Johnson 5b2088d1fa [ThinLTO] Use original alias visibility when importing
Summary:
When we import an alias, we do so by making a clone of the aliasee. Just
as this clone uses the original alias name and linkage, it should also
use the same visibility (not the aliasee's visibility). Otherwise,
linker behavior is affected (e.g. if the aliasee was hidden, but the
alias is not, the resulting imported clone should not be hidden,
otherwise the linker will make the final symbol hidden which is
incorrect).

Reviewers: wmi

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62535

llvm-svn: 361989
2019-05-29 16:50:46 +00:00
Matt Arsenault 36e7254441 SpeculateAroundPHIs: Respect convergent
llvm-svn: 361957
2019-05-29 13:14:39 +00:00
Rong Xu e88173abc0 [PGO] Handle cases of failing to split critical edges
Fix PR41279 where critical edges to EHPad are not split.
The fix is to not instrument those critical edges. We used to be able to know
the size of counters right after MST is computed. With this, we have to
pre-collect the instrument BBs to know the size, and then instrument them.

Differential Revision: https://reviews.llvm.org/D62439

llvm-svn: 361882
2019-05-28 21:45:56 +00:00
Nikita Popov 5b32f60ec3 Revert "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst"
This reverts commit 53f2f32865.

As reported on D62126, this causes assertion failures if the switch
has incorrect branch_weights metadata, which may happen as a result
of other transforms not handling it correctly yet.

llvm-svn: 361881
2019-05-28 21:28:24 +00:00
Nikita Popov c51cdacab9 [InstCombine] Clean up saturing math overflow optimizations; NFC
Reduce duplication and make it easier to handle signed
always-overflows conditions in the future.

llvm-svn: 361863
2019-05-28 18:59:21 +00:00
Nikita Popov 332c100562 [ValueTracking][ConstantRange] Distinguish low/high always overflow
In order to fold an always overflowing signed saturating add/sub,
we need to know in which direction the always overflow occurs.
This patch splits up AlwaysOverflows into AlwaysOverflowsLow and
AlwaysOverflowsHigh to pass through this information (but it is
not used yet).

Differential Revision: https://reviews.llvm.org/D62463

llvm-svn: 361858
2019-05-28 18:08:31 +00:00
Hans Wennborg d936e40575 Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"
This was reverted in r360086 as it was supected of causing mysterious test
failures internally. However, it was never concluded that this patch was the
root cause.

> The code was previously checking that candidates for sinking had exactly
> one use or were a store instruction (which can't have uses). This meant
> we could sink call instructions only if they had a use.
>
> That limitation seemed a bit arbitrary, so this patch changes it to
> "instruction has zero or one use" which seems more natural and removes
> the need to special-case stores.
>
> Differential revision: https://reviews.llvm.org/D59936

llvm-svn: 361811
2019-05-28 12:19:38 +00:00
Yevgeny Rouban 53f2f32865 [CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst
This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.

Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126

llvm-svn: 361808
2019-05-28 11:33:50 +00:00
Florian Hahn 11b2f4fe50 [LoopInterchange] Fix handling of LCSSA nodes defined in headers and latches.
The code to preserve LCSSA PHIs currently only properly supports
reduction PHIs and PHIs for values defined outside the latches.

This patch improves the LCSSA PHI handling to cover PHIs for values
defined in the latches.

Fixes PR41725.

Reviewers: efriedma, mcrosier, davide, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D61576

llvm-svn: 361743
2019-05-26 23:38:25 +00:00
Shawn Landden 343578759e [SimplifyCFG] back out all SwitchInst commits
They caused the sanitizer builds to fail.

My suspicion is the change the countLeadingZeros().

llvm-svn: 361736
2019-05-26 18:15:51 +00:00
Sanjay Patel 9317963920 [InstCombine] prevent crashing with invalid extractelement index
This was found/reduced from a fuzzer report:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956

llvm-svn: 361729
2019-05-26 14:03:50 +00:00
Shawn Landden fa91ab85d9 [SimplifyCFG] ReduceSwitchRange: Improve on the case where the SubThreshold doesn't trigger
llvm-svn: 361728
2019-05-26 13:55:52 +00:00
Shawn Landden 30111c786f [SimplifyCFG] Run ReduceSwitchRange unconditionally, generalize
Rather than gating on "isSwitchDense" (resulting in necessesarily
sparse lookup tables even when they were generated), always run
this quite cheap transform.

This transform is useful not just for generating tables.
LowerSwitch also wants this: read LowerSwitch.cpp:257.

Be careful to not generate worse code, by introducing a
SubThreshold heuristic.

Instead of just sorting by signed, generalize the finding of the
best base.

And now that it is run unconditionally, do not replicate its
functionality in SwitchToLookupTable (which could use a Sub
when having a hole is smaller, hence the SubThreshold
heuristic located in a single place).
This simplifies SwitchToLookupTable, and fixes
some ugly corner cases due to the use of signed numbers,
such as a table containing i16 32768 and 32769, of which
32769 would be interpreted as -32768, and now the code thinks
the table is size 65536.

(We still use unconditional subtraction when building a single-register mask,
but I think this whole block should go when the more general sparse
map is added, which doesn't leave empty holes in the table.)

And the reason test4 and test5 did not trigger was documented wrong:
it was because they were not considered sufficiently "dense".

Also, fix generation of invalid LLVM-IR: shl by bit-width.

llvm-svn: 361727
2019-05-26 13:55:14 +00:00
Shawn Landden 444eaaf1cc [SimpligyCFG] NFC, remove GCD that was only used for powers of two
and replace with an equilivent countTrailingZeros.

GCD is much more expensive than this, with repeated division.

This depends on D60823

llvm-svn: 361726
2019-05-26 13:54:04 +00:00
Shawn Landden b7cc093db2 [Support] make countLeadingZeros() and countTrailingZeros() return unsigned
This matches countLeadingOnes() and countTrailingOnes(), and
APInt's countLeadingZeros() and countTrailingZeros().

(as well as __builtin_clzll())

llvm-svn: 361724
2019-05-26 13:49:58 +00:00
Nikita Popov 39f2bebf41 [InstCombine] Refactor OptimizeOverflowCheck; NFCI
Extract method to compute overflow based on binop and signedness,
and then make the result handling code generic. This extends the
always-overflow handling to signed muls, but has currently no effect,
as we don't compute always overflow for them (thus NFC).

llvm-svn: 361721
2019-05-26 11:43:37 +00:00
Nikita Popov 352f598795 [InstCombine] Remove OverflowCheckFlavor; NFC
Instead pass binary op and signedness. The extra enum only makes
things more complicated in this case.

llvm-svn: 361720
2019-05-26 11:43:31 +00:00
David Bolvansky 0290a77aa8 [SimplifyCFG] Added condition assumption for unreachable blocks
Summary: PR41688

Reviewers: spatel, efriedma, craig.topper, hfinkel, reames

Reviewed By: hfinkel

Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61409

llvm-svn: 361707
2019-05-25 22:34:27 +00:00
Nikita Popov 8b1fa07639 [CVP] Remove unnecessary checks for empty GNWR; NFC
The guaranteed no-wrap region is never empty, it always contains at
least zero, so these optimizations don't ever apply.

To make this more obviously true, replace the conversative return
in makeGNWR with an assertion.

llvm-svn: 361698
2019-05-25 14:11:55 +00:00
Bjorn Pettersson b4771425f5 Use the DataLayout::typeSizeEqualsStoreSize helper. NFC
Just a minor refactoring to use the new helper method
DataLayout::typeSizeEqualsStoreSize(). This is done when
checking if getTypeSizeInBits is equal/non-equal to
getTypeStoreSizeInBits.

llvm-svn: 361613
2019-05-24 09:20:20 +00:00
Neil Henning 119c31ad93 StructurizeCFG: Relax uniformity checks.
This change relaxes the checks for hasOnlyUniformBranches such that our
region is uniform if:

1. All conditional branches that are direct children are uniform.
2. And either:
  a. All sub-regions are uniform.
  b. There is one or less conditional branches among the direct
     children.

Differential Revision: https://reviews.llvm.org/D62198

llvm-svn: 361610
2019-05-24 08:59:17 +00:00
Bjorn Pettersson d63a2bb35f [DSE] Bugfix to avoid PartialStoreMerging involving non byte-sized stores
Summary:
The DeadStoreElimination pass now skips doing
PartialStoreMerging when stores overlap according to
OW_PartialEarlierWithFullLater and at least one of
the stores is having a store size that is different
from the size of the type being stored.

This solves problems seen in
  https://bugs.llvm.org/show_bug.cgi?id=41949
for which we in the past could end up with
mis-compiles or assertions.

The content and location of the padding bits is not
formally described (or undefined) in the LangRef
at the moment. So the solution is chosen based on
that we cannot assume anything about the padding bits
when having a store that clobbers more memory than
indicated by the type of the value that is stored
(such as storing an i6 using an 8-bit store instruction).

Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949

Reviewers: spatel, efriedma, fhahn

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62250

llvm-svn: 361605
2019-05-24 08:32:02 +00:00
Eli Friedman 052f87ae36 Revert r361460
It regresses https://bugs.llvm.org/show_bug.cgi?id=38309 (represented
by the testcase test/Transforms/GlobalOpt/globalsra-multigep.ll).

llvm-svn: 361581
2019-05-24 01:03:51 +00:00
Sanjay Patel 8869a98e82 [InstSimplify] fold insertelement-of-extractelement
This was partly handled in InstCombine (only the constant
index case), so delete that and zap it more generally in
InstSimplify.

llvm-svn: 361576
2019-05-24 00:13:58 +00:00
Sanjay Patel 093c922205 [InstCombine] remove redundant fold for extractelement; NFC
The out-of-bounds index pattern is handled by InstSimplify,
so the extractelement should be eliminated next time it is
visited.

llvm-svn: 361570
2019-05-23 23:33:38 +00:00
Sanjay Patel 4d4df6f144 [InstCombine] remove redundant fold for insertelement; NFC
The out-of-bounds index pattern is handled by InstSimplify.

llvm-svn: 361569
2019-05-23 23:33:34 +00:00
Alina Sbirlea d82ddfa7c3 [NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61612

llvm-svn: 361560
2019-05-23 21:52:59 +00:00
Sanjay Patel e60cb7d1be [InstSimplify] insertelement V, undef, ? --> V
This was part of InstCombine, but it's better placed in
InstSimplify. InstCombine also had an unreachable but weaker
fold for insertelement with undef index, so that is deleted.

llvm-svn: 361559
2019-05-23 21:49:47 +00:00
Kit Barton 987fdfd9a7 Revert [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch.
This reverts r361517 (git commit 2049e4dd8f)

llvm-svn: 361553
2019-05-23 20:53:05 +00:00
Alina Sbirlea 63729b0c49 [SLPVectorizer] Set flag to previous default.
Summary:
The refactoring in r360276 moved the `RunSLPVectorization` flag and added the default explicitly. The default should have been `false`, as before.

The new pass manager used to have SLPVectorization on by default, now it's off in opt, and needs D61617 checked in to enable it in clang.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, eraman, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61955

llvm-svn: 361537
2019-05-23 19:07:41 +00:00
Sanjay Patel 3249be1e03 [InstCombine] be more careful when transforming a shuffle mask
This is reduced from a fuzzer test:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890

Usually, demanded elements should be able to simplify shuffle
mask elements that are pointing to undef elements of its source
operands, but that doesn't happen in the test case.

llvm-svn: 361533
2019-05-23 18:46:03 +00:00
Kit Barton 2049e4dd8f [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch.
Summary:
    This PR extends the loop object with more utilities to get loop bounds, step, induction variable, and guard branch. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. Moreover, loop fusion (https://reviews.llvm.org/D55851) is planning to use getGuard() to extend the kind of loops it is able to fuse, e.g. rotated loop with non-constant upper bound, which would have a loop guard.

      /// Example:
      /// for (int i = lb; i < ub; i+=step)
      ///   <loop body>
      /// --- pseudo LLVMIR ---
      /// beforeloop:
      ///   guardcmp = (lb < ub)
      ///   if (guardcmp) goto preheader; else goto afterloop
      /// preheader:
      /// loop:
      ///   i1 = phi[{lb, preheader}, {i2, latch}]
      ///   <loop body>
      ///   i2 = i1 + step
      /// latch:
      ///   cmp = (i2 < ub)
      ///   if (cmp) goto loop
      /// exit:
      /// afterloop:
      ///
      /// getBounds
      ///   getInitialIVValue      --> lb
      ///   getStepInst            --> i2 = i1 + step
      ///   getStepValue           --> step
      ///   getFinalIVValue        --> ub
      ///   getCanonicalPredicate  --> '<'
      ///   getDirection           --> Increasing
      /// getGuard             --> if (guardcmp) goto loop; else goto afterloop
      /// getInductionVariable          --> i1
      /// getAuxiliaryInductionVariable --> {i1}
      /// isCanonical                   --> false

    Committed on behalf of @Whitney (Whitney Tsang).

    Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn

    Reviewed By: kbarton

    Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D60565

llvm-svn: 361517
2019-05-23 17:56:35 +00:00
Saleem Abdulrasool 7bbefb13ee Transforms: lower fadd and fsub atomicrmw instructions
`fadd` and `fsub` have recently (r351850) been added as `atomicrmw`
operations. This diff adds lowering cases for them to the LowerAtomic
transform.

Patch by Josh Berdine!

llvm-svn: 361512
2019-05-23 17:03:43 +00:00
Clement Courbet 43882b16a3 [MergeICmps] Make the pass compatible with the new pass manager.
Reviewers: gchatelet, spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62287

llvm-svn: 361490
2019-05-23 12:35:26 +00:00
Christian Bruel 4a7da98bd9 [GlobalOpt] recognize dead struct fields and propagate values
Summary:
Allow struct fields SRA and dead stores. This works by considering fields accesses from getElementPtr to be considered as a possible pointer root that can be cleaned up.
We check that the variable can be SRA by recursively checking the sub expressions with the new isSafeSubSROAGEP function.

basically this allows the array in following C code  to be optimized out 

struct Expr {
  int a[2];
  int b;
};

static struct Expr e;

int foo (int i)
{
  e.b = 2;
  e.a[i] = 1;
  return e.b;
}


Reviewers: greened, bkramer, nicholas, jmolloy

Reviewed By: jmolloy

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61911

llvm-svn: 361460
2019-05-23 05:53:10 +00:00
Craig Topper 9816d55776 [X86][InstCombine] Remove InstCombine code that turns X86 round intrinsics into llvm.ceil/floor. Remove some isel patterns that existed because that was happening.
We were turning roundss/sd/ps/pd intrinsics with immediates of 1 or 2 into
llvm.floor/ceil.  The llvm.ceil/floor intrinsics are supposed to correspond
to the libm functions.  For the libm functions we need to disable the
precision exception so the llvm.floor/ceil functions should always map to
encodings 0x9 and 0xA.

We had a mix of isel patterns where some used 0x9 and 0xA and others used
0x1 and 0x2. We need to be consistent and always use 0x9 and 0xA.

Since we have no way in isel of knowing where the llvm.ceil/floor came
from, we can't map X86 specific intrinsics with encodings 1 or 2 to it.
We could map 0x9 and 0xA to llvm.ceil/floor instead, but I'd really like
to see a use case and optimization advantage first.

I've left the backend test cases to show the blend we now emit without
the extra isel patterns. But I've removed the InstCombine tests completely.

llvm-svn: 361425
2019-05-22 20:04:55 +00:00
Hiroshi Yamauchi dfeb797455 [PGO][CHR] Speed up following long use-def chains.
Summary: Avoid visiting an instruction more than once by using a map.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62262

llvm-svn: 361416
2019-05-22 18:37:34 +00:00
Simon Pilgrim 3c05cad03e LoopVectorizationCostModel::selectInterleaveCount - assert we have a non-zero loop cost. NFCI.
The input LoopCost value can be zero, but if so it should be recalculated with the current VF. After that it should always be non-zero.

llvm-svn: 361387
2019-05-22 14:18:17 +00:00
Clement Courbet f8f93ba90d Re-land r361257 "[MergeICmps][NFC] Make BCEAtom move-only.""
llvm-svn: 361366
2019-05-22 09:45:40 +00:00
Sanjay Patel 6a554188aa [InstCombine] fold shuffles of insert_subvectors
This should be a valid exception to the general rule of not creating new shuffle masks in IR...
because we already do it. :)
Also, DAG combining/legalization will undo this by widening the shuffle back out if needed.

Explanation for how we already do this: SLP or vector source can create chains of insert/extract
as shown in 1 of the examples from PR16739:
https://godbolt.org/z/NlK7rA
https://bugs.llvm.org/show_bug.cgi?id=16739

And we expect instcombine or DAGCombine to clean that up by creating relatively simple shuffles.

Differential Revision: https://reviews.llvm.org/D62024

llvm-svn: 361338
2019-05-22 00:32:25 +00:00
Clement Courbet 122c6e6f36 [MergeICmps] Make sorting strongly stable on the rhs.
Summary:
Because the sort order was not strongly stable on the RHS, whether the
chain could merge would depend on the order of the blocks in the Phi.

EXPENSIVE_CHECKS would shuffle the blocks before sorting, resulting in
non-deterministic merging.

Reviewers: gchatelet

Subscribers: hiraditya, llvm-commits, RKSimon

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62193

llvm-svn: 361281
2019-05-21 17:58:42 +00:00
Clement Courbet 8361a10493 Revert r361257 "[MergeICmps][NFC] Make BCEAtom move-only."
Broke some bots.

llvm-svn: 361263
2019-05-21 14:24:46 +00:00
Clement Courbet 8fa970c2d8 [MergeICmps][NFC] Make BCEAtom move-only.
And handle for self-move. This is required so that llvm::sort can work
with EXPENSIVE_CHECKS, as it will do a random shuffle of the input
which can result in self-moves.

llvm-svn: 361257
2019-05-21 13:34:12 +00:00
Bob Haarman 032f87bbb3 Revert r360902 "Resubmit: [Salvage] Change salvage debug info ..."
This reverts commit rr360902. It caused an assertion failure in
lib/IR/DebugInfoMetadata.cpp: Assertion `(OffsetInBits + SizeInBits <=
FragmentSizeInBits) && "new fragment outside of original fragment"'
failed.

PR41931.

llvm-svn: 361246
2019-05-21 11:53:41 +00:00
Clement Courbet a95d95d392 [MergeICmps] Preserve the dominator tree.
Summary: In preparation for D60318 .

Reviewers: gchatelet, efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62068

llvm-svn: 361239
2019-05-21 11:02:23 +00:00
Cameron McInally 8bec58d5f7 [NFC][InstCombine] Add FIXME for one-use check on constant negation transforms.
llvm-svn: 361197
2019-05-20 21:00:42 +00:00
Cameron McInally 2557ca296a [InstCombine] Add visitFNeg(...) visitor for unary Fneg
Also, break out a helper function, namely foldFNegIntoConstant(...), which performs transforms common between visitFNeg(...) and visitFSub(...).

Differential Revision: https://reviews.llvm.org/D61693

llvm-svn: 361188
2019-05-20 19:10:30 +00:00
Orlando Cazalet-Hyams ed67bf8d2f Resubmit "[DebugInfo] Update loop metadata for inlined loops"
This reverts commit 95805bc425.
I've squashed the test fix into this commit.

[DebugInfo] Update loop metadata for inlined loops

Currently, when a loop is cloned while inlining function (A) into function (B)
the loop metadata is copied and then not modified at all. The loop metadata can
encode the loop's start and end DILocations. Therefore, the new inlined loop in
function (B) may have loop metadata which shows start and end locations residing
in function (A).

This patch ensures loop metadata is updated while inlining so that the start and
end DILocations are given the "inlinedAt" operand. I've also added a regression
test for this.

This fix is required for D60831 because that patch uses loop metadata to
determine the DILocation for the branches of new loop preheaders.

Reviewers: aprantl, dblaikie, anemet

Reviewed By: aprantl

Subscribers: eraman, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D61933

llvm-svn: 361149
2019-05-20 13:02:30 +00:00
Orlando Cazalet-Hyams 95805bc425 Revert "[DebugInfo] Update loop metadata for inlined loops"
This reverts commit 6e8f1a80cd.
Reverting patch while investigating build bot failure.

llvm-svn: 361143
2019-05-20 11:24:39 +00:00
Petar Jovanovic e85bbf564d [DebugInfoMetadata] Refactor DIExpression::prepend constants (NFC)
Refactor DIExpression::With* into a flag enum in order to be less
error-prone to use (as discussed on D60866).

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D61943

llvm-svn: 361137
2019-05-20 10:35:57 +00:00
Orlando Cazalet-Hyams 6e8f1a80cd [DebugInfo] Update loop metadata for inlined loops
Summary:
Currently, when a loop is cloned while inlining function (A) into function (B) the loop metadata is copied and then not modified at all. The loop metadata can encode the loop's start and end DILocations. Therefore, the new inlined loop in function (B) may have loop metadata which shows start and end locations residing in function (A).

This patch ensures loop metadata is updated while inlining so that the start and end DILocations are given the "inlinedAt" operand. I've also added a regression test for this.

This fix is required for D60831 because that patch uses loop metadata to determine the DILocation for the branches of new loop preheaders.

Reviewers: aprantl, dblaikie, anemet

Reviewed By: aprantl

Subscribers: eraman, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D61933

llvm-svn: 361132
2019-05-20 09:40:44 +00:00
Dinar Temirbulatov 2ff72f6654 [SLP] Refactoring of EdgeInfo and UserTreeIdx in buildTree_rec().
This is a follow-up refactoring patch after the introduction of usable TreeEntry pointers in D61706.
The EdgeInfo struct can now use a TreeEntry pointer instead of an index in VectorizableTree.

Committed on behalf of @vporpo (Vasileios Porpodas)

Differential Revision: https://reviews.llvm.org/D61795

llvm-svn: 361110
2019-05-19 01:30:41 +00:00
Matt Arsenault b04f3258dd GVN: Handle addrspacecast
llvm-svn: 361103
2019-05-18 14:36:06 +00:00
Sanjay Patel 926e47751b [InstCombine] move bitcast after insertelement-with-bitcasted-operands
llvm-svn: 361058
2019-05-17 18:06:12 +00:00
Roman Lebedev 3275060fe8 [InstCombine] canShiftBinOpWithConstantRHS(): drop bogus signbit check
Summary:
In D61918 i was looking at dropping it in DAGCombiner `visitShiftByConstant()`,
but as @craig.topper pointed out, it was copied from here.

That check claims that the transform is illegal otherwise.
That isn't true:
1. For `ISD::ADD`, we only process `ISD::SHL` outer shift => sign bit does not matter
   https://rise4fun.com/Alive/K4A
2. For `ISD::AND`, there is no restriction on constants:
   https://rise4fun.com/Alive/Wy3
3. For `ISD::OR`, there is no restriction on constants:
   https://rise4fun.com/Alive/GOH
3. For `ISD::XOR`, there is no restriction on constants:
   https://rise4fun.com/Alive/ml6

So, why is it there then?
As far as i can tell, it dates all the way back to original check-in rL7793.
I think we should just drop it.

Reviewers: spatel, craig.topper, efriedma, majnemer

Reviewed By: spatel

Subscribers: llvm-commits, craig.topper

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61938

llvm-svn: 361043
2019-05-17 15:52:49 +00:00
Clement Courbet 90900fbc9f [MergeICmps][NFC] Add more debug.
llvm-svn: 361024
2019-05-17 12:07:51 +00:00
Clement Courbet 632dfdda16 Re-land r360859: "[MergeICmps] Simplify the code."
With a fix for PR41917: The predecessor list was changing under our feet.

-  for (BasicBlock *Pred : predecessors(EntryBlock_)) {
+  while (!pred_empty(EntryBlock_)) {
+    BasicBlock* const Pred = *pred_begin(EntryBlock_);

llvm-svn: 361009
2019-05-17 09:43:45 +00:00
Philip Reames a74d654374 [LFTR] Strengthen assertions in genLoopLimit [NFCI]
llvm-svn: 360978
2019-05-17 02:18:03 +00:00
Philip Reames 45e7690796 [IndVars] Don't reimplement Loop::isLoopInvariant [NFC]
Using dominance vs a set membership check is indistinguishable from a compile time perspective, and the two queries return equivelent results.  Simplify code by using the existing function.

llvm-svn: 360976
2019-05-17 02:09:03 +00:00
Philip Reames 8e169cd266 [LFTR] Factor out a helper function for readability purpose [NFC]
llvm-svn: 360972
2019-05-17 01:39:58 +00:00
Philip Reames 9283f1847c Clarify comments on helpers used by LFTR [NFC]
I'm slowly wrapping my head around this code, and am making comment improvements where I can.

llvm-svn: 360968
2019-05-17 01:12:02 +00:00
Nico Weber d764e7c660 Revert r360859: "Reland r360771 "[MergeICmps] Simplify the code.""
It caused PR41917.

llvm-svn: 360963
2019-05-17 00:43:53 +00:00
Evgeniy Stepanov 7f281b2c06 HWASan exception support.
Summary:
Adds a call to __hwasan_handle_vfork(SP) at each landingpad entry.

Reusing __hwasan_handle_vfork instead of introducing a new runtime call
in order to be ABI-compatible with old runtime library.

Reviewers: pcc

Subscribers: kubamracek, hiraditya, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D61968

llvm-svn: 360959
2019-05-16 23:54:41 +00:00
Stephen Tozer 6f59b4b6d9 Resubmit: [Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed
Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=40645

Previously, LLVM had no functional way of performing casts inside of a
DIExpression(), which made salvaging cast instructions other than Noop casts
impossible. With the recent addition of DW_OP_LLVM_convert this salvaging is
now possible, and so can be used to fix the attached bug as well as any cases
where SExt instruction results are lost in the debugging metadata. This patch
introduces this fix by expanding the salvage debug info method to cover these
cases using the new operator.

Differential revision: https://reviews.llvm.org/D61184

llvm-svn: 360902
2019-05-16 14:41:01 +00:00
Clement Courbet c4fdd717ef Reland r360771 "[MergeICmps] Simplify the code."
This revision does not seem to be the culprit.

llvm-svn: 360859
2019-05-16 06:18:02 +00:00
Taewook Oh 9d020de3e8 [PredicateInfo] Do not process unreachable operands.
Summary: We should excluded unreachable operands from processing as their DFS visitation order is undefined. When `renameUses` function sorts `OpsToRename` (https://fburl.com/d2wubn60), the comparator assumes that the parent block of the operand has a corresponding dominator tree node. This is not the case for unreachable operands and crashes the compiler.

Reviewers: dberlin, mgrang, davide

Subscribers: efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61154

llvm-svn: 360796
2019-05-15 19:35:38 +00:00
Hiroshi Yamauchi 7dfd087a9a [JumpThreading] A bug fix for stale loop info after unfold select
Summary:
The return value of a TryToUnfoldSelect call was not checked, which led to an
incorrectly preserved loop info and some crash.

The original crash was reported on https://reviews.llvm.org/D59514.

Reviewers: davidxl, amehsan

Reviewed By: davidxl

Subscribers: fhahn, brzycki, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61920

llvm-svn: 360780
2019-05-15 15:15:16 +00:00
Clement Courbet eaf4413d2d Revert r360771 "[MergeICmps] Simplify the code."
Breaks a bunch of builbdots.

llvm-svn: 360776
2019-05-15 14:21:59 +00:00
Clement Courbet 0d071be474 [MergeICmps] Fix r360771.
Twine references a StringRef by reference, not value...

llvm-svn: 360775
2019-05-15 14:00:45 +00:00
Stephen Tozer 0d02f2ff4f Revert "[Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed"
This reverts r360772 due to build issues.
Reverted commit: 17dd4d7403.

llvm-svn: 360773
2019-05-15 13:41:44 +00:00
Stephen Tozer 17dd4d7403 [Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed
Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=40645

Previously, LLVM had no functional way of performing casts inside of a
DIExpression(), which made salvaging cast instructions other than Noop
casts impossible. With the recent addition of DW_OP_LLVM_convert this
salvaging is now possible, and so can be used to fix the attached bug as
well as any cases where SExt instruction results are lost in the
debugging metadata. This patch introduces this fix by expanding the
salvage debug info method to cover these cases using the new operator.

Differential revision: https://reviews.llvm.org/D61184

llvm-svn: 360772
2019-05-15 13:15:48 +00:00
Clement Courbet 157ae639fa [MergeICmps] Simplify the code.
Instead of patching the original blocks, we now generate new blocks and
delete the old blocks. This results in simpler code with a less twisted
control flow (see the change in `entry-block-shuffled.ll`).

This will make https://reviews.llvm.org/D60318 simpler by making it more
obvious where control flow created and deleted.

Reviewers: gchatelet

Subscribers: hiraditya, llvm-commits, spatel

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61736

llvm-svn: 360771
2019-05-15 13:04:24 +00:00
Florian Hahn 9e778e6c73 [LV] Move getScalarizationOverhead and vector call cost computations to CM. (NFC)
This reduces the number of parameters we need to pass in and they seem a
natural fit in LoopVectorizationCostModel. Also simplifies things for
D59995.

As a follow up refactoring, we could only expose a expose a
shouldUseVectorIntrinsic() helper in LoopVectorizationCostModel, instead
of calling getVectorCallCost/getVectorIntrinsicCost in
InnerLoopVectorizer/VPRecipeBuilder.

Reviewers: Ayal, hsaito, dcaballe, rengolin

Reviewed By: rengolin

Differential Revision: https://reviews.llvm.org/D61638

llvm-svn: 360758
2019-05-15 10:05:49 +00:00
Fangrui Song f4dfd63c74 [IR] Disallow llvm.global_ctors and llvm.global_dtors of the 2-field form in textual format
The 3-field form was introduced by D3499 in 2014 and the legacy 2-field
form was planned to be removed in LLVM 4.0

For the textual format, this patch migrates the existing 2-field form to
use the 3-field form and deletes the compatibility code.
test/Verifier/global-ctors-2.ll checks we have a friendly error message.

For bitcode, lib/IR/AutoUpgrade UpgradeGlobalVariables will upgrade the
2-field form (add i8* null as the third field).

Reviewed By: rnk, dexonsmith

Differential Revision: https://reviews.llvm.org/D61547

llvm-svn: 360742
2019-05-15 02:35:32 +00:00
Leonard Chan 0cdd3b1d81 [NewPM] Port HWASan and Kernel HWASan
Port hardware assisted address sanitizer to new PM following the same guidelines as msan and tsan.

Changes:
- Separate HWAddressSanitizer into a pass class and a sanitizer class.
- Create new PM wrapper pass for the sanitizer class.
- Use the getOrINsert pattern for some module level initialization declarations.
- Also enable kernel-kwasan in new PM
- Update llvm tests and add clang test.

Differential Revision: https://reviews.llvm.org/D61709

llvm-svn: 360707
2019-05-14 21:17:21 +00:00
Florian Hahn 53c9d585b5 [LICM] Allow AliasSetMap to contain top-level loops.
When an outer loop gets deleted by a different pass, before LICM visits
it, we cannot clean up its sub-loops in AliasSetMap, because at the
point we receive the deleteAnalysisLoop callback for the outer loop, the loop
object is already invalid and we cannot access its sub-loops any longer.

Reviewers: asbirlea, sanjoy, chandlerc

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D61904

llvm-svn: 360704
2019-05-14 19:41:36 +00:00
Alina Sbirlea 80c6e79602 [MemorySSA] LoopSimplify preserves MemorySSA only when flag is flipped.
LoopSimplify can preserve MemorySSA after r360270.
But the MemorySSA analysis is retrieved and preserved only when the
EnableMSSALoopDependency is set to true. Use the same conditional to
mark the pass as preserved, otherwise subsequent passes will get an
invalid analysis.
Resolves PR41853.

llvm-svn: 360697
2019-05-14 18:07:18 +00:00
Philip Reames 75ad8c5d63 Fix a release mode warning introduced in r360694
llvm-svn: 360696
2019-05-14 17:50:06 +00:00
Philip Reames bd8d309111 [IndVars] Extend reasoning about loop invariant exits to non-header blocks
Noticed while glancing through the code for other reasons.  The extension is trivial enough, decided to just do it.

llvm-svn: 360694
2019-05-14 17:20:10 +00:00
Cameron McInally 7c5c0c9fe5 Support FNeg in SpeculativeExecution pass
Differential Revision: https://reviews.llvm.org/D61910

llvm-svn: 360692
2019-05-14 16:51:18 +00:00
Tim Northover ed9117f88d GlobalOpt: do not promote globals used atomically to constants.
Some atomic loads are implemented as cmpxchg (particularly if large or
floating), and that usually requires write access to the memory involved
or it will segfault.

We can still propagate the constant value to users we understand though.

llvm-svn: 360662
2019-05-14 11:03:13 +00:00
Simon Pilgrim 15842132d5 [MemorySanitizer] getMMXVectorTy - assert valid element size. NFCI.
Fixes scan-build warnings

llvm-svn: 360658
2019-05-14 10:29:18 +00:00
Gor Nishanov d64455cd43 [coroutines] Fix spills of static array allocas
Summary:
CoroFrame was not considering static array allocas, and was only ever reserving a single element in the coroutine frame.
This meant that stores to the non-zero'th element would corrupt later frame data.

Store static array allocas as field arrays in the coroutine frame.

Added test.

Committed by Gor Nishanov on behalf of ben-clayton
Reviewers: GorNishanov, modocache

Reviewed By: GorNishanov

Subscribers: Orlando, capn, EricWF, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61372

llvm-svn: 360636
2019-05-13 23:58:24 +00:00
Sanjay Patel 760f61ab36 [InstCombine] try harder to form rotate (funnel shift) (PR20750)
We have a similar match for patterns ending in a truncate. This
should be ok for all targets because the default expansion would
still likely be better from replacing 2 'and' ops with 1.

Attempt to show the logic equivalence in Alive (which doesn't
currently have funnel-shift in its vocabulary AFAICT):

  %shamt = zext i8 %i to i32
  %m = and i32 %shamt, 31
  %neg = sub i32 0, %shamt
  %and4 = and i32 %neg, 31
  %shl = shl i32 %v, %m
  %shr = lshr i32 %v, %and4
  %or = or i32 %shr, %shl
  =>
  %a = and i8 %i, 31
  %shamt2 = zext i8 %a to i32
  %neg2 = sub i32 0, %shamt2
  %and4 = and i32 %neg2, 31
  %shl = shl i32 %v, %shamt2
  %shr = lshr i32 %v, %and4
  %or = or i32 %shr, %shl

https://rise4fun.com/Alive/V9r

llvm-svn: 360605
2019-05-13 17:28:19 +00:00
Amara Emerson e5248e6b41 Revert "[LSR] Tweak setup cost depth threshold to 10."
Changing the threshold might not be the best long term approach. Revert for now.

llvm-svn: 360589
2019-05-13 15:37:18 +00:00
Teresa Johnson 37b80122bd [ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible
Summary:
We hit undefined references building with ThinLTO when one source file
contained explicit instantiations of a template method (weak_odr) but
there were also implicit instantiations in another file (linkonce_odr),
and the latter was the prevailing copy. In this case the symbol was
marked hidden when the prevailing linkonce_odr copy was promoted to
weak_odr. It led to unsats when the resulting shared library was linked
with other code that contained a reference (expecting to be resolved due
to the explicit instantiation).

Add a CanAutoHide flag to the GV summary to allow the thin link to
identify when all copies are eligible for auto-hiding (because they were
all originally linkonce_odr global unnamed addr), and only do the
auto-hide in that case.

Most of the changes here are due to plumbing the new flag through the
bitcode and llvm assembly, and resulting test changes. I augmented the
existing auto-hide test to check for this situation.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59709

llvm-svn: 360466
2019-05-10 20:08:24 +00:00
Cameron McInally e75412ab47 Add InstCombine::visitFNeg(...)
Differential Revision: https://reviews.llvm.org/D61784

llvm-svn: 360461
2019-05-10 20:01:04 +00:00
Simon Pilgrim 6c3ae79e9b [SLP] Refactor VectorizableTree to use unique_ptr.
This patch fixes the TreeEntry dangling pointer issue caused by reallocations of VectorizableTree.

Committed on behalf of @vporpo (Vasileios Porpodas)

Differential Revision: https://reviews.llvm.org/D61706

llvm-svn: 360456
2019-05-10 18:55:17 +00:00
Amara Emerson b6af291772 [LSR] Tweak setup cost depth threshold to 10.
The original change introduced a depth limit of 7 which caused a 22% regression
in the Swift MapReduceLazyCollection & Ackermann benchmarks. This new threshold
still ensures that the original test case doesn't hang.

rdar://50359639

llvm-svn: 360444
2019-05-10 17:29:35 +00:00
Michael Liao b284414a1b [InferAddressSpaces] Enhance the handling of cosntexpr.
Summary:
- Constant expressions may not be added in strict postorder as the
  forward instruction scan order. Thus, for a constant express (CE0), if
  its operand (CE1) is used in an previous instruction, they are not in
  postorder. However, different from
  `cloneInstructionWithNewAddressSpace`,
  `cloneConstantExprWithNewAddressSpace` doesn't bookkeep uninferred
  instructions for later resolving. That results in failure of inferring
  constant address.
- This patch adds the support to infer constant expression operand
  recursively, since there won't be loop, if that operand is another
  constant expression.

Reviewers: arsenm

Subscribers: jholewinski, jvesely, wdng, nhaehnle, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61760

llvm-svn: 360431
2019-05-10 14:57:42 +00:00
Jeremy Morse a2b780b731 [DebugInfo] Use zero linenos for debug intrinsics when promoting dbg.declare
In certain circumstances, optimizations pick line numbers from debug
intrinsic instructions as the new location for altered instructions. This
is problematic because the line number of a debugging intrinsic is
meaningless (it doesn't produce any machine instruction), only the scope
information is valid. The result can be the line number of a variable
declaration "leaking" into real code from debugging intrinsics, making the
line table un-necessarily jumpy, and potentially different with / without
variable locations.

Fix this by using zero line numbers when promoting dbg.declare intrinsics
into dbg.values: this is safe for debug intrinsics as their line numbers
are meaningless, and reduces the scope for damage / misleading stepping
when optimizations pick locations from the wrong place.

Differential Revision: https://reviews.llvm.org/D59272

llvm-svn: 360415
2019-05-10 10:03:41 +00:00
David Stuttard 411488b11e [CodeGenPrepare] Limit recursion depth for collectBitParts
Summary:
Seeing some issues for windows debug pathological cases with collectBitParts
recursion (1525 levels of recursion!)
Setting the limit to 64 as this should be sufficient - passes all lit cases

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61728

Change-Id: I7f44cdc6c1badf1c2ccbf1b0c4b6afe27ecb39a1
llvm-svn: 360347
2019-05-09 15:02:10 +00:00
Craig Topper 51a17df45d [InstCombine] When turning sext into zext due to known bits, return the new ZExt instead of calling replaceinstuseswith
The worklist loop that we're returning back to should be able to do the repacement itself. This is how we normally do replacements.

My main motivation was that I observed that we weren't preserving the name of the result when we do this transform. The replacement code in the worklist loop will call takeName as part of the replacement.

Differential Revision: https://reviews.llvm.org/D61695

llvm-svn: 360284
2019-05-08 20:59:21 +00:00
Alina Sbirlea 458c7339e1 [NewPassManager] Add tuning option: SLPVectorization [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61616

llvm-svn: 360276
2019-05-08 17:58:35 +00:00
Alina Sbirlea f31eba6494 [MemorySSA] Teach LoopSimplify to preserve MemorySSA.
Summary:
Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available.
Do not preserve it in the new pass manager.
Update tests.

Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60833

llvm-svn: 360270
2019-05-08 17:05:36 +00:00
David Greene 6c433713e9 [Reassociation] Place moved instructions after landing pads
Reassociation's NegateValue moved instructions to the beginning of
blocks (after PHIs) without checking for exception handling pads.
It's possible for reassociation to move something into an exception
handling block so we need to make sure we don't move things too early
in the block.  This change advances the insertion point past any
exception handling pads.

If the block we want to move into contains a catchswitch, we cannot
move into it.  In that case just create a new neg as if we had not
found an existing neg to move.

Differential Revision: https://reviews.llvm.org/D61089

llvm-svn: 360262
2019-05-08 15:44:24 +00:00
Simon Pilgrim cced3ecc35 [VPlan] Fix "value never used" static analyzer warning. NFCI.
llvm-svn: 360241
2019-05-08 10:52:26 +00:00
Florian Hahn 3c696b3e7c [SCCP] Fix crash when trying to constant-fold terminators multiple times.
If we fold a branch/switch to an unconditional branch to another dead block we
replace the branch with unreachable, to avoid attempting to fold the
unconditional branch.

Reviewers: davide, efriedma, mssimpso, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D61300

llvm-svn: 360232
2019-05-08 09:09:54 +00:00
Kostya Serebryany b9c5768302 revert r360162 as it breaks most of the buildbots
llvm-svn: 360190
2019-05-07 20:57:11 +00:00
Robert Lougher 8681ef8f41 [InstCombine] Add new combine to add folding
(X | C1) + C2 --> (X | C1) ^ C1 iff (C1 == -C2)

I verified the correctness using Alive:
https://rise4fun.com/Alive/YNV

This transform enables the following transform that already exists in
instcombine:

(X | Y) ^ Y --> X & ~Y

As a result, the full expected transform is:

(X | C1) + C2 --> X & ~C1 iff (C1 == -C2)

There already exists the transform in the sub case:

(X | Y) - Y --> X & ~Y

However this does not trigger in the case where Y is constant due to an earlier
transform:

X - (-C) --> X + C

With this new add fold, both the add and sub constant cases are handled.

Patch by Chris Dawson.

Differential Revision: https://reviews.llvm.org/D61517

llvm-svn: 360185
2019-05-07 19:36:41 +00:00
Sanjay Patel 6a281a7545 [InstCombine] allow sinking fneg operands through an FP min/max
Fundamentally/generally, we should not have to rely on bailouts/crippling of
folds. In this particular case, I think we always recognize the inverted
predicate min/max pattern, so there should not be any loss of optimization.
Codegen looks better because we are eliminating an fneg.

llvm-svn: 360180
2019-05-07 18:58:07 +00:00
Orlando Cazalet-Hyams 78a6062c24 [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel

Reviewed By: hfinkel

Subscribers: bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

llvm-svn: 360162
2019-05-07 15:37:38 +00:00
Orlando Cazalet-Hyams 0d05177337 Test commit access
llvm-svn: 360125
2019-05-07 09:30:55 +00:00
Fangrui Song a400ca3f3d [SanitizerCoverage] Use different module ctor names for trace-pc-guard and inline-8bit-counters
Fixes the main issue in PR41693

When both modes are used, two functions are created:
`sancov.module_ctor`, `sancov.module_ctor.$LastUnique`, where
$LastUnique is the current LastUnique counter that may be different in
another module.

`sancov.module_ctor.$LastUnique` belongs to the comdat group of the same
name (due to the non-null third field of the ctor in llvm.global_ctors).

    COMDAT group section [    9] `.group' [sancov.module_ctor] contains 6 sections:
       [Index]    Name
       [   10]   .text.sancov.module_ctor
       [   11]   .rela.text.sancov.module_ctor
       [   12]   .text.sancov.module_ctor.6
       [   13]   .rela.text.sancov.module_ctor.6
       [   23]   .init_array.2
       [   24]   .rela.init_array.2

    # 2 problems:
    # 1) If sancov.module_ctor in this module is discarded, this group
    # has a relocation to a discarded section. ld.bfd and gold will
    # error. (Another issue: it is silently accepted by lld)
    # 2) The comdat group has an unstable name that may be different in
    # another translation unit. Even if the linker allows the dangling relocation
    # (with --noinhibit-exec), there will be many undesired .init_array entries
    COMDAT group section [   25] `.group' [sancov.module_ctor.6] contains 2 sections:
       [Index]    Name
       [   26]   .init_array.2
       [   27]   .rela.init_array.2

By using different module ctor names, the associated comdat group names
will also be different and thus stable across modules.

Reviewed By: morehouse, phosek

Differential Revision: https://reviews.llvm.org/D61510

llvm-svn: 360107
2019-05-07 01:39:37 +00:00
Jordan Rupprecht 8f14e7cacf Revert "Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"
This reverts r357452 (git commit 21eb771dcb).

This was causing strange optimization-related test failures on an internal test. Will followup with more details offline.

llvm-svn: 360086
2019-05-06 21:55:05 +00:00
Sanjay Patel a6019d5164 [InstCombine] sink FP negation of operands through select
We don't always get this:

Cond ? -X : -Y --> -(Cond ? X : Y)

...even with the legacy IR form of fneg in the case with extra uses,
and we miss matching with the newer 'fneg' instruction because we
are expecting binops through the rest of the path.

Differential Revision: https://reviews.llvm.org/D61604

llvm-svn: 360075
2019-05-06 20:34:05 +00:00
Simon Pilgrim 364ef5db2b Pull out repeated CI->getCalledFunction() calls. NFCI.
llvm-svn: 360070
2019-05-06 19:51:54 +00:00
Sanjay Patel a64bd09ec4 [InstCombine] reduce code duplication; NFC
llvm-svn: 360059
2019-05-06 17:39:18 +00:00
Sanjay Patel 62f457b137 [InstCombine] reduce code duplication; NFCI
llvm-svn: 360051
2019-05-06 15:35:02 +00:00
Simon Pilgrim 97fbc2abfe [LoadStoreVectorizer] vectorizeStoreChain - ensure we find a store type.
Properly initialize store type to null then ensure we find a real store type in the chain.

Fixes scan-build null dereference warning and makes the code clearer.

llvm-svn: 360031
2019-05-06 10:25:11 +00:00
Clement Courbet 9e1f2a7fe7 [SimplifyLibCalls] Simplify bcmp too.
Summary: Fixes PR40699.

Reviewers: gchatelet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61585

llvm-svn: 360021
2019-05-06 09:15:22 +00:00
Markus Lavin a778074165 [DebugInfo] GlobalOpt DW_OP_deref_size instead of DW_OP_deref.
Optimization pass lib/Transforms/IPO/GlobalOpt.cpp needs to insert
DW_OP_deref_size instead of DW_OP_deref to be compatible with big-endian
targets for same reasons as in D59687.

Differential Revision: https://reviews.llvm.org/D60611

llvm-svn: 360013
2019-05-06 07:20:56 +00:00
Roman Lebedev 1a1b922177 [NFC] BasicBlock: refactor changePhiUses() out of replacePhiUsesWith(), use it
Summary:
It is a common thing to loop over every `PHINode` in some `BasicBlock`
and change old `BasicBlock` incoming block to a new `BasicBlock` incoming block.
`replaceSuccessorsPhiUsesWith()` already had code to do that,
it just wasn't a function.
So outline it into a new function, and use it.

Reviewers: chandlerc, craig.topper, spatel, danielcdh

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61013

llvm-svn: 359996
2019-05-05 18:59:39 +00:00
Roman Lebedev e3b1d82b53 [NFC] PHINode: introduce replaceIncomingBlockWith() function, use it
Summary:
There is `PHINode::getBasicBlockIndex()`, `PHINode::setIncomingBlock()`
and `PHINode::getNumOperands()`, but no function to replace every
specified `BasicBlock*` predecessor with some other specified `BasicBlock*`.
Clearly, there are a lot of places that could use that functionality.

Reviewers: chandlerc, craig.topper, spatel, danielcdh

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61011

llvm-svn: 359995
2019-05-05 18:59:30 +00:00
Roman Lebedev 7ad5d14f3a [NFC] Instruction: introduce replaceSuccessorWith() function, use it
Summary:
There is `Instruction::getNumSuccessors()`, `Instruction::getSuccessor()`
and `Instruction::setSuccessor()`, but no function to replace every
specified `BasicBlock*` successor with some other specified `BasicBlock*`.
I've found one place where it should clearly be used.

Reviewers: chandlerc, craig.topper, spatel, danielcdh

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61010

llvm-svn: 359994
2019-05-05 18:59:22 +00:00
Roman Lebedev e5be660e25 [NFC][Utils] deleteDeadLoop(): add an assert that exit block has some non-PHI instruction
Summary:
If `deleteDeadLoop()` is called on such a loop, that has "bad" exit block,
one that e.g. has no terminator instruction, the `DIBuilder::insertDbgValueIntrinsic()`
will be told to insert the Dbg Value Intrinsic after `nullptr`
(since there is no first non-PHI instruction), which will cause it to not insert
those instructions into any basic block. The instructions will be parent-less,
and IR verifier will complain. It is rather obvious to track down the root cause
when that happens, so let's just assert it never happens.

Reviewers: sanjoy, davide, vsk

Reviewed By: vsk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61008

llvm-svn: 359993
2019-05-05 18:59:12 +00:00
Simon Pilgrim afb0e664e6 [SLPVectorizer] Prefer pre-increments. NFCI.
llvm-svn: 359989
2019-05-05 17:53:09 +00:00
Simon Pilgrim 5b05f20a3a [SLPVectorizer] Make getSpillCost() const. NFCI.
Ideally getTreeCost() should be const as well but non-const Type creation would need to be addressed first.

llvm-svn: 359975
2019-05-05 10:37:38 +00:00
Simon Pilgrim 7a2e855a0f Move Value *RHSCIOp def into the scope where its actually used. NFCI.
llvm-svn: 359973
2019-05-05 10:27:45 +00:00
Bob Haarman a78ab77b6b remove inalloca parameters in globalopt and simplify argpromotion
Summary:
Inalloca parameters require special handling in some optimizations.
This change causes globalopt to strip the inalloca attribute from
function parameters when it is safe to do so, removes the special
handling for inallocas from argpromotion, and replaces it with a
simple check that causes argpromotion to skip functions that receive
inallocas (for when the pass is invoked on code that didn't run
through globalopt first). This also avoids a case where argpromotion
would incorrectly try to pass an inalloca in a register.

Fixes PR41658.

Reviewers: rnk, efriedma

Reviewed By: rnk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61286

llvm-svn: 359743
2019-05-02 00:37:36 +00:00
Hiroshi Yamauchi 1620104034 [PGO][CHR] A bug fix.
Summary: Fix a transformation bug where two scopes share a common instrution to hoist.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61405

llvm-svn: 359736
2019-05-01 22:49:52 +00:00
Philip Reames 84e54eb471 [InstCombine] Limit a vector demanded elts rule which was producing invalid IR.
The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design).  It turns out that the LangRef disallows such cases when indexing structs.  The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes.

This should fix https://bugs.llvm.org/show_bug.cgi?id=41624.

llvm-svn: 359633
2019-04-30 23:09:26 +00:00
Alina Sbirlea 4e1ac95cf5 [PassManagerBuilder] Add option for interleaved loops, for loop vectorize.
Summary:
Match NewPassManager behavior: add option for interleaved loops in the
old pass manager, and use that instead of the flag used to disable loop unroll.
No changes in the defaults.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61030

llvm-svn: 359615
2019-04-30 21:29:20 +00:00
Evandro Menezes ea349f3ef5 [SimplifyLibCalls] Clean up code (NFC)
Fix pointer check after dereferencing (PR41665).

llvm-svn: 359595
2019-04-30 18:35:38 +00:00
Alexander Potapenko 06d00afa61 MSan: handle llvm.lifetime.start intrinsic
Summary:
When a variable goes into scope several times within a single function
or when two variables from different scopes share a stack slot it may
be incorrect to poison such scoped locals at the beginning of the
function.
In the former case it may lead to false negatives (see
https://github.com/google/sanitizers/issues/590), in the latter - to
incorrect reports (because only one origin remains on the stack).

If Clang emits lifetime intrinsics for such scoped variables we insert
code poisoning them after each call to llvm.lifetime.start().
If for a certain intrinsic we fail to find a corresponding alloca, we
fall back to poisoning allocas for the whole function, as it's now
impossible to tell which alloca was missed.

The new instrumentation may slow down hot loops containing local
variables with lifetime intrinsics, so we allow disabling it with
-mllvm -msan-handle-lifetime-intrinsics=false.

Reviewers: eugenis, pcc

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60617

llvm-svn: 359536
2019-04-30 08:35:14 +00:00
Sanjay Patel a706b9a90e [InstCombine] reduce code duplication; NFC
Follow-up to:
rL359482

Avoid this potential problem throughout by giving the type a name
and verifying the assumption that both operands are the same type.

llvm-svn: 359485
2019-04-29 19:23:44 +00:00
Simon Pilgrim e3c8776172 [InstCombine] visitFCmpInst - appease copy+paste pattern warning. NFCI.
PVS Studio's copy+paste recognizer was seeing this as a typo, technically Op0/Op1 in a fcmp should always be the same type, but we might as well avoid the issue.

Reported in https://www.viva64.com/en/b/0629/

llvm-svn: 359482
2019-04-29 18:52:19 +00:00
Quentin Colombet 31ce274207 [BlockExtractor] Expose a constructor for the group extraction
NFC

Differential Revision: https://reviews.llvm.org/D60971

llvm-svn: 359463
2019-04-29 16:14:02 +00:00
Quentin Colombet ae2cbb3400 [BlockExtractor] Change the basic block separator from ',' to ';'
This change aims at making the file format be compatible with the
way LLVM handles command line options.

Differential Revision: https://reviews.llvm.org/D60970

llvm-svn: 359462
2019-04-29 16:14:00 +00:00
Yevgeny Rouban 0822bfc6de [LoopSimplifyCFG] Suppress expensive DomTree verification
This patch makes verification level lower for builds with
inexpensive checks.

Differential Revision: https://reviews.llvm.org/D61055

llvm-svn: 359446
2019-04-29 13:29:55 +00:00
Sven van Haastregt 66f612601d [InferAddressSpaces] Add AS parameter to the pass factory
This enables the pass to be used in the absence of
TargetTransformInfo. When the argument isn't passed, the factory
defaults to UninitializedAddressSpace and the flat address space is
obtained from the TargetTransformInfo as before this change. Existing
users won't have to change.

Patch by Kevin Petit.

Differential Revision: https://reviews.llvm.org/D60602

llvm-svn: 359290
2019-04-26 09:21:25 +00:00
Justin Bogner df5d2b3846 [GlobalOpt] Swap the expensive check for cold calls with the cheap TTI check
isValidCandidateForColdCC is much more expensive than
TTI.useColdCCForColdCall, which by default just returns false. Avoid
doing this work if we're not going to look at the answer anyway.

This change is NFC, but I see significant compile time improvements on
some code with pathologically many functions.

llvm-svn: 359253
2019-04-26 00:12:50 +00:00
David Blaikie 0c4dbf9ecd Assigning to a local object in a return statement prevents copy elision. NFC.
I added a diagnostic along the lines of `-Wpessimizing-move` to detect `return x = y` suppressing copy elision, but I don't know if the diagnostic is really worth it. Anyway, here are the places where my diagnostic reported that copy elision would have been possible if not for the assignment.

P1155R1 in the post-San-Diego WG21 (C++ committee) mailing discusses whether WG21 should fix this pitfall by just changing the core language to permit copy elision in cases like these.

(Kona update: The bulk of P1155 is proceeding to CWG review, but specifically *not* the parts that explored the notion of permitting copy-elision in these specific cases.)

Reviewed By: dblaikie

Author: Arthur O'Dwyer

Differential Revision: https://reviews.llvm.org/D54885

llvm-svn: 359236
2019-04-25 20:09:00 +00:00
Akira Hatanaka 8edf8f317b [ObjC][ARC] Let ARC optimizer bail out if the number of pointer states
it keeps track of becomes too large

ARC optimizer does a top-down and a bottom-up traversal of the whole
function to pair up retain and release instructions and remove them.
This can be expensive if the number of instructions in the function and
pointer states it tracks are large since it has to look at each pointer
state and determine whether the instruction being visited can
potentially use the pointer.

This patch adds a command line option that sets a limit to the number of
pointers it tracks.

rdar://problem/49477063

Differential Revision: https://reviews.llvm.org/D61100

llvm-svn: 359226
2019-04-25 19:42:55 +00:00
Robert Lougher d469133f95 [Evaluator] Walk initial elements when handling load through bitcast
When evaluating a store through a bitcast, the evaluator tries to move the
bitcast from the pointer onto the stored value. If the cast is invalid, it
tries to "introspect" the type to get a valid cast by obtaining a pointer to
the initial element (if the type is nested, this may require walking several
initial elements).

In some situations it is possible to get a bitcast on a load (e.g. with
unions, where the bitcast may not be the same type as the store). However,
equivalent logic to the store to introspect the type is missing. This patch
add this logic.

Note, when developing the patch I was unhappy with adding similar logic
directly to the load case as it could get out of step. Instead, I have
abstracted the "introspection" into a helper function, with the specifics
being handled by a passed-in lambda function.

Differential Revision: https://reviews.llvm.org/D60793

llvm-svn: 359205
2019-04-25 17:00:01 +00:00
Simon Pilgrim 48a3b54572 [InstCombine][X86] Tweak generic expansion of PACKSS/PACKUS to shuffle then truncate. NFCI.
This has no effect on constant folding but will be useful when we expand non-saturating PACKSS/PACKUS intrinsics.

llvm-svn: 359191
2019-04-25 13:51:57 +00:00
Simon Pilgrim 4b7d3c4831 Fix include order. NFCI.
llvm-svn: 359177
2019-04-25 09:49:37 +00:00
Alina Sbirlea 733c8c40c8 Enable LoopVectorization by default.
Summary:
When refactoring vectorization flags, vectorization was disabled by default in the new pass manager.
This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults.
Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization.

Reviewers: chandlerc, jgorbe

Subscribers: jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61091

llvm-svn: 359167
2019-04-25 04:49:48 +00:00
Philip Reames 88cd69b56f Consolidate existing utilities for interpreting vector predicate maskes [NFC]
llvm-svn: 359163
2019-04-25 02:30:17 +00:00
Kit Barton 8e64f0a649 Fix unused variable warning in LoopFusion pass.
Do not wrap the contents of printFusionCandidates in the LLVM_DEBUG macro. This
fixes an unused variable warning generated when compiling without asserts but
with -DENABLE_LLVM_DUMP.

Differential Revision: https://reviews.llvm.org/D61035

llvm-svn: 359161
2019-04-25 02:10:02 +00:00
Philip Reames 7c8647b26f [InstCombine] Be consistent w/handling of masked intrinsics style wise [NFC]
llvm-svn: 359160
2019-04-25 01:18:56 +00:00
Alexey Bataev ef3c1884ec [SLP] Fix crash after r358519, by V. Porpodas.
Summary: The code did not check if operand was undef before casting it to Instruction.

Reviewers: RKSimon, ABataev, dtemirbulatov

Reviewed By: ABataev

Subscribers: uabelho

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61024

llvm-svn: 359136
2019-04-24 20:21:32 +00:00
Simon Pilgrim 55f14dac74 [InstCombine][X86] Use generic expansion of PACKSS/PACKUS for constant folding. NFCI.
This patch rewrites the existing PACKSS/PACKUS constant folding code to expand as a generic expansion.

This is a first NFCI step toward expanding PACKSS/PACKUS intrinsics which are acting as non-saturating truncations (although technically the expansion could be used in all cases - but we'll probably want to be conservative).

llvm-svn: 359111
2019-04-24 16:53:17 +00:00
Bjorn Pettersson 71e8c6f20f Add "const" in GetUnderlyingObjects. NFC
Summary:
Both the input Value pointer and the returned Value
pointers in GetUnderlyingObjects are now declared as
const.

It turned out that all current (in-tree) uses of
GetUnderlyingObjects were trivial to update, being
satisfied with have those Value pointers declared
as const. Actually, in the past several of the users
had to use const_cast, just because of ValueTracking
not providing a version of GetUnderlyingObjects with
"const" Value pointers. With this patch we get rid
of those const casts.

Reviewers: hfinkel, materi, jkorous

Reviewed By: jkorous

Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61038

llvm-svn: 359072
2019-04-24 06:55:50 +00:00
Fangrui Song b5f3984541 [CommandLine] Provide parser<unsigned long> instantiation to allow cl::opt<uint64_t> on LP64 platforms
Summary:
And migrate opt<unsigned long long> to opt<uint64_t>

Fixes PR19665

Differential Revision: https://reviews.llvm.org/D60933

llvm-svn: 359068
2019-04-24 02:40:20 +00:00
Dmitry Mikulin 312b5f86b7 The error message for mismatched value sites is very cryptic.
Make it more readable for an average user.

Differential Revision: https://reviews.llvm.org/D60896

llvm-svn: 359043
2019-04-23 22:26:55 +00:00
Alina Sbirlea 4fd1f266b1 [MemorySSA] LCSSA preserves MemorySSA.
Summary:
Enabling MemorySSA in the old pass manager leads to MemorySSA being run
twice due to the fact that LCSSA and LoopSimplify do not preserve
MemorySSA. This is the first step to address that: target LCSSA.

LCSSA does not make any changes that invalidate MemorySSA, so it
preserves it by design. It must preserve AA as well, for this to hold.

After this patch, MemorySSA is still run twice in the old pass manager.
Step two follows: target LoopSimplify.

Subscribers: mehdi_amini, jlebar, Prazek, llvm-commits, george.burgess.iv, chandlerc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60832

llvm-svn: 359032
2019-04-23 20:59:44 +00:00
Akira Hatanaka 5c3117b0a9 [ObjC][ARC] Check the basic block size before calling
DominatorTree::dominate.

ARC contract pass has an optimization that replaces the uses of the
argument of an ObjC runtime function call with the call result.

For example:

; Before optimization
%1 = tail call i8* @foo1()
%2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1)
store i8* %1, i8** @g0, align 8

; After optimization
%1 = tail call i8* @foo1()
%2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1)
store i8* %2, i8** @g0, align 8 // %1 is replaced with %2

Before replacing the argument use, DominatorTree::dominate is called to
determine whether the user instruction is dominated by the ObjC runtime
function call instruction. The call to DominatorTree::dominate can be
expensive if the two instructions belong to the same basic block and the
size of the basic block is large. This patch checks the basic block size
and just bails out if the size exceeds the limit set by command line
option "arc-contract-max-bb-size".

rdar://problem/49477063

Differential Revision: https://reviews.llvm.org/D60900

llvm-svn: 359027
2019-04-23 19:49:03 +00:00
Philip Reames 2ce017026a [InstCombine] Convert a masked.load of a dereferenceable address to an unconditional load
If we have a masked.load from a location we know to be dereferenceable, we can simply issue a speculative unconditional load against that address. The key advantage is that it produces IR which is well understood by the optimizer. The select (cnd, load, passthrough) form produced should be pattern matchable back to hardware predication if profitable.

Differential Revision: https://reviews.llvm.org/D59703

llvm-svn: 359000
2019-04-23 15:25:14 +00:00
Fangrui Song efd94c56ba Use llvm::stable_sort
While touching the code, simplify if feasible.

llvm-svn: 358996
2019-04-23 14:51:27 +00:00
Fedor Sergeev 652168a99b [CallSite removal] move InlineCost to CallBase usage
Converting InlineCost interface and its internals into CallBase usage.
Inliners themselves are still not converted.

Reviewed By: reames
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60636

llvm-svn: 358982
2019-04-23 12:43:27 +00:00
David Green 63a2aa715a [LSR] Limit the recursion for setup cost
In some circumstances we can end up with setup costs that are very complex to
compute, even though the scevs are not very complex to create. This can also
lead to setupcosts that are calculated to be exactly -1, which LSR treats as an
invalid cost. This patch puts a limit on the recursion depth for setup cost to
prevent them taking too long.

Thanks to @reames for the report and test case.

Differential Revision: https://reviews.llvm.org/D60944

llvm-svn: 358958
2019-04-23 08:52:21 +00:00
Philip Reames d748689c7f [InstCombine] Eliminate stores to constant memory
If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store.

The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead.

Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store *is undefined*, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date.

Differential Revision: https://reviews.llvm.org/D60659

llvm-svn: 358919
2019-04-22 20:28:19 +00:00
Philip Reames d8d9b7b20e [InstSimplify] Move masked.gather w/no active lanes handling to InstSimplify from InstCombine
In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask.  

llvm-svn: 358913
2019-04-22 19:30:01 +00:00
Justin Bogner e90d5c8db0 [IPSCCP] Add missing `AssumptionCacheTracker` dependency
Back in August, r340525 introduced a dependency on the assumption
cache tracker in the ipsccp pass, but that commit missed a call to
INITIALIZE_PASS_DEPENDENCY, which leaves the assumption cache
improperly registered if SCCP is the only thing that pulls it in.

llvm-svn: 358903
2019-04-22 17:38:29 +00:00
Philip Reames 37104d7189 [LPM/BPI] Preserve BPI through trivial loop pass pipeline (e.g. LCSSA, LoopSimplify)
Currently, we do not expose BPI to loop passes at all. In the old pass manager, we appear to have been ignoring the fact that LCSSA and/or LoopSimplify didn't preserve BPI, and making it available to the following loop passes anyways.  In the new one, it's invalidated before running any loop pass if either LCSSA or LoopSimplify actually make changes. If they don't make changes, then BPI is valid and available.  So, we go ahead and teach LCSSA and LoopSimplify how to preserve BPI for consistency between old and new pass managers.

This patch avoids an invalidation between the two requires in the following trivial pass pipeline:
opt -passes="requires<branch-prob>,loop(no-op-loop),requires<branch-prob>"
(when the input file is one which requires either LCSSA or LoopSimplify to canonicalize the loops)

Differential Revision: https://reviews.llvm.org/D60790

llvm-svn: 358901
2019-04-22 17:13:43 +00:00
Nikita Popov 5aacc7a573 Revert "[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC"
This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445.

After thinking about this more, this isn't right, the range is not exact
in the same sense as makeExactICmpRegion(). This needs a separate
function.

llvm-svn: 358876
2019-04-22 09:01:38 +00:00
Nikita Popov 5299e25f50 [ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC
Following D60632 makeGuaranteedNoWrapRegion() always returns an
exact nowrap region. Rename the function accordingly. This is in
line with the naming of makeExactICmpRegion().

llvm-svn: 358875
2019-04-22 08:36:05 +00:00
Luqman Aden 2993661cc0 [CorrelatedValuePropagation] Mark subs that we know not to wrap with nuw/nsw.
Summary:
Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative.

Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181)

Reviewers: sanjoy, apilipenko, nikic

Reviewed By: nikic

Subscribers: hiraditya, jfb, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60036

llvm-svn: 358816
2019-04-20 13:14:18 +00:00
Vedant Kumar 282b26ec4d [GVN+LICM] Use line 0 locations for better crash attribution
This is a follow-up to r291037+r291258, which used null debug locations
to prevent jumpy line tables.

Using line 0 locations achieves the same effect, but works better for
crash attribution because it preserves the right inline scope.

Differential Revision: https://reviews.llvm.org/D60913

llvm-svn: 358791
2019-04-19 22:36:40 +00:00
Eric Christopher dfebd84eb3 Remove the EnableEarlyCSEMemSSA set of options from the legacy
and new pass managers. They were default to true and not being
used.

Differential Revision: https://reviews.llvm.org/D60747

llvm-svn: 358789
2019-04-19 22:18:53 +00:00
Alina Sbirlea 43709f7233 [LICM & MemorySSA] Make limit flags pass tuning options.
Summary:
Make the flags in LICM + MemorySSA tuning options in the old and new
pass managers.

Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60490

llvm-svn: 358772
2019-04-19 17:46:50 +00:00
Alina Sbirlea 0499a2f961 [NewPassManager] Adding pass tuning options: loop vectorize.
Summary:
Trying to add the plumbing necessary to add tuning options to the new pass manager.
Testing with the flags for loop vectorize.

Reviewers: chandlerc

Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59723

llvm-svn: 358763
2019-04-19 16:11:59 +00:00
Fangrui Song 7137b54a03 [MergeFunc] Delete unused FunctionNode::release()
llvm-svn: 358742
2019-04-19 08:03:20 +00:00
Fangrui Song 884f557bb2 [MergeFunc] removeUsers: call remove() only on direct users
removeUsers uses a work list to collect indirect users and call remove()
on those functions. However it has a bug (`if (!Visited.insert(UU).second)`).

Actually, we don't have to collect indirect users.
After the merge of F and G, G's callers will be considered (added to
Deferred). If G's callers can be merged, G's callers' callers will be
considered.

Update the test unnamed-addr-reprocessing.ll to make it clear we can
still merge indirect callers.

llvm-svn: 358741
2019-04-19 07:57:51 +00:00
Chandler Carruth ce3f75df1f [CallSite removal] Move the legacy PM, call graph, and some inliner
code to `CallBase`.

This patch focuses on the legacy PM, call graph, and some of inliner and legacy
passes interacting with those APIs from `CallSite` to the new `CallBase` class.
No interesting changes.

Differential Revision: https://reviews.llvm.org/D60412

llvm-svn: 358739
2019-04-19 05:59:42 +00:00