Commit Graph

303 Commits

Author SHA1 Message Date
Gulfem Savrun Yeniceri 7c0e3a77bc [clang][IR] Add support for leaf attribute
This patch adds support for leaf attribute as an optimization hint
in Clang/LLVM.

Differential Revision: https://reviews.llvm.org/D90275
2020-12-14 14:48:17 -08:00
Fangrui Song b5ad32ef5c Migrate deprecated DebugLoc::get to DILocation::get
This migrates all LLVM (except Kaleidoscope and
CodeGen/StackProtector.cpp) DebugLoc::get to DILocation::get.

The CodeGen/StackProtector.cpp usage may have a nullptr Scope
and can trigger an assertion failure, so I don't migrate it.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D93087
2020-12-11 12:45:22 -08:00
Nick Desaulniers f4c6080ab8 Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch"
This reverts commit b7926ce6d7.

Going with a simpler approach.
2020-11-17 17:27:14 -08:00
Giorgis Georgakoudis 700d2417d8 [CodeExtractor] Replace uses of extracted bitcasts in out-of-region lifetime markers
CodeExtractor handles bitcasts in the extracted region that have
lifetime markers users in the outer region as outputs. That
creates unnecessary alloca/reload instructions and extra lifetime
markers. The patch identifies those cases, and replaces uses in
out-of-region lifetime markers with new bitcasts in the outer region.

**Example**
```
define void @foo() {
entry:
  %0 = alloca i32
  br label %extract

extract:
  %1 = bitcast i32* %0 to i8*
  call void @llvm.lifetime.start.p0i8(i64 4, i8* %1)
  call void @use(i32* %0)
  br label %exit

exit:
  call void @use(i32* %0)
  call void @llvm.lifetime.end.p0i8(i64 4, i8* %1)
  ret void
}
```

**Current extraction**
```
define void @foo() {
entry:
  %.loc = alloca i8*, align 8
  %0 = alloca i32, align 4
  br label %codeRepl

codeRepl:                                         ; preds = %entry
  %lt.cast = bitcast i8** %.loc to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast)
  %lt.cast1 = bitcast i32* %0 to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1)
  call void @foo.extract(i32* %0, i8** %.loc)
  %.reload = load i8*, i8** %.loc, align 8
  call void @llvm.lifetime.end.p0i8(i64 -1, i8* %lt.cast)
  br label %exit

exit:                                             ; preds = %codeRepl
  call void @use(i32* %0)
  call void @llvm.lifetime.end.p0i8(i64 4, i8* %.reload)
  ret void
}

define internal void @foo.extract(i32* %0, i8** %.out) {
newFuncRoot:
  br label %extract

exit.exitStub:                                    ; preds = %extract
  ret void

extract:                                          ; preds = %newFuncRoot
  %1 = bitcast i32* %0 to i8*
  store i8* %1, i8** %.out, align 8
  call void @use(i32* %0)
  br label %exit.exitStub
}
```

**Extraction with patch**
```
define void @foo() {
entry:
  %0 = alloca i32, align 4
  br label %codeRepl

codeRepl:                                         ; preds = %entry
  %lt.cast1 = bitcast i32* %0 to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1)
  call void @foo.extract(i32* %0)
  br label %exit

exit:                                             ; preds = %codeRepl
  call void @use(i32* %0)
  %lt.cast = bitcast i32* %0 to i8*
  call void @llvm.lifetime.end.p0i8(i64 4, i8* %lt.cast)
  ret void
}

define internal void @foo.extract(i32* %0) {
newFuncRoot:
  br label %extract

exit.exitStub:                                    ; preds = %extract
  ret void

extract:                                          ; preds = %newFuncRoot
  %1 = bitcast i32* %0 to i8*
  call void @use(i32* %0)
  br label %exit.exitStub
}
```

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D90689
2020-11-05 17:01:08 -08:00
Arthur Eubanks 5c31b8b94f Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit 10f2a0d662.

More uint64_t overflows.
2020-10-31 00:25:32 -07:00
Arthur Eubanks 10f2a0d662 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-30 10:03:46 -07:00
Nico Weber 2a4e704c92 Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit e5766f25c6.
Makes clang assert when building Chromium, see https://crbug.com/1142813
for a repro.
2020-10-27 09:26:21 -04:00
Arthur Eubanks e5766f25c6 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-26 20:24:04 -07:00
Nick Desaulniers b7926ce6d7 [IR] add fn attr for no_stack_protector; prevent inlining on mismatch
It's currently ambiguous in IR whether the source language explicitly
did not want a stack a stack protector (in C, via function attribute
no_stack_protector) or doesn't care for any given function.

It's common for code that manipulates the stack via inline assembly or
that has to set up its own stack canary (such as the Linux kernel) would
like to avoid stack protectors in certain functions. In this case, we've
been bitten by numerous bugs where a callee with a stack protector is
inlined into an __attribute__((__no_stack_protector__)) caller, which
generally breaks the caller's assumptions about not having a stack
protector. LTO exacerbates the issue.

While developers can avoid this by putting all no_stack_protector
functions in one translation unit together and compiling those with
-fno-stack-protector, it's generally not very ergonomic or as
ergonomic as a function attribute, and still doesn't work for LTO. See also:
https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/
https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u

Typically, when inlining a callee into a caller, the caller will be
upgraded in its level of stack protection (see adjustCallerSSPLevel()).
By adding an explicit attribute in the IR when the function attribute is
used in the source language, we can now identify such cases and prevent
inlining.  Block inlining when the callee and caller differ in the case that one
contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`.

Fixes pr/47479.

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D87956
2020-10-23 11:55:39 -07:00
Vedant Kumar 099bffe7f7 Revert "[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)"
This reverts commit 26ee8aff2b.

It's necessary to insert bitcast the pointer operand of a lifetime
marker if it has an opaque pointer type.

rdar://70560161
2020-10-22 12:25:50 -07:00
Atmn Patel 595c615606 [IR] Adds mustprogress as a LLVM IR attribute
This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D85393
2020-10-20 03:09:57 -04:00
Simon Pilgrim 8f0658ae67 [Transforms] CodeExtractor::verifyAssumptionCache - don't dereference a dyn_cast<>. NFCI.
Use cast<> as we immediately dereference the pointer afterwards - cast<> will assert if we fail.

Prevents clang static analyzer warning that we could deference a null pointer.
2020-10-08 19:04:30 +01:00
Vedant Kumar 26ee8aff2b [CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)
Lifetime marker intrinsics support any pointer type, so CodeExtractor
does not need to bitcast to `i8*` in order to use these markers.
2020-09-29 16:34:36 -07:00
Fangrui Song 6913812abc Fix some clang-tidy bugprone-argument-comment issues 2020-09-19 20:41:25 -07:00
Matt Arsenault 5e999cbe8d IR: Define byref parameter attribute
This allows tracking the in-memory type of a pointer argument to a
function for ABI purposes. This is essentially a stripped down version
of byval to remove some of the stack-copy implications in its
definition.

This includes the base IR changes, and some tests for places where it
should be treated similarly to byval. Codegen support will be in a
future patch.

My original attempt at solving some of these problems was to repurpose
byval with a different address space from the stack. However, it is
technically permitted for the callee to introduce a write to the
argument, although nothing does this in reality. There is also talk of
removing and replacing the byval attribute, so a new attribute would
need to take its place anyway.

This is intended avoid some optimization issues with the current
handling of aggregate arguments, as well as fixes inflexibilty in how
frontends can specify the kernel ABI. The most honest representation
of the amdgpu_kernel convention is to expose all kernel arguments as
loads from constant memory. Today, these are raw, SSA Argument values
and codegen is responsible for turning these into loads.

Background:

There currently isn't a satisfactory way to represent how arguments
for the amdgpu_kernel calling convention are passed. In reality,
arguments are passed in a single, flat, constant memory buffer
implicitly passed to the function. It is also illegal to call this
function in the IR, and this is only ever invoked by a driver of some
kind.

It does not make sense to have a stack passed parameter in this
context as is implied by byval. It is never valid to write to the
kernel arguments, as this would corrupt the inputs seen by other
dispatches of the kernel. These argumets are also not in the same
address space as the stack, so a copy is needed to an alloca. From a
source C-like language, the kernel parameters are invisible.
Semantically, a copy is always required from the constant argument
memory to a mutable variable.

The current clang calling convention lowering emits raw values,
including aggregates into the function argument list, since using
byval would not make sense. This has some unfortunate consequences for
the optimizer. In the aggregate case, we end up with an aggregate
store to alloca, which both SROA and instcombine turn into a store of
each aggregate field. The optimizer never pieces this back together to
see that this is really just a copy from constant memory, so we end up
stuck with expensive stack usage.

This also means the backend dictates the alignment of arguments, and
arbitrarily picks the LLVM IR ABI type alignment. By allowing an
explicit alignment, frontends can make better decisions. For example,
there's real no advantage to an aligment higher than 4, so a frontend
could choose to compact the argument layout. Similarly, there is a
high penalty to using an alignment lower than 4, so a frontend could
opt into more padding for small arguments.

Another design consideration is when it is appropriate to expose the
fact that these arguments are all really passed in adjacent
memory. Currently we have a late IR optimization pass in codegen to
rewrite the kernel argument values into explicit loads to enable
vectorization. In most programs, unrelated argument loads can be
merged together. However, exposing this property directly from the
frontend has some disadvantages. We still need a way to track the
original argument sizes and alignments to report to the driver. I find
using some side-channel, metadata mechanism to track this
unappealing. If the kernel arguments were exposed as a single buffer
to begin with, alias analysis would be unaware that the padding bits
betewen arguments are meaningless. Another family of problems is there
are still some gaps in replacing all of the available parameter
attributes with metadata equivalents once lowered to loads.

The immediate plan is to start using this new attribute to handle all
aggregate argumets for kernels. Long term, it makes sense to migrate
all kernel arguments, including scalars, to be passed indirectly in
the same manner.

Additional context is in D79744.
2020-07-20 10:23:09 -04:00
Gui Andrade ff7900d5de [LLVM] Accept `noundef` attribute in function definitions/calls
The `noundef` attribute indicates an argument or return value which
may never have an undef value representation.

This patch allows LLVM to parse the attribute.

Differential Revision: https://reviews.llvm.org/D83412
2020-07-08 19:02:04 +00:00
Yevgeny Rouban 8138487468 [BrachProbablityInfo] Set edge probabilities at once and fix calcMetadataWeights()
Hide the method that allows setting probability for particular edge
and introduce a public method that sets probabilities for all
outgoing edges at once.
Setting individual edge probability is error prone. More over it is
difficult to check that the total probability is 1.0 because there is
no easy way to know when the user finished setting all
the probabilities.

Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights().
Changing unreachable branch probabilities to raw(1) and distributing
the rest (oldProbability - raw(1)) over the reachable branches could
introduce total probability inaccuracy bigger than 1/numOfBranches.

Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
2020-05-21 12:52:37 +07:00
Eli Friedman 11aa3707e3 StoreInst should store Align, not MaybeAlign
This is D77454, except for stores.  All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.

Differential Revision: https://reviews.llvm.org/D79968
2020-05-15 12:26:58 -07:00
Nikita Popov f89f7da999 [IR] Convert null-pointer-is-valid into an enum attribute
The "null-pointer-is-valid" attribute needs to be checked by many
pointer-related combines. To make the check more efficient, convert
it from a string into an enum attribute.

In the future, this attribute may be replaced with data layout
properties.

Differential Revision: https://reviews.llvm.org/D78862
2020-05-15 19:41:07 +02:00
Reid Kleckner 1370757dd0 Revert "[BrachProbablityInfo] Set edge probabilities at once. NFC."
This reverts commit eef95f2746.

The new assertion about branch propability sums does not hold.
2020-05-13 08:23:09 -07:00
Yevgeny Rouban eef95f2746 [BrachProbablityInfo] Set edge probabilities at once. NFC.
Hide the method that allows setting probability for particular
edge and introduce a public method that sets probabilities for
all outgoing edges at once.
Setting individual edge probability is error prone. More over
it is difficult to check that the total probability is 1.0
because there is no easy way to know when the user finished
setting all the probabilities.

Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
2020-05-13 13:55:36 +07:00
Zequan Wu cb22ab7403 Add nomerge function attribute to supress tail merge optimization in simplifyCFG
We want to add a way to avoid merging identical calls so as to keep the
separate debug-information for those calls. There is also an asan
usecase where having this attribute would be beneficial to avoid
alternative work-arounds.

Here is the link to the feature request:
https://bugs.llvm.org/show_bug.cgi?id=42783.

`nomerge` is different from `noline`. `noinline` prevents function from
inlining at callsites, but `nomerge` prevents multiple identical calls
from being merged into one.

This patch adds `nomerge` to disable the optimization in IR level. A
followup patch will be needed to let backend understands `nomerge` and
avoid tail merge at backend.

Reviewed By: asbirlea, rnk

Differential Revision: https://reviews.llvm.org/D78659
2020-05-12 16:49:20 -07:00
Arthur Eubanks 3b0450acec Add IR constructs for preallocated (inalloca replacement)
Add llvm.call.preallocated.{setup,arg} instrinsics.
Add "preallocated" operand bundle which takes a token produced by llvm.call.preallocated.setup.
Add "preallocated" parameter attribute, which is like byval but without the copy.

Verifier changes for these IR constructs.

See https://github.com/rnk/llvm-project/blob/call-setup-docs/llvm/docs/CallSetup.md

Subscribers: hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74651
2020-04-27 16:15:50 -07:00
Ehud Katz 64249f177e [CodeExtractor] Fix extraction of a value used only by intrinsics outside of region
We should only skip `lifetime` and `dbg` intrinsics when searching for users.
Other intrinsics are legit users that can't be ignored.

Without this fix, the testcase would result in an invalid IR. `memcpy`
will have a reference to the, now, external value (local to the
extracted loop function).

Fix PR42194

Differential Revision: https://reviews.llvm.org/D78749
2020-04-25 11:44:47 +03:00
Eli Friedman 565b56a72c [NFC] Clean up uses of LoadInst constructor. 2020-04-07 16:28:53 -07:00
Tyker c5ec8890c9 [NFC] Try fix ubsan buildbot after 876d133789 2020-03-03 17:53:02 +01:00
Johannes Doerfert 70cac41a2b Reapply "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
Reapply 8a56d64d76 with minor fixes.

The problem was that cancellation can cause new edges to the parallel
region exit block which is not outlined. The CodeExtractor will encode
the information which "exit" was taken as a return value. The fix is to
ensure we do not return any value from the outlined function, to prevent
control to value conversion we ensure a single exit block for the
outlined region.

This reverts commit 3aac953afa.
2020-02-12 22:29:07 -06:00
Johannes Doerfert 3aac953afa Revert "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
This reverts commit 8a56d64d76.

Will be recommitted once the clang test problem is addressed.
2020-02-12 18:50:43 -06:00
Johannes Doerfert 8a56d64d76 [OpenMP][IRBuilder] Perform finalization (incl. outlining) late
In order to fix PR44560 and to prepare for loop transformations we now
finalize a function late, which will also do the outlining late. The
logic is as before but the actual outlining step happens now after the
function was fully constructed. Once we have loop transformations we
can apply them in the finalize step before the outlining.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D74372
2020-02-12 17:55:01 -06:00
Vedant Kumar 8359511c62 [CodeExtractor] Remove stale llvm.assume calls from extracted region
During extraction, stale llvm.assume handles may be retained in the
original function. The setup is:

1) CodeExtractor unregisters assumptions in the blocks that are to be
   extracted.

2) Extraction happens. There are now two functions: f1 and f1.extracted.

3) Leftover assumptions in f1 (/not/ removed as they were not in the set of
   blocks to be extracted) now have affected-value llvm.assume handles in
   f1.extracted.

When assumptions for a value used in f1 are looked up, ValueTracking can assert
as some of the handles are in the wrong function. To fix this, simply erase the
llvm.assume calls in the extracted function.

Alternatives include flushing the assumption cache in the original function, or
walking all values used in the original function to prune stale affected-value
handles. Both seem more expensive.

Testing: check-llvm, LNT run with -mllvm -hot-cold-split enabled

rdar://58460728
2020-01-28 17:18:01 -08:00
Vedant Kumar 360abb7ee5 [CodeExtractor] Transfer debug info to extracted function
After extracting, fix up debug info in both the old and new functions by

1) Pointing line locations and debug intrinsics to the new subprogram
   scope, and

2) Deleting intrinsics which point to values outside of the new
   function.

Depends on https://reviews.llvm.org/D72795.

Testing: check-llvm, check-clang, a build of LNT in the `-Os -g` config
with "-mllvm -hot-cold-split=1" set, and end-to-end debugging of a toy
program which undergoes splitting to verify that lldb can find
variables, single step, etc. in extracted code.

rdar://45507940

Differential Revision: https://reviews.llvm.org/D72801
2020-01-15 15:38:36 -08:00
Simon Pilgrim 2740b2d5d5 Fix uninitialized value clang static analyzer warning. NFC. 2020-01-11 16:02:22 +00:00
Aditya Kumar 9d10b9d99b CodeExtractor: NFC: Use Range based loop
Reviewers: vsk, tejohnson, fhahn

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68924

llvm-svn: 374963
2019-10-16 01:50:21 +00:00
Vedant Kumar 9852699dcb [CodeExtractor] Factor out and reuse shrinkwrap analysis
Factor out CodeExtractor's analysis of allocas (for shrinkwrapping
purposes), and allow the analysis to be reused.

This resolves a quadratic compile-time bug observed when compiling
AMDGPUDisassembler.cpp.o.

Pre-patch (Release + LTO clang):

```
   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  176.5278 ( 57.8%)   0.4915 ( 18.5%)  177.0192 ( 57.4%)  177.4112 ( 57.3%)  Hot Cold Splitting
```

Post-patch (ReleaseAsserts clang):

```
   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  1.4051 (  3.3%)   0.0079 (  0.3%)   1.4129 (  3.2%)   1.4129 (  3.2%)  Hot Cold Splitting
```

Testing: check-llvm, and comparing the AMDGPUDisassembler.cpp.o binary
pre- vs. post-patch.

An alternate approach is to hide CodeExtractorAnalysisCache from clients
of CodeExtractor, and to recompute the analysis from scratch inside of
CodeExtractor::extractCodeRegion(). This eliminates some redundant work
in the shrinkwrapping legality check. However, some clients continue to
exhibit O(n^2) compile time behavior as computing the analysis is O(n).

rdar://55912966

Differential Revision: https://reviews.llvm.org/D68616

llvm-svn: 374089
2019-10-08 17:17:51 +00:00
Aditya Kumar 6a2673605e Invalidate assumption cache before outlining.
Subscribers: llvm-commits

Tags: #llvm

Reviewers: compnerd, vsk, sebpop, fhahn, tejohnson

Reviewed by: vsk

Differential Revision: https://reviews.llvm.org/D68478

llvm-svn: 373807
2019-10-04 22:46:42 +00:00
Aditya Kumar c4a7b912c2 [CodeExtractor] NFC: Refactor sanity checks into isEligible
Reviewers: fhahn

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68331

llvm-svn: 373479
2019-10-02 15:36:39 +00:00
Aditya Kumar b1fe6c90e6 NFC: directly return when CommonExitBlock != Succ
Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68330

llvm-svn: 373456
2019-10-02 12:15:17 +00:00
Evgeniy Stepanov c5e7f56249 ARM MTE stack sanitizer.
Add "memtag" sanitizer that detects and mitigates stack memory issues
using armv8.5 Memory Tagging Extension.

It is similar in principle to HWASan, which is a software implementation
of the same idea, but there are enough differencies to warrant a new
sanitizer type IMHO. It is also expected to have very different
performance properties.

The new sanitizer does not have a runtime library (it may grow one
later, along with a "debugging" mode). Similar to SafeStack and
StackProtector, the instrumentation pass (in a follow up change) will be
inserted in all cases, but will only affect functions marked with the
new sanitize_memtag attribute.

Reviewers: pcc, hctim, vitalybuka, ostannard

Subscribers: srhines, mehdi_amini, javed.absar, kristof.beyls, hiraditya, cryptoad, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64169

llvm-svn: 366123
2019-07-15 20:02:23 +00:00
Stefan Stipanovic 0626367202 [Attributor] Deduce "nosync" function attribute.
Introduce and deduce "nosync" function attribute to indicate that a function
does not synchronize with another thread in a way that other thread might free memory.

Reviewers: jdoerfert, jfb, nhaehnle, arsenm

Subscribers: wdng, hfinkel, nhaenhle, mehdi_amini, steven_wu,
dexonsmith, arsenm, uenoku, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D62766

llvm-svn: 365830
2019-07-11 21:37:40 +00:00
Vedant Kumar 5eb6ba060a [CodeExtractor] Fix sinking of allocas with multiple bitcast uses (PR42451)
An alloca which can be sunk into the extraction region may have more
than one bitcast use. Move these uses along with the alloca to prevent
use-before-def.

Testing: check-llvm, stage2 build of clang

Fixes llvm.org/PR42451.

Differential Revision: https://reviews.llvm.org/D64463

llvm-svn: 365660
2019-07-10 16:32:20 +00:00
Vedant Kumar f65f302cc7 [CodeExtractor] Simplify findAllocas, NFC
Split getLifetimeMarkers out into its own method and have it return a
struct.

Differential Revision: https://reviews.llvm.org/D64467

llvm-svn: 365659
2019-07-10 16:32:16 +00:00
Brian Homerding b4b21d807e Add, and infer, a nofree function attribute
This patch adds a function attribute, nofree, to indicate that a function does
not, directly or indirectly, call a memory-deallocation function (e.g., free,
C++'s operator delete).

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D49165

llvm-svn: 365336
2019-07-08 15:57:56 +00:00
Johannes Doerfert 3b77583e95 [Attr] Add "willreturn" function attribute
This patch introduces a new function attribute, willreturn, to indicate
that a call of this function will either exhibit undefined behavior or
comes back and continues execution at a point in the existing call stack
that includes the current invocation.

This attribute guarantees that the function does not have any endless
loops, endless recursion, or terminating functions like abort or exit.

Patch by Hideto Ueno (@uenoku)

Reviewers: jdoerfert

Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62801

llvm-svn: 364555
2019-06-27 15:51:40 +00:00
Quentin Colombet 474a9679bd [CodeExtractor] Add a few debug lines to understand why a region is not extracted
The CodeExtractor is not smart enough to compute which basic block is
the entry of a region. Instead it relies on the order of the list
of basic blocks that is handed to it and assumes that the entry
is the first block in the list.

Without the additional debug information, it is hard to understand
why a valid region does not get extracted, because we would miss
that the order of in the list just doesn't match what the CodeExtractor
wants.

NFC

llvm-svn: 358471
2019-04-16 02:12:05 +00:00
Fangrui Song 2c5c12c041 Change some dyn_cast to more apropriate isa. NFC
llvm-svn: 357773
2019-04-05 16:16:23 +00:00
Matt Arsenault caf1316f71 IR: Add immarg attribute
This indicates an intrinsic parameter is required to be a constant,
and should not be replaced with a non-constant value.

Add the attribute to all AMDGPU and generic intrinsics that comments
indicate it should apply to. I scanned other target intrinsics, but I
don't see any obvious comments indicating which arguments are intended
to be only immediates.

This breaks one questionable testcase for the autoupgrade. I'm unclear
on whether the autoupgrade is supposed to really handle declarations
which were never valid. The verifier fails because the attributes now
refer to a parameter past the end of the argument list.

llvm-svn: 355981
2019-03-12 21:02:54 +00:00
Vedant Kumar 5f5cac3ae2 [CodeExtractor] Do not lift lifetime.end markers for region inputs
If a lifetime.end marker occurs along one path through the extraction
region, but not another, then it's still incorrect to lift the marker,
because there is some path through the extracted function which would
ordinarily not reach the marker. If the call to the extracted function
is in a loop, unrolling can cause inputs to the function to become
optimized out as undef after the first iteration.

To prevent incorrect stack slot merging in the calling function, it
should be sufficient to lift lifetime.start markers for region inputs.
I've tested this theory out by doing a stage2 check-all with randomized
splitting enabled.

This is a follow-up to r353973, and there's additional context for this
change in https://reviews.llvm.org/D57834.

rdar://47896986

Differential Revision: https://reviews.llvm.org/D58253

llvm-svn: 354159
2019-02-15 18:46:58 +00:00
Vedant Kumar 4b0cc9a7c8 [CodeExtractor] Only lift lifetime markers present in the extraction region
When CodeExtractor finds liftime markers referencing inputs to the
extraction region, it lifts these markers out of the region and inserts
them around the call to the extracted function (see r350420, PR39671).

However, it should *only* lift lifetime markers that are actually
present in the extraction region. I.e., if a start marker is present in
the extraction region but a corresponding end marker isn't (or vice
versa), only the start marker (or end marker, resp.) should be lifted.

Differential Revision: https://reviews.llvm.org/D57834

llvm-svn: 353973
2019-02-13 19:53:38 +00:00
Vedant Kumar 0e5dd512aa [CodeExtractor] Restore outputs after creating exit stubs
When CodeExtractor saves the result of InvokeInst at the first insertion
point of the 'normal destination' basic block, this block can be omitted
in the outlined region, so store is placed outside of the function. The
suggested solution is to process saving outputs after creating exit
stubs for new function, and stores will be placed in that blocks before
return in this case.

Patch by Sergei Kachkov!

Fixes llvm.org/PR40455.

Differential Revision: https://reviews.llvm.org/D57919

llvm-svn: 353562
2019-02-08 20:48:04 +00:00
Sergey Dmitriev 807960e6ef [CodeExtractor] Update function's assumption cache after extracting blocks from it
Summary: Assumption cache's self-updating mechanism does not correctly handle the case when blocks are extracted from the function by the CodeExtractor. As a result function's assumption cache may have stale references to the llvm.assume calls that were moved to the outlined function. This patch fixes this problem by removing extracted llvm.assume calls from the function’s assumption cache.

Reviewers: hfinkel, vsk, fhahn, davidxl, sanjoy

Reviewed By: hfinkel, vsk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57215

llvm-svn: 353500
2019-02-08 06:55:18 +00:00
James Y Knight 14359ef1b6 [opaque pointer types] Pass value type to LoadInst creation.
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57172

llvm-svn: 352911
2019-02-01 20:44:24 +00:00
Vedant Kumar 1c3694a4d4 [CodeExtractor] Add support for the `swifterror` attribute
When passing a `swifterror` argument or alloca as an input to an
extraction region, mark the input parameter `swifterror`.

llvm-svn: 352408
2019-01-28 19:13:37 +00:00
Julian Lettner b62e9dc46b Revert "[Sanitizers] UBSan unreachable incompatible with ASan in the presence of `noreturn` calls"
This reverts commit cea84ab93a.

llvm-svn: 352069
2019-01-24 18:04:21 +00:00
Julian Lettner cea84ab93a [Sanitizers] UBSan unreachable incompatible with ASan in the presence of `noreturn` calls
Summary:
UBSan wants to detect when unreachable code is actually reached, so it
adds instrumentation before every `unreachable` instruction. However,
the optimizer will remove code after calls to functions marked with
`noreturn`. To avoid this UBSan removes `noreturn` from both the call
instruction as well as from the function itself. Unfortunately, ASan
relies on this annotation to unpoison the stack by inserting calls to
`_asan_handle_no_return` before `noreturn` functions. This is important
for functions that do not return but access the the stack memory, e.g.,
unwinder functions *like* `longjmp` (`longjmp` itself is actually
"double-proofed" via its interceptor). The result is that when ASan and
UBSan are combined, the `noreturn` attributes are missing and ASan
cannot unpoison the stack, so it has false positives when stack
unwinding is used.

Changes:
  # UBSan now adds the `expect_noreturn` attribute whenever it removes
    the `noreturn` attribute from a function
  # ASan additionally checks for the presence of this attribute

Generated code:
```
call void @__asan_handle_no_return    // Additionally inserted to avoid false positives
call void @longjmp
call void @__asan_handle_no_return
call void @__ubsan_handle_builtin_unreachable
unreachable
```

The second call to `__asan_handle_no_return` is redundant. This will be
cleaned up in a follow-up patch.

rdar://problem/40723397

Reviewers: delcypher, eugenis

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D56624

llvm-svn: 352003
2019-01-24 01:06:19 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Vedant Kumar 17d9f14bff [CodeExtractor] Emit lifetime markers around reloads of outputs
CodeExtractor permits extracting a region of blocks from a function even
when values defined within the region are used outside of it.

This is typically done by creating an alloca in the original function
and reloading the alloca after a call to the extracted function.

Wrap the reload in lifetime start/end markers to promote stack coloring.

Suggested by Sergei Kachkov!

Differential Revision: https://reviews.llvm.org/D56045

llvm-svn: 351621
2019-01-19 02:37:59 +00:00
Vedant Kumar a1778df474 [CodeExtractor] Do not extract unsafe lifetime markers
Lifetime markers which reference inputs to the extraction region are not
safe to extract. Example ('rhs' will be extracted):

```
               entry:
              +------------+
              | x = alloca |
              | y = alloca |
              +------------+
             /              \
   lhs:                      rhs:
  +-------------------+     +-------------------+
  | lifetime_start(x) |     | lifetime_start(x) |
  | use(x)            |     | lifetime_start(y) |
  | lifetime_end(x)   |     | use(x, y)         |
  | lifetime_start(y) |     | lifetime_end(y)   |
  | use(y)            |     | lifetime_end(x)   |
  | lifetime_end(y)   |     +-------------------+
  +-------------------+
```

Prior to extraction, the stack coloring pass sees that the slots for 'x'
and 'y' are in-use at the same time. After extraction, the coloring pass
infers that 'x' and 'y' are *not* in-use concurrently, because markers
from 'rhs' are no longer available to help decide otherwise.

This leads to a miscompile, because the stack slots actually are in-use
concurrently in the extracted function.

Fix this by moving lifetime start/end markers for memory regions defined
in the calling function around the call to the extracted function.

Fixes llvm.org/PR39671 (rdar://45939472).

Differential Revision: https://reviews.llvm.org/D55967

llvm-svn: 350420
2019-01-04 17:43:22 +00:00
Vedant Kumar b264d69de7 [IR] Add Instruction::isLifetimeStartOrEnd, NFC
Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an
llvm.lifetime.start or an llvm.lifetime.end intrinsic.

This was suggested as a cleanup in D55967.

Differential Revision: https://reviews.llvm.org/D56019

llvm-svn: 349964
2018-12-21 21:49:40 +00:00
Vedant Kumar b2a6f8e505 [CodeExtractor] Store outputs at the first valid insertion point
When CodeExtractor outlines values which are used by the original
function, it must store those values in some in-out parameter. This
store instruction must not be inserted in between a PHI and an EH pad
instruction, as that results in invalid IR.

This fixes the following verifier failure seen while outlining within
ObjC methods with live exit values:

  The unwind destination does not have an exception handling instruction!
    %call35 = invoke i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %exn.adjusted, i8* %1)
            to label %invoke.cont34 unwind label %lpad33, !dbg !4183
  The unwind destination does not have an exception handling instruction!
    invoke void @objc_exception_throw(i8* %call35) #12
            to label %invoke.cont36 unwind label %lpad33, !dbg !4184
  LandingPadInst not the first non-PHI instruction in the block.
    %3 = landingpad { i8*, i32 }
            catch i8* null, !dbg !1411

rdar://46540815

llvm-svn: 348562
2018-12-07 03:01:54 +00:00
Vedant Kumar 09415a850e [CodeExtractor] Do not marked outlined calls which may resume EH as noreturn
Treat terminators which resume exception propagation as returning instructions
(at least, for the purposes of marking outlined functions `noreturn`). This is
to avoid inserting traps after calls to outlined functions which unwind.

rdar://46129950

llvm-svn: 348404
2018-12-05 19:35:37 +00:00
Vedant Kumar d129569e34 [CodeExtractor] Split PHI nodes with incoming values from outlined region (PR39433)
If a PHI node out of extracted region has multiple incoming values from it,
split this PHI on two parts. First PHI has incomings only from region and
extracts with it (they are placed to the separate basic block that added to the
list of outlined), and incoming values in original PHI are replaced by first
PHI. Similar solution is already used in CodeExtractor for PHIs in entry block
(severSplitPHINodes method). It covers PR39433 bug.

Patch by Sergei Kachkov!

Differential Revision: https://reviews.llvm.org/D55018

llvm-svn: 348205
2018-12-03 22:40:21 +00:00
Vedant Kumar d6699423f1 [CodeExtractor] Mark functions noreturn when applicable
This eliminates the outlining penalty for llvm.trap/unreachable, because
callers no longer have to emit cleanup/ret instructions after calling an
outlined `noreturn` function.

rdar://45523626

llvm-svn: 346421
2018-11-08 17:57:09 +00:00
Vedant Kumar 1e209e284f [CodeExtractor] Do not extract calls to eh_typeid_for (PR39545)
The lowering for a call to eh_typeid_for changes when it's moved from
one function to another.

There are several proposals for fixing this issue in llvm.org/PR39545.
Until some solution is in place, do not allow CodeExtractor to extract
calls to eh_typeid_for, as that results in serious miscompilations.

llvm-svn: 346256
2018-11-06 19:06:08 +00:00
Vedant Kumar 09b7aa443d [CodeExtractor] Erase use-without-def debug intrinsics in parent func
When CodeExtractor moves instructions to a new function, debug
intrinsics referring to those instructions within the parent function
become invalid.

This results in the same verifier failure which motivated r344545, about
function-local metadata being used in the wrong function.

llvm-svn: 346255
2018-11-06 19:05:53 +00:00
Vedant Kumar c299006879 [HotColdSplitting] Identify larger cold regions using domtree queries
The current splitting algorithm works in three stages:

  1) Identify cold blocks, then
  2) Use forward/backward propagation to mark hot blocks, then
  3) Grow a SESE region of blocks *outside* of the set of hot blocks and
  start outlining.

While testing this pass on Apple internal frameworks I noticed that some
kinds of control flow (e.g. loops) are never outlined, even though they
unconditionally lead to / follow cold blocks. I noticed two other issues
related to how cold regions are identified:

  - An inconsistency can arise in the internal state of the hotness
  propagation stage, as a block may end up in both the ColdBlocks set
  and the HotBlocks set. Further inconsistencies can arise as these sets
  do not match what's in ProfileSummaryInfo.

  - It isn't necessary to limit outlining to single-exit regions.

This patch teaches the splitting algorithm to identify maximal cold
regions and outline them. A maximal cold region is defined as the set of
blocks post-dominated by a cold sink block, or dominated by that sink
block. This approach can successfully outline loops in the cold path. As
a side benefit, it maintains less internal state than the current
approach.

Due to a limitation in CodeExtractor, blocks within the maximal cold
region which aren't dominated by a single entry point (a so-called "max
ancestor") are filtered out.

Results:
  - X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to
  47KB pre-patch, or a ~3x improvement. Did not see a performance impact
  across two runs.
  - AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB
  of TEXT were outlined. Ditto re: performance impact.
  - Outlining results improve marginally in the internal frameworks I
  tested.

Follow-ups:
  - Outline more than once per function, outline large single basic
  blocks, & try to remove unconditional branches in outlined functions.

Differential Revision: https://reviews.llvm.org/D53627

llvm-svn: 345209
2018-10-24 22:15:41 +00:00
Teresa Johnson c8dba682bb [hot-cold-split] Name split functions with ".cold" suffix
Summary:
The current default of appending "_"+entry block label to the new
extracted cold function breaks demangling. Change the deliminator from
"_" to "." to enable demangling. Because the header block label will
be empty for release compile code, use "extracted" after the "." when
the label is empty.

Additionally, add a mechanism for the client to pass in an alternate
suffix applied after the ".", and have the hot cold split pass use
"cold."+Count, where the Count is currently 1 but can be used to
uniquely number multiple cold functions split out from the same function
with D53588.

Reviewers: sebpop, hiraditya

Subscribers: llvm-commits, erik.pilkington

Differential Revision: https://reviews.llvm.org/D53534

llvm-svn: 345178
2018-10-24 18:53:47 +00:00
Chandler Carruth 8b7a8123dd [TI removal] Update CodeExtractor to use Instruction directly.
llvm-svn: 344716
2018-10-18 00:38:54 +00:00
Vedant Kumar 15718a6190 [CodeExtractor] Erase debug intrinsics in outlined thunks (fix PR22900)
Variable updates within the outlined function are invisible to
debuggers. This could be improved by defining a DISubprogram for the
new function. For the moment, simply erase the debug intrinsics instead.

This fixes verifier failures about function-local metadata being used in
the wrong function, seen while testing the hot/cold splitting pass.

rdar://45142482

Differential Revision: https://reviews.llvm.org/D53267

llvm-svn: 344545
2018-10-15 19:22:20 +00:00
Chandler Carruth edb12a838a [TI removal] Make variables declared as `TerminatorInst` and initialized
by `getTerminator()` calls instead be declared as `Instruction`.

This is the biggest remaining chunk of the usage of `getTerminator()`
that insists on the narrow type and so is an easy batch of updates.
Several files saw more extensive updates where this would cascade to
requiring API updates within the file to use `Instruction` instead of
`TerminatorInst`. All of these were trivial in nature (pervasively using
`Instruction` instead just worked).

llvm-svn: 344502
2018-10-15 10:04:59 +00:00
Chandler Carruth 664aa868f5 [x86/SLH] Add a real Clang flag and LLVM IR attribute for Speculative
Load Hardening.

Wires up the existing pass to work with a proper IR attribute rather
than just a hidden/internal flag. The internal flag continues to work
for now, but I'll likely remove it soon.

Most of the churn here is adding the IR attribute. I talked about this
Kristof Beyls and he seemed at least initially OK with this direction.
The idea of using a full attribute here is that we *do* expect at least
some forms of this for other architectures. There isn't anything
*inherently* x86-specific about this technique, just that we only have
an implementation for x86 at the moment.

While we could potentially expose this as a Clang-level attribute as
well, that seems like a good question to defer for the moment as it
isn't 100% clear whether that or some other programmer interface (or
both?) would be best. We'll defer the programmer interface side of this
for now, but at least get to the point where the feature can be enabled
without relying on implementation details.

This also allows us to do something that was really hard before: we can
enable *just* the indirect call retpolines when using SLH. For x86, we
don't have any other way to mitigate indirect calls. Other architectures
may take a different approach of course, and none of this is surfaced to
user-level flags.

Differential Revision: https://reviews.llvm.org/D51157

llvm-svn: 341363
2018-09-04 12:38:00 +00:00
Alexander Richardson 6bcf2ba2f0 Allow creating llvm::Function in non-zero address spaces
Most users won't have to worry about this as all of the
'getOrInsertFunction' functions on Module will default to the program
address space.

An overload has been added to Function::Create to abstract away the
details for most callers.

This is based on https://reviews.llvm.org/D37054 but without the changes to
make passing a Module to Function::Create() mandatory. I have also added
some more tests and fixed the LLParser to accept call instructions for
types in the program address space.

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D47541

llvm-svn: 340519
2018-08-23 09:25:17 +00:00
Florian Hahn 7cdf52e425 [CodeExtractor] Use 'normal destination' BB as insert point to store invoke results.
Currently CodeExtractor tries to use the next node after an invoke to
place the store for the result of the invoke, if it is an out parameter
of the region. This fails, as the invoke terminates the current BB.
In that case, we can place the store in the 'normal destination' BB, as
the result will only be available in that case.


Reviewers: davidxl, davide, efriedma

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D51037

llvm-svn: 340331
2018-08-21 20:07:46 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Sergey Dmitriev 69c9cd277d [CodeExtractor] Allow extracting blocks with exception handling
This is a CodeExtractor improvement which adds support for extracting blocks
which have exception handling constructs if that is legal to do. CodeExtractor
performs validation checks to ensure that extraction is legal when it finds
invoke instructions or EH pads (landingpad, catchswitch, or cleanuppad) in
blocks to be extracted.

I have also added an option to allow extraction of blocks with alloca
instructions, but no validation is done for allocas. CodeExtractor caller has
to validate it himself before allowing alloca instructions to be extracted.
By default allocas are still not allowed in extraction blocks.

Differential Revision: https://reviews.llvm.org/D45904

llvm-svn: 332151
2018-05-11 22:49:49 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Vlad Tsyrklevich d17f61ea3b Add the ShadowCallStack attribute
Summary:
Introduce the ShadowCallStack function attribute. It's added to
functions compiled with -fsanitize=shadow-call-stack in order to mark
functions to be instrumented by a ShadowCallStack pass to be submitted
in a separate change.

Reviewers: pcc, kcc, kubamracek

Reviewed By: pcc, kcc

Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D44800

llvm-svn: 329108
2018-04-03 20:10:40 +00:00
Matt Morehouse 236cdaf84c [SimplifyCFG] Create attribute for fuzzing-specific optimizations.
Summary:
When building with libFuzzer, converting control flow to selects or
obscuring the original operands of CMPs reduces the effectiveness of
libFuzzer's heuristics.

This patch provides an attribute to disable or modify certain optimizations
for optimal fuzzing signal.

Provides a less aggressive alternative to https://reviews.llvm.org/D44057.

Reviewers: vitalybuka, davide, arsenm, hfinkel

Reviewed By: vitalybuka

Subscribers: junbuml, mehdi_amini, wdng, javed.absar, hiraditya, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D44232

llvm-svn: 328214
2018-03-22 17:07:51 +00:00
Oren Ben Simhon fdd72fd522 [X86] Added support for nocf_check attribute for indirect Branch Tracking
X86 Supports Indirect Branch Tracking (IBT) as part of Control-Flow Enforcement Technology (CET).
IBT instruments ENDBR instructions used to specify valid targets of indirect call / jmp.
The `nocf_check` attribute has two roles in the context of X86 IBT technology:
	1. Appertains to a function - do not add ENDBR instruction at the beginning of the function.
	2. Appertains to a function pointer - do not track the target function of this pointer by adding nocf_check prefix to the indirect-call instruction.

This patch implements `nocf_check` context for Indirect Branch Tracking.
It also auto generates `nocf_check` prefixes before indirect branchs to jump tables that are guarded by range checks.

Differential Revision: https://reviews.llvm.org/D41879

llvm-svn: 327767
2018-03-17 13:29:46 +00:00
Easwaran Raman e5b8de2f1f Add a ProfileCount class to represent entry counts.
Summary:
The class wraps a uint64_t and an enum to represent the type of profile
count (real and synthetic) with some helper methods.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41883

llvm-svn: 322771
2018-01-17 22:24:23 +00:00
Florian Hahn 55be37e7d4 [CodeExtractor] Use subset of function attributes for extracted function.
In addition to target-dependent attributes, we can also preserve a
white-listed subset of target independent function attributes. The white-list
excludes problematic attributes, most prominently:

* attributes related to memory accesses, as alloca instructions
  could be moved in/out of the extracted block

* control-flow dependent attributes, like no_return or thunk, as the
  relerelevant instructions might or might not get extracted.

Thanks @efriedma and @aemerson for providing a set of attributes that cannot be
propagated.


Reviewers: efriedma, davidxl, davide, silvas

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D41334

llvm-svn: 321961
2018-01-07 11:22:25 +00:00
Florian Hahn e5089e2e94 [CodeExtractor] Add debug locations for new call and branch instrs.
Summary:
If a partially inlined function has debug info, we have to add debug
locations to the call instruction calling the outlined function.
We use the debug location of the first instruction in the outlined
function, as the introduced call transfers control to this statement and
there is no other equivalent line in the source code.

We also use the same debug location for the branch instruction added
to jump from artificial entry block for the outlined function, which just
jumps to the first actual basic block of the outlined function.

Reviewers: davide, aprantl, rriddle, dblaikie, danielcdh, wmi

Reviewed By: aprantl, rriddle, danielcdh

Subscribers: eraman, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D40413

llvm-svn: 320199
2017-12-08 21:49:03 +00:00
Florian Hahn 7114755913 [CodeExtractor] Add missing AllowVarArgs initialization.
llvm-svn: 318029
2017-11-13 11:08:47 +00:00
Florian Hahn 0e9dec672d [PartialInliner] Inline vararg functions that forward varargs.
Summary:
This patch extends the partial inliner to support inlining parts of
vararg functions, if the vararg handling is done in the outlined part.

It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is
non-null, all varargs passed to the inlined function will be added to
all calls to `ForwardVarArgsTo`.

The partial inliner takes care to only pass `ForwardVarArgsTo` if the
varargs handing is done in the outlined function. It checks that vastart
is not part of the function to be inlined.

`test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part
of the repo) checks we do not do partial inlining if vastart is used in
a basic block that will be inlined.

Reviewers: davide, davidxl, grosser

Reviewed By: davide, davidxl, grosser

Subscribers: gyiu, grosser, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D39607

llvm-svn: 318028
2017-11-13 10:35:52 +00:00
Florian Hahn b93c06331e [CodeExtractor] Fix iterator invalidation in findOrCreateBlockForHoisting.
Summary:
By replacing branches to CommonExitBlock, we remove the node from
CommonExitBlock's predecessors, invalidating the iterator. The problem
is exposed when the common exit block has multiple predecessors and
needs to sink lifetime info. The modification in the test case trigger
the issue.

Reviewers: davidxl, davide, wmi

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39112

llvm-svn: 317084
2017-11-01 09:48:12 +00:00
Eugene Zelenko 286d5897d6 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 315516
2017-10-11 21:41:43 +00:00
Jakub Kuderski cbe9fae99d [CodeExtractor] Fix multiple bugs under certain shape of extracted region
Summary:
If the extracted region has multiple exported data flows toward the same BB which is not included in the region, correct resotre instructions and PHI nodes won't be generated inside the exitStub. The solution is simply put the restore instructions right after the definition of output values instead of putting in exitStub.
Unittest for this bug is included.

Author: myhsu

Reviewers: chandlerc, davide, lattner, silvas, davidxl, wmi, kuhar

Subscribers: dberlin, kuhar, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D37902

llvm-svn: 315041
2017-10-06 03:37:06 +00:00
Davide Italiano e3f7dda1fb [CodeExtractor] Remove unneded and commented out debugging stmts.
llvm-svn: 306966
2017-07-02 00:07:18 +00:00
Serge Guelton 7bc405aa4c [CodeExtractor] Prevent extraction of block involving blockaddress
BlockAddress are only valid within their function context, which does not
interact well with CodeExtractor. Detect this case and prevent it.

Differential Revision: https://reviews.llvm.org/D33839

llvm-svn: 306448
2017-06-27 18:57:53 +00:00
Xinliang David Li 7ed6cd32ea [PartialInlining] Support shrinkwrap life_range markers
Differential Revision: http://reviews.llvm.org/D33847

llvm-svn: 305170
2017-06-11 20:46:05 +00:00
Xinliang David Li 74480adafd [PartialInlining] Shrinkwrap allocas with live range contained in outline region.
Differential Revision: http://reviews.llvm.org/D33618

llvm-svn: 304245
2017-05-30 21:22:18 +00:00
Xinliang David Li f12a0faf88 [CodeExtractor]: Fixup use refs of the old phi.
Differential Revision: http://reviews.llvm.org/D32468

llvm-svn: 301291
2017-04-25 04:51:19 +00:00
Davide Italiano fa15de34b7 [PartialInliner] Fix crash when inlining functions with unreachable blocks.
CodeExtractor looks up the dominator node corresponding to return blocks
when splitting them. If one of these blocks is unreachable, there's no
node in the Dom and CodeExtractor crashes because it doesn't check
for domtree node validity.
In theory, we could add just a check for skipping null DTNodes in
`splitReturnBlock` but the fix I propose here is slightly different. To the
best of my knowledge, unreachable blocks are irrelevant for the algorithm,
therefore we can just skip them when building the candidate set in the
constructor.

Differential Revision:  https://reviews.llvm.org/D32335

llvm-svn: 300946
2017-04-21 04:25:00 +00:00
Davide Italiano 059574c537 [CodeExtractor] Remove an unneeded level of indirection. NFCI.
llvm-svn: 300931
2017-04-21 00:21:09 +00:00
Xinliang David Li 99e3ca1526 Use basicblock split block utility function
Instead of calling BasicBlock::SplitBasicBlock directly in 
CodeExtractor.

Differential Revision: https://reviews.llvm.org/D32308

llvm-svn: 300899
2017-04-20 21:40:22 +00:00
Davide Italiano b965121ba8 [CodeExtractor] Remove a bunch of unneeded constructors.
Differential Revision:  https://reviews.llvm.org/D32305

llvm-svn: 300869
2017-04-20 18:33:40 +00:00
Reid Kleckner eb9dd5b87f Reland "[IR] Make AttributeSetNode public, avoid temporary AttributeList copies"
This re-lands r299875.

I introduced a bug in Clang code responsible for replacing K&R, no
prototype declarations with a real function definition with a prototype.
The bug was here:

       // Collect any return attributes from the call.
  -    if (oldAttrs.hasAttributes(llvm::AttributeList::ReturnIndex))
  -      newAttrs.push_back(llvm::AttributeList::get(newFn->getContext(),
  -                                                  oldAttrs.getRetAttributes()));
  +    newAttrs.push_back(oldAttrs.getRetAttributes());

Previously getRetAttributes() carried AttributeList::ReturnIndex in its
AttributeList. Now that we return the AttributeSetNode* directly, it no
longer carries that index, and we call this overload with a single node:
  AttributeList::get(LLVMContext&, ArrayRef<AttributeSetNode*>)

That aborted with an assertion on x86_32 targets. I added an explicit
triple to the test and added CHECKs to help find issues like this in the
future sooner.

llvm-svn: 299899
2017-04-10 23:31:05 +00:00
Matt Arsenault 3c1fc768ed Allow DataLayout to specify addrspace for allocas.
LLVM makes several assumptions about address space 0. However,
alloca is presently constrained to always return this address space.
There's no real way to avoid using alloca, so without this
there is no way to opt out of these assumptions.

The problematic assumptions include:
- That the pointer size used for the stack is the same size as
  the code size pointer, which is also the maximum sized pointer.

- That 0 is an invalid, non-dereferencable pointer value.

These are problems for AMDGPU because alloca is used to
implement the private address space, which uses a 32-bit
index as the pointer value. Other pointers are 64-bit
and behave more like LLVM's notion of generic address
space. By changing the address space used for allocas,
we can change our generic pointer type to be LLVM's generic
pointer type which does have similar properties.

llvm-svn: 299888
2017-04-10 22:27:50 +00:00
Reid Kleckner 211b1f324f Revert "[IR] Make AttributeSetNode public, avoid temporary AttributeList copies"
This reverts r299875. A Linux bot came back with a test failure:
http://bb.pgr.jp/builders/test-clang-i686-linux-RA/builds/741/steps/test_clang/logs/Clang%20%3A%3A%20CodeGen__2006-05-19-SingleEltReturn.c

llvm-svn: 299878
2017-04-10 20:34:19 +00:00
Reid Kleckner 324c99dee5 [IR] Make AttributeSetNode public, avoid temporary AttributeList copies
Summary:
AttributeList::get(Fn|Ret|Param)Attributes no longer creates a temporary
AttributeList just to hide the AttributeSetNode type.

I've also added a factory method to create AttributeLists from a
parallel array of AttributeSetNodes. I think this simplifies
construction of AttributeLists when rewriting function prototypes.
Previously we would test if a particular index had attributes, and
conditionally add a temporary attribute list to a vector. Now the
attribute set vector is parallel to the argument vector already that
these passes already construct.

My long term vision is to wrap AttributeSetNode* inside an AttributeSet
type that holds the enum attributes, but that will come in a follow up
change.

I haven't done any performance measurements for this change because
profiling hasn't shown that any of the affected code is hot.

Reviewers: pete, chandlerc, sanjoy, hfinkel

Reviewed By: pete

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D31198

llvm-svn: 299875
2017-04-10 20:18:10 +00:00