Summary:
Trying to add the plumbing necessary to add tuning options to the new pass manager.
Testing with the flags for loop vectorize.
Reviewers: chandlerc
Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59723
llvm-svn: 358763
As motivated in D60598, this drops support for specifying both NUW and
NSW in makeGuaranteedNoWrapRegion(). None of the users of this function
currently make use of this.
When both NUW and NSW are specified, the exact nowrap region has two
disjoint parts and makeGNWR() returns one of them. This result doesn't
seem to be useful for anything, but makes the semantics of the function
fuzzier.
Differential Revision: https://reviews.llvm.org/D60632
llvm-svn: 358340
makeGuaranteedNoWrapRegion() is actually makeExactNoWrapRegion() as
long as only one of NUW or NSW is specified. This is not obvious from
the current documentation, and some code seems to think that it is
only exact for single-element ranges. Clarify docs and add tests to
be more confident this really holds.
There are currently no users of makeGuaranteedNoWrapRegion() that
pass both NUW and NSW. I think it would be best to drop support for
this entirely and then rename the function to makeExactNoWrapRegion().
Knowing that the no-wrap region is exact is useful, because we can
backwards-constrain values. What I have in mind in particular is
that LVI should be able to constrain values on edges where the
with.overflow overflow flag is false.
Differential Revision: https://reviews.llvm.org/D60598
llvm-svn: 358305
Same as the other ConstantRange overflow checking methods, but for
unsigned mul. In this case there is no cheap overflow criterion, so
using umul_ov for the implementation.
Differential Revision: https://reviews.llvm.org/D60574
llvm-svn: 358228
This extends D59959 to unionWith(), allowing to specify that a
non-wrapping unsigned/signed range is preferred. This is somewhat
less useful than the intersect case, because union operations are
rarer. An example use would the the phi union computed in SCEV.
The implementation is mostly a straightforward use of getPreferredRange(),
but I also had to adjust some <=/< checks to make sure that no ranges with
lower==upper get constructed before they're passed to getPreferredRange(),
as these have additional constraints.
Differential Revision: https://reviews.llvm.org/D60377
llvm-svn: 357876
The intersection of two ConstantRanges may consist of two disjoint
ranges. As we can only return one range as the result, we need to
return one of the two possible ranges that cover both. Currently the
result is picked based on set size. However, this is not always
optimal: If we're in an unsigned context, we'd prefer to get a large
unsigned range over a small signed range -- the latter effectively
becomes a full set in the unsigned domain.
This revision adds a PreferredRangeType, which can be either Smallest,
Unsigned or Signed. Smallest is the current behavior and Unsigned and
Signed are new variants that prefer not to wrap the unsigned/signed
domain. The new type isn't used anywhere yet (but SCEV will be a good
first user, see D60035).
I've also added some comments to illustrate the various cases in
intersectWith(), which should hopefully make it more obvious what is
going on.
Differential Revision: https://reviews.llvm.org/D59959
llvm-svn: 357873
Add isAllNegative() and isAllNonNegative() methods to ConstantRange,
which determine whether all values in the constant range are
negative/non-negative.
This is useful for replacing KnownBits isNegative() and isNonNegative()
calls when changing code to use constant ranges.
Differential Revision: https://reviews.llvm.org/D60264
llvm-svn: 357871
if we do SHL of two 16-bit ranges like [0, 30000) with [1,2) we get
"full-set" instead of what I would have expected [0, 60000) which is
still in the 16-bit unsigned range.
This patch changes the SHL algorithm to allow getting a usable range
even in this case.
Differential Revision: https://reviews.llvm.org/D57983
llvm-svn: 357854
Add a test that checks the intersectWith() implementation against
all 4-bit range pairs. The test uses a more explicit way of
calculating the possible intersections, and checks that the right
one is picked out according to the smallest set heuristic.
This is in preparation for introducing intersectWith() variants that
use different heuristics to pick an intersection range, if there are
multiple possibilities.
llvm-svn: 357119
Split off from D59749. This adds isWrappedSet() and
isUpperSignWrapped() set with the same behavior as isSignWrappedSet()
and isUpperWrapped() for the respectively other domain.
The methods isWrappedSet() and isSignWrappedSet() will not consider
ranges of the form [X, Max] == [X, 0) and [X, SignedMax] == [X, SignedMin)
to be wrapping, while isUpperWrapped() and isUpperSignWrapped() will.
Also replace the checks in getUnsignedMin() and friends with method
calls that implement the same logic.
llvm-svn: 357112
Split out from D59749. The current implementation of isWrappedSet()
doesn't do what it says on the tin, and treats ranges like
[X, Max] as wrapping, because they are represented as [X, 0) when
using half-inclusive ranges. This also makes it inconsistent with
the semantics of isSignWrappedSet().
This patch renames isWrappedSet() to isUpperWrapped(), in preparation
for the introduction of a new isWrappedSet() method with corrected
behavior.
llvm-svn: 357107
Split off from D59749. This uses a simpler and more efficient
implementation of isSignWrappedSet(), and considers full sets
as non-wrapped, to be consistent with isWrappedSet(). Otherwise
the behavior is unchanged.
There are currently only two users of this function and both already
check for isFullSet() || isSignWrappedSet(), so this is not going to
cause a change in overall behavior.
Differential Revision: https://reviews.llvm.org/D59848
llvm-svn: 357039
This adds ConstantRange::getFull(BitWidth) and
ConstantRange::getEmpty(BitWidth) named constructors as more readable
alternatives to the current ConstantRange(BitWidth, /* full */ false)
and similar. Additionally private getFull() and getEmpty() member
functions are added which return a full/empty range with the same bit
width -- these are commonly needed inside ConstantRange.cpp.
The IsFullSet argument in the ConstantRange(BitWidth, IsFullSet)
constructor is now mandatory for the few usages that still make use of it.
Differential Revision: https://reviews.llvm.org/D59716
llvm-svn: 356852
As a followup to newpm -time-passes fix (D59366), now adding a similar
functionality to legacy time-passes.
Enhancing llvm::reportAndResetTimings to accept an optional stream
for reporting output. By default it still reports into the stream created
by CreateInfoOutputFile (-info-output-file).
Also fixing to actually reset after printing as declared.
Reviewed By: philip.pfaffe
Differential Revision: https://reviews.llvm.org/D59416
llvm-svn: 356824
Following the suggestion in D59450, I'm moving the code for constructing
a ConstantRange from KnownBits out of ValueTracking, which also allows us
to test this code independently.
I'm adding this method to ConstantRange rather than KnownBits (which
would have been a bit nicer API wise) to avoid creating a dependency
from Support to IR, where ConstantRange lives.
Differential Revision: https://reviews.llvm.org/D59475
llvm-svn: 356339
TimePassesHandler object (implementation of time-passes for new pass manager)
gains ability to report into a stream customizable per-instance (per pipeline).
Intended use is to specify separate time-passes output stream per each compilation,
setting up TimePasses member of StandardInstrumentation during PassBuilder setup.
That allows to get independent non-overlapping pass-times reports for parallel
independent compilations (in JIT-like setups).
By default it still puts timing reports into the info-output-file stream
(created by CreateInfoOutputFile every time report is requested).
Unit-test added for non-default case, and it also allowed to discover that print() does not work
as declared - it did not reset the timers, leading to yet another report being printed into the default stream.
Fixed print() to actually reset timers according to what was declared in print's comments before.
Reviewed By: philip.pfaffe
Differential Revision: https://reviews.llvm.org/D59366
llvm-svn: 356305
Add functions to ConstantRange that determine whether the
unsigned/signed addition/subtraction of two ConstantRanges
may/always/never overflows. This will allow checking overflow
conditions based on known constant ranges in addition to known bits.
I'm implementing these methods on ConstantRange to allow them to be
unit tested independently of any ValueTracking machinery. The tests
include exhaustive testing on 4-bit ranges, to make sure the result
is both conservatively correct and maximally precise.
The OverflowResult enum is redeclared on ConstantRange, because
I wanted to avoid a dependency in either direction between
ValueTracking.h and ConstantRange.h.
Differential Revision: https://reviews.llvm.org/D59193
llvm-svn: 356276
Use this feature to fix a bug on ARM where 4 byte alignment is
incorrectly assumed.
Differential Revision: https://reviews.llvm.org/D57335
llvm-svn: 355685
Use this feature to fix a bug on ARM where 4 byte alignment is
incorrectly assumed.
Differential Revision: https://reviews.llvm.org/D57335
llvm-svn: 355585
Use this feature to fix a bug on ARM where 4 byte alignment is
incorrectly assumed.
Differential Revision: https://reviews.llvm.org/D57335
llvm-svn: 355522
OptBisect is in IR due to LLVMContext using it. However, it uses IR units from
Analysis as well. This change moves getDescription functions from OptBisect
to their respective IR units. Generating names for IR units will now be up
to the callers, keeping the Analysis IR units in Analysis. To prevent
unnecessary string generation, isEnabled function is added so that callers know
when the description needs to be generated.
Differential Revision: https://reviews.llvm.org/D58406
llvm-svn: 355068
DomTreeUpdater depends on headers from Analysis, but is in IR. This is a
layering violation since Analysis depends on IR. Relocate this code from IR
to Analysis to fix the layering violation.
llvm-svn: 353265
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57173
llvm-svn: 352913
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57172
llvm-svn: 352911
This cleans up all InvokeInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57171
llvm-svn: 352910
This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57170
llvm-svn: 352909
Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc
doesn't choke on it, hopefully.
Original Message:
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.
Then:
- update the CallInst/InvokeInst instruction creation functions to
take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.
One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.
However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)
Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.
Differential Revision: https://reviews.llvm.org/D57315
llvm-svn: 352827
This reverts commit f47d6b38c7 (r352791).
Seems to run into compilation failures with GCC (but not clang, where
I tested it). Reverting while I investigate.
llvm-svn: 352800
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.
Then:
- update the CallInst/InvokeInst instruction creation functions to
take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.
One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.
However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)
Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.
Differential Revision: https://reviews.llvm.org/D57315
llvm-svn: 352791
Summary:
Renamed setBaseDiscriminator to cloneWithBaseDiscriminator, to match
similar APIs. Also changed its behavior to copy over the other
discriminator components, instead of eliding them.
Renamed cloneWithDuplicationFactor to
cloneByMultiplyingDuplicationFactor, which more closely matches what
this API does.
Reviewers: dblaikie, wmi
Reviewed By: dblaikie
Subscribers: zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D56220
llvm-svn: 351996
As noted in https://bugs.llvm.org/show_bug.cgi?id=36651, the specialization for
isPodLike<std::pair<...>> did not match the expectation of
std::is_trivially_copyable which makes the memcpy optimization invalid.
This patch renames the llvm::isPodLike trait into llvm::is_trivially_copyable.
Unfortunately std::is_trivially_copyable is not portable across compiler / STL
versions. So a portable version is provided too.
Note that the following specialization were invalid:
std::pair<T0, T1>
llvm::Optional<T>
Tests have been added to assert that former specialization are respected by the
standard usage of llvm::is_trivially_copyable, and that when a decent version
of std::is_trivially_copyable is available, llvm::is_trivially_copyable is
compared to std::is_trivially_copyable.
As of this patch, llvm::Optional is no longer considered trivially copyable,
even if T is. This is to be fixed in a later patch, as it has impact on a
long-running bug (see r347004)
Note that GCC warns about this UB, but this got silented by https://reviews.llvm.org/D50296.
Differential Revision: https://reviews.llvm.org/D54472
llvm-svn: 351701
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
This shortcut mechanism for creating types was added 10 years ago, but
has seen almost no uptake since then, neither internally nor in
external projects.
The very small number of characters saved by using it does not seem
worth the mental overhead of an additional type-creation API, so,
delete it.
Differential Revision: https://reviews.llvm.org/D56573
llvm-svn: 351020
Summary:
Added a pair of APIs for encoding/decoding the 3 components of a DWARF discriminator described in http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html: the base discriminator, the duplication factor (useful in profile-guided optimization) and the copy index (used to identify copies of code in cases like loop unrolling)
The encoding packs 3 unsigned values in 32 bits. This CL addresses 2 issues:
- communicates overflow back to the user
- supports encoding all 3 components together. Current APIs assume a sequencing of events. For example, creating a new discriminator based on an existing one by changing the base discriminator was not supported.
Reviewers: davidxl, danielcdh, wmi, dblaikie
Reviewed By: dblaikie
Subscribers: zzheng, dmgreen, aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D55681
llvm-svn: 349973
IR-printing AfterPass instrumentation might be called on a loop
that has just been invalidated. We should skip printing it to
avoid spurious asserts.
Reviewed By: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D54740
llvm-svn: 348887
This patch fixes the issue noticed in D54532.
The problem is that cst_pred_ty-based matchers like m_Zero() currently do not match
scalar undefs (as expected), but *do* match vector undefs. This may lead to optimization
inconsistencies in rare cases.
There is only one existing test for which output changes, reverting the change from D53205.
The reason here is that vector fsub undef, %x is no longer matched as an m_FNeg(). While I
think that the new output is technically worse than the previous one, it is consistent with
scalar, and I don't think it's really important either way (generally that undef should have
been folded away prior to reassociation.)
I've also added another test case for this issue based on InstructionSimplify. It took some
effort to find that one, as in most cases undef folds are either checked first -- and in the
cases where they aren't it usually happens to not make a difference in the end. This is the
only case I was able to come up with. Prior to this patch the test case simplified to undef
in the scalar case, but zeroinitializer in the vector case.
Patch by: @nikic (Nikita Popov)
Differential Revision: https://reviews.llvm.org/D54631
llvm-svn: 347318
This will hold flags specific to subprograms. In the future
we could potentially free up scarce bits in DIFlags by moving
subprogram-specific flags from there to the new flags word.
This patch does not change IR/bitcode formats, that will be
done in a follow-up.
Differential Revision: https://reviews.llvm.org/D54597
llvm-svn: 347239
Summary:
Ranges base address specifiers can save a lot of object size in
relocation records especially in optimized builds.
For an optimized self-host build of Clang with split DWARF and debug
info compression in object files, but uncompressed debug info in the
executable, this change produces about 18% smaller object files and 6%
larger executable.
While it would've been nice to turn this on by default, gold's 32 bit
gdb-index support crashes on this input & I don't think there's any
perfect heuristic to implement solely in LLVM that would suffice - so
we'll need a flag one way or another (also possible people might want to
aggressively optimized for executable size that contains debug info
(even with compression this would still come at some cost to executable
size)) - so let's plumb it through.
Differential Revision: https://reviews.llvm.org/D54242
llvm-svn: 346788
Summary:
Change the dynamic cast in CallBase::getCalledFunction() to allow
null-valued function operands.
This patch fixes a crash that occurred when a funtion operand of a
call instruction was dropped, and later on a metadata-carrying
instruction was printed out. When allocating the metadata slot numbers,
getCalledFunction() would be invoked on the call with the dropped
operand, resulting in a failed non-null assertion in isa<>.
This fixes PR38924, in which a printout in DBCE crashed due to this.
This aligns getCalledFunction() with getCalledValue(), as the latter
allows the operand to be null.
Reviewers: vsk, dexonsmith, hfinkel
Reviewed By: dexonsmith
Subscribers: hfinkel, llvm-commits
Differential Revision: https://reviews.llvm.org/D52537
llvm-svn: 345966
All the PassBuilder::parse interfaces now return descriptive StringError
instead of a plain bool. It allows to make -passes/aa-pipeline parsing
errors context-specific and thus less confusing.
TODO: ideally we should also make suggestions for misspelled pass names,
but that requires some extensions to PassBuilder.
Reviewed By: philip.pfaffe, chandlerc
Differential Revision: https://reviews.llvm.org/D53246
llvm-svn: 344685
Summary:
All the PassBuilder::parse interfaces now return descriptive StringError
instead of a plain bool. It allows to make -passes/aa-pipeline parsing
errors context-specific and thus less confusing.
TODO: ideally we should also make suggestions for misspelled pass names,
but that requires some extensions to PassBuilder.
Reviewed By: philip.pfaffe, chandlerc
Differential Revision: https://reviews.llvm.org/D53246
llvm-svn: 344519
This removes the primary remaining API producing `TerminatorInst` which
will reduce the rate at which code is introduced trying to use it and
generally make it much easier to remove the remaining APIs across the
codebase.
Also clean up some of the stragglers that the previous mechanical update
of variables missed.
Users of LLVM and out-of-tree code generally will need to update any
explicit variable types to handle this. Replacing `TerminatorInst` with
`Instruction` (or `auto`) almost always works. Most of these edits were
made in prior commits using the perl one-liner:
```
perl -i -ple 's/TerminatorInst(\b.* = .*getTerminator\(\))/Instruction\1/g'
```
This also my break some rare use cases where people overload for both
`Instruction` and `TerminatorInst`, but these should be easily fixed by
removing the `TerminatorInst` overload.
llvm-svn: 344504
Summary:
These new intrinsics have the semantics of the `minimum` and `maximum`
operations specified by the latest draft of IEEE 754-2018. Unlike
llvm.minnum and llvm.maxnum, these new intrinsics propagate NaNs and
always treat -0.0 as less than 0.0. `minimum` and `maximum` lower
directly to the existing `fminnan` and `fmaxnan` ISel DAG nodes. It is
safe to reuse these DAG nodes because before this patch were only
emitted in situations where there were known to be no NaN arguments or
where NaN propagation was correct and there were known to be no zero
arguments. I know of only four backends that lower fminnan and
fmaxnan: WebAssembly, ARM, AArch64, and SystemZ, and each of these
lowers fminnan and fmaxnan to instructions that are compatible with
the IEEE 754-2018 semantics.
Reviewers: aheejin, dschuff, sunfish, javed.absar
Subscribers: kristof.beyls, dexonsmith, kristina, llvm-commits
Differential Revision: https://reviews.llvm.org/D52764
llvm-svn: 344437
The IRBuilder CreateIntrinsic method wouldn't allow you to specify the
types that you wanted the intrinsic to be mangled with. To fix this
I've:
- Added an ArrayRef<Type *> member to both CreateIntrinsic overloads.
- Used that array to pass into the Intrinsic::getDeclaration call.
- Added a CreateUnaryIntrinsic to replace the most common use of
CreateIntrinsic where the type was auto-deduced from operand 0.
- Added a bunch more unit tests to test Create*Intrinsic calls that
weren't being tested (including the FMF flag that wasn't checked).
This was suggested as part of the AMDGPU specific atomic optimizer
review (https://reviews.llvm.org/D51969).
Differential Revision: https://reviews.llvm.org/D52087
llvm-svn: 343962
As a prerequisite to time-passes implementation which needs to time both passes
and analyses, adding instrumentation points to the Analysis Manager.
The are two functional differences between Pass and Analysis instrumentation:
- the latter does not increment pass execution counter
- it does not provide ability to skip execution of the corresponding analysis
Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D51275
llvm-svn: 342778
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@
The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.
Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
and access to them.
* PassInstrumentation class that handles instrumentation-point interfaces
that call into PassInstrumentationCallbacks.
* Callbacks accept StringRef which is just a name of the Pass right now.
There were some ideas to pass an opaque wrapper for the pointer to pass instance,
however it appears that pointer does not actually identify the instance
(adaptors and managers might have the same address with the pass they govern).
Hence it was decided to go simple for now and then later decide on what the proper
mental model of identifying a "pass in a phase of pipeline" is.
* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
on different IRUnits (e.g. Analyses).
* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
usual AnalysisManager::getResult. All pass managers were updated to run that
to get PassInstrumentation object for instrumentation calls.
* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
args out of a generic PassManager's extra args. This is the only way I was able to explicitly
run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
RepeatedPass::run.
TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
and then get rid of getAnalysisResult by improving RepeatedPass implementation.
* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
PassInstrumentationAnalysis. Callbacks registration should be performed directly
through PassInstrumentationCallbacks.
* new-pm tests updated to account for PassInstrumentationAnalysis being run
* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.
Made getName helper to return std::string (instead of StringRef initially) to fix
asan builtbot failures on CGSCC tests.
Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858
llvm-svn: 342664
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@
The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.
Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
and access to them.
* PassInstrumentation class that handles instrumentation-point interfaces
that call into PassInstrumentationCallbacks.
* Callbacks accept StringRef which is just a name of the Pass right now.
There were some ideas to pass an opaque wrapper for the pointer to pass instance,
however it appears that pointer does not actually identify the instance
(adaptors and managers might have the same address with the pass they govern).
Hence it was decided to go simple for now and then later decide on what the proper
mental model of identifying a "pass in a phase of pipeline" is.
* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
on different IRUnits (e.g. Analyses).
* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
usual AnalysisManager::getResult. All pass managers were updated to run that
to get PassInstrumentation object for instrumentation calls.
* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
args out of a generic PassManager's extra args. This is the only way I was able to explicitly
run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
RepeatedPass::run.
TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
and then get rid of getAnalysisResult by improving RepeatedPass implementation.
* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
PassInstrumentationAnalysis. Callbacks registration should be performed directly
through PassInstrumentationCallbacks.
* new-pm tests updated to account for PassInstrumentationAnalysis being run
* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.
Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858
llvm-svn: 342597
Summary:
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@
The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.
Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
and access to them.
* PassInstrumentation class that handles instrumentation-point interfaces
that call into PassInstrumentationCallbacks.
* Callbacks accept StringRef which is just a name of the Pass right now.
There were some ideas to pass an opaque wrapper for the pointer to pass instance,
however it appears that pointer does not actually identify the instance
(adaptors and managers might have the same address with the pass they govern).
Hence it was decided to go simple for now and then later decide on what the proper
mental model of identifying a "pass in a phase of pipeline" is.
* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
on different IRUnits (e.g. Analyses).
* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
usual AnalysisManager::getResult. All pass managers were updated to run that
to get PassInstrumentation object for instrumentation calls.
* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
args out of a generic PassManager's extra args. This is the only way I was able to explicitly
run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
RepeatedPass::run.
TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
and then get rid of getAnalysisResult by improving RepeatedPass implementation.
* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
PassInstrumentationAnalysis. Callbacks registration should be performed directly
through PassInstrumentationCallbacks.
* new-pm tests updated to account for PassInstrumentationAnalysis being run
* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.
Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858
llvm-svn: 342544
This was one of the potential follow-ups suggested in D48236,
and these will be used to make matching the patterns in PR38691 cleaner:
https://bugs.llvm.org/show_bug.cgi?id=38691
About the vocabulary: in the DAG, these would be concat_vector with an
undef operand or extract_subvector. Alternate names are discussed in the
review, but I think these are familiar/good enough to proceed. Once we
have uses of them in code, we might adjust if there are better options.
https://reviews.llvm.org/D51392
llvm-svn: 341075
The function's new implementation from r340583 had a bug in it that
could cause an invalid scope to be generated when merging two
DILocations with no common ancestor scope.
This patch detects this situation and picks the scope of the first
location. This is not perfect, because the scope is misleading, but on
the other hand, this will be a line 0 location.
rdar://problem/43687474
Differential Revision: https://reviews.llvm.org/D51238
llvm-svn: 340672
In cases where the debugger load time is a worthwhile tradeoff (or less
costly - such as loading from a DWP instead of a variety of DWOs
(possibly over a high-latency/distributed filesystem)) against object
file size, it can be reasonable to disable pubnames and corresponding
gdb-index creation in the linker.
A backend-flag version of this was implemented for NVPTX in
D44385/r327994 - which was fine for NVPTX which wouldn't mix-and-match
CUs. Now that it's going to be a user-facing option (likely powered by
"-gno-pubnames", the same as GCC) it should be encoded in the
DICompileUnit so it can vary per-CU.
After this, likely the NVPTX support should be migrated to the metadata
& the previous flag implementation should be removed.
Reviewers: aprantl
Differential Revision: https://reviews.llvm.org/D50213
llvm-svn: 339939
Flags in DIBasicType will be used to pass attributes used in
DW_TAG_base_type, such as DW_AT_endianity.
Patch by Chirag Patel!
Differential Revision: https://reviews.llvm.org/D49610
llvm-svn: 339714
Summary:
Clean-up following D50479.
Make Update and LegalizeUpdate refer to the utilities in Support/CFGUpdate.
Reviewers: kuhar
Subscribers: sanjoy, jlebar, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D50669
llvm-svn: 339694
Summary: After converting all existing passes to use the new DomTreeUpdater interface, there isn't any usage of the original DeferredDominance class. Thus, we can safely remove it from the codebase.
Reviewers: kuhar, brzycki, dmgreen, davide, grosser
Reviewed By: kuhar, brzycki
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D49747
llvm-svn: 339502
Summary:
This patch refines the logic of `recalculate()` in the `DomTreeUpdater` in the following two aspects:
1. Previously, `recalculate()` tests whether there are pending updates/BBs awaiting deletion and then do recalculation under Lazy UpdateStrategy; and do recalculation immediately under Eager UpdateStrategy. (The former behavior is inherited from the `DeferredDominance` class). This is an inconsistency between two strategies and there is no obvious reason to do this. So the behavior is changed to always recalculate available trees when calling `recalculate()`.
2. Fix the issue of when DTU under Lazy UpdateStrategy holds nothing but with BBs awaiting deletion, after calling `recalculate()`, BBs awaiting deletion aren't flushed. An additional unittest is added to cover this case.
Reviewers: kuhar, dmgreen, brzycki, grosser, davide
Reviewed By: kuhar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50173
llvm-svn: 338822
LowerDbgDeclare inserts a dbg.value before each use of an address
described by a dbg.declare. When inserting a dbg.value before a CallInst
use, however, it fails to append DW_OP_deref to the DIExpression.
The DW_OP_deref is needed to reflect the fact that a dbg.value describes
a source variable directly (as opposed to a dbg.declare, which relies on
pointer indirection).
This patch adds in the DW_OP_deref where needed. This results in the
correct values being shown during a debug session for a program compiled
with ASan and optimizations (see https://reviews.llvm.org/D49520). Note
that ConvertDebugDeclareToDebugValue is already correct -- no changes
there were needed.
One complication is that SelectionDAG is unable to distinguish between
direct and indirect frame-index (FRAMEIX) SDDbgValues. This patch also
fixes this long-standing issue in order to not regress integration tests
relying on the incorrect assumption that all frame-index SDDbgValues are
indirect. This is a necessary fix: the newly-added DW_OP_derefs cannot
be lowered properly otherwise. Basically the fix prevents a direct
SDDbgValue with DIExpression(DW_OP_deref) from being dereferenced twice
by a debugger. There were a handful of tests relying on this incorrect
"FRAMEIX => indirect" assumption which actually had incorrect
DW_AT_locations: these are all fixed up in this patch.
Testing:
- check-llvm, and an end-to-end test using lldb to debug an optimized
program.
- Existing unit tests for DIExpression::appendToStack fully cover the
new DIExpression::append utility.
- check-debuginfo (the debug info integration tests)
Differential Revision: https://reviews.llvm.org/D49454
llvm-svn: 338069
Summary:
Previously, when both DT and PDT are nullptrs and the UpdateStrategy is Lazy, DomTreeUpdater still pends updates inside.
After this patch, DomTreeUpdater will ignore all updates from(`applyUpdates()/insertEdge*()/deleteEdge*()`) in this case. (call `delBB()` still pends BasicBlock deletion until a flush event according to the doc).
The behavior of DomTreeUpdater previously documented won't change after the patch.
Reviewers: dmgreen, davide, kuhar, brzycki, grosser
Reviewed By: kuhar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48974
llvm-svn: 336968
Summary:
Previously, when people need to deal with DTU with different UpdateStrategy using different actions, they need to
```
if (DTU.getUpdateStrategy() == DomTreeUpdater::UpdateStrategy::Lazy) {
...
}
if (DTU.getUpdateStrategy() == DomTreeUpdater::UpdateStrategy::Eager) {
...
}
```
After the patch, they can avoid code patterns above
```
if (DTU.isUpdateLazy()){
...
}
if (!DTU.isUpdateLazy()){
...
}
```
Reviewers: kuhar, brzycki, dmgreen
Reviewed By: kuhar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49056
llvm-svn: 336886
Summary:
This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].
This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process.
—Prior to the patch—
- Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily.
- DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated.
- Functions receiving DT/DDT need to branch a lot which is currently necessary.
- Functions using both DomTree and PostDomTree need to call the update function separately on both trees.
- People need to construct an additional DeferredDominance class to use functions only receiving DDT.
—After the patch—
Patch by Chijun Sima <simachijun@gmail.com>.
Reviewers: kuhar, brzycki, dmgreen, grosser, davide
Reviewed By: kuhar, brzycki
Author: NutshellySima
Subscribers: vsk, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D48383
llvm-svn: 336163
Summary:
This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].
This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process.
—Prior to the patch—
- Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily.
- DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated.
- Functions receiving DT/DDT need to branch a lot which is currently necessary.
- Functions using both DomTree and PostDomTree need to call the update function separately on both trees.
- People need to construct an additional DeferredDominance class to use functions only receiving DDT.
—After the patch—
Patch by Chijun Sima <simachijun@gmail.com>.
Reviewers: kuhar, brzycki, dmgreen, grosser, davide
Reviewed By: kuhar, brzycki
Subscribers: vsk, mgorny, llvm-commits
Author: NutshellySima
Differential Revision: https://reviews.llvm.org/D48383
llvm-svn: 336114
This addresses post-commit feedback about the name 'skipDebugInfo' being
misleading. This name could be interpreted as meaning 'a function that
skips instructions with debug locations'.
The new name, 'skipDebugIntrinsics', makes it clear that this function
only skips debug info intrinsics.
Thanks to Adrian Prantl for pointing this out!
llvm-svn: 335667
Summary: This is trying to add support for r334428.
Reviewers: sanjoy
Subscribers: jlebar, hiraditya, bixia, llvm-commits
Differential Revision: https://reviews.llvm.org/D48399
llvm-svn: 335646
This patch introduces two helpers to make it easier to ignore debug
intrinsics:
- Instruction::getNextNonDebugInstruction()
This is just like Instruction::getNextNode(), except that it skips debug
info.
- skipDebugInfo(BasicBlock::iterator)
A free function which advances a BasicBlock iterator past any debug
info. This is a no-op when the iterator already points to a non-debug
instruction.
Part of: llvm.org/PR37728
Related to: https://reviews.llvm.org/D47874
Differential Revision: https://reviews.llvm.org/D48305
llvm-svn: 335083
The optimizer is getting smarter (eg, D47986) about differentiating shuffles
based on its mask values, so we should make queries on the mask constant
operand generally available to avoid code duplication.
We'll probably use this soon in the vectorizers and instcombine (D48023 and
https://bugs.llvm.org/show_bug.cgi?id=37806).
We might clean up TTI a bit more once all of its current 'SK_*' options are
covered.
Differential Revision: https://reviews.llvm.org/D48236
llvm-svn: 335067
DominatorTreeBase::getNode does not modify its parameter and this change
allows callers that only have access to const pointers to use it without
casting.
Reviewers: kuhar, dblaikie, chandlerc
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D48231
llvm-svn: 334892
and using the latter in DIBuilder::createArtificialType and
DIBuilder::createObjectPointerType methods as well as introducing
mirroring DISubprogram::cloneWithFlags and
DIBuilder::createArtificialSubprogram methods.
The primary goal here is to add createArtificialSubprogram to support
a pass downstream while keeping the method consistent with the
existing ones and making sure we don't encourage changing already
created DI-nodes.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D47615
llvm-svn: 333806
Summary: This patch adds a PDT constructor from Function and lets codes previously using a local class to do this use PostDominatorTree class directly.
Reviewers: davide, kuhar, grosser, dberlin
Reviewed By: kuhar
Author: NutshellySima
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46709
llvm-svn: 333102
r332057 introduced distance() for ranges. Based on post-commit feedback,
this renames distance() to size(). The new size() is also only enabled
when the operation is O(1).
Differential Revision: https://reviews.llvm.org/D46976
llvm-svn: 332551
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
Summary:
Currently the order of blocks returned by `IDF::calculate` can be
non-deterministic. This was discovered in several attempts to enable
SSAUpdaterBulk for JumpThreading (which led to miscompare in bootstrap between
stage 3 and stage4). Originally, the blocks were put into a priority queue with
a depth level as their key, and this patch adds a DFSIn number as a second key
to specify a deterministic order across blocks from one level.
The solution was suggested by Daniel Berlin.
Reviewers: dberlin, davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46646
llvm-svn: 332167
Summary: Fix two typos which result in verifying wrong data structures (DT) instead of PDT in DominatorTreeBatchUpdatesTest.
Reviewers: davide, kuhar, grosser, dberlin
Reviewed By: davide, kuhar, dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46696
llvm-svn: 332086
This commit adds a wrapper for std::distance() which works with ranges.
As it would be a common case to write `distance(predecessors(BB))`, this
also introduces `pred_size()` and `succ_size()` helpers to make that
easier to write.
Differential Revision: https://reviews.llvm.org/D46668
llvm-svn: 332057
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is
!DILabel(scope: !1, name: "foo", file: !2, line: 3)
We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is
llvm.dbg.label(metadata !1)
It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.
We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.
Differential Revision: https://reviews.llvm.org/D45024
Patch by Hsiangkai Wang.
llvm-svn: 331841
Inspired by r331508, I did a grep and found these.
Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa.
llvm-svn: 331577