This reverts commit r208930, r208933, and r208975.
It seems not all fission consumers are ready to handle this behavior.
Reverting until tools are brought up to spec.
llvm-svn: 209338
Committed in r209178 then reverted in r209251 due to LTO breakage,
here's a proper fix for the case of the missing subprogram DIE. The DIEs
were there, just in other compile units. Using the SPMap we can find the
right compile unit to search for and produce cross-unit references to
describe this kind of inlining.
One existing test case needed to be updated because it had a function
that wasn't in the CU's subprogram list, so it didn't appear in the
SPMap.
llvm-svn: 209335
This reverts commit r209178.
This seems to be asserting in an LTO build on some internal Apple
buildbots. No upstream reproduction (and I don't have an LLVM-aware gold
built right now to reproduce it personally) but it's a small patch & the
failure's semi-plausible so I'm going to revert first while I try to
reproduce this.
llvm-svn: 209251
Undecided whether this should include a test case - SROA produces bad
dbg.value metadata describing a value for a reference that is actually
the value of the thing the reference refers to. For now, loosening the
assert lets this not assert, but it's still bogus/wrong output...
If someone wants to tell me to add a test, I'm willing/able, just
undecided. Hopefully we'll get SROA fixed soon & we can tighten up this
assertion again.
llvm-svn: 209240
This change preserves the original algorithm of generating history
for user variables, but makes it more clear.
High-level description of algorithm:
Scan all the machine basic blocks and machine instructions in the order
they are emitted to the object file. Do the following:
1) If we see a DBG_VALUE instruction, add it to the history of the
corresponding user variable. Keep track of all user variables, whose
locations are described by a register.
2) If we see a regular instruction, look at all the registers it clobbers,
and terminate the location range for all variables described by these registers.
3) At the end of the basic block, terminate location ranges for all
user variables described by some register.
Although this change shouldn't be user-visible (the contents of .debug_loc section
should be the same), it changes some internal assumptions about the set
of instructions used to track the variable locations. Watching the bots.
llvm-svn: 209225
In refactoring DwarfUnit::isUnsignedDIType I restricted it to only work
on values with signedness (unsigned or signed), asserting on anything
else (which did uncover some bugs). But it turns out that we do need to
emit constants of signless data, such as pointer constants - only null
pointer constants are known to need this so far, but it's conceivable
that there might be non-null pointer constants at some point (hardcoded
address offsets for device drivers?).
This patch just uses 'unsigned' for signless data such as pointer
constants. Arguably we could use signless representations
(DW_FORM_dataN) instead, allowing a trinary result from isUnsignedDIType
(signed, unsigned, signless), but this seems reasonable for now.
llvm-svn: 209223
This workaround (presumably for ancient GDB) doesn't appear to be
required (GDB 7.5 seems to tolerate function definition DIEs in
namespace scope just fine).
llvm-svn: 209189
Since we visit the whole list of subprograms for each CU at module
start, this is clearly true - don't test for the case, just assert it.
A few old test cases seemed to have incomplete subprogram lists, but any
attempt to reproduce them shows full subprogram lists that even include
entities that have been completely inlined and the out of line
definition removed.
llvm-svn: 209178
When I refactored this in r208636 I accidentally caused this to be added
multiple times to each abstract subprogram (not accounting for the
deduplicating effect of the InlinedSubprogramDIEs set).
This got better in r208798 when the abstract definitions got the
attribute added to them at construction time, but still had the
redundant copies introduced in r208636.
This commit removes those excess DW_AT_inlines and relies solely on the
insertion in r208798.
llvm-svn: 209166
The check in DwarfDebug::constructScopeDIE was meant to consider inlined
subroutines as any non-top-level scope that was a subprogram. Instead of
checking "not top level scope" it was checking if the /subprogram's/
scope was non-top-level.
Fix this and beef up a test case to demonstrate some of the missing
inlined_subroutines are no longer missing.
In the course of fixing this I also found that r208748 (with this fix)
found one /extra/ inlined_subroutine in concrete_out_of_line.ll due to
two inlined_subroutines having the same inlinedAt location. The previous
implementation was collapsing these into a single inlined subroutine.
I'm not sure what the original code was that created this .ll file so
I'm not sure if this actually happens in practice today. Since we
deliberately include column information to disambiguate two calls on the
same line, that may've addressed this bug in the frontend, but it's good
to know that workaround isn't necessary for this particular case
anymore.
llvm-svn: 209165
- On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though.
- On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal.
- On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled.
llvm-svn: 209123
This is mostly a mechanical change changing all the call sites to the newer
chained-function construction pattern. This removes the horrible 15-parameter
constructor for the CallLoweringInfo in favour of setting properties of the call
via chained functions. No functional change beyond the removal of the old
constructors are intended.
llvm-svn: 209082
This is a preliminary step to help ease the construction of CallLoweringInfo.
Changing the construction to a chained function pattern requires that the
parameter be nullable. However, rather than copying the vector, save a pointer
rather than the reference to permit a late binding of the arguments.
llvm-svn: 209080
This allows us to put dynamic initializers for weak data into the same
comdat group as the data being initialized. This is necessary for MSVC
ABI compatibility. Once we have comdats for guard variables, we can use
the combination to help GlobalOpt fire more often for weak data with
guarded initialization on other platforms.
Reviewers: nlewycky
Differential Revision: http://reviews.llvm.org/D3499
llvm-svn: 209015
I'm not sure this is how it'll be going forward (I'd rather prefer the
definition to be in the main SP mapping, for various reasons) but this
helps me understand how it is today.
llvm-svn: 209009
DIBuilder maintains this invariant and the current DwarfDebug code could
end up doing weird things if it contained declarations (such as putting
the definition DIE inside a CU that contained the declaration - this
doesn't seem like a good idea, so rather than adding logic to handle
this case we'll just ban in for now & cross that bridge if we come to
it later).
llvm-svn: 209004
This reverts commit r208934.
The patch depends on aliases to GEPs with non zero offsets. That is not
supported and fairly broken.
The good news is that GlobalAlias is being redesigned and will have support
for offsets, so this patch should be a nice match for it.
llvm-svn: 208978
This commit implements two command line switches -global-merge-on-external
and -global-merge-aligned, and both of them are false by default, so this
optimization is disabled by default for all targets.
For ARM64, some back-end behaviors need to be tuned to get this optimization
further enabled.
llvm-svn: 208934
Since type units in the dwo file are handled by a debug aware tool, they
don't need to leverage the ELF comdat grouping to implement
deduplication. Avoid creating all the .group sections for these as a
space optimization.
llvm-svn: 208930
Abstract variables should never have/use locations. In this case the
data wasn't used, so no functional change intended here, just
simplification.
llvm-svn: 208820
Many old tests using prior schemas still had some brokenness here (both
indirect arrays and arrays with single bogus elements). Fixed those up
so they don't hit the new assertions.
Also reduced nesting in some places, etc.
llvm-svn: 208817
This is just unneccessary - we only create abstract definitions when
we're inlining anyway, so there's no reason to delay this to see if
we're going to inline anything.
llvm-svn: 208798
If the function has the landingpad instruction, then the
handlerdata should be emitted even if the function has
nouwnind attribute. Otherwise, following code will not
work:
void test1() noexcept {
try {
throw_exception();
} catch (...) {
log_unexpected_exception();
}
}
Since the cantunwind was incorrectly emitted and the
LSDA is not available.
llvm-svn: 208791
This was reverted in r208642 due to regressions surrounding file changes
within lexical scopes causing inlining information to be lost.
The issue was in LexicalScopes::getOrCreateInlinedScope, where I was
previously testing "isLexicalBlock" which is false for
"DILexicalBlockFile" (a scope used to represent changes in the current
file name) and assuming it was then a function (breaking out of the
inlined scope path and reaching for the parent non-inlined scopes). By
inverting the condition and testing for "isSubprogram" the correct
behavior is attained.
(also found some weirdness in Clang, see r208742 when reducing this test
case - the resulting test case doesn't apply with the Clang fix, but
I've added a more realistic test case to inline-scopes.ll which does
reproduce the issue and demonstrate the fix)
llvm-svn: 208748
This allows code to statically accept a Function or a GlobalVariable, but
not an alias. This is already a cleanup by itself IMHO, but the main
reason for it is that it gives a lot more confidence that the refactoring to fix
the design of GlobalAlias is correct. That will be a followup patch.
llvm-svn: 208716
compared to 'AddrMode.BaseReg'. In the case that 'AddrMode.BaseReg' is
nullptr, 'Result' will also be nullptr, so the cast causes an assertion. We
should use dyn_cast_or_null here to check 'Result' is not null and it is an
instruction.
Bug found by Mats Petersson, and I reduced his IR to get a test case.
llvm-svn: 208705
The problem occurs when a non-i1 setcc is inverted. For example 'i8 = setcc' will get 'xor 0xff' to invert this. This is clearly wrong when the boolean contents are ZeroOrOne.
This patch introduces getLogicalNOT and updates SetCC legalisation to use it.
Reviewed by Hal Finkel.
llvm-svn: 208641
Right now the load may not get DCE'd because of the side-effect of updating
the base pointer.
This can happen if we lower a read-modify-write of an illegal larger type
(e.g. i48) such that the modification only affects one of the subparts (the
lower i32 part but not the higher i16 part). See the testcase.
In order to spot the dead load we need to revisit it when SimplifyDemandedBits
decided that the value of the load is masked off. This is the
CommitTargetLoweringOpt piece.
I checked compile time with ARM64 by sending SPEC bitcode files through llc.
No measurable change.
Fixes <rdar://problem/16031651>
llvm-svn: 208640
One test case had to be updated as it still had the extra indirection
for the variable list - removing the extra indirection got it back to
passing.
llvm-svn: 208608
We must validate the value type in TLI::getRegisterByName, because if we
don't and the wrong type was used with the IR intrinsic, then we'll assert
(because we won't be able to find a valid register class with which to
construct the requested copy operation). For PPC64, additionally, the type
information is necessary to decide between the 64-bit register and the 32-bit
subregister.
No functionality change.
llvm-svn: 208508
Filed as PR19712, LLVM fails to detect the right type of an enum
constant when a frontend does not provide an underlying type for the
enumeration type.
llvm-svn: 208502
And the winner by a nose is isUnsignedDIType, for no particular reason.
These two functions were just complements of each other and used in very
related code, so refactor callers to just use one of them.
llvm-svn: 208500
Doesn't seem a good reason to duplicate this code (it was more literally
duplicated prior to r208494, and while the dataN code /does/ actually
fire in this case, it doesn't seem necessary (and the DWARF standard
recommends using udata/sdata pervasively instead of dataN, so as to
indicate signedness of the values))
llvm-svn: 208495
This code looks to have become dead at some time in the past. I tried to
reproduce cases where LLVM would emit constants with dataN, but could
not. Upon inspection it seems the code doesn't do that anymore - the
only time a size is provided by isTypeSigned is when the type is signed,
and in those cases we use sdata. dataN is only used for unsigned types
and isTypeSigned doesn't provide a value for sizeInBits in that case.
Remove the dead cases/size plumbing.
llvm-svn: 208494
When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must
be passed in a block of consecutive floating-point registers, or on the stack.
This means that unused floating-point registers cannot be back-filled with
part of an HFA, however this can currently happen. This patch, along with the
corresponding clang patch (http://reviews.llvm.org/D3083) prevents this.
llvm-svn: 208413
comment of the API.
Relaxes the behavior of TargetInstrInfo::commuteInstruction when
TargetInstrInfo::findCommutedOpIndices returns false.
Previously TargetInstrInfo triggered a fatal error in such situation whereas based
on the comment in the API it should just return nullptr. Indeed the only
precondition that should be ensured is that the instruction must be commutable.
llvm-svn: 208371
(r207876 was reverted in r208131 after seeing some consistent buildbot
failure for MSVC 2012. The original commits were in r207724-r207726)
Takumi was nice enough to dig into this and locate this Microsoft
Connect issue:
http://connect.microsoft.com/VisualStudio/feedback/details/814899/forward-as-tuple-debug-implementation-error
describing a bug in MSVC2012's forward_as_tuple implementation.
Since the parameters in this instance are trivial/small, pass them by
value (using make_tuple) instead of perfectly-forwarded tuple of rvalue
references (involving the broken forward_as_tuple). Hopefully this will
satisfy MSVC2012.
llvm-svn: 208364
The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).
This does represent a small functionality change:
- For generic x86-64 (which uses the SB model and, thus, will get some
unrolling).
- For AMD cores (because they still currently use the SB scheduling model)
- For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.
llvm-svn: 208289
When reducing the bitwidth of a comparison against a constant, the
original setcc's result type was used, which was incorrect.
No test since I don't think any other in tree targets change the
bitwidth of the setcc type depending on the bitwidth of the compared
type.
llvm-svn: 208236
1) Fix for printing debug locations for absolute paths.
2) Location printing is moved into public method DebugLoc::print() to avoid re-inventing the wheel.
Differential Revision: http://reviews.llvm.org/D3513
llvm-svn: 208177
Speculatively reverting due to a suspicious failure on a Windows
buildbot.
This reverts commit 10c37a012ea11596d44cd9059fe09c959caf30c8.
llvm-svn: 208131
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).
So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.
llvm-svn: 208104
Committed initially in r207724-r207726 and reverted due to compiler-rt
crashes in r207732.
Instead, fix this harder with unordered_map and store the LexicalScopes
by value in the map. This did necessitate moving the definition of
LexicalScope above the definition of LexicalScopes.
Let's see how the buildbots/compilers tolerate unordered_map::emplace +
std::piecewise_construct + std::forward_as_tuple...
llvm-svn: 207876
Summary:
Get rid of UserVariables set, and turn DbgValues into MapVector
to get a fixed ordering, as suggested in review for http://reviews.llvm.org/D3573.
Test Plan: llvm regression tests
Reviewers: dblaikie
Reviewed By: dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3579
llvm-svn: 207720
Breaks GDB buildbot
(http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/14517)
GCC emits DW_AT_object_pointer /everywhere/ (declaration, abstract
definition, inlined subroutine), but it looks like GCC relies on it
being somewhere other than the declaration, at least. I'll experiment
further & can hopefully still remove it from the inlined_subroutine.
This reverts commit r207705.
llvm-svn: 207719
They just don't need to be there - they're inherited from the abstract
definition. In theory I would like them to be inherited from the
declaration, but the DWARF standard doesn't quite say that... we can
probably do it anyway but I'm less confident about that so I'll leave it
for a separate commit.
llvm-svn: 207717
This effectively reverts r164326, but adds some comments and
justification and ensures we /don't/ emit the DW_AT_object_pointer on
the (abstract and concrete) definitions. (while still preserving it on
standalone definitions involving ObjC Blocks)
This does increase the size of member function declarations from 7 to 11
bytes, unfortunately, but still seems like the Right Thing to do so that
callers that see only the declaration still have the information about
the object pointer. That said, I don't know what, if any, DWARF
consumers don't have a heuristic to guess this in the case of normal
C++ member functions - perhaps we can remove it entirely.
llvm-svn: 207705
For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it
into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx
more difficult.
For example:
Given
%shr = lshr i64 %x, 4
%and = and i64 %shr, 15
%arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and
%0 = load i64* %arrayidx
With current shift folding, it takes 3 instrs to compute base address:
lsr x8, x0, #1
and x8, x8, #0x78
add x8, x9, x8
If using ubfx, it only needs 2 instrs:
ubfx x8, x0, #4, #4
add x8, x9, x8, lsl #3
This fixes bug 19589
llvm-svn: 207702
DwarfDebug.h has a SmallVector member containing a unique_ptr of an
incomplete type. MSVC doesn't have key functions, so the vtable and
dtor are emitted in AsmPrinter.cpp, where DwarfDebug's ctor is called.
AsmPrinter.cpp include DwarfUnit.h and doesn't get a complete definition
of DwarfTypeUnit. We could fix the problem by including DwarfUnit.h in
DwarfDebug.h, but that would increase header bloat. Instead, define
~DwarfDebug out of line.
llvm-svn: 207701
These were called from distinct places and had significant distinct
behavior. No need to make that a dynamic check inside the function
rather than just having two functions (refactoring some common code into
a helper function to be called from the two separate functions).
llvm-svn: 207539
Seems at some point the intent was to emit fission ranges_base as unique
per CU but the code today emits ranges_base as the start of the ranges
section for all CUs being compiled and all the ranges_base relative
addresses are relative to that. So removing this dead code and leaving
the status quo until there's a reason to change it (perhaps something's
faster if it has distinct ranges for each CU).
llvm-svn: 207464
(Clang doesn't warn here because it knows the string is benign - the
assert still checks what it's intended to - though putting the correct
parens does make clang-format format the code a little better)
llvm-svn: 207456
Since all 4 ctor calls in DwarfDebug just pass in a trivially
constructed DIE with the right tag type, sink the tag selection down
into the Dwarf*Unit ctors (removing the argument entirely from callers
in DwarfDebug) and initialize the DIE member in DwarfUnit.
llvm-svn: 207448
Now that the subtle constructScopeDIE has been refactored into two
functions - one returning memory to take ownership of, one returning a
pointer to already owning memory - push unique_ptr through more APIs.
I think this completes most of the unique_ptr ownership of DIEs.
llvm-svn: 207442
While refactoring out constructScopeDIE into two functions I realized we
were emitting DW_AT_object_pointer in the inlined subroutine when we
didn't need to (GCC doesn't, and the abstract subprogram definition has
the information already).
So here's the refactoring and the bug fix. This is one step of
refactoring to remove some subtle memory ownership semantics. It turns
out the original constructScopeDIE returned ownership in its return
value in some cases and not in others. The split into two functions now
separates those two semantics - further cleanup (unique_ptr, etc) will
follow.
llvm-svn: 207441
entry. This is in preparation for generic DW_OP_piece support.
No functional change so far.
http://reviews.llvm.org/D3373
rdar://problem/15928306
llvm-svn: 207368
Since there's no way to ensure the type unit in the .dwo and the type
unit skeleton in the .o are correlated, this cannot work.
This implementation is a bit inefficient for a few reasons, called out
in comments.
llvm-svn: 207323
Sinking addition of the declaration attribute down to where the
signature is added. So that if the signature is not added neither is the
declaration attribute (this will come in handy when aborting type unit
construction to instead emit the type into the CU directly in some
cases)
Pull out type unit identifier hashing just to simplify the function a
little, it'll be getting longer.
llvm-svn: 207321
Otherwise the legalizer would just scalarize everything. Support for
mulhi in the targets isn't that great yet so on most targets we get
exactly the same scalarized output. Add a test for x86 vector udiv.
I had to disable the mulhi nodes on ARM because there aren't any patterns
for it. As far as I know ARM has instructions for getting the high part of
a multiply so this should be fixed.
llvm-svn: 207315
The included test case would return the incorrect results, because the expansion
of an shift with a constant shift amount of 0 would generate undefined behavior.
This is because ExpandShiftByConstant assumes that all shifts by constants with
a value of 0 have already been optimized away. This doesn't happen for opaque
constants and usually this isn't a problem, because opaque constants won't take
this code path - they are not supposed to. In the case that the opaque constant
has to be expanded by the legalizer, the legalizer would drop the opaque flag.
In this case we hit the limitations of ExpandShiftByConstant and create incorrect
code.
This commit fixes the legalizer by not dropping the opaque flag when expanding
opaque constants and adding an assertion to ExpandShiftByConstant to catch this
not supported case in the future.
This fixes <rdar://problem/16718472>
llvm-svn: 207304
This also avoids the need for subtly side-effecting calls to manifest
strings in the string table at the point where items are added to the
accelerator tables.
llvm-svn: 207281
Pulls out some more code from some of the rather monolithic DWARF
classes. Unlike the address table, the string table won't move up into
DwarfDebug - each DWARF file has its own string table (but there can be
only one address table).
llvm-svn: 207277
buildbot - do not insert debug intrinsics before phi nodes.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
llvm-svn: 207269
This should reduce the chance of memory leaks like those fixed in
r207240.
There's still some unclear ownership of DIEs happening in DwarfDebug.
Pushing unique_ptr and references through more APIs should help expose
the cases where ownership is a bit fuzzy.
llvm-svn: 207263
Since this doesn't return ownership (the DIE has been added to the
specified parent already) nor return null, just return by reference.
llvm-svn: 207259
This'll make changing to unique_ptr ownership of DIEs easier since the
usages will now have '*' on them making them textually compatible
between unique_ptr and raw pointer.
llvm-svn: 207253
AllocaInst that was missing in one location.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
llvm-svn: 207235
AllocaInst that was missing in one location.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
llvm-svn: 207165
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine-intrinsics testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
llvm-svn: 207130
There's only ever one address pool, not one per DWARF output file, so
let's just have one.
(similar refactoring of the string pool to come soon)
llvm-svn: 207026
Some of these types (DwarfDebug in particular) are quite large to begin
with (and I keep forgetting whether DwarfFile is in DwarfDebug or
DwarfUnit... ) so having a few smaller files seems like goodness.
llvm-svn: 207010
For now it contains a single flag, SanitizeAddress, which enables
AddressSanitizer instrumentation of inline assembly.
Patch by Yuri Gorshenin.
llvm-svn: 206971
This prompted me to push references through most of DwarfDebug. Sorry
for the churn.
Honestly it's a bit silly that we're passing around units all over the
place like that anyway and I think it's mostly due to the DIE attribute
adding utility functions being utilities in DwarfUnit. I should have
another go at moving them out of DwarfUnit...
llvm-svn: 206925
This reverts commit r206780.
This commit was regressing gdb.opt/inline-locals.exp in the GDB 7.5 test
suite. Reverting until I can fix the issue.
llvm-svn: 206867
define below all header includes in the lib/CodeGen/... tree. While the
current modules implementation doesn't check for this kind of ODR
violation yet, it is likely to grow support for it in the future. It
also removes one layer of macro pollution across all the included
headers.
Other sub-trees will follow.
llvm-svn: 206837
while checking candidate for bit field extract.
Otherwise the value may not fit in uint64_t and this will trigger an
assertion.
This fixes PR19503.
llvm-svn: 206834
behavior based on other files defining DEBUG_TYPE, which means it cannot
define DEBUG_TYPE at all. This is actually better IMO as it forces folks
to define relevant DEBUG_TYPEs for their files. However, it requires all
files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't
already. I've updated all such files in LLVM and will do the same for
other upstream projects.
This still leaves one important change in how LLVM uses the DEBUG_TYPE
macro going forward: we need to only define the macro *after* header
files have been #include-ed. Previously, this wasn't possible because
Debug.h required the macro to be pre-defined. This commit removes that.
By defining DEBUG_TYPE after the includes two things are fixed:
- Header files that need to provide a DEBUG_TYPE for some inline code
can do so by defining the macro before their inline code and undef-ing
it afterward so the macro does not escape.
- We no longer have rampant ODR violations due to including headers with
different DEBUG_TYPE definitions. This may be mostly an academic
violation today, but with modules these types of violations are easy
to check for and potentially very relevant.
Where necessary to suppor headers with DEBUG_TYPE, I have moved the
definitions below the includes in this commit. I plan to move the rest
of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big
enough.
The comments in Debug.h, which were hilariously out of date already,
have been updated to reflect the recommended practice going forward.
llvm-svn: 206822
The rationale for this artificial dependency seems to have been lost to the
ravages of time, it is covered by no regression tests, and has no impact on
test-suite performance numbers on either x86 or PPC.
For the test suite, on both x86 and PPC, I ran the test suite 10 times (both as
a baseline and with this change), and found no statistically-significant
changes. For PPC, I used a P7 box. For x86, I used an Intel Xeon E5430. Both
with -O3 -mcpu=native.
This was discussed on-list back in January, but I've not had a chance to run
the performance tests until today.
llvm-svn: 206795
Requires switching some vectors to lists to maintain pointer validity.
These could be changed to forward_lists (singly linked) with a bit more
work - I've left comments to that effect.
llvm-svn: 206780
This reverts commit r206707, reapplying r206704. The preceding commit
to CalcSpillWeights should have sorted out the failing buildbots.
<rdar://problem/14292693>
llvm-svn: 206766
This reverts commit r206677, reapplying my BlockFrequencyInfo rewrite.
I've done a careful audit, added some asserts, and fixed a couple of
bugs (unfortunately, they were in unlikely code paths). There's a small
chance that this will appease the failing bots [1][2]. (If so, great!)
If not, I have a follow-up commit ready that will temporarily add
-debug-only=block-freq to the two failing tests, allowing me to compare
the code path between what the failing bots and what my machines (and
the rest of the bots) are doing. Once I've triggered those builds, I'll
revert both commits so the bots go green again.
[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
[2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445
<rdar://problem/14292693>
llvm-svn: 206704
Win64 stack unwinder gets confused when execution flow "falls through" after
a call to 'noreturn' function. This fixes the "missing epilogue" problem by
emitting a trap instruction for IR 'unreachable' on x86_x64-pc-windows.
A secondary use for it would be for anyone wanting to make double-sure that
'noreturn' functions, indeed, do not return.
llvm-svn: 206684
This reverts commit r206666, as planned.
Still stumped on why the bots are failing. Sanitizer bots haven't
turned anything up. If anyone can help me debug either of the failures
(referenced in r206666) I'll owe them a beer. (In the meantime, I'll be
auditing my patch for undefined behaviour.)
llvm-svn: 206677
This reverts commit r206628, reapplying r206622 (and r206626).
Two tests are failing only on buildbots [1][2]: i.e., I can't reproduce
on Darwin, and Chandler can't reproduce on Linux. Asan and valgrind
don't tell us anything, but we're hoping the msan bot will catch it.
So, I'm applying this again to get more feedback from the bots. I'll
leave it in long enough to trigger builds in at least the sanitizer
buildbots (it was failing for reasons unrelated to my commit last time
it was in), and hopefully a few others.... and then I expect to revert a
third time.
[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
[2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445
llvm-svn: 206666
This reverts commit r206622 and the MSVC fixup in r206626.
Apparently the remotely failing tests are still failing, despite my
attempt to fix the nondeterminism in r206621.
llvm-svn: 206628
This reverts commit r206556, effectively reapplying commit r206548 and
its fixups in r206549 and r206550.
In an intervening commit I've added target triples to the tests that
were failing remotely [1] (but passing locally). I'm hoping the mystery
is solved? I'll revert this again if the tests are still failing
remotely.
[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816
llvm-svn: 206622
Rewrite the shared implementation of BlockFrequencyInfo and
MachineBlockFrequencyInfo entirely.
The old implementation had a fundamental flaw: precision losses from
nested loops (or very wide branches) compounded past loop exits (and
convergence points).
The @nested_loops testcase at the end of
test/Analysis/BlockFrequencyAnalysis/basic.ll is motivating. This
function has three nested loops, with branch weights in the loop headers
of 1:4000 (exit:continue). The old analysis gives non-sensical results:
Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
---- Block Freqs ----
entry = 1.0
for.cond1.preheader = 1.00103
for.cond4.preheader = 5.5222
for.body6 = 18095.19995
for.inc8 = 4.52264
for.inc11 = 0.00109
for.end13 = 0.0
The new analysis gives correct results:
Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
block-frequency-info: nested_loops
- entry: float = 1.0, int = 8
- for.cond1.preheader: float = 4001.0, int = 32007
- for.cond4.preheader: float = 16008001.0, int = 128064007
- for.body6: float = 64048012001.0, int = 512384096007
- for.inc8: float = 16008001.0, int = 128064007
- for.inc11: float = 4001.0, int = 32007
- for.end13: float = 1.0, int = 8
Most importantly, the frequency leaving each loop matches the frequency
entering it.
The new algorithm leverages BlockMass and PositiveFloat to maintain
precision, separates "probability mass distribution" from "loop
scaling", and uses dithering to eliminate probability mass loss. I have
unit tests for these types out of tree, but it was decided in the review
to make the classes private to BlockFrequencyInfoImpl, and try to shrink
them (or remove them entirely) in follow-up commits.
The new algorithm should generally have a complexity advantage over the
old. The previous algorithm was quadratic in the worst case. The new
algorithm is still worst-case quadratic in the presence of irreducible
control flow, but it's linear without it.
The key difference between the old algorithm and the new is that control
flow within a loop is evaluated separately from control flow outside,
limiting propagation of precision problems and allowing loop scale to be
calculated independently of mass distribution. Loops are visited
bottom-up, their loop scales are calculated, and they are replaced by
pseudo-nodes. Mass is then distributed through the function, which is
now a DAG. Finally, loops are revisited top-down to multiply through
the loop scales and the masses distributed to pseudo nodes.
There are some remaining flaws.
- Irreducible control flow isn't modelled correctly. LoopInfo and
MachineLoopInfo ignore irreducible edges, so this algorithm will
fail to scale accordingly. There's a note in the class
documentation about how to get closer. See also the comments in
test/Analysis/BlockFrequencyInfo/irreducible.ll.
- Loop scale is limited to 4096 per loop (2^12) to avoid exhausting
the 64-bit integer precision used downstream.
- The "bias" calculation proposed on llvmdev is *not* incorporated
here. This will be added in a follow-up commit, once comments from
this review have been handled.
llvm-svn: 206548
Summary:
This prevents the discriminator generation pass from triggering if
the DWARF version being used in the module is prior to 4.
Reviewers: echristo, dblaikie
CC: llvm-commits
Differential Revision: http://reviews.llvm.org/D3413
llvm-svn: 206507
Previously, SSPBufferSize was assigned the value of the "stack-protector-buffer-size"
attribute after all uses of SSPBufferSize. The effect was that the default
SSPBufferSize was always used during analysis. I moved the check for the
attribute before the analysis; now --param ssp-buffer-size= works correctly again.
Differential Revision: http://reviews.llvm.org/D3349
llvm-svn: 206486
Still only 32-bit ARM using it at this stage, but the promotion allows
direct testing via opt and is a reasonably self-contained patch on the
way to switching ARM64.
At this point, other targets should be able to make use of it without
too much difficulty if they want. (See ARM64 commit coming soon for an
example).
llvm-svn: 206485
This particular DAG combine is designed to kick in when both ConstantFPs will
end up being loaded via a litpool, however those nodes have a semi-legal
status, dictated by isFPImmLegal so in some cases there wouldn't have been a
litpool in the first place. Don't try to be clever in those circumstances.
Picked up while merging some AArch64 tests.
llvm-svn: 206365
handles Intrinsic::trap if TargetOptions::TrapFuncName is set.
This fixes a bug in which the trap function was not taken into consideration
when a program was compiled without optimization (at -O0).
<rdar://problem/16291933>
llvm-svn: 206323
Implement DebugInfoVerifier, which steals verification relying on
DebugInfoFinder from Verifier.
- Adds LegacyDebugInfoVerifierPassPass, a ModulePass which wraps
DebugInfoVerifier. Uses -verify-di command-line flag.
- Change verifyModule() to invoke DebugInfoVerifier as well as
Verifier.
- Add a call to createDebugInfoVerifierPass() wherever there was a
call to createVerifierPass().
This implementation as a module pass should sidestep efficiency issues,
allowing us to turn debug info verification back on.
<rdar://problem/15500563>
llvm-svn: 206300
ARM64 suffered multiple -verify-machineinstr failures (principally over the
xsp/xzr issue) because FastISel was completely ignoring which subset of the
general-purpose registers each instruction required.
More fixes are coming in ARM64 specific FastISel, but this should cover the
generic problems.
llvm-svn: 206283
Got bored, removed some manual memory management.
Pushed references (rather than pointers) through a few APIs rather than
replacing *x with x.get().
llvm-svn: 206222
Thanks to dblaikie for updating the testcase!
Debug info: (bugfix) C++ C/Dtors can be compiled to multiple functions,
therefore, their declaration cannot have one DW_AT_linkage_name.
The specific instances however can and should have that attribute.
This patch reorders the code in DwarfUnit::getOrCreateSubprogramDIE()
to emit linkage names for C/Dtors.
rdar://problem/16362674.
llvm-svn: 206210
BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting
AllowUnknown=true with TLI->getSimpleValueType is not sufficient because, for
example, non-power-of-two vector types return non-simple EVTs (not MVT::Other).
llvm-svn: 206150
Nice to be able to just print out the Tag and have the debugger print
dwarf::DW_TAG_subprogram or whatever, rather than an int.
It's a bit finicky (for example DIDescriptor::getTag still returns
unsigned) because some places still handle real dwarf tags + our fake
tags (one day we'll remove the fake tags, hopefully).
llvm-svn: 206098
therefore, their declaration cannot have one DW_AT_linkage_name.
The specific instances however can and should have that attribute.
This patch reorders the code in DwarfUnit::getOrCreateSubprogramDIE()
to emit linkage names for C/Dtors.
rdar://problem/16362674.
llvm-svn: 206096
We had disabled use of TBAA during CodeGen (even when otherwise using AA)
because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to
miss basic type punning that it should catch (and, thus, we'd fail to override
TBAA when we should).
However, when AA is in use during CodeGen, CGP now uses normal GEPs and
bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a
result, BasicAA should be able to make us do the right thing in the face of
type-punning, and it seems safe to enable use of TBAA again. self-hosting seems
fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle.
Note: We still don't update TBAA when merging stack slots, although because
BasicAA should now catch all such cases, this is no longer a blocking issue.
Nevertheless, I plan to commit code to deal with this properly in the near
future.
llvm-svn: 206093
The current memory-instruction optimization logic in CGP, which sinks parts of
the address computation that can be adsorbed by the addressing mode, does this
by explicitly converting the relevant part of the address computation into
IR-level integer operations (making use of ptrtoint and inttoptr). For most
targets this is currently not a problem, but for targets wishing to make use of
IR-level aliasing analysis during CodeGen, the use of ptrtoint/inttoptr is a
problem for two reasons:
1. BasicAA becomes less powerful in the face of the ptrtoint/inttoptr
2. In cases where type-punning was used, and BasicAA was used
to override TBAA, BasicAA may no longer do so. (this had forced us to disable
all use of TBAA in CodeGen; something which we can now enable again)
This (use of GEPs instead of ptrtoint/inttoptr) is not currently enabled by
default (except for those targets that use AA during CodeGen), and so aside
from some PowerPC subtargets and SystemZ, there should be no change in
behavior. We may be able to switch completely away from the ptrtoint/inttoptr
sinking on all targets, but further testing is required.
I've doubled-up on a number of existing tests that are sensitive to the
address sinking behavior (including some store-merging tests that are
sensitive to the order of the resulting ADD operations at the SDAG level).
llvm-svn: 206092
This is a shared implementation class for BlockFrequencyInfo and
MachineBlockFrequencyInfo, not for BlockFrequency, a related (but
distinct) class.
No functionality change.
<rdar://problem/14292693>
llvm-svn: 206083
-fexhaustive-register-search option to allow an exhaustive search during last
chance recoloring.
This is related to PR18747
Patch by MAYUR PANDEY <mayur.p@samsung.com>.
llvm-svn: 206072
When rematerializing an instruction that defines a super register that would be
used by a physical subregisters we use the related physical super register for
the definition.
To keep the live-range information accurate, all the defined subregisters must be
marked as dead def, otherwise the register allocation may miss some
interferences.
Working on a reduced test-case!
<rdar://problem/16582185>
llvm-svn: 206060
The TargetLowering::expandMUL() helper contains lowering code extracted
from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more
ISD::MUL patterns without having to use a library call.
llvm-svn: 206037
This code has been moved to a new function in the TargetLowering
class called expandMUL(). The purpose of this is to be able
to share lowering code between the SelectionDAGLegalize and
DAGTypeLegalizer classes.
No functionality changed intended.
llvm-svn: 206036
Also updated as many loops as I could find using df_begin/idf_begin -
strangely I found no uses of idf_begin. Is that just used out of tree?
Also a few places couldn't use df_begin because either they used the
member functions of the depth first iterators or had specific ordering
constraints (I added a comment in the latter case).
Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T>
where you needed iterator_range<idf_iterator<T>>)
llvm-svn: 206016