Given something like:
struct S {
int a;
struct { int b; };
};
We would fail to give 'b' offset 4. Instead, we would give it the
offset it has inside of it's struct.
llvm-svn: 274400
A namespace without a name should be written out as `anonymous
namespace' while a tag type without a name should be written out as
<unnamed-tag>.
llvm-svn: 274399
MSVC makes up names for these anonymous structs, but we don't (yet).
Eventually Clang should use getTypedefNameForAnonDecl() to put some name
in the debug info, and we can update the test case when that happens.
llvm-svn: 274391
We were asserting that our type records were valid when emitting
assembly, but not when emitting an object file.
I've been seeing lots of LNK1285 errors (corrupt PDB) during incremental
debug self-host builds with the MSVC linker, and hopefully this will
catch some of them earlier.
llvm-svn: 274373
Use MachineInstr& to avoid implicit conversions from
MachineBasicBlock::iterator to MachineInstr*. In one case, this could
use a range-based for loop, but the other loops iterated in reverse
order.
One of the reverse-loops checked the MachineInstr* for nullptr, a
condition that is provably unreachable. (And even if my proof has a
flaw, UBSan would catch the bug.)
llvm-svn: 274360
Use MachineInstr& instead of MachineInstr* in RegAllocFast to avoid
implicit conversions from MachineInstrBundleIterator. RAFast::spillAll
and RAFast::spillVirtReg still take iterators, since their argument may
be an end iterator from MachineBasicBlock::getFirstTerminator.
llvm-svn: 274353
For the most part this simplifies all callers. There were two places in X86 that needed an explicit makeArrayRef to shorten a statically sized array.
llvm-svn: 274337
Group" sections while lowering. In particular, for ELF sections this is
useful for creating function-specific groups that get merged into the
same named section.
Also use const Twine& instead of StringRef for the getELF functions
while we're here.
Differential Revision: http://reviews.llvm.org/D21743
llvm-svn: 274336
Summary:
This represents the adjustment applied to the implicit 'this' parameter
in the prologue of a virtual method in the MS C++ ABI. The adjustment is
always zero unless multiple inheritance is involved.
This increases the size of DISubprogram by 8 bytes, unfortunately. The
adjustment really is a signed 32-bit integer. If this size increase is
too much, we could probably win it back by splitting out a subclass with
info specific to virtual methods (virtuality, vindex, thisadjustment,
containingType).
Reviewers: aprantl, dexonsmith
Subscribers: aaboud, amccarth, llvm-commits
Differential Revision: http://reviews.llvm.org/D21614
llvm-svn: 274325
Change all the methods in LiveVariables that expect non-null
MachineInstr* to take MachineInstr& and update the call sites. This
clarifies the API, and designs away a class of iterator to pointer
implicit conversions.
llvm-svn: 274319
Convert a loop to a range-based for, using MachineInstr& instead of
MachineInstr* and removing an implicit conversion from iterator to
pointer.
llvm-svn: 274311
Use MachineInstr& over MachineInstr* to avoid implicit iterator to
pointer conversions. MachineInstr*-as-nullptr was being used as a flag
for whether the for loop terminated normally; I added an explicit `bool`
instead.
llvm-svn: 274310
TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr*
arguments (begin and end) that invite implicit conversions from
MachineInstrBundleIterator. One option would be to change their type to
an iterator, but since they don't seem to have been used since the API
was added in 2010, I'm deleting the dead code.
llvm-svn: 274304
Push MachineInstr& through helper APIs for consistency. This doesn't
remove any more implicit conversions, but it's a nice cleanup after
r274300.
llvm-svn: 274301
Switch to a range-based for in IfConverter::PredicateBlock and take
MachineInstr& in MaySpeculate to avoid an implicit conversion from
MachineBasicBlock::iterator to MachineInstr*.
llvm-svn: 274290
This is a mechanical change to make TargetLowering API take MachineInstr&
(instead of MachineInstr*), since the argument is expected to be a valid
MachineInstr. In one case, changed a parameter from MachineInstr* to
MachineBasicBlock::iterator, since it was used as an insertion point.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
llvm-svn: 274287
Summary:
MSVC provide exception handlers with enhanced information to deal with security buffer feature (/GS).
To be more secure, the security cookies (GS and SEH) are validated when unwinding the stack.
The following code:
```
void f() {}
void foo() {
__try {
f();
} __except(1) {
f();
}
}
```
Reviewers: majnemer, rnk
Subscribers: thakis, llvm-commits, chrisha
Differential Revision: http://reviews.llvm.org/D21101
llvm-svn: 274239
CodeView need to know the offset of the storage allocation for a
bitfield. Encode this via the "extraData" field in DIDerivedType and
introduced a new flag, DIFlagBitField, to indicate whether or not a
member is a bitfield.
This fixes PR28162.
Differential Revision: http://reviews.llvm.org/D21782
llvm-svn: 274200
- Use range based for loops
- No need for some !Reg checks: isPhysicalRegister() reports false for
NoRegister anyway
- Do not repeat function name in documentation comment.
- Do not repeat documentation comment in implementation when we already
have one at the declaration.
- Factor some common subexpressions out.
- Change file comments to use doxygen syntax.
llvm-svn: 274194
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr. This is a
general API improvement.
Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other. Instead I've done everything as a block and just
updated what was necessary.
This is mostly mechanical fixes: adding and removing `*` and `&`
operators. The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy. I couldn't run tests
for AVR since llc doesn't link with it turned on.
llvm-svn: 274189
- Use range based for
- Use the more common variable names MBB and MF for
MachineBasicBlock/MachineFunction variables.
- Add a few const modifiers
llvm-svn: 274187
This is a fix for PR27842.
An IR-level implementation of stack coloring tailored to work with
SafeStack. It is a bit weaker than the MI implementation in that it
does not the "lifetime start at first access" logic. This can be
improved in the future.
This patch also replaces the naive implementation of stack frame
layout with a greedy algorithm that can split existing stack slots
and even fit small objects inside the alignment padding of other
objects.
llvm-svn: 274162
I think this converts all the simple cases that really just care about
the generated code being position independent or not. The remaining
uses are a bit more complicated and are checking things like "is this
a library or executable" or "can this symbol be preempted".
llvm-svn: 274055
This patch enhances dot graph viewer to show hot regions
with hot bbs/edges displayed in red. The ratio of the bb
freq to the max freq of the function needs to be no less
than the value specified by view-hot-freq-percent option.
The default value is 10 (i.e. 10%).
llvm-svn: 273996
MBFI supports profile count dumping and function
name based filtering. Add these two feature to
BFI as well. The filtering option is shared between
BFI and MBFI: -view-bfi-func-name=..
llvm-svn: 273992
BFI and MBFI's dot traits class share most of the
code and all future enhancement. This patch extracts
common implementation into base class BFIDOTGraphTraitsBase.
This patch also enables BFI graph to show branch probability
on edges as MBFI does before.
llvm-svn: 273990
The main issue here is that the "thumb" flag wasn't set for some of these
sections, making MSVC's link.exe fails to correctly relocate code
against the symbols inside these sections. link.exe could fail for
instance with the "fixup is not aligned for target 'XX'" error. If
linking doesn't fail, the relocation process goes wrong in the end and
invalid code is generated by the linker.
This patch adds Thumb/ARM information so that the right flags are set
on COFF/Windows.
Patch by Adrien Guinet.
llvm-svn: 273880
SimplifyCFG had logic to insert calls to llvm.trap for two very
particular IR patterns: stores and invokes of undef/null.
While InstCombine canonicalizes certain undefined behavior IR patterns
to stores of undef, phase ordering means that this cannot be relied upon
in general.
There are much better tools than llvm.trap: UBSan and ASan.
N.B. I could be argued into reverting this change if a clear argument as
to why it is important that we synthesize llvm.trap for stores, I'd be
hard pressed to see why it'd be useful for invokes...
llvm-svn: 273778
Remember the last choice for the top/bottom scheduling boundary in
bidirectional scheduling mode. The top choice should not change if we
schedule at the bottom and vice versa.
This allows us to improve compiletime: We only recalculate the best pick
for one border and re-use the cached top-pick from the other border.
Differential Revision: http://reviews.llvm.org/D19350
llvm-svn: 273766
In bidirectional scheduling this gives more stable results than just
comparing the "reason" fields of the top/bottom node because the reason
field may be higher depending on what other nodes are in the queue.
Differential Revision: http://reviews.llvm.org/D19401
llvm-svn: 273755
This fixes an embarrassing bug when emitting .debug_loc entries for 64-bit+ constants,
which were previously silently truncated to 32 bits.
<rdar://problem/26843232>
llvm-svn: 273736
Tail merge was making the assumption that a layout successor or
predecessor was always a cfg successor/predecessor. Remove that
assumption. Changes to tests are necessary because the errant cfg edges
were preventing optimizations.
llvm-svn: 273700
Clang emits them in reverse order to conform to the ABI, which requires
left-to-right destruction. As a result, the order doesn't fall out
naturally, and we have to sort things out in the backend.
Fixes PR28213
llvm-svn: 273696
There are two remaining issues here:
1. No vbptr information
2. Need to mention indirect virtual bases
Getting indirect virtual bases is just a matter of adding an "indirect"
flag, emitting them in the frontend, and ignoring them when appropriate
for DWARF.
All virtual bases use the same artificial vbptr field, so I think the
vbptr offset will be best represented by an implicit __vbptr$ClassName
member similar to our existing __vptr$ member.
llvm-svn: 273688
When considering whether to split an instruction with a memory operand
into an explicit load and a register-based instruction, we currently
check that the resulting instruction has exactly 1 def. This prevents 2
important LICM optimizations: compares with memory operands, and double
indirect calls. All the tests and the test-suite pass without the check.
My guess as to original intent is to limit the additional register pressure
created by the new instruction, but given that we only split out a single
register, it is already limited.
The licm-dominance test now checks actual memory loads for hoisting instead of
undef, and it tests compares.
hoist-invariant-load.ll now checks for 2 hoists, the intended hoist, and a bonus
from calling a got-relative function in a loop.
llvm-svn: 273616
Recommiting after correcting over-eager Debug Value transfer fixing PR28270.
[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.
Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.
This refixes PR9817 which was being incompletely checked in the
testsuite.
Reviewers: jyknight
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D21037
llvm-svn: 273585
IfConversion used to always add the undef flag when adding a use operand
on a newly predicated instruction. This would be an operand for the register
being conditionally redefined. Due to the undef flag, the liveness of this
register prior to the predicated instruction would get lost.
This patch changes this so that such use operands are added only when the
register is live, without the undef flag.
Reviewed by Quentin Colombet.
http://reviews.llvm.org/D209077
llvm-svn: 273545
When trying to convert a loading instruction into a FAULTING_LOAD, we
sometimes face code like this:
if %R10 is not null:
%R9<def> = MOV32ri Immediate
%R9<def, tied> = AND32rm %R9, 0x20(%R10)
else:
goto TRAP
In these cases we would like to use the AND32rm instruction as the
faulting operation by hoisting the "depedency" def-ing %R9 also above
the control flow, transforming the program into:
%R9<def> = MOV32ri Immediate
%R9<def, tied> = FAULTING_LOAD_OP(AND32rm %R9, 0x20(%R10), FailPath: TRAP)
This change teaches ImplicitNullChecks to do the above, when safe.
llvm-svn: 273501
This is a convenience iterator that allows clients to enumerate the
GlobalObjects within a Module.
Also start using it in a few places where it is obviously the right thing
to use.
Differential Revision: http://reviews.llvm.org/D21580
llvm-svn: 273470
-view-machine-block-freq-propagation-dags currently
support integer and fraction as the suboptions. This
patch adds the 'count' suboption to display actual
profile count if available.
llvm-svn: 273460
Recommiting after fixing over-aggressive assertion
[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.
Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.
This refixes PR9817 which was being incompletely checked in the
testsuite.
Reviewers: jyknight
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D21037
llvm-svn: 273456
CodeView needs to know if a virtual method was introduced in the current
class, and base classes may not have complete type information, so we
need to thread this bit through from the frontend.
llvm-svn: 273453
This is the motivating example:
struct B { int b; };
struct A { B *b; };
int f(A *p) { return p->b->b; }
Clang emits complete types for both A and B because they are required to
be complete, but our CodeView emission would only emit forward
declarations of A and B. This was a consequence of the fact that the A*
type must reference the forward declaration of A, which doesn't
reference B at all.
We can't eagerly emit complete definitions of A and B when we request
the forward declaration's type index because of recursive types like
linked lists. If we did that, our stack usage could get out of hand, and
it would be possible to lower a type while attempting to lower a type,
and we would need to double check if our type is already present in the
TypeIndexMap after all recursive getTypeIndex calls.
Instead, defer complete type emission until after all type lowering has
completed. This ensures that all referenced complete types are emitted,
and that type lowering is not re-entrant.
llvm-svn: 273443
From a design perspective, complete record type emission should not
depend on information from other complete record types.
Currently this map is unused, and needlessly accumulates data throughout
compilation.
llvm-svn: 273431
The setCallee function will set the number of fixed arguments based
on the size of the argument list. The FixedArgs parameter was often
explicitly set to 0, leading to a lack of consistent value for non-
vararg functions.
Differential Revision: http://reviews.llvm.org/D20376
llvm-svn: 273403
We now include namespace scope info in LF_FUNC_ID records and we emit
LF_MFUNC_ID records for member functions as we should.
Class names are now fully qualified, which is what MSVC does.
Add a little bit of scaffolding to handle ThisAdjustment when it arrives
in DISubprogram.
llvm-svn: 273358
Summary:
Fix the computation of the offsets present in the scopetable when using the
SEH (__except_handler4).
This patch added an intrinsic to track the position of the allocation on the
stack of the EHGuard. This position is needed when producing the ScopeTable.
```
struct _EH4_SCOPETABLE {
DWORD GSCookieOffset;
DWORD GSCookieXOROffset;
DWORD EHCookieOffset;
DWORD EHCookieXOROffset;
_EH4_SCOPETABLE_RECORD ScopeRecord[1];
};
struct _EH4_SCOPETABLE_RECORD {
DWORD EnclosingLevel;
long (*FilterFunc)();
union {
void (*HandlerAddress)();
void (*FinallyFunc)();
};
};
```
The code to generate the EHCookie is added in `X86WinEHState.cpp`.
Which is adding these instructions when using SEH4.
```
Lfunc_begin0:
# BB#0: # %entry
pushl %ebp
movl %esp, %ebp
pushl %ebx
pushl %edi
pushl %esi
subl $28, %esp
movl %ebp, %eax <<-- Loading FramePtr
movl %esp, -36(%ebp)
movl $-2, -16(%ebp)
movl $L__ehtable$use_except_handler4_ssp, %ecx
xorl ___security_cookie, %ecx
movl %ecx, -20(%ebp)
xorl ___security_cookie, %eax <<-- XOR FramePtr and Cookie
movl %eax, -40(%ebp) <<-- Storing EHGuard
leal -28(%ebp), %eax
movl $__except_handler4, -24(%ebp)
movl %fs:0, %ecx
movl %ecx, -28(%ebp)
movl %eax, %fs:0
movl $0, -16(%ebp)
calll _may_throw_or_crash
LBB1_1: # %cont
movl -28(%ebp), %eax
movl %eax, %fs:0
addl $28, %esp
popl %esi
popl %edi
popl %ebx
popl %ebp
retl
```
And the corresponding offset is computed:
```
Luse_except_handler4_ssp$parent_frame_offset = -36
.p2align 2
L__ehtable$use_except_handler4_ssp:
.long -2 # GSCookieOffset
.long 0 # GSCookieXOROffset
.long -40 # EHCookieOffset <<----
.long 0 # EHCookieXOROffset
.long -2 # ToState
.long _catchall_filt # FilterFunction
.long LBB1_2 # ExceptionHandler
```
Clang is not yet producing function using SEH4, but it's a work in progress.
This patch is a step toward having a valid implementation of SEH4.
Unfortunately, it is not yet fully working. The EH registration block is not
allocated at the right offset on the stack.
Reviewers: rnk, majnemer
Subscribers: llvm-commits, chrisha
Differential Revision: http://reviews.llvm.org/D21231
llvm-svn: 273281
When you have a map holding a unique_ptr, hold a reference to the raw
pointer instead of the unique pointer. The unique_ptr will be moved on
rehash.
llvm-svn: 273268
Summary:
canCombineSinCosLibcall() would previously combine sin+cos into sincos for
GNUX32/GNUEABI/GNUEABIHF regardless of whether UnsafeFPMath were set or not.
However, GNU would only combine them for UnsafeFPMath because sincos does not
set errno like sin and cos do. It seems likely that this is an oversight.
Reviewers: t.p.northover
Subscribers: t.p.northover, aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D21431
llvm-svn: 273259
Summary:
Using isOutOfOrder makes the code more clear.
Reviewers: rengolin, atrick, hfinkel.
Differential Revision: http://reviews.llvm.org/D21548
llvm-svn: 273255
An incomplete member pointer type will always have a size of zero, so we
don't need an extra flag. Credit to David Majnemer for the idea.
llvm-svn: 273057
Summary:
This seems like the least intrusive way to pass this information
through.
Fixes PR28151
Reviewers: majnemer, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21444
llvm-svn: 273053
Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1
or 2-byte. For those, you need to mask and shift the 1 or 2 byte values
appropriately to use the 4-byte instruction.
This change adds support for cmpxchg-based instruction sets (only SPARC,
in LLVM). The support can be extended for LL/SC-based PPC and MIPS in
the future, supplanting the ISel expansions those architectures
currently use.
Tests added for the IR transform and SPARCv9.
Differential Revision: http://reviews.llvm.org/D21029
llvm-svn: 273025
Names in function id records don't include nested name specifiers or
template arguments, but names in the symbol stream include both.
For the symbol stream, instead of having Clang put the fully qualified
name in the subprogram display name, recreate it from the subprogram
scope chain. For the type stream, take the unqualified name and chop of
any template arguments.
This makes it so that CodeView DI metadata is more similar to DWARF DI
metadata.
llvm-svn: 273009
This is a fix for PR27844.
When replacing uses of unsafe allocas, emit the new location
immediately after each use. Without this, the pointer stays live from
the function entry to the last use, while it's usually cheaper to
recalculate.
llvm-svn: 272969
When moving unsafe allocas to the unsafe stack, dbg.declare intrinsics are
updated to refer to the new location.
This change does the same to dbg.value intrinsics.
llvm-svn: 272968
MSVC handles enums differently from structs and classes: a forward
declaration is not emitted unconditionally. MSVC does not emit an S_UDT
record for the enum.
Differential Revision: http://reviews.llvm.org/D21442
llvm-svn: 272960
Summary:
... into getFrameIndexReferencePreferSP. This change folds the
fail-then-retry logic into getFrameIndexReferencePreferSP.
There is a non-functional but behaviorial change in WinException --
earlier if `getFrameIndexReferenceFromSP` failed we'd trip an assert,
but now we'll silently use the (wrong) offset from the base pointer. I
could not write the assert I'd like to write ("FrameReg ==
StackRegister", like I've done in X86FrameLowering) since there is no
easy way to get to the stack register from WinException (happy to be
proven wrong here). One solution to this is to add a `bool
OnlyStackPointer` parameter to `getFrameIndexReferenceFromSP` that
asserts if it could not satisfy its promise of returning an offset from
a stack pointer, but that seems overkill.
Reviewers: rnk
Subscribers: sanjoy, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D21427
llvm-svn: 272938
This allows better catching of compiler errors since we can use
the override keyword to verify that methods are actually
overridden.
Also in this patch I've changed from storing a boolean Error
code everywhere to returning an llvm::Error, to propagate richer
error information up the call stack.
Reviewed By: ruiu, rnk
Differential Revision: http://reviews.llvm.org/D21410
llvm-svn: 272926
When calculating a square root using Newton-Raphson with two constants,
a naive implementation is to use five multiplications (four muls to calculate
reciprocal square root and another one to calculate the square root itself).
However, after some reassociation and CSE the same result can be obtained
with only four multiplications. Unfortunately, there's no reliable way to do
such a reassociation in the back-end. So, the patch modifies NR code itself
so that it directly builds optimal code for SQRT and doesn't rely on any
further reassociation.
Patch by Nikolai Bozhenov!
Differential Revision: http://reviews.llvm.org/D21127
llvm-svn: 272920
Emit a S_UDT record for typedefs. We still need to do something for
class types.
Differential Revision: http://reviews.llvm.org/D21149
llvm-svn: 272813
[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.
Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.
This refixes PR9817 which was being incompletely checked in the
testsuite.
Reviewers: jyknight
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D21037
llvm-svn: 272792
Summary:
... when the offset is not statically known.
Prioritize addresses relative to the stack pointer in the stackmap, but
fallback gracefully to other modes of addressing if the offset to the
stack pointer is not a known constant.
Patch by Oscar Blumberg!
Reviewers: sanjoy
Subscribers: llvm-commits, majnemer, rnk, sanjoy, thanm
Differential Revision: http://reviews.llvm.org/D21259
llvm-svn: 272756
Document the new parameter and threshod computation
model. Also fix a bug when the threshold parameter
is set to be different from the default.
llvm-svn: 272749
Summary: With runtime profile, we have more confidence in branch probability, thus during basic block layout, we set a lower hot prob threshold so that blocks can be layouted optimally.
Reviewers: djasper, davidxl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20991
llvm-svn: 272729
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.
This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
the normal rule that the global must have a unique address can be broken without
being observable by the program by performing comparisons against the global's
address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
its own copy of the global if it requires one, and the copy in each linkage unit
must be the same)
- It is a constant or a function (which means that the program cannot observe that
the unique-address rule has been broken by writing to the global)
Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.
See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.
Part of the fix for PR27553.
Differential Revision: http://reviews.llvm.org/D20348
llvm-svn: 272709
Summary:
Split NumInstrDups statistic into separate added/removed counts to avoid
negative stat being printed as unsigned.
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D21335
llvm-svn: 272700
For <N x i32> type mul, pmuludq will be used for targets without SSE41, which
often introduces many extra pack and unpack instructions in vectorized loop
body because pmuludq generates <N/2 x i64> type value. However when the operands
of <N x i32> mul are extended from smaller size values like i8 and i16, the type
of mul may be shrunk to use pmullw + pmulhw/pmulhuw instead of pmuludq, which
generates better code. For targets with SSE41, pmulld is supported so no
shrinking is needed.
Differential Revision: http://reviews.llvm.org/D20931
llvm-svn: 272694
Change EmitGlobalVariable to check final assembler section is in BSS
before using .lcomm/.comm directive. This prevents globals from being
put into .bss erroneously when -data-sections is used.
This fixes PR26570.
Reviewers: echristo, rafael
Subscribers: llvm-commits, mehdi_amini
Differential Revision: http://reviews.llvm.org/D21146
llvm-svn: 272674
The exit-on-error flag in the ARM test is necessary in order to avoid an
unreachable in the DAGTypeLegalizer, when trying to expand a physical register.
We can also avoid this situation by introducing a bitcast early on, where the
invalid scalar-to-vector conversion is detected.
We also add a test for PowerPC, which goes through a similar code path in the
SelectionDAGBuilder.
Fixes PR27765.
Differential Revision: http://reviews.llvm.org/D21061
llvm-svn: 272644
Save machine function pointer so that
the reference does not need to be passed around.
This also gives other methods access to machine
function for information such as entry count etc.
llvm-svn: 272594
This is third patch to clean up the code.
Included in this patch:
1. Further unclutter trace/chain formation main routine;
2. Isolate the logic to compute global cost/conflict detection
into its own method;
3. Heavily document the selection algorithm;
4. Added helper hook to allow PGO specific logic to be
added in the future.
llvm-svn: 272582