This adds the actual MachineLegalizeHelper to do the work and a trivial pass
wrapper that legalizes all instructions in a MachineFunction. Currently the
only transformation supported is splitting up a vector G_ADD into one acting on
smaller vectors.
llvm-svn: 276461
This provides a better layering of responsibilities among different
aspects of PDB writing code. Some of the MSF related code was
contained in CodeView, and some was in PDB prior to this. Further,
we were often saying PDB when we meant MSF, and the two are
actually independent of each other since in theory you can have
other types of data besides PDB data in an MSF. So, this patch
separates the MSF specific code into its own library, with no
dependencies on anything else, and DebugInfoCodeView and
DebugInfoPDB take dependencies on DebugInfoMsf.
llvm-svn: 276458
An extension of D19978, this patch replaces the default BITREVERSE evaluation of individual bit masks+shifts with block mask+shifts when we have integer elements of power-of-2 bits in size.
After calling BSWAP to reverse the order of the constituent bytes (which typically follows a similar approach), every neighbouring 4-bits, 2-bits and finally 1-bit pairs are masked off and swapped over with shifts.
In doing so we can significantly reduce the number of operations required.
Differential Revision: https://reviews.llvm.org/D21578
llvm-svn: 276432
When we failed to parse a function in the mir parser, we should abort
the whole compilation instead of continuing in a weird state. Indeed,
this was creating strange machine function passes failures that were
hard to understand, until we notice that the function actually did not
get parsed correctly!
llvm-svn: 276348
The clearance calculation did not take into account registers defined as outputs or clobbers in inline assembly machine instructions because these register defs are implicit.
Differential Revision: http://reviews.llvm.org/D22580
llvm-svn: 276266
This patch fixes a very subtle bug in regmask calculation. Thanks to zan
jyu Wong <zyfwong@gmail.com> for bringing this to notice.
For example if CL is only clobbered than CH should not be marked
clobbered but CX, RCX and ECX should be mark clobbered. Previously for
each modified register all of its aliases are marked clobbered by
markRegClobbred() in RegUsageInfoCollector.cpp but that is wrong because
when CL is clobbered then MRI::isPhysRegModified() will return true for
CL, CX, ECX, RCX which is correct behavior but then for CX, EXC, RCX we
mark CH also clobbered as CH is aliased to CX,ECX,RCX so
markRegClobbred() is not required because isPhysRegModified already take
cares of proper aliasing register. A very simple test case has been
added to verify this change.
Please find relevant bug report here :
http://llvm.org/PR28567
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: https://reviews.llvm.org/D22400
llvm-svn: 276235
This should be all the low-level instruction selection needs to determine how
to implement an operation, with the remaining context taken from the opcode
(e.g. G_ADD vs G_FADD) or other flags not based on type (e.g. fast-math).
llvm-svn: 276158
Reverting this commit for now as it seems to be causing failures on
test-suite tests on the clang-ppc64le-linux-lnt bot.
This reverts commit r276044.
llvm-svn: 276068
Add a check that the layout predecessor of a block is an actual CFG
predecssor of the block as well. No current code fails this check, but
upcoming patches can trigger this, and it makes sense to separate it
out.
llvm-svn: 276066
canTailDuplicate accepts two blocks and returns true if the first can be
duplicated into the second successfully. Use this function to
encapsulate the heuristic.
llvm-svn: 276062
This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.
This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.
Differential Revision: http://reviews.llvm.org/D21885
llvm-svn: 276044
This adds two pieces:
- RegisterScavenger:::enterBasicBlockEnd() which behaves similar to
enterBasicBlock() but starts tracking at the end of the basic block.
- A RegisterScavenger::backward() method. It is subtly different
from the existing unprocess() method which only considers uses with
the kill flag set: If a value is dead at the end of a basic block with
a last use inside the basic block, unprocess() will fail to mark it as
live. However we cannot change/fix this behaviour because unprocess()
needs to perform the exact reverse operation of forward().
Differential Revision: http://reviews.llvm.org/D21873
llvm-svn: 276043
Also verify that we never try to set the size of a vreg associated
to a register class.
Report an error when we encounter that in MIR. Fix a testcase that
hit that error and had a size for no reason.
llvm-svn: 276012
The following condition expression ( a >> n) & 1 is converted to "bt a, n" instruction. It works on all intel targets.
But on AVX-512 it was broken because the expression is modified to (truncate (a >>n) to i1).
I added the new sequence (truncate (a >>n) to i1) to the BT pattern.
Differential Revision: https://reviews.llvm.org/D22354
llvm-svn: 275950
Elsewhere (particularly computeKnownBits) we assume that a global will be
aligned to the value returned by Value::getPointerAlignment. This is used to
boost the alignment on memcpy/memset, so any target-specific request can only
increase that value.
llvm-svn: 275866
DAGTypeLegalizer::CanSkipSoftenFloatOperand should allow
SELECT op code for x86_64 fp128 type for MME targets,
so SoftenFloatOperand does not abort on SELECT op code.
Differential Revision: http://reviews.llvm.org/D21758
llvm-svn: 275818
When SelectionDAGISel transforms a node representing an inline asm
block, memory constraint information is not preserved. This can cause
constraints to be broken when a memory offset is of the form:
offset + frame index
when the frame is resolved.
By propagating the constraints all the way to the backend, targets can
enforce memory operands of inline assembly to conform to their constraints.
For MIPSR6, some instructions had their offsets reduced to 9 bits from
16 bits such as ll/sc. This becomes problematic when using inline assembly
to perform atomic operations, as an offset can generated that is too big to
encode in the instruction.
Reviewers: dsanders, vkalintris
Differential Review: https://reviews.llvm.org/D21615
llvm-svn: 275786
Previously, we would expand:
%BL<def> = COPY %DL<kill>, %EBX<imp-use,kill>, %EBX<imp-def>
Into:
%BL<def> = MOV8rr %DL<kill>, %EBX<imp-def>
Dropping the imp-use on the floor.
That confused CriticalAntiDepBreaker, which (correctly) assumes that if an
instruction defs but doesn't use a register, that register is dead immediately
before the instruction - while in this case, the high lanes of EBX can be very
much alive.
This fixes PR28560.
Differential Revision: https://reviews.llvm.org/D22425
llvm-svn: 275634
Remove unnecessary clutter in assembly output. When using SjLj EH, the CFI is
not actually used for anything. Do not emit the CFI needlessly. The minor test
adjustments are interesting. The prologue test was just overzealous matcching.
The interesting case is the LSDA change. It was originally added to ensure that
various compilations did not mangle the name (it explicitly checked the name!).
However, subsequent cleanups made it more reliant on the CFI to find the name.
Parse the generated code flow to generically find the label still.
llvm-svn: 275614
Summary:
Instead, we take a single flags arg (a bitset).
Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.
This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted. It also greatly simplifies the process of adding another flag
to getLoad.
Reviewers: chandlerc, tstellarAMD
Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits
Differential Revision: http://reviews.llvm.org/D22249
llvm-svn: 275592
Summary:
Previously we took an unsigned.
Hooray for type-safety.
Reviewers: chandlerc
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D22282
llvm-svn: 275591
For a fully inlined call chain like a -> b -> c -> d, we were emitting
line info for 'd' 3 separate times: once for d's actual InlineSite line
table, and twice for 'b' and 'c'. This is particularly inefficient when
all these functions are in different headers, because now we need to
encode the file change. Windbg was coping with our suboptimal output, so
this should not be noticeable from the debugger.
llvm-svn: 275502
Summary:
Make the target-specific flags in MachineMemOperand::Flags real, bona
fide enum values. This simplifies users, prevents various constants
from going out of sync, and avoids the false sense of security provided
by declaring static members in classes and then forgetting to define
them inside of cpp files.
Reviewers: MatzeB
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22372
llvm-svn: 275451
Summary:
- Give it a shorter name (because we're going to refer to it often from
SelectionDAG and friends).
- Split the flags and alignment into separate variables.
- Specialize FlagsEnumTraits for it, so we can do bitwise ops on it
without losing type information.
- Make some enum values constants in MachineMemOperand instead.
MOMaxBits should not be a valid Flag.
- Simplify some of the bitwise ops for dealing with Flags.
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22281
llvm-svn: 275438
Summary:
In this patch we implement the following parts of XRay:
- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches.
- Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts).
- X86-specific nop sleds as described in the white paper.
- A machine function pass that adds the different instrumentation marker instructions at a very late stage.
- A way of identifying which return opcode is considered "normal" for each architecture.
There are some caveats here:
1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet.
2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library.
Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk
Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits
Differential Revision: http://reviews.llvm.org/D19904
llvm-svn: 275367
Avoid exposing a cl::opt in a public header and instead promote this
option in the API.
Alternatively, we could land the cl::opt in CommandFlags.h so that
it is available to every tool, but we would still have to find an
option for clang.
llvm-svn: 275348
IPRA try to optimize caller saved register by propagating register
usage information from callee to caller so it is beneficial to have
caller saved registers compare to callee saved registers when IPRA
is enabled. Please find more detailed explanation here
https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/tjAJqb0eEgAJ.
This change makes local function do not have any callee preserved
register when IPRA is enabled. A simple test case is also added to
verify this change.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D21561
llvm-svn: 275347
Code cleanup: Move references to SlotMapping and SourceMgr into the
PerFunctionMIParsingState to avoid unnecessary passing around in
parameters.
llvm-svn: 275342
Summary:
Previously it would say we had an invariant load if any of the memory
operands were invariant. But the load should be invariant only if *all*
the memory operands are invariant.
No testcase because this has proven to be very difficult to tickle in
practice. As just one example, ARM's ldrd instruction, which loads 64
bits into two 32-bit regs, is theoretically affected by this. But when
it's produced, it loses its memoperands' invariance bits!
Reviewers: jfb
Subscribers: llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D22318
llvm-svn: 275331
Code cleanup: The PerFunctionMIParsingState is per function, moving a
reference into PFS we can avoid passing around the MachineFunction in an
extra parameter most of the time.
Also change most signatures to consistently pass PFS reference first.
llvm-svn: 275329
Currently the MIR framework prints all its outputs (errors and actual
representation) on stderr.
This patch fixes that by printing the regular output in the output
specified with -o.
Differential Revision: http://reviews.llvm.org/D22251
llvm-svn: 275314
We can freeze the registers after the MachineFrameInfo has been configured (by
telling it about calls, inline asm, ...). This doesn't happen at all yet, but
will be part of IR translation.
Fixes -verify-machineinstrs assertion.
llvm-svn: 275221
Use LivePhysRegs with a backwards walking algorithm to update live in
lists, this way the results do not depend on the presence of kill flags
anymore.
This patch also reduces the number of registers added as live-in.
Previously all pristine registers as well as all sub registers of a
super register were added resulting in unnecessarily large live in
lists. This fixed https://llvm.org/PR25263.
Differential Revision: http://reviews.llvm.org/D22027
llvm-svn: 275201
Added support for:
1. Multi dimension array.
2. Array of structure type, which previously was declared incompletely.
3. Dynamic size array.
4. Array where element type is a typedef, volatile or constant (this should resolve PR28311).
Differential Revision: http://reviews.llvm.org/D21526
llvm-svn: 275167
Blocks to be tail-merged may share more than one successor. Correct the
comment to state that they share a specific successor, SuccBB, rather
than a single successor, which is not true.
llvm-svn: 275104
Preserve assembly comments from input in output assembly and flags to
toggle property. This is on by default for inline assembly and off in
llvm-mc.
Parsed comments are emitted immediately before an EOL which generally
places them on the expected line.
Reviewers: rtrieu, dwmw2, rnk, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20020
llvm-svn: 275058
Drive-by improvement: We would 1) add CSRs, 2) remove callee saved CSRs
and 3) add all CSRs again for the return block. Just adding CSRs once
obviously gives the same results.
llvm-svn: 274955
An identity COPY like this:
%AL = COPY %AL, %EAX<imp-def>
has no semantic effect, but encodes liveness information: Further users
of %EAX only depend on this instruction even though it does not define
the full register.
Replace the COPY with a KILL instruction in those cases to maintain this
liveness information. (This reverts a small part of r238588 but this
time adds a comment explaining why a KILL instruction is useful).
llvm-svn: 274952
Because isReallyTriviallyReMaterializableGeneric puts many limits on
rematerializable instructions, this fix can prevent instructions with
tied virtual operands and instructions with virtual register uses from
being kept in DeadRemat, so as to workaround the live interval consistency
problem for the dummy instructions kept in DeadRemat.
But we still need to fix the live interval consistency problem. This patch
is just a short time relieve. PR28464 has been filed as a reminder.
Differential Revision: http://reviews.llvm.org/D19486
llvm-svn: 274928
Mostly through preferring MachineInstr&, avoid implicit conversions from
iterator to pointer.
Although this may bitrot (since there are other uses blocking me from
removing the implicit operator), this removes the last of the implicit
conversions from MachineInstrBundleIterator to MachineInstr* in the
LLVMCodeGen build target.
llvm-svn: 274893
The createRegAllocPass reads and writes to a global variable 'Registry'
via calls to getDefault and setDefault. Run this under a call_once to
avoid races.
llvm-svn: 274875
As a result, the urem instruction will not be expanded to a sequence of umull,
lsrs, muls and sub instructions, but just a call to __aeabi_uidivmod.
Differential Revision: http://reviews.llvm.org/D22131
llvm-svn: 274843
findScratchNonCalleeSaveRegister() just needs a simple liveness
analysis, use LivePhysRegs for that as it is simpler and does not depend
on the kill flags.
This commit adds a convenience function available() to LivePhysRegs:
This function returns true if the given register is not reserved and
neither the register nor any of its aliases are alive.
Differential Revision: http://reviews.llvm.org/D21865
llvm-svn: 274685
Now with a corrected test to account for a recently supported properties bit in the debug info of a struct.
Original review: http://reviews.llvm.org/D21939
This reverts commit 970c3fd497a28d25dd69526eb52594a696c37968.
llvm-svn: 274661
This logic was introduced in r157663 and does not make any sense to me.
The motivating example in rdar://11538365 looks like this:
This is the tail:
BB#16: derived from LLVM BB %if.end68
Live Ins: %R0 %R4 %R5
Predecessors according to CFG: BB#15 BB#5
tBLXi pred:14, pred:%noreg, <ga:@CFRelease>, %R0<kill>, <regmask>, %LR<imp-def,dead>, %SP<imp-use>, %SP<imp-def>
t2B <BB#20>, pred:14, pred:%noreg
Successors according to CFG: BB#20
This is the predBB:
BB#5:
Live Ins: %R5
Predecessors according to CFG: BB#4
%R4<def> = t2MOVi 0, pred:14, pred:%noreg, opt:%noreg
t2B <BB#16>, pred:14, pred:%noreg
Successors according to CFG: BB#16
However this is invalid machine code to begin with, if %R0 is live-in to
BB#16 then it must be live-in to BB#5 as well if BB#5 does not define
it. We should not need logic to retroactively fix broken machine code
and in fact the example from r157663 passes cleanly with the code
removed and I do not see any (newly) failing tests with the machine
verifier enabled.
Differential Revision: http://reviews.llvm.org/D22031
llvm-svn: 274655
Summary:
findBetterNeighborChains may or may not find a better chain for each node it finds, which include the node ("St") that visitSTORE is currently processing. If no better chain is found for St, visitSTORE should continue instead of return SDValue(St, 0), as if it's CombinedTo'ed.
This fixes bug 28130. There might be other ways to make the test pass (see D21409). I think both of the patches are fixing actual bugs revealed by the same testcase.
Reviewers: echristo, wschmidt, hfinkel, kbarton, amehsan, arsenm, nemanjai, bogner
Subscribers: mehdi_amini, nemanjai, llvm-commits
Differential Revision: http://reviews.llvm.org/D21692
llvm-svn: 274644
StratifiedSets (as implemented) is very fast, but its accuracy is also
limited. If we take a more aggressive andersens-like approach, we can be
way more accurate, but we'll also end up being slower.
So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA.
Long-term, we want to end up in a place where CFLSteens is queried
first; if it can provide an answer, great (since queries are basically
map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc.
This patch splits everything out so we can try to do something like
that when we get a reasonable CFLAnders implementation.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21910
llvm-svn: 274589
This reverts commit r259387 because it inserts illegal code after legalization
in some backends where i64 OR type is illegal for example.
llvm-svn: 274573
We can now handle concatenation of each source multiple times. The previous code just checked for each source to appear once in either order.
This also now handles an entire source vector sized piece having undef indices correctly. We now concat with UNDEF instead of using one of the sources. This is responsible for the test case change.
llvm-svn: 274483
After the block placement, if a block ends with a conditional branch, but the
next block is not its successor. The conditional branch should be changed to
unconditional branch. This patch fixes PR28307, PR28297, PR28402.
Differential Revision: http://reviews.llvm.org/D21811
llvm-svn: 274470
Given something like:
struct S {
int a;
struct { int b; };
};
We would fail to give 'b' offset 4. Instead, we would give it the
offset it has inside of it's struct.
llvm-svn: 274400
A namespace without a name should be written out as `anonymous
namespace' while a tag type without a name should be written out as
<unnamed-tag>.
llvm-svn: 274399
MSVC makes up names for these anonymous structs, but we don't (yet).
Eventually Clang should use getTypedefNameForAnonDecl() to put some name
in the debug info, and we can update the test case when that happens.
llvm-svn: 274391
We were asserting that our type records were valid when emitting
assembly, but not when emitting an object file.
I've been seeing lots of LNK1285 errors (corrupt PDB) during incremental
debug self-host builds with the MSVC linker, and hopefully this will
catch some of them earlier.
llvm-svn: 274373
Use MachineInstr& to avoid implicit conversions from
MachineBasicBlock::iterator to MachineInstr*. In one case, this could
use a range-based for loop, but the other loops iterated in reverse
order.
One of the reverse-loops checked the MachineInstr* for nullptr, a
condition that is provably unreachable. (And even if my proof has a
flaw, UBSan would catch the bug.)
llvm-svn: 274360
Use MachineInstr& instead of MachineInstr* in RegAllocFast to avoid
implicit conversions from MachineInstrBundleIterator. RAFast::spillAll
and RAFast::spillVirtReg still take iterators, since their argument may
be an end iterator from MachineBasicBlock::getFirstTerminator.
llvm-svn: 274353
For the most part this simplifies all callers. There were two places in X86 that needed an explicit makeArrayRef to shorten a statically sized array.
llvm-svn: 274337
Group" sections while lowering. In particular, for ELF sections this is
useful for creating function-specific groups that get merged into the
same named section.
Also use const Twine& instead of StringRef for the getELF functions
while we're here.
Differential Revision: http://reviews.llvm.org/D21743
llvm-svn: 274336
Summary:
This represents the adjustment applied to the implicit 'this' parameter
in the prologue of a virtual method in the MS C++ ABI. The adjustment is
always zero unless multiple inheritance is involved.
This increases the size of DISubprogram by 8 bytes, unfortunately. The
adjustment really is a signed 32-bit integer. If this size increase is
too much, we could probably win it back by splitting out a subclass with
info specific to virtual methods (virtuality, vindex, thisadjustment,
containingType).
Reviewers: aprantl, dexonsmith
Subscribers: aaboud, amccarth, llvm-commits
Differential Revision: http://reviews.llvm.org/D21614
llvm-svn: 274325
Change all the methods in LiveVariables that expect non-null
MachineInstr* to take MachineInstr& and update the call sites. This
clarifies the API, and designs away a class of iterator to pointer
implicit conversions.
llvm-svn: 274319
Convert a loop to a range-based for, using MachineInstr& instead of
MachineInstr* and removing an implicit conversion from iterator to
pointer.
llvm-svn: 274311
Use MachineInstr& over MachineInstr* to avoid implicit iterator to
pointer conversions. MachineInstr*-as-nullptr was being used as a flag
for whether the for loop terminated normally; I added an explicit `bool`
instead.
llvm-svn: 274310
TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr*
arguments (begin and end) that invite implicit conversions from
MachineInstrBundleIterator. One option would be to change their type to
an iterator, but since they don't seem to have been used since the API
was added in 2010, I'm deleting the dead code.
llvm-svn: 274304
Push MachineInstr& through helper APIs for consistency. This doesn't
remove any more implicit conversions, but it's a nice cleanup after
r274300.
llvm-svn: 274301
Switch to a range-based for in IfConverter::PredicateBlock and take
MachineInstr& in MaySpeculate to avoid an implicit conversion from
MachineBasicBlock::iterator to MachineInstr*.
llvm-svn: 274290
This is a mechanical change to make TargetLowering API take MachineInstr&
(instead of MachineInstr*), since the argument is expected to be a valid
MachineInstr. In one case, changed a parameter from MachineInstr* to
MachineBasicBlock::iterator, since it was used as an insertion point.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
llvm-svn: 274287
Summary:
MSVC provide exception handlers with enhanced information to deal with security buffer feature (/GS).
To be more secure, the security cookies (GS and SEH) are validated when unwinding the stack.
The following code:
```
void f() {}
void foo() {
__try {
f();
} __except(1) {
f();
}
}
```
Reviewers: majnemer, rnk
Subscribers: thakis, llvm-commits, chrisha
Differential Revision: http://reviews.llvm.org/D21101
llvm-svn: 274239
CodeView need to know the offset of the storage allocation for a
bitfield. Encode this via the "extraData" field in DIDerivedType and
introduced a new flag, DIFlagBitField, to indicate whether or not a
member is a bitfield.
This fixes PR28162.
Differential Revision: http://reviews.llvm.org/D21782
llvm-svn: 274200
- Use range based for loops
- No need for some !Reg checks: isPhysicalRegister() reports false for
NoRegister anyway
- Do not repeat function name in documentation comment.
- Do not repeat documentation comment in implementation when we already
have one at the declaration.
- Factor some common subexpressions out.
- Change file comments to use doxygen syntax.
llvm-svn: 274194
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr. This is a
general API improvement.
Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other. Instead I've done everything as a block and just
updated what was necessary.
This is mostly mechanical fixes: adding and removing `*` and `&`
operators. The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy. I couldn't run tests
for AVR since llc doesn't link with it turned on.
llvm-svn: 274189
- Use range based for
- Use the more common variable names MBB and MF for
MachineBasicBlock/MachineFunction variables.
- Add a few const modifiers
llvm-svn: 274187
This is a fix for PR27842.
An IR-level implementation of stack coloring tailored to work with
SafeStack. It is a bit weaker than the MI implementation in that it
does not the "lifetime start at first access" logic. This can be
improved in the future.
This patch also replaces the naive implementation of stack frame
layout with a greedy algorithm that can split existing stack slots
and even fit small objects inside the alignment padding of other
objects.
llvm-svn: 274162
I think this converts all the simple cases that really just care about
the generated code being position independent or not. The remaining
uses are a bit more complicated and are checking things like "is this
a library or executable" or "can this symbol be preempted".
llvm-svn: 274055
This patch enhances dot graph viewer to show hot regions
with hot bbs/edges displayed in red. The ratio of the bb
freq to the max freq of the function needs to be no less
than the value specified by view-hot-freq-percent option.
The default value is 10 (i.e. 10%).
llvm-svn: 273996
MBFI supports profile count dumping and function
name based filtering. Add these two feature to
BFI as well. The filtering option is shared between
BFI and MBFI: -view-bfi-func-name=..
llvm-svn: 273992
BFI and MBFI's dot traits class share most of the
code and all future enhancement. This patch extracts
common implementation into base class BFIDOTGraphTraitsBase.
This patch also enables BFI graph to show branch probability
on edges as MBFI does before.
llvm-svn: 273990
The main issue here is that the "thumb" flag wasn't set for some of these
sections, making MSVC's link.exe fails to correctly relocate code
against the symbols inside these sections. link.exe could fail for
instance with the "fixup is not aligned for target 'XX'" error. If
linking doesn't fail, the relocation process goes wrong in the end and
invalid code is generated by the linker.
This patch adds Thumb/ARM information so that the right flags are set
on COFF/Windows.
Patch by Adrien Guinet.
llvm-svn: 273880
SimplifyCFG had logic to insert calls to llvm.trap for two very
particular IR patterns: stores and invokes of undef/null.
While InstCombine canonicalizes certain undefined behavior IR patterns
to stores of undef, phase ordering means that this cannot be relied upon
in general.
There are much better tools than llvm.trap: UBSan and ASan.
N.B. I could be argued into reverting this change if a clear argument as
to why it is important that we synthesize llvm.trap for stores, I'd be
hard pressed to see why it'd be useful for invokes...
llvm-svn: 273778
Remember the last choice for the top/bottom scheduling boundary in
bidirectional scheduling mode. The top choice should not change if we
schedule at the bottom and vice versa.
This allows us to improve compiletime: We only recalculate the best pick
for one border and re-use the cached top-pick from the other border.
Differential Revision: http://reviews.llvm.org/D19350
llvm-svn: 273766
In bidirectional scheduling this gives more stable results than just
comparing the "reason" fields of the top/bottom node because the reason
field may be higher depending on what other nodes are in the queue.
Differential Revision: http://reviews.llvm.org/D19401
llvm-svn: 273755
This fixes an embarrassing bug when emitting .debug_loc entries for 64-bit+ constants,
which were previously silently truncated to 32 bits.
<rdar://problem/26843232>
llvm-svn: 273736
Tail merge was making the assumption that a layout successor or
predecessor was always a cfg successor/predecessor. Remove that
assumption. Changes to tests are necessary because the errant cfg edges
were preventing optimizations.
llvm-svn: 273700
Clang emits them in reverse order to conform to the ABI, which requires
left-to-right destruction. As a result, the order doesn't fall out
naturally, and we have to sort things out in the backend.
Fixes PR28213
llvm-svn: 273696
There are two remaining issues here:
1. No vbptr information
2. Need to mention indirect virtual bases
Getting indirect virtual bases is just a matter of adding an "indirect"
flag, emitting them in the frontend, and ignoring them when appropriate
for DWARF.
All virtual bases use the same artificial vbptr field, so I think the
vbptr offset will be best represented by an implicit __vbptr$ClassName
member similar to our existing __vptr$ member.
llvm-svn: 273688
When considering whether to split an instruction with a memory operand
into an explicit load and a register-based instruction, we currently
check that the resulting instruction has exactly 1 def. This prevents 2
important LICM optimizations: compares with memory operands, and double
indirect calls. All the tests and the test-suite pass without the check.
My guess as to original intent is to limit the additional register pressure
created by the new instruction, but given that we only split out a single
register, it is already limited.
The licm-dominance test now checks actual memory loads for hoisting instead of
undef, and it tests compares.
hoist-invariant-load.ll now checks for 2 hoists, the intended hoist, and a bonus
from calling a got-relative function in a loop.
llvm-svn: 273616
Recommiting after correcting over-eager Debug Value transfer fixing PR28270.
[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.
Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.
This refixes PR9817 which was being incompletely checked in the
testsuite.
Reviewers: jyknight
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D21037
llvm-svn: 273585
IfConversion used to always add the undef flag when adding a use operand
on a newly predicated instruction. This would be an operand for the register
being conditionally redefined. Due to the undef flag, the liveness of this
register prior to the predicated instruction would get lost.
This patch changes this so that such use operands are added only when the
register is live, without the undef flag.
Reviewed by Quentin Colombet.
http://reviews.llvm.org/D209077
llvm-svn: 273545
When trying to convert a loading instruction into a FAULTING_LOAD, we
sometimes face code like this:
if %R10 is not null:
%R9<def> = MOV32ri Immediate
%R9<def, tied> = AND32rm %R9, 0x20(%R10)
else:
goto TRAP
In these cases we would like to use the AND32rm instruction as the
faulting operation by hoisting the "depedency" def-ing %R9 also above
the control flow, transforming the program into:
%R9<def> = MOV32ri Immediate
%R9<def, tied> = FAULTING_LOAD_OP(AND32rm %R9, 0x20(%R10), FailPath: TRAP)
This change teaches ImplicitNullChecks to do the above, when safe.
llvm-svn: 273501
This is a convenience iterator that allows clients to enumerate the
GlobalObjects within a Module.
Also start using it in a few places where it is obviously the right thing
to use.
Differential Revision: http://reviews.llvm.org/D21580
llvm-svn: 273470
-view-machine-block-freq-propagation-dags currently
support integer and fraction as the suboptions. This
patch adds the 'count' suboption to display actual
profile count if available.
llvm-svn: 273460
Recommiting after fixing over-aggressive assertion
[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.
Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.
This refixes PR9817 which was being incompletely checked in the
testsuite.
Reviewers: jyknight
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D21037
llvm-svn: 273456
CodeView needs to know if a virtual method was introduced in the current
class, and base classes may not have complete type information, so we
need to thread this bit through from the frontend.
llvm-svn: 273453
This is the motivating example:
struct B { int b; };
struct A { B *b; };
int f(A *p) { return p->b->b; }
Clang emits complete types for both A and B because they are required to
be complete, but our CodeView emission would only emit forward
declarations of A and B. This was a consequence of the fact that the A*
type must reference the forward declaration of A, which doesn't
reference B at all.
We can't eagerly emit complete definitions of A and B when we request
the forward declaration's type index because of recursive types like
linked lists. If we did that, our stack usage could get out of hand, and
it would be possible to lower a type while attempting to lower a type,
and we would need to double check if our type is already present in the
TypeIndexMap after all recursive getTypeIndex calls.
Instead, defer complete type emission until after all type lowering has
completed. This ensures that all referenced complete types are emitted,
and that type lowering is not re-entrant.
llvm-svn: 273443
From a design perspective, complete record type emission should not
depend on information from other complete record types.
Currently this map is unused, and needlessly accumulates data throughout
compilation.
llvm-svn: 273431
The setCallee function will set the number of fixed arguments based
on the size of the argument list. The FixedArgs parameter was often
explicitly set to 0, leading to a lack of consistent value for non-
vararg functions.
Differential Revision: http://reviews.llvm.org/D20376
llvm-svn: 273403
We now include namespace scope info in LF_FUNC_ID records and we emit
LF_MFUNC_ID records for member functions as we should.
Class names are now fully qualified, which is what MSVC does.
Add a little bit of scaffolding to handle ThisAdjustment when it arrives
in DISubprogram.
llvm-svn: 273358
Summary:
Fix the computation of the offsets present in the scopetable when using the
SEH (__except_handler4).
This patch added an intrinsic to track the position of the allocation on the
stack of the EHGuard. This position is needed when producing the ScopeTable.
```
struct _EH4_SCOPETABLE {
DWORD GSCookieOffset;
DWORD GSCookieXOROffset;
DWORD EHCookieOffset;
DWORD EHCookieXOROffset;
_EH4_SCOPETABLE_RECORD ScopeRecord[1];
};
struct _EH4_SCOPETABLE_RECORD {
DWORD EnclosingLevel;
long (*FilterFunc)();
union {
void (*HandlerAddress)();
void (*FinallyFunc)();
};
};
```
The code to generate the EHCookie is added in `X86WinEHState.cpp`.
Which is adding these instructions when using SEH4.
```
Lfunc_begin0:
# BB#0: # %entry
pushl %ebp
movl %esp, %ebp
pushl %ebx
pushl %edi
pushl %esi
subl $28, %esp
movl %ebp, %eax <<-- Loading FramePtr
movl %esp, -36(%ebp)
movl $-2, -16(%ebp)
movl $L__ehtable$use_except_handler4_ssp, %ecx
xorl ___security_cookie, %ecx
movl %ecx, -20(%ebp)
xorl ___security_cookie, %eax <<-- XOR FramePtr and Cookie
movl %eax, -40(%ebp) <<-- Storing EHGuard
leal -28(%ebp), %eax
movl $__except_handler4, -24(%ebp)
movl %fs:0, %ecx
movl %ecx, -28(%ebp)
movl %eax, %fs:0
movl $0, -16(%ebp)
calll _may_throw_or_crash
LBB1_1: # %cont
movl -28(%ebp), %eax
movl %eax, %fs:0
addl $28, %esp
popl %esi
popl %edi
popl %ebx
popl %ebp
retl
```
And the corresponding offset is computed:
```
Luse_except_handler4_ssp$parent_frame_offset = -36
.p2align 2
L__ehtable$use_except_handler4_ssp:
.long -2 # GSCookieOffset
.long 0 # GSCookieXOROffset
.long -40 # EHCookieOffset <<----
.long 0 # EHCookieXOROffset
.long -2 # ToState
.long _catchall_filt # FilterFunction
.long LBB1_2 # ExceptionHandler
```
Clang is not yet producing function using SEH4, but it's a work in progress.
This patch is a step toward having a valid implementation of SEH4.
Unfortunately, it is not yet fully working. The EH registration block is not
allocated at the right offset on the stack.
Reviewers: rnk, majnemer
Subscribers: llvm-commits, chrisha
Differential Revision: http://reviews.llvm.org/D21231
llvm-svn: 273281
When you have a map holding a unique_ptr, hold a reference to the raw
pointer instead of the unique pointer. The unique_ptr will be moved on
rehash.
llvm-svn: 273268
Summary:
canCombineSinCosLibcall() would previously combine sin+cos into sincos for
GNUX32/GNUEABI/GNUEABIHF regardless of whether UnsafeFPMath were set or not.
However, GNU would only combine them for UnsafeFPMath because sincos does not
set errno like sin and cos do. It seems likely that this is an oversight.
Reviewers: t.p.northover
Subscribers: t.p.northover, aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D21431
llvm-svn: 273259
Summary:
Using isOutOfOrder makes the code more clear.
Reviewers: rengolin, atrick, hfinkel.
Differential Revision: http://reviews.llvm.org/D21548
llvm-svn: 273255
An incomplete member pointer type will always have a size of zero, so we
don't need an extra flag. Credit to David Majnemer for the idea.
llvm-svn: 273057
Summary:
This seems like the least intrusive way to pass this information
through.
Fixes PR28151
Reviewers: majnemer, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21444
llvm-svn: 273053
Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1
or 2-byte. For those, you need to mask and shift the 1 or 2 byte values
appropriately to use the 4-byte instruction.
This change adds support for cmpxchg-based instruction sets (only SPARC,
in LLVM). The support can be extended for LL/SC-based PPC and MIPS in
the future, supplanting the ISel expansions those architectures
currently use.
Tests added for the IR transform and SPARCv9.
Differential Revision: http://reviews.llvm.org/D21029
llvm-svn: 273025
Names in function id records don't include nested name specifiers or
template arguments, but names in the symbol stream include both.
For the symbol stream, instead of having Clang put the fully qualified
name in the subprogram display name, recreate it from the subprogram
scope chain. For the type stream, take the unqualified name and chop of
any template arguments.
This makes it so that CodeView DI metadata is more similar to DWARF DI
metadata.
llvm-svn: 273009
This is a fix for PR27844.
When replacing uses of unsafe allocas, emit the new location
immediately after each use. Without this, the pointer stays live from
the function entry to the last use, while it's usually cheaper to
recalculate.
llvm-svn: 272969
When moving unsafe allocas to the unsafe stack, dbg.declare intrinsics are
updated to refer to the new location.
This change does the same to dbg.value intrinsics.
llvm-svn: 272968
MSVC handles enums differently from structs and classes: a forward
declaration is not emitted unconditionally. MSVC does not emit an S_UDT
record for the enum.
Differential Revision: http://reviews.llvm.org/D21442
llvm-svn: 272960
Summary:
... into getFrameIndexReferencePreferSP. This change folds the
fail-then-retry logic into getFrameIndexReferencePreferSP.
There is a non-functional but behaviorial change in WinException --
earlier if `getFrameIndexReferenceFromSP` failed we'd trip an assert,
but now we'll silently use the (wrong) offset from the base pointer. I
could not write the assert I'd like to write ("FrameReg ==
StackRegister", like I've done in X86FrameLowering) since there is no
easy way to get to the stack register from WinException (happy to be
proven wrong here). One solution to this is to add a `bool
OnlyStackPointer` parameter to `getFrameIndexReferenceFromSP` that
asserts if it could not satisfy its promise of returning an offset from
a stack pointer, but that seems overkill.
Reviewers: rnk
Subscribers: sanjoy, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D21427
llvm-svn: 272938
This allows better catching of compiler errors since we can use
the override keyword to verify that methods are actually
overridden.
Also in this patch I've changed from storing a boolean Error
code everywhere to returning an llvm::Error, to propagate richer
error information up the call stack.
Reviewed By: ruiu, rnk
Differential Revision: http://reviews.llvm.org/D21410
llvm-svn: 272926
When calculating a square root using Newton-Raphson with two constants,
a naive implementation is to use five multiplications (four muls to calculate
reciprocal square root and another one to calculate the square root itself).
However, after some reassociation and CSE the same result can be obtained
with only four multiplications. Unfortunately, there's no reliable way to do
such a reassociation in the back-end. So, the patch modifies NR code itself
so that it directly builds optimal code for SQRT and doesn't rely on any
further reassociation.
Patch by Nikolai Bozhenov!
Differential Revision: http://reviews.llvm.org/D21127
llvm-svn: 272920
Emit a S_UDT record for typedefs. We still need to do something for
class types.
Differential Revision: http://reviews.llvm.org/D21149
llvm-svn: 272813
[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.
Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.
This refixes PR9817 which was being incompletely checked in the
testsuite.
Reviewers: jyknight
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D21037
llvm-svn: 272792
Summary:
... when the offset is not statically known.
Prioritize addresses relative to the stack pointer in the stackmap, but
fallback gracefully to other modes of addressing if the offset to the
stack pointer is not a known constant.
Patch by Oscar Blumberg!
Reviewers: sanjoy
Subscribers: llvm-commits, majnemer, rnk, sanjoy, thanm
Differential Revision: http://reviews.llvm.org/D21259
llvm-svn: 272756
Document the new parameter and threshod computation
model. Also fix a bug when the threshold parameter
is set to be different from the default.
llvm-svn: 272749
Summary: With runtime profile, we have more confidence in branch probability, thus during basic block layout, we set a lower hot prob threshold so that blocks can be layouted optimally.
Reviewers: djasper, davidxl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20991
llvm-svn: 272729
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.
This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
the normal rule that the global must have a unique address can be broken without
being observable by the program by performing comparisons against the global's
address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
its own copy of the global if it requires one, and the copy in each linkage unit
must be the same)
- It is a constant or a function (which means that the program cannot observe that
the unique-address rule has been broken by writing to the global)
Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.
See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.
Part of the fix for PR27553.
Differential Revision: http://reviews.llvm.org/D20348
llvm-svn: 272709
Summary:
Split NumInstrDups statistic into separate added/removed counts to avoid
negative stat being printed as unsigned.
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D21335
llvm-svn: 272700
For <N x i32> type mul, pmuludq will be used for targets without SSE41, which
often introduces many extra pack and unpack instructions in vectorized loop
body because pmuludq generates <N/2 x i64> type value. However when the operands
of <N x i32> mul are extended from smaller size values like i8 and i16, the type
of mul may be shrunk to use pmullw + pmulhw/pmulhuw instead of pmuludq, which
generates better code. For targets with SSE41, pmulld is supported so no
shrinking is needed.
Differential Revision: http://reviews.llvm.org/D20931
llvm-svn: 272694
Change EmitGlobalVariable to check final assembler section is in BSS
before using .lcomm/.comm directive. This prevents globals from being
put into .bss erroneously when -data-sections is used.
This fixes PR26570.
Reviewers: echristo, rafael
Subscribers: llvm-commits, mehdi_amini
Differential Revision: http://reviews.llvm.org/D21146
llvm-svn: 272674
The exit-on-error flag in the ARM test is necessary in order to avoid an
unreachable in the DAGTypeLegalizer, when trying to expand a physical register.
We can also avoid this situation by introducing a bitcast early on, where the
invalid scalar-to-vector conversion is detected.
We also add a test for PowerPC, which goes through a similar code path in the
SelectionDAGBuilder.
Fixes PR27765.
Differential Revision: http://reviews.llvm.org/D21061
llvm-svn: 272644
Save machine function pointer so that
the reference does not need to be passed around.
This also gives other methods access to machine
function for information such as entry count etc.
llvm-svn: 272594
This is third patch to clean up the code.
Included in this patch:
1. Further unclutter trace/chain formation main routine;
2. Isolate the logic to compute global cost/conflict detection
into its own method;
3. Heavily document the selection algorithm;
4. Added helper hook to allow PGO specific logic to be
added in the future.
llvm-svn: 272582
This is second patch to clean up the code.
In this patch, the logic to determine block outlinining
is refactored and more comments are added.
llvm-svn: 272514
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
llvm-svn: 272512
This is one of the patches to clean up the code so that
it is in a better form to make future enhancements easier.
In htis patch, the logic to collect viable successors are
extrated as a helper to unclutter the caller which gets very
large recenty. Also cleaned up BP adjustment code.
llvm-svn: 272482
undef uses are no real uses of a register and must be ignored by
findLastUseBefore() so that handleMove() does not produce invalid live
intervals in some cases.
This fixed http://llvm.org/PR28083
llvm-svn: 272446
Summary:
When stack-protection is activated and WinEH exceptions is used,
the EHRegNode (exception handling registration) is allocated twice on the stack.
This was not breaking anything except loosing space on the stack.
```
D:\src\llvm\examples>llc exc2.ll -debug-only=pei
alloc FI(0) at SP[-24]
alloc FI(1) at SP[-48] <<-- Allocated
alloc FI(1) at SP[-72] <<-- Allocated twice!?
alloc FI(2) at SP[-76]
alloc FI(4) at SP[-80]
alloc FI(3) at SP[-84]
```
Reviewers: rnk, majnemer
Subscribers: chrisha, llvm-commits
Differential Revision: http://reviews.llvm.org/D21188
llvm-svn: 272426
Adds a MachineFunctionPass that scans the body to find calls, and
update the register mask with the one saved by the
RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D21180
llvm-svn: 272414
Add an option to enable the analysis of MachineFunction register
usage to extract the list of clobbered registers.
When enabled, the CodeGen order is changed to be bottom up on the Call
Graph.
The analysis is split in two parts, RegUsageInfoCollector is the
MachineFunction Pass that runs post-RA and collect the list of
clobbered registers to produce a register mask.
An immutable pass, RegisterUsageInfo, stores the RegMask produced by
RegUsageInfoCollector, and keep them available. A future tranformation
pass will use this information to update every call-sites after
instruction selection.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D20769
llvm-svn: 272403
When we delete a live-range, we check if that live-range is the origin of others
to keep it around for rematerialization. For that we check that the instruction
we are about to remove is the same as the definition of the VNI of the original
live-range.
If this is the case, we just shrink the live-range to an empty one.
Now, when we try to delete one of the children of such live-range (product of
splitting), we do the same check.
However, now the original live-range is empty and there is no way we can
access the VNI to check its definition, and we crash.
When we cannot get the VNI for the original live-range, that means we are not in
the presence of the original definition. Thus, this check does not need to happen
in that case and the crash is sloved!
This bug was introduced in r266162 | wmi | 2016-04-12 20:08:27. It affects every
target that uses the greedy register allocator.
To happen, we need to delete both a the original instruction and its split
products, in that order. This is likely to happen when rematerialization comes
into play.
Trying to produce a more robust test case. Will follow in a coming commit.
This fixes llvm.org/PR27983.
rdar://problem/26651519
llvm-svn: 272314
This reapplies commit r271930, r271915, r271923. They hit a bug in
Thumb which is fixed in r272258 now.
The original message:
The code layout that TailMerging (inside BranchFolding) works on is not the
final layout optimized based on the branch probability. Generally, after
BlockPlacement, many new merging opportunities emerge.
This patch calls Tail Merging after MBP and calls MBP again if Tail Merging
merges anything.
llvm-svn: 272267
Without that check it was possible to write test cases where the size
was not specified and we ended up with weird asserts down the road,
because the default value (1) would not make sense.
llvm-svn: 272226
For complex rewrittings, which do not occur currently, the related
machine instruction may have been deleted in the process. Therefore, do
not try to print it after the mapping is applied.
llvm-svn: 272209
Refactor the code so that we do not compute in two different places the
end iterator for the range of new virtual registers for a given operand.
Although this refactoring was intended as NFC, this is not the case
because it actually fixes a bug where we were returning a range off by 1
(too long). Right now, this could not result in an actual bug because we
were accessing this range via the BreakDown size of the related operand.
llvm-svn: 272208
Summary:
Consider the following diamond CFG:
A
/ \
B C
\/
D
Suppose A->B and A->C have probabilities 81% and 19%. In block-placement, A->B is called a hot edge and the final placement should be ABDC. However, the current implementation outputs ABCD. This is because when choosing the next block of B, it checks if Freq(C->D) > Freq(B->D) * 20%, which is true (if Freq(A) = 100, then Freq(B->D) = 81, Freq(C->D) = 19, and 19 > 81*20%=16.2). Actually, we should use 25% instead of 20% as the probability here, so that we have 19 < 81*25%=20.25, and the desired ABDC layout will be generated.
Reviewers: djasper, davidxl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20989
llvm-svn: 272203
Summary:
Now DISubroutineType has a 'cc' field which should be a DW_CC_ enum. If
it is present and non-zero, the backend will emit it as a
DW_AT_calling_convention attribute. On the CodeView side, we translate
it to the appropriate enum for the LF_PROCEDURE record.
I added a new LLVM vendor specific enum to the list of DWARF calling
conventions. DWARF does not appear to attempt to standardize these, so I
assume it's OK to do this until we coordinate with GCC on how to emit
vectorcall convention functions.
Reviewers: dexonsmith, majnemer, aaboud, amccarth
Subscribers: mehdi_amini, llvm-commits
Differential Revision: http://reviews.llvm.org/D21114
llvm-svn: 272197
When repairing with a copy, instead of accounting for the cost of that
copy and actually inserting it, we may be able to use an alternative
source for the register to repair and just use it.
Make sure this is documented, so that we consider that opportunity at
some point.
llvm-svn: 272176
Now, the target will be able to provide its how implementation to remap
an instruction. This open the way to crazier optimizations, but to
beginning with, we will be able to handle something else than the
default mapping.
llvm-svn: 272165
When the command line option is set, it overrides any thing that the
target may have set. The rationale is that we get what we asked for.
Options are respectively regbankselect-fast and regbankselect-greedy for
fast and greedy mode.
llvm-svn: 272158
repairing.
Copies are easy because we repair only when there is a mismatch. For
non-copy repairing, i.e., cases that involves breaking down or gathering
up the value, one of the operand may not have a register bank yet. Thus,
derivate a cost from that, requires more work.
llvm-svn: 272157
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.
llvm-svn: 272126
The cost of a copy may be different based on how many bits we have to
copy around. E.g., a 8-bit copy may be different than a 32-bit copy.
llvm-svn: 272084
Summary:
This patch is adding support for the MSVC buffer security check implementation
The buffer security check is turned on with the '/GS' compiler switch.
* https://msdn.microsoft.com/en-us/library/8dbf701c.aspx
* To be added to clang here: http://reviews.llvm.org/D20347
Some overview of buffer security check feature and implementation:
* https://msdn.microsoft.com/en-us/library/aa290051(VS.71).aspx
* http://www.ksyash.com/2011/01/buffer-overflow-protection-3/
* http://blog.osom.info/2012/02/understanding-vs-c-compilers-buffer.html
For the following example:
```
int example(int offset, int index) {
char buffer[10];
memset(buffer, 0xCC, index);
return buffer[index];
}
```
The MSVC compiler is adding these instructions to perform stack integrity check:
```
push ebp
mov ebp,esp
sub esp,50h
[1] mov eax,dword ptr [__security_cookie (01068024h)]
[2] xor eax,ebp
[3] mov dword ptr [ebp-4],eax
push ebx
push esi
push edi
mov eax,dword ptr [index]
push eax
push 0CCh
lea ecx,[buffer]
push ecx
call _memset (010610B9h)
add esp,0Ch
mov eax,dword ptr [index]
movsx eax,byte ptr buffer[eax]
pop edi
pop esi
pop ebx
[4] mov ecx,dword ptr [ebp-4]
[5] xor ecx,ebp
[6] call @__security_check_cookie@4 (01061276h)
mov esp,ebp
pop ebp
ret
```
The instrumentation above is:
* [1] is loading the global security canary,
* [3] is storing the local computed ([2]) canary to the guard slot,
* [4] is loading the guard slot and ([5]) re-compute the global canary,
* [6] is validating the resulting canary with the '__security_check_cookie' and performs error handling.
Overview of the current stack-protection implementation:
* lib/CodeGen/StackProtector.cpp
* There is a default stack-protection implementation applied on intermediate representation.
* The target can overload 'getIRStackGuard' method if it has a standard location for the stack protector cookie.
* An intrinsic 'Intrinsic::stackprotector' is added to the prologue. It will be expanded by the instruction selection pass (DAG or Fast).
* Basic Blocks are added to every instrumented function to receive the code for handling stack guard validation and errors handling.
* Guard manipulation and comparison are added directly to the intermediate representation.
* lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
* lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
* There is an implementation that adds instrumentation during instruction selection (for better handling of sibbling calls).
* see long comment above 'class StackProtectorDescriptor' declaration.
* The target needs to override 'getSDagStackGuard' to activate SDAG stack protection generation. (note: getIRStackGuard MUST be nullptr).
* 'getSDagStackGuard' returns the appropriate stack guard (security cookie)
* The code is generated by 'SelectionDAGBuilder.cpp' and 'SelectionDAGISel.cpp'.
* include/llvm/Target/TargetLowering.h
* Contains function to retrieve the default Guard 'Value'; should be overriden by each target to select which implementation is used and provide Guard 'Value'.
* lib/Target/X86/X86ISelLowering.cpp
* Contains the x86 specialisation; Guard 'Value' used by the SelectionDAG algorithm.
Function-based Instrumentation:
* The MSVC doesn't inline the stack guard comparison in every function. Instead, a call to '__security_check_cookie' is added to the epilogue before every return instructions.
* To support function-based instrumentation, this patch is
* adding a function to get the function-based check (llvm 'Value', see include/llvm/Target/TargetLowering.h),
* If provided, the stack protection instrumentation won't be inlined and a call to that function will be added to the prologue.
* modifying (SelectionDAGISel.cpp) do avoid producing basic blocks used for inline instrumentation,
* generating the function-based instrumentation during the ISEL pass (SelectionDAGBuilder.cpp),
* if FastISEL (not SelectionDAG), using the fallback which rely on the same function-based implemented over intermediate representation (StackProtector.cpp).
Modifications
* adding support for MSVC (lib/Target/X86/X86ISelLowering.cpp)
* adding support function-based instrumentation (lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp, .h)
Results
* IR generated instrumentation:
```
clang-cl /GS test.cc /Od /c -mllvm -print-isel-input
```
```
*** Final LLVM Code input to ISel ***
; Function Attrs: nounwind sspstrong
define i32 @"\01?example@@YAHHH@Z"(i32 %offset, i32 %index) #0 {
entry:
%StackGuardSlot = alloca i8* <<<-- Allocated guard slot
%0 = call i8* @llvm.stackguard() <<<-- Loading Stack Guard value
call void @llvm.stackprotector(i8* %0, i8** %StackGuardSlot) <<<-- Prologue intrinsic call (store to Guard slot)
%index.addr = alloca i32, align 4
%offset.addr = alloca i32, align 4
%buffer = alloca [10 x i8], align 1
store i32 %index, i32* %index.addr, align 4
store i32 %offset, i32* %offset.addr, align 4
%arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 0
%1 = load i32, i32* %index.addr, align 4
call void @llvm.memset.p0i8.i32(i8* %arraydecay, i8 -52, i32 %1, i32 1, i1 false)
%2 = load i32, i32* %index.addr, align 4
%arrayidx = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 %2
%3 = load i8, i8* %arrayidx, align 1
%conv = sext i8 %3 to i32
%4 = load volatile i8*, i8** %StackGuardSlot <<<-- Loading Guard slot
call void @__security_check_cookie(i8* %4) <<<-- Epilogue function-based check
ret i32 %conv
}
```
* SelectionDAG generated instrumentation:
```
clang-cl /GS test.cc /O1 /c /FA
```
```
"?example@@YAHHH@Z": # @"\01?example@@YAHHH@Z"
# BB#0: # %entry
pushl %esi
subl $16, %esp
movl ___security_cookie, %eax <<<-- Loading Stack Guard value
movl 28(%esp), %esi
movl %eax, 12(%esp) <<<-- Store to Guard slot
leal 2(%esp), %eax
pushl %esi
pushl $204
pushl %eax
calll _memset
addl $12, %esp
movsbl 2(%esp,%esi), %esi
movl 12(%esp), %ecx <<<-- Loading Guard slot
calll @__security_check_cookie@4 <<<-- Epilogue function-based check
movl %esi, %eax
addl $16, %esp
popl %esi
retl
```
Reviewers: kcc, pcc, eugenis, rnk
Subscribers: majnemer, llvm-commits, hans, thakis, rnk
Differential Revision: http://reviews.llvm.org/D20346
llvm-svn: 272053
This reverts commit r271962 and reinstantes r271957.
MSVC's linker doesn't appear to like it if you have an empty symbol
substream, so only open a symbol substream if we're going to emit
something about globals into it.
Makes check-asan pass.
llvm-svn: 271965
The code layout that TailMerging (inside BranchFolding) works on is not the
final layout optimized based on the branch probability. Generally, after
BlockPlacement, many new merging opportunities emerge.
This patch calls Tail Merging after MBP and calls MBP again if Tail Merging
merges anything.
Differential Revision: http://reviews.llvm.org/D20276
llvm-svn: 271925
C++ has a builtin type called wchar_t. Clang also provides a type
called __wchar_t in C mode.
In C mode, wchar_t can be a typedef to unsigned short.
llvm-svn: 271793
My first attempt at this had an overly aggressive assert - chain nodes
will only be removed, but we could hit the assert if a non-chain node
was CSE'd (NodeToMatch, for instance).
This reapplies r271706 by reverting r271713 and fixing an assert.
Original message:
Avoid relying on UB by looking into deleted nodes for a marker value.
Instead, update the list of chain nodes as we go.
llvm-svn: 271733
This only translates data members for now. Translating overloaded
methods is complicated, so I stopped short of doing that.
Reviewers: aaboud
Differential Revision: http://reviews.llvm.org/D20924
llvm-svn: 271680
The DIType* for void is the null pointer. A null DIType can never be a
qualified type, so we can just exit the loop at this point and go to
getTypeIndex(BaseTy).
Fixes PR27984
llvm-svn: 271550
Summary:
If the target requests it, use emptry spaces in the fixed and
callee-save stack area to allocate local stack objects.
AArch64: Change last callee-save reg stack object alignment instead of
size to leave a gap to take advantage of above change.
Reviewers: t.p.northover, qcolombet, MatzeB
Subscribers: rengolin, mcrosier, llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D20220
llvm-svn: 271527
Although this was intended to be NFC, the test case wiggle shows a change in
code scheduling/RA caused by a difference in the SDLoc() generation.
Depending on how you look at it, this is the (dis)advantage of exact checking
in regression tests.
llvm-svn: 271526
Use the type index of the underlying type unless we have a typedef from
long to HRESULT; HRESULT typedefs are translated to T_HRESULT.
llvm-svn: 271494
When the index is known to be constant 0, insert directly into the the low half,
instead of spilling, performing the insert in-memory, and reloading.
Differential Revision: http://reviews.llvm.org/D20763
llvm-svn: 271428
Summary:
Re-enable lifetime-start-on-first-use for stack coloring,
but explicitly disable it for slots with more than one start
or end lifetime marker.
Bug: 27903
Reviewers: wmi, tejohnson, qcolombet, gbiv
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20739
llvm-svn: 271412
Summary:
This is meant to be the tiniest step towards DIType to CV type index
translation that I could come up with. Whenever translation fails, we use type
index zero, which is the unknown type.
Reviewers: aaboud, zturner
Subscribers: llvm-commits, amccarth
Differential Revision: http://reviews.llvm.org/D20840
llvm-svn: 271408
This should have been converting the size to bytes, but wasn't really.
These should probably all be using getStoreSize instead.
I haven't been able to come up with a meaningful testcase for this.
I can trigger it using combinations of struct loads and stores,
but can't observe a difference in non-broken testcases.
isAlias is only really used during store merging, so I'm not sure how
to get into the vector splitting situation the comment describes
since store merging is only done before type legalization.
llvm-svn: 271356
Refactor LiveIntervals::renameDisconnectedComponents() to be a pass.
Also change the name to "RenameIndependentSubregs":
- renameDisconnectedComponents() worked on a MachineFunction at a time
so it is a natural candidate for a machine function pass.
- The algorithm is testable with a .mir test now.
- This also fixes a problem where the lazy renaming as part of the
MachineScheduler introduced IMPLICIT_DEF instructions after the number
of a nodes in a region were counted leading to a mismatch.
Differential Revision: http://reviews.llvm.org/D20507
llvm-svn: 271345
We think it's OK to generate half fminnan because it's legal for the
transform-to type (f32; r245196). However, PromoteFloatRes was missing
the case; simply promote like the other binops, including minnum.
llvm-svn: 271317
Adds the method MCStreamer::EmitBinaryData, which is usually an alias
for EmitBytes. In the MCAsmStreamer case, it is overridden to emit hex
dump output like this:
.byte 0x0e, 0x00, 0x08, 0x10
.byte 0x03, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x10, 0x00, 0x00
Also, when verbose asm comments are enabled, this patch prints the dump
output for each comment before its record, like this:
# ArgList (0x1000) {
# TypeLeafKind: LF_ARGLIST (0x1201)
# NumArgs: 0
# Arguments [
# ]
# }
.byte 0x06, 0x00, 0x01, 0x12
.byte 0x00, 0x00, 0x00, 0x00
This should make debugging easier and testing more convenient.
Reviewers: aaboud
Subscribers: majnemer, zturner, amccarth, aaboud, llvm-commits
Differential Revision: http://reviews.llvm.org/D20711
llvm-svn: 271313
This adds support to the backed to actually support SjLj EH as an exception
model. This is *NOT* the default model, and requires explicitly opting into it
from the frontend. GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.
Addresses PR27749!
llvm-svn: 271244
Summary:
Turn off lifetime-start-on-first-use enhancement for the moment
pending a fix for bug 27903.
Bug: 27903
Reviewers: tejohnson, wmi, qcolombet, gbiv
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20731
llvm-svn: 271003
Fix: updated clang code which was not updated by mistake.
Original commit message:
[llvm-mc] - Teach llvm-mc to generate zlib styled compression sections.
This patch is strongly based on previously reverted D20331.
(because of gnuutils < 2.26 does not support compressed debug sections in non zlib-gnu style)
Difference that this patch supports both zlib and zlib-gnu styles.
-compress-debug-sections option now supports next values:
-compress-debug-sections=zlib-gnu
-compress-debug-sections=zlib
-compress-debug-sections=none
Previously specifying -compress-debug-sections enabled zlib-gnu compression,
so anyone can put "-compress-debug-sections=zlib-gnu" to restore the behavior
that was before this patch for case when compression was enabled.
Differential revision: http://reviews.llvm.org/D20676
llvm-svn: 270987
It broke buildbot:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/13585/steps/build/logs/stdio
Initial commit message:
[llvm-mc] - Teach llvm-mc to generate zlib styled compression sections.
This patch is strongly based on previously reverted D20331.
(because of gnuutils < 2.26 does not support compressed debug sections in non zlib-gnu style)
Difference that this patch supports both zlib and zlib-gnu styles.
-compress-debug-sections option now supports next values:
-compress-debug-sections=zlib-gnu
-compress-debug-sections=zlib
-compress-debug-sections=none
Previously specifying -compress-debug-sections enabled zlib-gnu compression,
so anyone can put "-compress-debug-sections=zlib-gnu" to restore the behavior
that was before this patch for case when compression was enabled.
Differential revision: http://reviews.llvm.org/D20676
llvm-svn: 270978
This patch is strongly based on previously reverted D20331.
(because of gnuutils < 2.26 does not support compressed debug sections in non zlib-gnu style)
Difference that this patch supports both zlib and zlib-gnu styles.
-compress-debug-sections option now supports next values:
-compress-debug-sections=zlib-gnu
-compress-debug-sections=zlib
-compress-debug-sections=none
Previously specifying -compress-debug-sections enabled zlib-gnu compression,
so anyone can put "-compress-debug-sections=zlib-gnu" to restore the behavior
that was before this patch for case when compression was enabled.
Differential revision: http://reviews.llvm.org/D20676
llvm-svn: 270977
CriticalAntiDepBreaker was not correctly tracking defs of the high X86 byte
registers, leading to incorrect use of a busy register to break an
antidependence.
Fixes pr27681, and its duplicates pr27580, pr27804.
Differential Revision: http://reviews.llvm.org/D20456
llvm-svn: 270935
This patch builds upon r270776 and speeds up
LiveDebugValues::transferDebugValue() by adding an index that maps each
DebugVariable to its open VarLoc.
The transferDebugValue() function needs to close all open ranges for a
given DebugVariable. Iterating over the set bits of OpenRanges is
prohibitively slow in practice. I experimented with using the sorted map
of VarLocs in the UniqueVector to iterate only over the range of VarLocs
with a given DebugVariable, but the binary search turned out to be even
more expensive than just iterating over the set bits in OpenRanges.
Instead, this patch exploits the fact that there can only be one open
location for each DebugVariable and redundantly stores this location in a
DenseMap.
This patch brings the time spent in the LiveDebugValues pass down to an
almost neglectiable amount.
http://llvm.org/bugs/show_bug.cgi?id=26055http://reviews.llvm.org/D20636
rdar://problem/24091200
llvm-svn: 270923
Summary:
This allows the linker to discard unused symbol information for comdat
functions that were discarded during the link. Before this change,
searching for the name of an inline function in the debugger would
return multiple results, one per symbol subsection in the object file.
After this change, there is only one result, the result for the function
chosen by the linker.
Reviewers: zturner, majnemer
Subscribers: aaboud, amccarth, llvm-commits
Differential Revision: http://reviews.llvm.org/D20642
llvm-svn: 270792
This patch modifies the LiveDebugValues pass to use more efficient set
data structures as outlined in PR26055. Both VarLocSet and VarLocList are
now SparseBitVectors which allows us to perform much faster bitvector
arithmetic on them.
The speedup can be in the order of minutes especially on ASANified code.
The change is not NFC in the assembler output because the inserted
DBG_VALUEs are now sorted by variable and location.
Many thanks to Daniel Berlin for helping design the improved algorithm and
reviewing the patch.
https://llvm.org/bugs/show_bug.cgi?id=26055http://reviews.llvm.org/D20178
rdar://problem/24091200
llvm-svn: 270776
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.
Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.
This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).
Fixes PR19797.
llvm-svn: 270720
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.
llvm-svn: 270715
We have to modify V2SU before inserting new elements into the
CurrentVRegDefs set because that may move V2SU in memory invalidating
the reference.
llvm-svn: 270644
The benefits of this patch are
-- We call AnalyzeBranch() to optimize unanalyzable branches, but the result of
AnalyzeBranch() is not used. Now the result is useful.
-- Before the layout of all the MBBs is set, the result of AnalyzeBranch() is
not correct and needs to be fixed before using it to optimize the branch
conditions. Now this optimization is called after the layout, the code used
to fix the result of AnalyzeBranch() is not needed.
-- The branch condition of the last block is not optimized before. Now it is
optimized.
Differential Revision: http://reviews.llvm.org/D20177
llvm-svn: 270623
These attributes aren't used by other debuggers (& may be confused with
other DWARF extensions) so they just waste space (about 1.5% on .dwo
file size on a random large program I tested).
We could remove the ObjC property ones too, but I figured they were
probably more necessary when trying to understand ObjC (I could be wrong
though) & so any debugger interested in working with ObjC would use
them, perhaps? (also, there are some legacy tests in Clang that test for
them - making it one of those annoying cross-project commits and/or
cleanup to refactor those tests)
llvm-svn: 270613
Replace bidirectional flow analysis to compute liveness with forward
analysis pass. Treat lifetimes as starting when there is a first
reference to the stack slot, as opposed to starting at the point of the
lifetime.start intrinsic, so as to increase the number of stack
variables we can overlap.
Reviewers: gbiv, qcolumbet, wmi
Differential Revision: http://reviews.llvm.org/D18827
Bug: 25776
llvm-svn: 270559
Before r269750 we did the comparisons in this loop in signed ints so
that it DTRT when MinCSFrameIndex was 0. This was changed because it's
now possible for MinCSFrameIndex to be UINT_MAX, but that introduced a
bug when we were comparing `>= 0` - this is tautological in unsigned.
Rework the comparisons here to avoid issues with unsigned wrapping.
No test. I couldn't find a way to get any of the StackGrowsUp in-tree
targets to reach the code that sets MinCSFrameIndex.
llvm-svn: 270492
This effectively revers commit r270389 and re-lands r270106, but it's
almost a rewrite.
The behavior change in r270106 was that we could no longer assume that
each LF_FUNC_ID record got its own type index. This patch adds a map
from DINode* to TypeIndex, so we can stop making that assumption.
This change also emits padding bytes between type records similar to the
way MSVC does. The size of the type record includes the padding bytes.
llvm-svn: 270485
Summary:
MBBs don't necessarily have a name (in my experience, they almost never
do), in which case this logging is quite unhelpful. The number seems to
work well.
Reviewers: iteratee
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20533
llvm-svn: 270477
This will pave the way to introduce a full fledged symbol visitor
similar to how we have a type visitor, thus allowing the same
dumping code to be used in llvm-readobj and llvm-pdbdump.
Differential Revision: http://reviews.llvm.org/D20384
Reviewed By: rnk
llvm-svn: 270475
This fixes a bug introduced in:
r262115 - CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFC
The iterator End here might == MBB->end(), and so we can't unconditionally
dereference it. This often goes unnoticed (I don't have a test case that always
crashes, and ASAN does not catch it either) because the function call arguments are
turned right back into iterators. MachineInstrBundleIterator's constructor,
however, does have an assert which might randomly fire.
llvm-svn: 270323
Prior to this patch, we were using 1 for all the repairing costs.
Now, we use the information from the target to get this information.
llvm-svn: 270304
We now use LiveRangeCalc::extendToUses() instead of a specially designed
algorithm in constructMainRangeFromSubranges():
- The original motivation for constructMainRangeFromSubranges() were
differences between the main liverange and subranges because of hidden
dead definitions. This case however cannot happen anymore with the
DetectDeadLaneMasks pass in place.
- It simplifies the code.
- This fixes a longstanding bug where we did not properly create new SSA
values on merging control flow (the MachineVerifier missed most of
these cases).
- Move constructMainRangeFromSubranges() to LiveIntervalAnalysis and
LiveRangeCalc to better match the implementation/available helper
functions.
This re-applies r269016. The fixes from r270290 and r270259 should avoid
the machine verifier problems this time.
llvm-svn: 270291
It is fine for subregister ranges to be undefined on some CFG paths as
we may have a "vregX:other_subreg<read-undef> =" def on that path. We
do not (and should not) have live segments for the subregister ranges.
The MachineVerifier should not complain about this.
This is a slight variant of http://llvm.org/PR27705
llvm-svn: 270290
Depending on the compiler used to build LLVM, llvm_unreachable can either
expand to a call to abort(), or to a __builtin_unreachable. The latter
does not have a predictable behavior at runtime.
llvm-svn: 270260
Fix renameDisconnectedComponents() creating vreg uses that can be
reached from function begin withouthaving a definition (or explicit
live-in). Fix this by inserting IMPLICIT_DEF instruction before
control-flow joins as necessary.
Removes an assert from MachineScheduler because we may now get
additional IMPLICIT_DEF when preparing the scheduling policy.
This fixes the underlying problem of http://llvm.org/PR27705
llvm-svn: 270259
This gives AsmPrinter a chance to initialize its DD field before
we call beginModule(), which is about to start using it.
Differential Revision: http://reviews.llvm.org/D20413
llvm-svn: 270258
We are about to start using DIEDwarfExpression to create global variable
DIEs, which happens before we generate code for functions.
Differential Revision: http://reviews.llvm.org/D20412
llvm-svn: 270257
The Fast mode takes the first mapping, the greedy mode loops over all
the possible mapping for an instruction and choose the cheaper one.
Test case will come with target specific code, since we currently do not
have instructions that have several mappings.
llvm-svn: 270249
computeMapping.
Computing the cost of a mapping takes some time.
Since in Fast mode, the cost is irrelevant, just spare some cycles by not
computing it.
In Greedy mode, we need to choose the best cost, that means that when
the local cost gets more expensive than the best cost, we can stop
computing the repairing and cost for the current mapping.
llvm-svn: 270245
The previous choice of the insertion points for repairing was
straightfoward but may introduce some basic block or edge splitting. In
some situation this is something we can avoid.
For instance, when repairing a phi argument, instead of placing the
repairing on the related incoming edge, we may move it to the previous
block, before the terminators. This is only possible when the argument
is not defined by one of the terminator.
llvm-svn: 270232
an instruction.
Use the previously introduced RepairingPlacement class to split the code
computing the repairing placement from the code doing the actual
placement. That way, we will be able to consider different placement and
then, only apply the best one.
llvm-svn: 270168
When assigning the register banks we may have to insert repairing code
to move already assigned values accross register banks.
Introduce a few helper classes to keep track of what is involved in the
repairing of an operand:
- InsertPoint and its derived classes record the positions, in the CFG,
where repairing has to be inserted.
- RepairingPlacement holds all the insert points for the repairing of an
operand plus the kind of action that is required to do the repairing.
This is going to be used to keep track of how the repairing should be
done, while comparing different solutions for an instruction. Indeed, we
will need the repairing placement to capture the cost of a solution and
we do not want to compute it a second time when we do the actual
repairing.
llvm-svn: 270167
register bank twice.
Prior to this change, we were checking if the assignment for the current
machine operand was matching, then we would check if the mismatch
requires to insert repair code.
We actually already have this information from the first check, so just
pass it along.
NFCI.
llvm-svn: 270166
Sorry for the lack testcase. There is one in the pr, but it depends on
std::sort and the .ll version is 110 lines, so I don't think it is
wort it.
The bug was that we were sorting after adding a terminator, and the
sorting algorithm could end up putting the terminator in the middle of
the List vector.
With that we would create a Spans map entry keyed on nullptr which would
then be added to CUs and fail in that sorting.
llvm-svn: 270165
This helper class will be used to represent the cost of mapping an
instruction to a specific register bank.
The particularity of these costs is that they are mostly local, thus the
frequency of the basic block is irrelevant. However, for few
instructions (e.g., phis and terminators), the cost may be non-local and
then, we need to account for the frequency of the involved basic blocks.
This will be used by the greedy mode I am working on.
llvm-svn: 270163