fix: add a flag to MapValue and friends which indicates whether
any module-level mappings are being made. In the common case of
inlining, no module-level mappings are needed, so MapValue doesn't
need to examine non-function-local metadata, which can be very
expensive in the case of a large module with really deep metadata
(e.g. a large C++ program compiled with -g).
This flag is a little awkward; perhaps eventually it can be moved
into the ClonedCodeInfo class.
llvm-svn: 112190
I think there are good reasons to change this, but in the interests
of short-term stability, make SmallVector<...,0> reserve non-zero
capacity in its constructors. This means that SmallVector<...,0>
uses more memory than SmallVector<...,1> and should really only be
used (unless/until this workaround is removed) by clients that
care about using SmallVector with an incomplete type.
llvm-svn: 112147
expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats. This
affects two places in the code: handling cross block values and handling
function return and arguments. Since vectors are already widened by
legalizetypes, this gives us much better code and unblocks x86-64 abi
and SPU abi work.
For example, this (which is a silly example of a cross-block value):
define <4 x float> @test2(<4 x float> %A) nounwind {
%B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1>
%C = fadd <2 x float> %B, %B
br label %BB
BB:
%D = fadd <2 x float> %C, %C
%E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
ret <4 x float> %E
}
Now compiles into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
addps %xmm0, %xmm0
ret
previously it compiled into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
pshufd $1, %xmm0, %xmm1
## kill: XMM0<def> XMM0<kill> XMM0<def>
insertps $0, %xmm0, %xmm0
insertps $16, %xmm1, %xmm0
addps %xmm0, %xmm0
ret
This implements rdar://8230384
llvm-svn: 112101
needed parsing for the .loc directive and saves the current info from that
into the context. The next patch will take the current loc info after an
instruction is assembled and save that info into a vector for each section for
use to build the line number tables. The patch after that will encode the info
from those vectors into the output file as the dwarf line tables.
llvm-svn: 111956
which does the same thing. This eliminates redundant code and
handles MDNodes better. MDNode linking still doesn't fully
work yet though.
llvm-svn: 111941
- Cache used characters in a bitset to reduce memory overhead to just 32 bytes.
- On my core2 this code is faster except when the checked string was very short
(smaller than the list of delimiters).
llvm-svn: 111817
general idea here is to have a group of x86 target specific nodes which are
going to be selected during lowering and then directly matched in isel.
The commit includes the addition of those specific nodes and a *bunch* of
patterns, and incrementally we're going to switch between them and what we
have right now. Both the patterns and target specific nodes can change as
we move forward with this work.
llvm-svn: 111691
It's similar to "linker_private_weak", but it's known that the address of the
object is not taken. For instance, functions that had an inline definition, but
the compiler decided not to inline it. Note, unlike linker_private and
linker_private_weak, linker_private_weak_def_auto may have only default
visibility. The symbols are removed by the linker from the final linked image
(executable or dynamic library).
llvm-svn: 111684
not part of the IR, are not uniqued, and may be safely RAUW'd.
This replaces a variety of alternate mechanisms for achieving
the same effect.
llvm-svn: 111681
functionality that most command-line tools need: ensuring that the
output file gets deleted if the tool is interrupted or encounters an
error.
llvm-svn: 111595
base registers were required. This will allow for slightly better packing
of the locals when alignment padding is necessary after callee saved registers.
llvm-svn: 111508
constructed with an output filename of "-". In particular, allow the
file descriptor to be closed, and close the file descriptor in the
destructor if it hasn't been explicitly closed already, to ensure
that any write errors are detected.
llvm-svn: 111436
We must complete the DFS, otherwise we might miss needed phi-defs, and
prematurely color live ranges with a non-dominating value.
This is not a big deal since we get to color more of the CFG and the next
mapValue call will be faster.
llvm-svn: 111397
Nothing fancy, just ask the target if any currently available base reg
is in range for the instruction under consideration and use the first one
that is. Placeholder ARM implementation simply returns false for now.
ongoing saga of rdar://8277890
llvm-svn: 111374
the local block. Resolve references to those indices to a new base register.
For simplification and testing purposes, a new virtual base register is
allocated for each frame index being resolved. The result is truly horrible,
but correct, code that's good for exercising the new code paths.
Next up is adding thumb1 support, which should be very simple. Following that
will be adding base register re-use and implementing a reasonable ARM
heuristic for when a virtual base register should be generated at all.
llvm-svn: 111315
whether to allocate a virtual frame base register to resolve the frame
index reference in it. Implement a simple version for ARM to aid debugging.
In LocalStackSlotAllocation, scan the function for frame index references
to local frame indices and ask the target whether to allocate virtual
frame base registers for any it encounters. Purely infrastructural for
debug output. Next step is to actually allocate base registers, then add
intelligent re-use of them.
rdar://8277890
llvm-svn: 111262
a Pass abstraction, since that's the level it's actually used at.
Rename Pass' dumpPassStructure to dumpPass.
This eliminates an awkward use of getAsPass() to convert a PMDataManager*
into a Pass* just to permit a dumpPassStructure call.
llvm-svn: 111199
mapping. Have the local block track its alignment requirement, and then
apply that when the block itself is allocated. Previously, offsets could
get adjusted in PEI to be different, relative to one another, than the
block allocation thought they would be, which defeats the point of doing
the allocation this way. Continuing rdar://8277890
llvm-svn: 111197
Introduce a helper method to add a section to the end of a layout. This
will be used by the ELF ObjectWriter code to add the metadata sections
(symbol table, etc) to the end of an object file.
llvm-svn: 111171
expression being loop invariant is not equivalent to the property of
properly dominating the loop header.
Other optimizations have also made this optimization less important.
llvm-svn: 111160
implementations of equality comparison and hash computation. This
can be used to optimize node lookup by avoiding creating lots of
temporary ID values just for hashing and comparison purposes.
llvm-svn: 111130
- Eliminate redundant successors.
- Convert an indirectbr with one successor into a direct branch.
Also, generalize SimplifyCFG to be able to be run on a function entry block.
It knows quite a few simplifications which are applicable to the entry
block, and it only needs a few checks to avoid trouble with the entry block.
llvm-svn: 111060
experimental pass that allocates locals relative to one another before
register allocation and then assigns them to actual stack slots as a block
later in PEI. This will eventually allow targets with limited index offset
range to allocate additional base registers (not just FP and SP) to
more efficiently reference locals, as well as handle situations where
locals cannot be referenced via SP or FP at all (dynamic stack realignment
together with variable sized objects, for example). It's currently
incomplete and almost certainly buggy. Work in progress.
Disabled by default and gated via the -enable-local-stack-alloc command
line option.
rdar://8277890
llvm-svn: 111059
target triple and straightens it out. This does less than gcc's script
config.sub, for example it turns i386-mingw32 into i386--mingw32 not
i386-pc-mingw32, but it does a decent job of turning funky triples into
something that the rest of the Triple class can understand. The plan
is to use this to canonicalize triple's when they are first provided
by users, and have the rest of LLVM only deal with canonical triples.
Once this is done the special case workarounds in the Triple constructor
can be removed, making the class more regular and easier to use. The
comments and unittests for the Triple class are already adjusted in this
patch appropriately for this brave new world of increased uniformity.
llvm-svn: 110909
- remove ashr which never worked.
- fix lshr and shl and add tests.
- remove dead function "intersect1Wrapped".
- add a new sub method to subtract ranges, with test.
llvm-svn: 110861
that many of these things, so the memory savings isn't significant,
and there are now situations where there can be alignments greater
than 128.
llvm-svn: 110836
When splitting a live range, the new registers have fewer uses and the
permissible register class may be less constrained. Recompute the register class
constraint from the uses of new registers created for a split. This may let them
be allocated from a larger set, possibly avoiding a spill.
llvm-svn: 110703
register at a time. This turns out to be slightly faster than iterating over
instructions, but more importantly, it allows us to compute spill weights for
new registers created after the spill weight pass has run.
Also compute the allocation hint at the same time as the spill weight. This
allows us to use the spill weight as a cost metric for copies, and choose the
most profitable hint if there is more than one possibility.
The new hints provide a very small (< 0.1%) but universal code size improvement.
llvm-svn: 110631
previously collected info from the .file directives and outputs the encoded
bytes for it. For now this is only in the Mach-O streamer but at some point
will move to a more generic place.
llvm-svn: 110617
relatively expensive comparison analyzer on each instruction. Also rename the
comparison analyzer method to something more in line with what it actually does.
This pass is will eventually be folded into the Machine CSE pass.
llvm-svn: 110539
After heavy editing of a live interval, it is much easier to simply renumber the
live values instead of trying to keep track of the unused ones.
llvm-svn: 110463
Without this what was happening was:
* R3 is not marked as "used"
* ARM backend thinks it has to save it to the stack because of vaarg
* Offset computation correctly ignores it
* Offsets are wrong
llvm-svn: 110446
need the Compare flag after all.
--- Reverse-merging r109901 into '.':
U include/llvm/Target/TargetInstrDesc.h
U include/llvm/Target/Target.td
U utils/TableGen/InstrInfoEmitter.cpp
U utils/TableGen/CodeGenInstruction.cpp
U utils/TableGen/CodeGenInstruction.h
llvm-svn: 110424
This pass tries to remove comparison instructions when possible. For instance,
if you have this code:
sub r1, 1
cmp r1, 0
bz L1
and "sub" either sets the same flag as the "cmp" instruction or could be
converted to set the same flag, then we can eliminate the "cmp" instruction all
together. This is a important for ARM where the ALU instructions could set the
CPSR flag, but need a special suffix ('s') to do so.
llvm-svn: 110423
be killed before being redefined.
These checks are usually disabled, and usually fail when enabled. We de facto
allow live registers to be redefined without a kill, the corresponding
assertions in RegScavenger were removed long ago.
llvm-svn: 110362
eliminate several const_casts.
Make CallSite implicitly convertible to ImmutableCallSite.
Rename the getModRefBehavior for intrinsic IDs to
getIntrinsicModRefBehavior to avoid overload ambiguity with CallSite,
which happens to be implicitly convertible to bool.
llvm-svn: 110155
Add support for using the FPSCR in conjunction with the vcvtr instruction, for controlling fp to int rounding.
Add support for the FLT_ROUNDS_ node now that the FPSCR is exposed.
llvm-svn: 110152
as soon as we properly codegen the simple vector operations in clang, remove the
unnecessary builti-ins/intrinsics from clang and llvm.
llvm-svn: 110094
of Value deletions and RAUWs, instead of relying on ScalarEvolution's
Scalars map being notified, as that's complicated at best, and
insufficient in general.
This means SCEVUnknown needs a non-trivial destructor, so introduce
a mechanism to allow ScalarEvolution to locate all the SCEVUnknowns.
llvm-svn: 110086
exactly what bugpoint expected it to do.
There was also only one user of
BlockExtractorPass(const std::vector<BasicBlock*> &B), so just remove it and
make BlockExtractorPass read BlockFile.
This fixes bugpoint's block extraction.
Nick, please review.
llvm-svn: 109936
later to identify and possibly remove superfluous compare instructions -- those
that are testing for and setting a status flag that should already be set.
llvm-svn: 109901
handles with a pointer to the containing map. When a map is copied, these
pointers need to be corrected to point to the new map. If not, then consider
the case of a map M1 which maps a value V to something. Create a copy M2 of
M1. At this point there are two value handles on V, one representing V as a
key in M1, the other representing V as a key in M2. But both value handles
point to M1 as the containing map. Now delete V. The value handles remove
themselves from their containing map (which destroys them), but only the first
value handle is successful: the second one cannot remove itself from M1 as
(once the first one has removed itself) there is nothing there to remove; it
is therefore not destroyed. This causes an assertion failure "All references
to V were not removed?".
llvm-svn: 109851
extend it to handle the case where multiple RAUWs affect a single
SCEVUnknown.
Add a ScalarEvolution unittest to test for this situation.
llvm-svn: 109705
the info from the .file directive and makes file and directory tables that
will eventually be put out as part of the dwarf info in the output file.
llvm-svn: 109651
alloca instructions (constrained by their internal encoding),
and add error checking for it. Fix an instcombine bug which
generated huge alignment values (null is infinitely aligned).
This fixes undefined behavior noticed by John Regehr.
llvm-svn: 109643
- Designed as a simple wrapper to allow clients to attempt to catch crashes
(memory errors, assertion violations, etc.) and do some kind of recovery.
- Currently doesn't actually attempt to catch crashes.
llvm-svn: 109586
protectors, to be near the stack protectors on the stack. Accomplish this by
tagging the stack object with a predicate that indicates that it would trigger
this. In the prolog-epilog inserter, assign these objects to the stack after the
stack protector but before the other objects.
llvm-svn: 109481
that the values they refer to aren't being deleted underneath them.
Make sure these containters get cleared by clear(), which IndVarSimplify
and LSR both use before deleting instructions.
llvm-svn: 109478
Simplifying use_iterators by dereferencing
is not a good idea. The codebase does not depend
in this any more, and it may introduce hidden
runtime cost. If you get compile errors, please
dereference your iterator before passing to cast<>
(and friends).
Also: please consider caching the result of
operator* and reusing that instead of dereferencing
many times.
llvm-svn: 109425
dependence on DominanceFrontier. Instead, add an explicit DominanceFrontier
pass in StandardPasses.h to ensure that it gets scheduled at the right
time.
Declare that loop unrolling preserves ScalarEvolution, and shuffle some
getAnalysisUsages.
This eliminates one LoopSimplify and one LCCSA run in the standard
compile opts sequence.
llvm-svn: 109413
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.
On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.
llvm-svn: 109300
it's too late to start backing off aggressive latency scheduling when most
of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
For ARM, this is almost always a win on # of instructions. It's runtime
neutral for most of the tests. But for some kernels with high register
pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
54 and sped up by 20%.
llvm-svn: 109279
is not a good idea. The codebase does not depend
in this any more, and it may introduce hidden
runtime cost. If you get compile errors, please
dereference your iterator before passing to cast<>
(and friends).
Also: please consider caching the result of
operator* and reusing that instead of dereferencing
many times.
llvm-svn: 109220
with an existing allocator. The interesting use case of this
is that it allows "StringMap<whatever, BumpPtrAllocator&>" for
when you want to allocate out of a preexisting bump pointer
allocator owned by someone else.
llvm-svn: 109213
ARM/PPC/MSP430-specific code (which are the only targets that
implement the hook) can directly reference their target-specific
instrinfo classes.
llvm-svn: 109171
The RegionInfo pass detects single entry single exit regions in a function,
where a region is defined as any subgraph that is connected to the remaining
graph at only two spots.
Furthermore an hierarchical region tree is built.
Use it by calling "opt -regions analyze" or "opt -view-regions".
llvm-svn: 109089
Make MDNode::destroy private.
Fix the one thing that used MDNode::destroy, outside of MDNode itself.
One should never delete or destroy an MDNode explicitly. MDNodes
implicitly go away when there are no references to them (implementation
details aside).
llvm-svn: 109028
to Parse mach-o files. All defines have been renamed to not conflict with
#defines in mach header files, all structures were left named the same but
are in the llvm::MachO namespace.
llvm-svn: 108953
bitcode file, so that two bitcode files where the same metadata kind
name happens to have been assigned a different ID can still be
linked together.
Eliminate the restriction that metadata kind IDs can't be 0.
Change MD_dbg from 1 to 0, because we can now, and because it's
less mysterious that way.
llvm-svn: 108939
better in the llvm world. Among other things, this changes:
1. The guts of libedis are now moved into lib/MC/MCDisassembler
2. llvm-mc now depends on lib/MC/MCDisassembler, not tools/edis,
so edis and mc don't have to be built in series.
3. lib/MC/MCDisassembler no longer depends on the C api, the C
API depends on it.
4. Various code cleanup changes.
There is still a lot to be done to make edis fit with the llvm
design, but this is an incremental step in the right direction.
llvm-svn: 108869
linked list. This is a little slower and involves more malloc'ing, but these lists are
typically short, and it allows PassInfo to be entirely constant initializable.
llvm-svn: 108755
- Currently includes a hack to limit ourselves to "In32BitMode" and "In64BitMode", because we don't have the other infrastructure to properly deal with setting SSE, etc. features on X86.
llvm-svn: 108677
- Unfortunate, but necessary for now to handle subtarget instruction matching. Eventually we should factor out the lower level target machine information so we don't need to do this.
llvm-svn: 108664
Still very much under development. Comments and fixes will be forthcoming.
(This commit includes some small tweaks to LiveIntervals & LoopInfo to support the splitter)
llvm-svn: 108615
since it doesn't work for front-ends which don't emit column information
(which includes llvm-gcc in its present configuration), and doesn't
work for clang for K&R style variables where the variables are declared
in a different order from the parameter list.
Instead, make a separate pass through the instructions to collect the
llvm.dbg.declare instructions in order. This ensures that the debug
information for variables is emitted in this order.
llvm-svn: 108538
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.
llvm-svn: 108465
independent of the order that isel happens to visit the dbg_declare
intrinsics. This fixes a bug in which the formal arguments were
being printed in reverse order, now that fast isel is going bottom up.
llvm-svn: 108369
with this commit the callee moves to the end of
the operand array (from the start) and the call
arguments now start at index 0 (formerly 1)
this ordering is now consistent with InvokeInst
this commit only flips the switch,
functionally it is equivalent to
r101465
I intend to commit several cleanups after a few
days of soak period
llvm-svn: 108240
- Currently initialization is a bit of a hack, but harmless. We need to rework
various parts of target initialization to clean this up.
llvm-svn: 108165
AggressiveAntiDepBreaker should not be using getPhysicalRegisterRegClass. An
instruction might be using a register that can only be replaced with one from
a subclass of getPhysicalRegisterRegClass.
With this patch we use getMinimalPhysRegClass. This is correct, but
conservative. We should check the uses of the register and select the
largest register class that can be used in all of them.
llvm-svn: 108122
Targets must now implement TargetInstrInfo::copyPhysReg instead. There is no
longer a default implementation forwarding to copyRegToReg.
llvm-svn: 108095
Use a COPY instruction instead for register copies, or TII::copyPhysReg() after
COPY instructions are lowered.
Targets should implement copyPhysReg instead of copyRegToReg.
llvm-svn: 108075
correct alignment information, which simplifies ExpandRes_VAARG a bit.
The patch introduces a new alignment information to TargetLoweringInfo. This is
needed since the two natural candidates cannot be used:
* The 's' in target data: If this is set to the minimal alignment of any
argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for
example.
* The getTransientStackAlignment method. It is possible for an architecture to
have argument less aligned than what we maintain the stack pointer.
llvm-svn: 108072
- Check getBytesToPopOnReturn().
- Eschew ST0 and ST1 for return values.
- Fix the PIC base register initialization so that it doesn't ever
fail to end up the top of the entry block.
llvm-svn: 108039
inserted in a MBB, and return an already inserted MI.
This target API change is necessary to allow foldMemoryOperand to call
storeToStackSlot and loadFromStackSlot when folding a COPY to a stack slot
reference in a target independent way.
The foldMemoryOperandImpl hook is going to change in the same way, but I'll wait
until COPY folding is actually implemented. Most targets only fold copies and
won't need to specialize this hook at all.
llvm-svn: 107991
U utils/TableGen/FastISelEmitter.cpp
--- Reverse-merging r107943 into '.':
U test/CodeGen/X86/fast-isel.ll
U test/CodeGen/X86/fast-isel-loads.ll
U include/llvm/Target/TargetLowering.h
U include/llvm/Support/PassNameParser.h
U include/llvm/CodeGen/FunctionLoweringInfo.h
U include/llvm/CodeGen/CallingConvLower.h
U include/llvm/CodeGen/FastISel.h
U include/llvm/CodeGen/SelectionDAGISel.h
U lib/CodeGen/LLVMTargetMachine.cpp
U lib/CodeGen/CallingConvLower.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
U lib/CodeGen/SelectionDAG/FastISel.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
U lib/CodeGen/SelectionDAG/InstrEmitter.cpp
U lib/CodeGen/SelectionDAG/TargetLowering.cpp
U lib/Target/XCore/XCoreISelLowering.cpp
U lib/Target/XCore/XCoreISelLowering.h
U lib/Target/X86/X86ISelLowering.cpp
U lib/Target/X86/X86FastISel.cpp
U lib/Target/X86/X86ISelLowering.h
llvm-svn: 107987
Unlike insertMachineInstrInMaps this does not guarantee live intervals will
remain correct. The caller will need to manually update intervals to account
for the changes made to the CFG.
llvm-svn: 107958
.weak_def_can_be_hidden directive. Chris pointed out that the MCAsmInfo.h/.cpp
chunks aren't needed for this until the compiler starts generating these. And
when that happens it will be more convenient for it to be a bool than a const
char*.
llvm-svn: 107906
EXTRACT_SUBREG no longer appears as a machine instruction. Use COPY instead.
Add isCopy() checks in many places using isMoveInstr() and isExtractSubreg().
The isMoveInstr hook will be removed later.
llvm-svn: 107879
This target hook is intended to replace copyRegToReg entirely, but for now it
calls copyRegToReg.
Any remaining calls to copyRegToReg wil be replaced by COPY instructions.
llvm-svn: 107854
(if there are any) and use the one which remains available for the longest
rather than just using the first one. This should help enable better re-use
of the loaded frame index values. rdar://7318760
llvm-svn: 107847
around everywhere, and also give it an InsertPt member, to enable isel
to operate at an arbitrary position within a block, rather than just
appending to a block.
llvm-svn: 107791
interface needs implementations to be consistent, so any code which
wants to support different semantics must use a different interface.
It's not currently worthwhile to add a new interface for this new
concept.
Document that AliasAnalysis doesn't support cross-function queries.
llvm-svn: 107776
It is OK for an alias live range to overlap if there is a copy to or from the
physical register. CoalescerPair can work out if the copy is coalescable
independently of the alias.
This means that we can join with the actual destination interval instead of
using the getOrigDstReg() hack. It is no longer necessary to merge clobber
ranges into subregisters.
llvm-svn: 107695
making all of CallInst's low-level operand accessors
private
If you get compile errors I strongly urge you to
update your code.
I tried to write the necessary clues into the
header where the compiler may point to, but no
guarantees. It works for my GCC.
You have several options to update your code:
- you can use the v2.8 ArgOperand accessors
- you can go via a temporary CallSite
- you can upcast to, say, User and call its
low-level accessors if your code is definitely
operand-order agnostic.
If you run into serious problems, please
comment in below thread (and back out this
revision only if absolutely necessary):
<http://groups.google.com/group/llvm-dev/browse_thread/thread/64650cf343b28271>
llvm-svn: 107667
second round of low-level interface squeeze-out:
making all of CallInst's low-level operand accessors
private
If you get compile errors I strongly urge you to
update your code.
I tried to write the necessary clues into the
header where the compiler may point to, but no
guarantees. It works for my GCC.
You have several options to update your code:
- you can use the v2.8 ArgOperand accessors
- you can go via a temporary CallSite
- you can upcast to, say, User and call its
low-level accessors if your code is definitely
operand-order agnostic.
If you run into serious problems, please
comment in below thread (and back out this
revision only if absolutely necessary):
<http://groups.google.com/group/llvm-dev/browse_thread/thread/64650cf343b28271>
llvm-svn: 107580
The COPY instruction is intended to replace the target specific copy
instructions for virtual registers as well as the EXTRACT_SUBREG and
INSERT_SUBREG instructions in MachineFunctions. It won't we used in a selection
DAG.
COPY is lowered to native register copies by LowerSubregs.
llvm-svn: 107529
list of predefined instructions appear. Add some consistency checks.
Ideally, TargetOpcodes.h should be produced by TableGen from Target.td, but it
is hardly worth the effort.
llvm-svn: 107520
PrologEpilog code, and use it to determine whether
the asm forces stack alignment or not. gcc consistently
does not do this for GCC-style asms; Apple gcc inconsistently
sometimes does it for asm blocks. There is no
convenient place to put a bit in either the SDNode or
the MachineInstr form, so I've added an extra operand
to each; unlovely, but it does allow for expansion for
more bits, should we need it. PR 5125. Some
existing testcases are affected.
The operand lists of the SDNode and MachineInstr forms
are indexed with awesome mnemonics, like "2"; I may
fix this someday, but not now. I'm not making it any
worse. If anyone is inspired I think you can find all
the right places from this patch.
llvm-svn: 107506
This allows us to recognize the common case where all uses could be
rematerialized, and no stack slot allocation is necessary.
If some values could be fully rematerialized, remove them from the live range
before allocating a stack slot for the rest.
llvm-svn: 107492
second round of low-level interface squeeze-out:
making all of CallInst's low-level operand accessors
private
If you get compile errors I strongly urge you to
update your code.
I tried to write the necessary clues into the
header where the compiler may point to, but no
guarantees. It works for my GCC.
You have several options to update your code:
- you can use the v2.8 ArgOperand accessors
- you can go via a temporary CallSite
- you can upcast to, say, User and call its
low-level accessors if your code is definitely
operand-order agnostic.
If you run into serious problems, please
comment in below thread (and back out this
revision only if absolutely necessary):
<http://groups.google.com/group/llvm-dev/browse_thread/thread/64650cf343b28271>
llvm-svn: 107480
Objective-C metadata types which should be marked as "weak", but which the
linker will remove upon final linkage. However, this linkage isn't specific to
Objective-C.
For example, the "objc_msgSend_fixup_alloc" symbol is defined like this:
.globl l_objc_msgSend_fixup_alloc
.weak_definition l_objc_msgSend_fixup_alloc
.section __DATA, __objc_msgrefs, coalesced
.align 3
l_objc_msgSend_fixup_alloc:
.quad _objc_msgSend_fixup
.quad L_OBJC_METH_VAR_NAME_1
This is different from the "linker_private" linkage type, because it can't have
the metadata defined with ".weak_definition".
Currently only supported on Darwin platforms.
llvm-svn: 107433
to update their code to high-level interfaces
If you get compile errors in your project
please update your code according to the
comments.
This is a re-commit of r107396 which causes
compile errors for the indicated usage patterns
instead of link errors (which are less easy to
fix because of missing source location).
If you get compile errors please perform
following functionally equivalent transformations:
- getOperand(0) ---> getCalledValue()
- setOperand(0, V) ---> setCalledFunction(V)
This will make your code more future-proof
and avoid potentially hard-to-debug bugs.
please refer to this thread on llvm-dev:
<http://groups.google.com/group/llvm-dev/browse_thread/thread/64650cf343b28271>
llvm-svn: 107432
such a way that debug info for symbols preserved even if symbols are
optimized away by the optimizer.
Add new special pass to remove debug info for such symbols.
llvm-svn: 107416
to update their code to high-level interfaces
If you get compile errors in your project
please update your code according to the
comments.
llvm-svn: 107396
replaced by a bigger array in SmallPtrSet (by overridding it), instead just use a
pointer to the start of the storage, and have SmallPtrSet pass in the value to use.
This has the disadvantage that SmallPtrSet becomes bigger by one pointer. It has
the advantage that it no longer uses tricky C++ rules, and is clearly correct while
I'm not sure the previous version was. This was inspired by g++-4.6 pointing out
that SmallPtrSetImpl was writing off the end of SmallArray, which it was. Since
SmallArray is replaced with a bigger array in SmallPtrSet, the write was still to
valid memory. But it was writing off the end of the declared array type - sounds
kind of dubious to me, like it sounded dubious to g++-4.6. Maybe g++-4.6 is wrong
and this construct is perfectly valid and correctly compiled by all compilers, but
I think it is better to avoid the whole can of worms by avoiding this construct.
llvm-svn: 107285
InlineSpiller inserts loads and spills immediately instead of deferring to
VirtRegMap. This is possible now because SlotIndexes allows instructions to be
inserted and renumbered.
This is work in progress, and is mostly a copy of TrivialSpiller so far. It
works very well for functions that don't require spilling.
llvm-svn: 107227
metadata types which should be marked as "weak", but which the linker will
remove upon final linkage. For example, the "objc_msgSend_fixup_alloc" symbol is
defined like this:
.globl l_objc_msgSend_fixup_alloc
.weak_definition l_objc_msgSend_fixup_alloc
.section __DATA, __objc_msgrefs, coalesced
.align 3
l_objc_msgSend_fixup_alloc:
.quad _objc_msgSend_fixup
.quad L_OBJC_METH_VAR_NAME_1
This is different from the "linker_private" linkage type, because it can't have
the metadata defined with ".weak_definition".
llvm-svn: 107205
SmallArray[SmallSize] in the SmallPtrSetIteratorImpl, and this is
one off the end of the array. For those who care, right now gcc
warns about writing off the end because it is confused about the
declaration of SmallArray as having length 1 in the parent class
SmallPtrSetIteratorImpl. However if you tweak code to unconfuse
it, then it still warns about writing off the end of the array,
because of this buffer overflow. In short, even with this fix
gcc-4.6 will warn about writing off the end of the array, but now
that is only because it is confused.
llvm-svn: 107200
and make PATypeHolder work with null pointers.
The implicitly generated one didn't work on numerous levels, but was still
accepted, allowing all sorts of bugs with default constructed pa type holders.
Previously, they "sort of" worked if they were default constructed and then
destructed. Now they really work, and you can even default construct one,
then assign to it, amazing.
llvm-svn: 107195
of getPhysicalRegisterRegClass with it.
If we want to make a copy (or estimate its cost), it is better to use the
smallest class as more efficient operations might be possible.
llvm-svn: 107140
(in both CallInst and InvokeInst)
also add a (short-lived) constant to CallInst, that names
the operand index of the first call argument. This is
strictly transitional and should not be used for new code.
llvm-svn: 107001
The VNInfo.kills vector was almost unused except for all the code keeping it
updated. The few places using it were easily rewritten to check for interval
ends instead.
The two new methods LiveInterval::killedAt and killedInRange are replacements.
This brings us down to 3 independent data structures tracking kills.
llvm-svn: 106905
for an "i" constraint should get lowered; PR 6309. While
this argument was passed around a lot, this is the only
place it was used, so it goes away from a lot of other
places.
llvm-svn: 106893
buffer in the same chunk of memory.
2 less mallocs for every uninitialized MemoryBuffer and 1 less malloc for every
MemoryBuffer pointing to a memory range translate into 20% less mallocs on
clang -cc1 -Eonly Cocoa_h.m.
llvm-svn: 106839
CoalescerPair can determine if a copy can be coalesced, and which register gets
merged away. The old logic in SimpleRegisterCoalescing had evolved into
something a bit too convoluted.
This second attempt fixes some crashes that only occurred Linux.
llvm-svn: 106769
None of the existing implementations of commuteInstruction create new
instructions unless the NewMI parameter is true, but the comment had
implied otherwise.
findCommutedOpIndices returns false, not true, when it doesn't know
how to commute the instruction.
llvm-svn: 106761
CoalescerPair can determine if a copy can be coalesced, and which register gets
merged away. The old logic in SimpleRegisterCoalescing had evolved into
something a bit too convoluted.
llvm-svn: 106701
atomic intrinsics, either because the use locking instructions for the
atomics, or because they perform the locking directly. Add support in the
DAG combiner to fold away the fences.
llvm-svn: 106630
- This fixed a number of bugs in if-converter, tail merging, and post-allocation
scheduler. If-converter now runs branch folding / tail merging first to
maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
change the instruction ordering in the IT block (since IT mask has been
finalized). It also ensures no other instructions can be scheduled between
instructions in the IT block.
This is not yet enabled.
llvm-svn: 106344
entries used by llvm-gcc. *_[U]MIN and such can be added later if needed.
This enables the front ends to simplify handling of the atomic intrinsics by
removing the target-specific decision about which targets can handle the
intrinsics.
llvm-svn: 106321
switch from this:
if (TimePassesIsEnabled) {
NamedRegionTimer T(Name, GroupName);
do_something();
} else {
do_something(); // duplicate the code, this time without a timer!
}
to this:
{
NamedRegionTimer T(Name, GroupName, TimePassesIsEnabled);
do_something();
}
llvm-svn: 106285
addresses a longstanding deficiency noted in many FIXMEs scattered
across all the targets.
This effectively moves the problem up one level, replacing eleven
FIXMEs in the targets with eight FIXMEs in CodeGen, plus one path
through FastISel where we actually supply a DebugLoc, fixing Radar
7421831.
llvm-svn: 106243
Given a copy instruction, CoalescerPair can determine which registers to
coalesce in order to eliminate the copy. It deals with all the subreg fun to
determine a tuple (DstReg, SrcReg, SubIdx) such that:
- SrcReg is a virtual register that will disappear after coalescing.
- DstReg is a virtual or physical register whose live range will be extended.
- SubIdx is 0 when DstReg is a physical register.
- SrcReg can be joined with DstReg:SubIdx.
CoalescerPair::isCoalescable() determines if another copy instruction is
compatible with the same tuple. This fixes some NEON miscompilations where
shuffles are getting coalesced as if they were copies.
The CoalescerPair class will replace a lot of the spaghetti logic in JoinCopy
later.
llvm-svn: 105997
the Profile method. Currently this only works with the default FoldingSetTraits
implementation.
The point of this is to allow nodes to not store context values which are only
used during profiling. A better solution would thread this value through the
folding algorithms, but then those would need to be (1) templated and
(2) non-opaque.
llvm-svn: 105819
with gcc-4.6. The warning is wrong, since Sub *is* used (perhaps gcc is confused
because the use of Sub is constant folded away?), but since it is trivial to avoid,
and massively reduces the amount of warning spew, just workaround the wrong warning.
llvm-svn: 105788
'class llvm::DAGDeltaAlgorithm' has virtual functions and accessible non-virtual destructor
Not sure if this is the best solution, but this class has state and some of the
classes that inherit from it also do, so it looks appropriate.
llvm-svn: 105675
scrounging through SCEVUnknown contents and SCEVNAryExpr operands;
instead just do a simple deterministic comparison of the precomputed
hash data.
Also, since this is more precise, it eliminates the need for the slow
N^2 duplicate detection code.
llvm-svn: 105540
encapsulation to force the users of these classes to know about the internal
data structure of the Operands structure. It also can lead to errors, like in
the MSIL writer.
llvm-svn: 105539
In file included from X86InstrInfo.cpp:16:
X86GenInstrInfo.inc:2789: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2790: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2792: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2793: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2808: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2809: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2816: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2817: error: integer constant is too large for 'long' type
llvm-svn: 105524
instruction defines subregisters.
Any existing subreg indices on the original instruction are preserved or
composed with the new subreg index.
Also substitute multiple operands mentioning the original register by using the
new MachineInstr::substituteRegister() function. This is necessary because there
will soon be <imp-def> operands added to non read-modify-write partial
definitions. This instruction:
%reg1234:foo = FLAP %reg1234<imp-def>
will reMaterialize(%reg3333, bar) like this:
%reg3333:bar-foo = FLAP %reg333:bar<imp-def>
Finally, replace the TargetRegisterInfo pointer argument with a reference to
indicate that it cannot be NULL.
llvm-svn: 105358
implementation that is correct for most targets. Tablegen will override where
needed.
Add MachineOperand::subst{Virt,Phys}Reg methods that correctly handle existing
subreg indices when sustituting registers.
llvm-svn: 104985
optimization level.
This only really affects llc for now because both the llvm-gcc and clang front
ends override the default register allocator. I intend to remove that code later.
llvm-svn: 104904
implementing pop with a linear search for a "best" element. The priority
queue was a neat idea, but in practice the comparison functions depend
on dynamic information.
llvm-svn: 104718
A Register with subregisters must also provide SubRegIndices for adressing the
subregisters. TableGen automatically inherits indices for sub-subregisters to
minimize typing.
CompositeIndices may be specified for the weirder cases such as the XMM sub_sd
index that returns the same register, and ARM NEON Q registers where both D
subregs have ssub_0 and ssub_1 sub-subregs.
It is now required that all subregisters are named by an index, and a future
patch will also require inherited subregisters to be named. This is necessary to
allow composite subregister indices to be reduced to a single index.
llvm-svn: 104704
A Register with subregisters must also provide SubRegIndices for adressing the
subregisters. TableGen automatically inherits indices for sub-subregisters to
minimize typing.
CompositeIndices may be specified for the weirder cases such as the XMM sub_sd
index that returns the same register, and ARM NEON Q registers where both D
subregs have ssub_0 and ssub_1 sub-subregs.
It is now required that all subregisters are named by an index, and a future
patch will also require inherited subregisters to be named. This is necessary to
allow composite subregister indices to be reduced to a single index.
llvm-svn: 104654
structure that represents a mapping without any dependencies on SubRegIndex
numbering.
This brings us closer to being able to remove the explicit SubRegIndex
numbering, and it is now possible to specify any mapping without inventing
*_INVALID register classes.
llvm-svn: 104563
This is the beginning of purely symbolic subregister indices, but we need a bit
of jiggling before the explicit numeric indices can be completely removed.
llvm-svn: 104492
that are aliases of the specified register.
- Rename modifiesRegister to definesRegister since it's looking a def of the
specific register or one of its super-registers. It's not looking for def of a
sub-register or alias that could change the specified register.
- Added modifiesRegister to look for defs of aliases.
llvm-svn: 104377
reads or writes a register.
This takes partial redefines and undef uses into account.
Don't actually use it yet. That caused miscompiles.
llvm-svn: 104372
If the size of the string is greater than the zero fill size, the function will attempt to write a very large string of zeros to the object file (~4GB on 32 bit platforms). This assertion will catch the scenario and crash the program before the write occurs.
llvm-svn: 104334
isn't ideal if we want to be able to use another object file format.
Add a createObjectStreamer() factory method so that the correct object
file streamer can be instantiated for a given target triple.
llvm-svn: 104318
pipeline stall. It's useful for targets like ARM cortex-a8. NEON has a lot
of long latency instructions so a strict register pressure reduction
scheduler does not work well.
Early experiments show this speeds up some NEON loops by over 30%.
llvm-svn: 104216
partial redefines.
We are going to treat a partial redefine of a virtual register as a
read-modify-write:
%reg1024:6 = OP
Unless the register is fully clobbered:
%reg1024:6 = OP, %reg1024<imp-def>
MachineInstr::readsVirtualRegister() knows the difference. The first case is a
read, the second isn't.
llvm-svn: 104149
- Of questionable utility, since in general anything which wants to do this should probably be within a target specific hook, which can rely on the sections being of the appropriate type. However, it can be useful for short term hacks.
llvm-svn: 103980
variable has not yet been used in an expression. This allows us to support a few
cases that show up in real code (mostly because gcc generates it for Objective-C
on Darwin), without giving up a reasonable semantic model for assignment.
llvm-svn: 103950
instructions.
e.g.
%reg1026<def> = VLDMQ %reg1025<kill>, 260, pred:14, pred:%reg0
%reg1027<def> = EXTRACT_SUBREG %reg1026, 6
%reg1028<def> = EXTRACT_SUBREG %reg1026<kill>, 5
...
%reg1029<def> = REG_SEQUENCE %reg1028<kill>, 5, %reg1027<kill>, 6, %reg1028, 7, %reg1027, 8, %reg1028, 9, %reg1027, 10, %reg1030<kill>, 11, %reg1032<kill>, 12
After REG_SEQUENCE is eliminated, we are left with:
%reg1026<def> = VLDMQ %reg1025<kill>, 260, pred:14, pred:%reg0
%reg1029:6<def> = EXTRACT_SUBREG %reg1026, 6
%reg1029:5<def> = EXTRACT_SUBREG %reg1026<kill>, 5
The regular coalescer will not be able to coalesce reg1026 and reg1029 because it doesn't
know how to combine sub-register indices 5 and 6. Now 2-address pass will consult the
target whether sub-registers 5 and 6 of reg1026 can be combined to into a larger
sub-register (or combined to be reg1026 itself as is the case here). If it is possible,
it will be able to replace references of reg1026 with reg1029 + the larger sub-register
index.
llvm-svn: 103835
the variable actually tracks.
N.B., several back-ends are using "HasCalls" as being synonymous for something
that adjusts the stack. This isn't 100% correct and should be looked into.
llvm-svn: 103802
on RAUW of functions, this is a correctness issue instead of a mere memory
usage problem.
No testcase until the new MergeFunctions can land.
llvm-svn: 103653
- This provides a convenient alternative to using something llvm::prior or
manual iterator access, for example::
if (T *Prev = foo->getPrevNode())
...
instead of::
iterator it(foo);
if (it != begin()) {
--it;
...
}
- Chris, please review.
llvm-svn: 103647
be diced into atoms, and adjust getAtom() to take this into account.
- This fixes relocations to symbols in fixed size literal sections, for
example.
llvm-svn: 103532
and the others use the regular addPassesToEmitFile hook now, and
llc no longer needs a bunch of redundant code to handle the
whole-file case.
llvm-svn: 103492
Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and
EmitTargetCodeForMemmove out of TargetLowering and into
SelectionDAGInfo to exercise this.
llvm-svn: 103481
- This eliminates getAtomForAddress() (which was a linear search) and
simplifies getAtom().
- This also fixes some correctness problems where local labels at the same
address as non-local labels could be assigned to the wrong atom.
llvm-svn: 103480
string of features for that target. However LTO was using that string to pass
into the "create target machine" stuff. That stuff needed the feature string to
be in a particular form. In particular, it needed the CPU specified first and
then the attributes. If there isn't a CPU specified, it required it to be blank
-- e.g., ",+altivec". Yuck.
Modify the getDefaultSubtargetFeatures method to be a non-static member
function. For all attributes for a specific subtarget, it will add them in like
normal. It will also take a CPU string so that it can satisfy this horrible
syntax.
llvm-svn: 103451
This includes a patch by Roman Divacky to fix the initial crash.
Move the actual addition of passes from *PassManager::add to
*PassManager::addImpl. That way, when adding printer passes we won't
recurse infinitely.
Finally, check to make sure that we are actually adding a FunctionPass
to a FunctionPassManager before doing a print before or after it.
Immutable passes are strange in this way because they aren't
FunctionPasses yet they can be and are added to the FunctionPassManager.
llvm-svn: 103425
getConstantFP to accept the two supported long double
target types. This was not the original intent, but
there are other places that assume this works and it's
easy enough to do.
llvm-svn: 103299
handled cases where a block had zero predecessors, but failed to detect other
cases like loops with no entries. The SSAUpdater is already doing a forward
traversal through the blocks, so it is not hard to identify the blocks that
were never reached on that traversal. This fixes the crash for ppc on the
stepanov_vector test.
llvm-svn: 103184
Microoptimize Twine's with unsigned and int to not pin their value to
the stack. This saves stack space in common cases and allows mem2reg
in the caller. A simple example is:
void foo(const Twine &);
void bar(int x) {
foo("xyz: " + Twine(x));
}
Before:
__Z3bari:
subq $40, %rsp
movl %edi, 36(%rsp)
leaq L_.str3(%rip), %rax
leaq 36(%rsp), %rcx
leaq 8(%rsp), %rdi
movq %rax, 8(%rsp)
movq %rcx, 16(%rsp)
movb $3, 24(%rsp)
movb $7, 25(%rsp)
callq __Z3fooRKN4llvm5TwineE
addq $40, %rsp
ret
After:
__Z3bari:
subq $24, %rsp
leaq L_.str3(%rip), %rax
movq %rax, (%rsp)
movslq %edi, %rax
movq %rax, 8(%rsp)
movb $3, 16(%rsp)
movb $7, 17(%rsp)
leaq (%rsp), %rdi
callq __Z3fooRKN4llvm5TwineE
addq $24, %rsp
ret
It saves 16 bytes of stack and one instruction in this case.
llvm-svn: 103107
sub-register indices and outputs a single super register which is formed from
a consecutive sequence of registers.
This is used as register allocation / coalescing aid and it is useful to
represent instructions that output register pairs / quads. For example,
v1024, v1025 = vload <address>
where v1024 and v1025 forms a register pair.
This really should be modelled as
v1024<3>, v1025<4> = vload <address>
but it would violate SSA property before register allocation is done.
Currently we use insert_subreg to form the super register:
v1026 = implicit_def
v1027 - insert_subreg v1026, v1024, 3
v1028 = insert_subreg v1027, v1025, 4
...
= use v1024
= use v1028
But this adds pseudo live interval overlap between v1024 and v1025.
We can now modeled it as
v1024, v1025 = vload <address>
v1026 = REG_SEQUENCE v1024, 3, v1025, 4
...
= use v1024
= use v1026
After coalescing, it will be
v1026<3>, v1025<4> = vload <address>
...
= use v1026<3>
= use v1026
llvm-svn: 102815
Limit alignment in SmallVector 8, otherwise GCC assumes 16 byte alignment.
opetaror new, and malloc only return 8-byte aligned memory on 32-bit Linux,
which cause a crash if code is compiled with -O3 (or -ftree-vectorize) and some
SmallVector code is vectorized.
llvm-svn: 102604
add a version of createLowerInvokePass that allows the client
to specify whether it wants "expensive" or "cheap" lowering.
Patch by Alex Mac!
llvm-svn: 102402
otherwise labels get incorrectly merged. We handled this by emitting a
".byte 0", but this isn't correct on thumb/arm targets where the text segment
needs to be a multiple of 2/4 bytes. Handle this by emitting a noop. This
is more gross than it should be because arm/ppc are not fully mc'ized yet.
This fixes rdar://7908505
llvm-svn: 102400
form of DEBUG_VALUE, as it doesn't have reasonable default
behavior for unsupported targets. Add a new hook instead.
No functional change.
llvm-svn: 102320
This fixes a bug where calls inlined into an invoke would get
changed into an invoke but the array would keep pointing to
the (now dead) call. The improved inliner behavior is still
disabled for now.
llvm-svn: 102196
that appear in the SCC as a result of inlining as candidates
for inlining. Change this so that it *does* consider call
sites that change from being indirect to being direct as a
result of inlining. This allows it to completely
"devirtualize" the testcase.
llvm-svn: 102146
arguments are handled with a new InlineFunctionInfo class. This
makes it easier to extend InlineFunction to return more info in the
future.
llvm-svn: 102137
So far this is just a clone of -regalloc=local that has been lobotomized to run
25% faster. It drops the least-recently-used calculations, and is just plain
stupid when it runs out of registers.
The plan is to make this go even faster for -O0 by taking advantage of the short
live intervals in unoptimized code. It should not be necessary to calculate
liveness when most virtual registers are killed 2-3 instructions after they are
born.
llvm-svn: 102006
user-defined operations that use MMX register types, but
the compiler shouldn't generate them on its own. This adds
a Synthesizable abstraction to represent this, and changes
the vector widening computation so it won't produce MMX types.
(The motivation is to remove noise from the ABI compatibility
part of the gcc test suite, which has some breakage right now.)
llvm-svn: 101951
just ask ScalarEvolution for it on demand. This helps IVUsers be more robust
in the case of expressions changing underneath it. This fixes PR6862.
llvm-svn: 101819
const_casts, and it reinforces the design of the Target classes being
immutable.
SelectionDAGISel::IsLegalToFold is now a static member function, because
PIC16 uses it in an unconventional way. There is more room for API
cleanup here.
And PIC16's AsmPrinter no longer uses TargetLowering.
llvm-svn: 101635
CGSCC can delete nodes in regions of the callgraph that
have already been visited. If new CG nodes are allocated
to the same pointer, we shouldn't abort, just handle it
correctly by assigning a new number. This should restore
stability by removing invalidated pointers that *will* be
reused from the densemap in the iterator.
llvm-svn: 101628
to determine where to place PHIs by iteratively comparing reaching definitions
at each block. That was just plain wrong. This version now computes the
dominator tree within the subset of the CFG where PHIs may need to be placed,
and then places the PHIs in the iterated dominance frontier of each definition.
The rest of the patch is mostly the same, with a few more performance
improvements added in.
llvm-svn: 101612
to keep the node entries in scc_iterator up to date instead of dangling as
the SCC mutates.
This is a really terrible problem which was causing -g to affect codegen
because it would permute the memory image of the compiler process.
Thanks to Dale for expertly hunting it down.
llvm-svn: 101565
to CallGraphSCCPass's instead of passing around a
std::vector<CallGraphNode*>. No functionality change,
but now we have a much tidier interface.
llvm-svn: 101558
with a fix for self-hosting
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101465
JIT doesn't use the MC back-end asm printer to emit labels that it uses, the
section for the MCSymbol is never set. And thus the MCSymbol for the EH label
isn't marked as "defined". Because of that, TidyLandingPads removes the needed
landing pads from the JIT output. This breaks EH for every JIT program.
This is a work-around for this limitation. We pass in the label locations
map. If the label has a non-zero value, then it was "emitted" by the JIT and
TidyLandingPads shouldn't remove that label.
A nicer solution would be to mark the MCSymbol as "used" by the JIT and not rely
upon the section being set to determine if it's defined or not.
llvm-svn: 101453
with a fix
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101397
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101364
current PC. rdar://7834775
We now produce an identical .o file compared to the cctools
assembler for something like this:
_f0:
L0:
jmp L1
.long . - L0
L1:
jmp A
.long . - L1
.zerofill __DATA,_bss,A,0
llvm-svn: 101227
code. It used to #include the enhanced disassembly
information for the targets it supported straight
out of lib/Target/{X86,ARM,...} but now it uses a
new interface provided by MCDisassembler, and (so
far) implemented by X86 and ARM.
Also removed hacky #define-controlled initialization
of targets in edis. If clients only want edis to
initialize a limited set of targets, they can set
--enable-targets on the configure command line.
llvm-svn: 101179
MachineBasicBlock::livein_iterator a const_iterator, because
clients shouldn't ever be using the iterator interface to
mutate the livein set.
llvm-svn: 101147