Implement the TII hooks needed by EarlyIfConversion to create cmov
instructions and estimate their latency.
Early if-conversion is still not enabled by default.
llvm-svn: 159695
This is a preliminary step toward having TargetPassConfig be able to
start and stop the compilation at specified passes for unit testing
and debugging. No functionality change.
llvm-svn: 159567
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.
llvm-svn: 157818
The TargetPassManager's default constructor wants to initialize the PassManager
to 'null'. But it's illegal to bind a null reference to a null l-value. Make the
ivar a pointer instead.
PR12468
llvm-svn: 155902
Passes prior to instructon selection are now split into separate configurable stages.
Header dependencies are simplified.
The bulk of this diff is simply removal of the silly DisableVerify flags.
Sorry for the target header churn. Attempting to stabilize them.
llvm-svn: 149754
Allows command line overrides to be centralized in LLVMTargetMachine.cpp.
LLVMTargetMachine can intercept common passes and give precedence to command line overrides.
Allows adding "internal" target configuration options without touching TargetOptions.
Encapsulates the PassManager.
Provides a good point to initialize all CodeGen passes so that Pass ID's can be used in APIs.
Allows modifying the target configuration hooks without rebuilding the world.
llvm-svn: 149672
Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.
Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.
Adds a test to verify that the scheduler is working.
Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.
Patch by Preston Gurd!
llvm-svn: 149558
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.
One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.
llvm-svn: 145714
and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.
llvm-svn: 144788
promoting allocas to preferred alignments that exceed the natural
alignment. This avoids some potentially expensive dynamic stack realignments.
The natural stack alignment is set in target data strings via the "S<size>"
option. Size is in bits and must be a multiple of 8. The natural stack alignment
defaults to "unspecified" (represented by a zero value), and the "unspecified"
value does not prevent any alignment promotions. Target maintainers that care
about avoiding promotions should explicitly add the "S<size>" option to their
target data strings.
llvm-svn: 141599
This also enables domain swizzling for AVX code which required a few
trivial test changes.
The pass will be moved to lib/CodeGen shortly.
llvm-svn: 140659
SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper"
llc command line option. This is only the first step (very naive and
conservative one) to sketch out the idea, but proper DFA is coming next
to allow smarter decisions. Comments and ideas now and in further commits
will be very appreciated.
llvm-svn: 138317
- Introduce JITDefault code model. This tells targets to set different default
code model for JIT. This eliminates the ugly hack in TargetMachine where
code model is changed after construction.
llvm-svn: 135580
(including compilation, assembly). Move relocation model Reloc::Model from
TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine.
llvm-svn: 135468
and MCSubtargetInfo.
- Added methods to update subtarget features (used when targets automatically
detect subtarget features or switch modes).
- Teach X86Subtarget to update MCSubtargetInfo features bits since the
MCSubtargetInfo layer can be shared with other modules.
- These fixes .code 16 / .code 32 support since mode switch is updated in
MCSubtargetInfo so MC code emitter can do the right thing.
llvm-svn: 134884
- Each target asm parser now creates its own MCSubtatgetInfo (if needed).
- Changed AssemblerPredicate to take subtarget features which tablegen uses
to generate asm matcher subtarget feature queries. e.g.
"ModeThumb,FeatureThumb2" is translated to
"(Bits & ModeThumb) != 0 && (Bits & FeatureThumb2) != 0".
llvm-svn: 134678
be the first encoded as the first feature. It then uses the CPU name to look up
features / scheduling itineray even though clients know full well the CPU name
being used to query these properties.
The fix is to just have the clients explictly pass the CPU name!
llvm-svn: 134127
No one uses *-mingw64. mingw-w64 is represented as {i686|x86_64}-w64-mingw32. In llvm side, i686 and x64 can be treated as similar way.
llvm-svn: 125747
- The COFF backend doesn't support MingW/Cygwin at the moment, it'll report an
error, but it's still much better than random assertions from the MachO backend.
- We want to make ELF the default eventually, it's what the majority of targets use.
llvm-svn: 110197
FP_REG_KILL instructions are still inserted, but can be disabled by passing
-live-x87 to llc. The X87FPRegKillInserterPass is going to be removed shortly.
CFG edges are partioned into bundles where the x87 stack must be allocated
identically. Code is insertad at the end of each basic block that shuffles the
live FP registers to match the outgoing bundles expectations.
This fix is in preparation for some upcoming register allocator improvements
that may extend the live range of registers beyond a basic block, similar to
LICM. It also provides a nice runtime speedup if you are building with
-mfpmath=387.
llvm-svn: 108529
- Check getBytesToPopOnReturn().
- Eschew ST0 and ST1 for return values.
- Fix the PIC base register initialization so that it doesn't ever
fail to end up the top of the entry block.
llvm-svn: 108039
isn't ideal if we want to be able to use another object file format.
Add a createObjectStreamer() factory method so that the correct object
file streamer can be instantiated for a given target triple.
llvm-svn: 104318
Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and
EmitTargetCodeForMemmove out of TargetLowering and into
SelectionDAGInfo to exercise this.
llvm-svn: 103481
When a frame pointer is not otherwise required, and dynamic stack alignment
is necessary solely due to the spilling of a register with larger alignment
requirements than the default stack alignment, the frame pointer can be both
used as a general purpose register and a frame pointer. That goes poorly, for
obvious reasons. This patch brings back a bit of old logic for identifying
the use of such registers and conservatively reserves the frame pointer
during register allocation in such cases.
For now, implement for X86 only since it's 32-bit linux which is hitting this,
and we want a targeted fix for 2.7. As a follow-on, this will be expanded
to handle other targets, as theoretically the problem could arise elsewhere
as well.
llvm-svn: 100559
On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register
in a different domain than where it was defined. Some instructions have
equvivalents for different domains, like por/orps/orpd.
The SSEDomainFix pass tries to minimize the number of domain crossings by
changing between equvivalent opcodes where possible.
This is a work in progress, in particular the pass doesn't do anything yet. SSE
instructions are tagged with their execution domain in TableGen using the last
two bits of TSFlags. Note that not all instructions are tagged correctly. Life
just isn't that simple.
The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline
issue handled by NEONMoveFixPass. This pass may become target independent to
handle both.
llvm-svn: 99524
This is work in progress. So far, SSE execution domain tables are added to
X86InstrInfo, and a skeleton pass is enabled with -sse-domain-fix.
llvm-svn: 99345
only run for x86 with fastisel. I've found it being very effective in
eliminating some obvious dead code as result of formal parameter lowering
especially when tail call optimization eliminated the need for some of the loads
from fixed frame objects. It also shrinks a number of the tests. A couple of
tests no longer make sense and are now eliminated.
llvm-svn: 95493
function can support dynamic stack realignment. That's a much easier question
to answer at instruction selection stage than whether the function actually
will have dynamic alignment prologue. This allows the removal of the
stack alignment heuristic pass, and improves code quality for cases where
the heuristic would result in dynamic alignment code being generated when
it was not strictly necessary.
llvm-svn: 93885
idea, but unfortunately necessary.
- Default to using 4-bytes for the LSDA pointer encoding to agree with the
encoded value in the CIE.
llvm-svn: 93753
The CIE says that the LSDA point in the FDE section is an "sdata4". That's fine,
but we need it to actually be 4-bytes in the FDE for some platforms. Allow
individual platforms to decide for themselves.
llvm-svn: 93616
by allowing backends to override routines that will default
the JIT and Static code generation to an appropriate code model
for the architecture.
Should fix PR 5773.
llvm-svn: 91824
incarnations), integrated into the MC framework.
The disassembler is table-driven, using a custom TableGen backend to
generate hierarchical tables optimized for fast decode. The disassembler
consumes MemoryObjects and produces arrays of MCInsts, adhering to the
abstract base class MCDisassembler (llvm/MC/MCDisassembler.h).
The disassembler is documented in detail in
- lib/Target/X86/Disassembler/X86Disassembler.cpp (disassembler runtime)
- utils/TableGen/DisassemblerEmitter.cpp (table emitter)
You can test the disassembler by running llvm-mc -disassemble for i386
or x86_64 targets. Please let me know if you encounter any problems
with it.
llvm-svn: 91749
The large code model is documented at
http://www.x86-64.org/documentation/abi.pdf and says that calls should
assume their target doesn't live within the 32-bit pc-relative offset
that fits in the call instruction.
To do this, we turn off the global-address->target-global-address
conversion in X86TargetLowering::LowerCall(). The first attempt at
this broke the lazy JIT because it can separate the movabs(imm->reg)
from the actual call instruction. The lazy JIT receives the address of
the movabs as a relocation and needs to record the return address from
the call; and then when that call happens, it needs to patch the
movabs with the newly-compiled target. We could thread the call
instruction into the relocation and record the movabs<->call mapping
explicitly, but that seems to require at least as much new
complication in the code generator as this change.
To fix this, we make lazy functions _always_ go through a call
stub. You'd think we'd only have to force lazy calls through a stub on
difficult platforms, but that turns out to break indirect calls
through a function pointer. The right fix for that is to distinguish
between calls and address-of operations on uncompiled functions, but
that's complex enough to leave for someone else to do.
Another attempt at this defined a new CALL64i pseudo-instruction,
which expanded to a 2-instruction sequence in the assembly output and
was special-cased in the X86CodeEmitter's emitInstruction()
function. That broke indirect calls in the same way as above.
This patch also removes a hack forcing Darwin to the small code model.
Without far-call-stubs, the small code model requires things of the
JITMemoryManager that the DefaultJITMemoryManager can't provide.
Thanks to echristo for lots of testing!
llvm-svn: 88984
- Note, this is a gigantic hack, with the sole purpose of unblocking further
work on the assembler (its also possible to test the mathcer more completely
now).
- Despite being a hack, its actually good enough to work over all of 403.gcc
(although some encodings are probably incorrect). This is a testament to the
beauty of X86's MachineInstr, no doubt! ;)
llvm-svn: 80234
pair instead of from a virtual method on TargetMachine. This cuts the final
ties of TargetAsmInfo to TargetMachine, meaning that MC can now use
TargetAsmInfo.
llvm-svn: 78802