Commit Graph

7987 Commits

Author SHA1 Message Date
Nate Begeman 87abe955fc Make register scavenging happy by not using a reg (CR0) that isn't defined
llvm-svn: 47045
2008-02-13 02:58:33 +00:00
Evan Cheng 244183ef0d commuteInstr() can now commute non-ssa machine instrs.
llvm-svn: 47043
2008-02-13 02:46:49 +00:00
Dan Gohman f990faf23b Convert SelectionDAG::ComputeMaskedBits to use APInt instead of uint64_t.
Add an overload that supports the uint64_t interface for use by clients
that haven't been updated yet.

llvm-svn: 47039
2008-02-13 00:35:47 +00:00
Dale Johannesen ffde4ff5b1 __DATA not __DATA__ is the right segment name on darwin.
Spotted by Nick Kledzik.

llvm-svn: 47037
2008-02-12 23:35:09 +00:00
Nate Begeman bcc182f50d Remove some dead code
llvm-svn: 47036
2008-02-12 22:54:40 +00:00
Nate Begeman 8ef50214f0 SSE4.1 64b integer insert/extract pattern support
Move formats into the formats file

llvm-svn: 47035
2008-02-12 22:51:28 +00:00
Evan Cheng 83af1197ca Revert r46916 PPCTargetAsmInfo.cpp.
llvm-svn: 47020
2008-02-12 19:25:12 +00:00
Evan Cheng 8a25d6ac53 Only using x86-64 rip relative addressing in non-staic mode?
llvm-svn: 47019
2008-02-12 19:20:46 +00:00
Evan Cheng 352acec37e Update comment.
llvm-svn: 47002
2008-02-12 07:59:55 +00:00
Evan Cheng 4d8c98b8f9 Unbreak various insert_vector_elt and extract_vector_elt tests in presence of SSE4.
llvm-svn: 47001
2008-02-12 07:59:45 +00:00
Nate Begeman fd82c308e5 Stuff noticed while grepping code
llvm-svn: 46979
2008-02-11 23:47:56 +00:00
Nate Begeman 2d77e8e446 Enable SSE4 codegen and pattern matching.
Add some notes to the README.

llvm-svn: 46949
2008-02-11 04:19:36 +00:00
Nate Begeman 3090b0fbd1 additional missing feature
llvm-svn: 46948
2008-02-11 04:16:09 +00:00
Nate Begeman 3050f74a1d xmm0 variable blends
llvm-svn: 46931
2008-02-10 18:47:57 +00:00
Dan Gohman 3a4be0fdef Rename MRegisterInfo to TargetRegisterInfo.
llvm-svn: 46930
2008-02-10 18:45:23 +00:00
Nick Lewycky 52ea27db17 Match GCC's behaviour for these sections.
llvm-svn: 46916
2008-02-10 00:03:54 +00:00
Nate Begeman 727c7634c7 memopv16i8 had wrong alignment requirement, would have broken pabsb
pabs{b,w,d} are not two address
fix extract-to-mem sse4 ops
add sse4 vector sign extend nodes

llvm-svn: 46915
2008-02-09 23:46:37 +00:00
Nate Begeman 6715f755cc Skeleton of insert and extract matching, more to come
llvm-svn: 46902
2008-02-09 01:38:08 +00:00
Nate Begeman 17bedbc500 Tablegen support for insert & extract element matching
llvm-svn: 46901
2008-02-09 01:37:05 +00:00
Evan Cheng 3b3286d4bc It's not always safe to fold movsd into xorpd, etc. Check the alignment of the load address first to make sure it's 16 byte aligned.
llvm-svn: 46893
2008-02-08 21:20:40 +00:00
Dale Johannesen 36c2967d89 64-bit (MMX) vectors do not need restrictive alignment.
128-bit vectors need it only when SSE is on.

llvm-svn: 46890
2008-02-08 19:48:20 +00:00
Dan Gohman 7a55a94ba1 Avoid needlessly casting away const qualifiers.
llvm-svn: 46877
2008-02-08 03:29:40 +00:00
Evan Cheng 8d59dd119b Added missing entries in X86 load / store folding tables.
llvm-svn: 46866
2008-02-08 00:12:56 +00:00
Dan Gohman 16d4bc3dc0 Follow Chris' suggestion; change the PseudoSourceValue accessors
to return pointers instead of references, since this is always what
is needed.

llvm-svn: 46857
2008-02-07 18:41:25 +00:00
Dan Gohman 63a8452e9c Add SourceValue information for outgoing argument stores on x86.
llvm-svn: 46854
2008-02-07 16:28:05 +00:00
Evan Cheng a20a773654 Fix a x86-64 codegen deficiency. Allow gv + offset when using rip addressing mode.
Before:
_main:
        subq    $8, %rsp
        leaq    _X(%rip), %rax
        movsd   8(%rax), %xmm1
        movss   _X(%rip), %xmm0
        call    _t
        xorl    %ecx, %ecx
        movl    %ecx, %eax
        addq    $8, %rsp
        ret
Now:
_main:
        subq    $8, %rsp
        movsd   _X+8(%rip), %xmm1
        movss   _X(%rip), %xmm0
        call    _t
        xorl    %ecx, %ecx
        movl    %ecx, %eax
        addq    $8, %rsp
        ret

Notice there is another idiotic codegen issue that needs to be fixed asap:
xorl    %ecx, %ecx
movl    %ecx, %eax

llvm-svn: 46850
2008-02-07 08:53:49 +00:00
Evan Cheng 1bc1cae318 In some cases, e.g. ADD32ri, no transformation is made. Guide against it.
llvm-svn: 46849
2008-02-07 08:29:53 +00:00
Dan Gohman 2d489b5081 Re-apply the memory operand changes, with a fix for the static
initializer problem, a minor tweak to the way the
DAGISelEmitter finds load/store nodes, and a renaming of the
new PseudoSourceValue objects.

llvm-svn: 46827
2008-02-06 22:27:42 +00:00
Evan Cheng 0f32916111 Move to getCALLSEQ_END to ensure CALLSEQ_END node produces a flag. This is consistent with the definition in td file.
llvm-svn: 46775
2008-02-05 22:44:06 +00:00
Dale Johannesen d88f1d060e Implement sseregparm.
llvm-svn: 46764
2008-02-05 20:46:33 +00:00
Nate Begeman f3c89be368 Ident mnemonics appropriately
llvm-svn: 46746
2008-02-05 08:49:09 +00:00
Evan Cheng 2cb9068c78 Dwarf requires variable entries to be in the source order. Right now, since we are recording variable information at isel time this means parameters would appear in the reverse order. The short term fix is to issue recordVariable() at asm printing time instead.
llvm-svn: 46724
2008-02-04 23:06:48 +00:00
Nate Begeman 5420516b3f This method should be virtual
llvm-svn: 46723
2008-02-04 23:04:24 +00:00
Nate Begeman ef14d5f926 Eliminate some redundant code.
llvm-svn: 46720
2008-02-04 21:44:06 +00:00
Nate Begeman e146c0e3fd The rest of the SSE4.1 intrinsic patterns that are obvious to me. Getting
Evan's help with the rest.

llvm-svn: 46697
2008-02-04 06:00:24 +00:00
Nate Begeman ccdfd4aa17 Some more SSE 4.1 intrinsic patterns.
llvm-svn: 46696
2008-02-04 05:34:34 +00:00
Nate Begeman e14fdfaecd SSE 4.1 Intrinsics and detection
llvm-svn: 46681
2008-02-03 07:18:54 +00:00
Chris Lattner 1770fb883b explicitly include Compiler.h instead of getting it from tblgen in the middle of a class.
llvm-svn: 46676
2008-02-03 05:43:57 +00:00
Chris Lattner e99faac423 don't do ReplaceUses on a result that doesn't exist.
llvm-svn: 46673
2008-02-03 03:20:59 +00:00
Evan Cheng 32e5347eb8 Get rid of the annoying blank lines before labels.
llvm-svn: 46667
2008-02-02 08:39:46 +00:00
Nick Lewycky f5b9938ef6 Don't use uninitialized values. Fixes vec_align.ll on X86 Linux.
llvm-svn: 46666
2008-02-02 08:29:58 +00:00
Evan Cheng 2aa360adf8 Unbreak ppc debug support.
llvm-svn: 46665
2008-02-02 05:06:29 +00:00
Evan Cheng efd142a920 SDIsel processes llvm.dbg.declare by recording the variable debug information descriptor and its corresponding stack frame index in MachineModuleInfo. This only works if the local variable is "homed" in the stack frame. It does not work for byval parameter, etc.
Added ISD::DECLARE node type to represent llvm.dbg.declare intrinsic. Now the intrinsic calls are lowered into a SDNode and lives on through out the codegen passes.
For now, since all the debugging information recording is done at isel time, when a ISD::DECLARE node is selected, it has the side effect of also recording the variable. This is a short term solution that should be fixed in time.

llvm-svn: 46659
2008-02-02 04:07:54 +00:00
Evan Cheng 4e7ff941f1 Frame index can be negative.
llvm-svn: 46655
2008-02-02 00:17:00 +00:00
Lauro Ramos Venancio 192c07b727 CBackend: Implement unaligned load/store.
llvm-svn: 46646
2008-02-01 21:25:59 +00:00
Evan Cheng d6e44ab5ec Remove the nasty LABEL hack with a much less evil one. Now llvm.dbg.func.start implies a stoppoint is set. SelectionDAGISel records a new source line but does not create a ISD::LABEL node for this special stoppoint. Asm printer will magically print this label. This ensures nothing is emitted before.
llvm-svn: 46635
2008-02-01 09:10:45 +00:00
Evan Cheng 27b32b87ed Revert 46556 and 46585. Dan please fix the PseudoSourceValue problem and re-commit.
llvm-svn: 46623
2008-01-31 21:00:00 +00:00
Evan Cheng 1c6c16ea11 Add an extra operand to LABEL nodes which distinguishes between debug, EH, or misc labels. This fixes the EH breakage. However I am not convinced this is *the* solution.
llvm-svn: 46609
2008-01-31 09:59:15 +00:00
Christopher Lamb 0592cf7e74 Allow ComplexExpressions in InstrInfo.td files to be slightly more... complex! ComplexExpressions can now have attributes which affect how TableGen interprets
the pattern when generating matchin code. 

The first (and currently, only) attribute causes the immediate parent node of the ComplexPattern operand to be passed into the matching code rather than the node at the root of the entire DAG containing the pattern.

llvm-svn: 46606
2008-01-31 07:27:46 +00:00
Evan Cheng 6332dbec69 Add x86 specific getFrameIndexOffset(). This fixes local variable debugging info.
llvm-svn: 46598
2008-01-31 04:06:00 +00:00
Evan Cheng a41d3bcb12 MRegisterInfo::getLocation() is a really bad idea. Its function is to calculate the offset from frame pointer to a stack slot and then storing the delta in a MachineLocation object. The name is bad (it implies a getter), and MRegisterInfo doesn't need to know about MachineLocation.
Replace getLocation() with getFrameIndexOffset() which returns the delta from frame pointer to stack slot. Dwarf writer can then use the information for whatever it wants.

llvm-svn: 46597
2008-01-31 03:37:28 +00:00
Evan Cheng 3c0486fb38 Makes the same change in ppc backend: avoid inserting prologue before debug labels.
llvm-svn: 46596
2008-01-31 03:33:38 +00:00
Dan Gohman ed346f2ed5 Avoid unnecessarily casting away const.
llvm-svn: 46590
2008-01-31 01:01:48 +00:00
Dan Gohman 9ba4d76816 Rename ISD::FLT_ROUNDS to ISD::FLT_ROUNDS_ to avoid conflicting
with the real FLT_ROUNDS (defined in <float.h>).

llvm-svn: 46587
2008-01-31 00:41:03 +00:00
Dan Gohman 3646fdda67 Create a new class, MemOperand, for describing memory references
in the backend. Introduce a new SDNode type, MemOperandSDNode, for
holding a MemOperand in the SelectionDAG IR, and add a MemOperand
list to MachineInstr, and code to manage them. Remove the offset
field from SrcValueSDNode; uses of SrcValueSDNode that were using
it are all all using MemOperandSDNode now.

Also, begin updating some getLoad and getStore calls to use the
PseudoSourceValue objects.

Most of this was written by Florian Brander, some
reorganization and updating to TOT by me.

llvm-svn: 46585
2008-01-31 00:25:39 +00:00
Evan Cheng a3395a61cc Treat the label for the first @llvm.dbg.stoppoint the same way as the dbg_func_start label. Make sure nothing else is inserted before them.
Note this solution might be somewhat fragile since ISD::LABEL may be used for other
purposes. If that ends up to be an issue, we may need to introduce a different node
for debug labels.

llvm-svn: 46571
2008-01-30 20:08:35 +00:00
Evan Cheng 29cfb67e28 Even though InsertAtEndOfBasicBlock is an ugly hack it still deserves a proper name. Rename it to EmitInstrWithCustomInserter since it does not necessarily insert
instruction at the end.

llvm-svn: 46562
2008-01-30 18:18:23 +00:00
Evan Cheng ed17ef7e18 Skip over the label which marks the beginning of the function before inserting prologue code.
llvm-svn: 46546
2008-01-30 03:57:33 +00:00
Scott Michel bb713ae0c7 More cleanups for CellSPU:
- Expand tabs... (poss 80-col violations, will get them later...)
- Consolidate logic for SelectDFormAddr and SelectDForm2Addr into a single
  function, simplifying maintenance. Also reduced custom instruction
  generation for SPUvecinsert/INSERT_MASK.

llvm-svn: 46544
2008-01-30 02:55:46 +00:00
Dan Gohman 47a7d6fafe Factor the addressing mode and the load/store VT out of LoadSDNode
and StoreSDNode into their common base class LSBaseSDNode. Member
functions getLoadedVT and getStoredVT are replaced with the common
getMemoryVT to simplify code that will handle both loads and stores.

llvm-svn: 46538
2008-01-30 00:15:11 +00:00
Evan Cheng 084a1cdcdd Work in progress. This patch *fixes* x86-64 calls which are modelled as StructRet but really should be return in registers, e.g. _Complex long double, some 128-bit aggregates. This is a short term solution that is necessary only because llvm, for now, cannot model i128 nor call's with multiple results.
Status: This only works for direct calls, and only the caller side is done. Disabled for now.

llvm-svn: 46527
2008-01-29 19:34:22 +00:00
Duncan Sands 05837edae7 Use getPreferredAlignmentLog or getPreferredAlignment
to get the alignment of global variables, rather than
using hand-made versions.

llvm-svn: 46495
2008-01-29 06:23:44 +00:00
Dale Johannesen 2b3bc30420 Handle 'X' constraint in asm's better.
llvm-svn: 46485
2008-01-29 02:21:21 +00:00
Scott Michel ceae3bbf4d Overhaul Cell SPU's addressing mode internals so that there are now
only two addressing mode nodes, SPUaform and SPUindirect (vice the
three previous ones, SPUaform, SPUdform and SPUxform). This improves
code somewhat because we now avoid using reg+reg addressing when
it can be avoided. It also simplifies the address selection logic,
which was the main point for doing this.

Also, for various global variables that would be loaded using SPU's
A-form addressing, prefer D-form offs[reg] addressing, keeping the
base in a register if the variable is used more than once.

llvm-svn: 46483
2008-01-29 02:16:57 +00:00
Bill Wendling 96a1b810ec If the function has no machine instructions, then emit a "nop" so that
the function label isn't associated with something it shouldn't be.

llvm-svn: 46449
2008-01-28 09:15:03 +00:00
Chris Lattner 2e4719ec55 add a note
llvm-svn: 46413
2008-01-27 07:31:41 +00:00
Chris Lattner d05d2011d0 Use fldz and fld1 for long double constants instead of a constant pool load.
llvm-svn: 46411
2008-01-27 06:19:31 +00:00
Chris Lattner 2dd23b9f32 Add some notes.
llvm-svn: 46405
2008-01-26 20:12:07 +00:00
Chris Lattner 250789f1bd Remove some code for inferring alignment info from the x86 backend
now that the dag combiner does it.

llvm-svn: 46404
2008-01-26 20:07:42 +00:00
Bill Wendling 1a17ef02c8 If there's no instructions being emitted on X86 for a function, emit a
nop. Emit the nop directly for PPC.

llvm-svn: 46398
2008-01-26 09:03:52 +00:00
Bill Wendling 5079483957 If there are no machine instructions emitted for a function, then insert
a "nop" instruction so that we don't have the function's label associated
with something that it's not supposed to be associated with.

llvm-svn: 46394
2008-01-26 06:51:24 +00:00
Chris Lattner 919ad97c01 JITEmitter.cpp was trying to sync the icache for function stubs, but
was actually passing a completely incorrect size to sys_icache_invalidate.
Instead of having the JITEmitter do this (which doesn't have the correct 
size), just make the target sync its own stubs.

llvm-svn: 46354
2008-01-25 16:41:09 +00:00
Chris Lattner f4523c35cb optimize fxor like for
llvm-svn: 46345
2008-01-25 06:14:17 +00:00
Chris Lattner 84ab724e06 Add target-specific dag combines for FAND(x,0) and FOR(x,0). This allows
us to compile:

double test(double X) {
  return copysign(0.0, X);
}

into:

_test:
	andpd	LCPI1_0(%rip), %xmm0
	ret

instead of:
_test:
	pxor	%xmm1, %xmm1
	andpd	LCPI1_0(%rip), %xmm1
	movapd	%xmm0, %xmm2
	andpd	LCPI1_1(%rip), %xmm2
	movapd	%xmm1, %xmm0
	orpd	%xmm2, %xmm0
	ret

llvm-svn: 46344
2008-01-25 05:46:26 +00:00
Anton Korobeynikov fcde616864 Provide correct DWARF register numbering for debug information emission on x86-32/Darwin.
This should fix bunch of issues.

llvm-svn: 46337
2008-01-25 00:34:13 +00:00
Chris Lattner a91f77eaac Significantly simplify and improve handling of FP function results on x86-32.
This case returns the value in ST(0) and then has to convert it to an SSE
register.  This causes significant codegen ugliness in some cases.  For 
example in the trivial fp-stack-direct-ret.ll testcase we used to generate:

_bar:
	subl	$28, %esp
	call	L_foo$stub
	fstpl	16(%esp)
	movsd	16(%esp), %xmm0
	movsd	%xmm0, 8(%esp)
	fldl	8(%esp)
	addl	$28, %esp
	ret

because we move the result of foo() into an XMM register, then have to
move it back for the return of bar.

Instead of hacking ever-more special cases into the call result lowering code
we take a much simpler approach: on x86-32, fp return is modeled as always 
returning into an f80 register which is then truncated to f32 or f64 as needed.
Similarly for a result, we model it as an extension to f80 + return.

This exposes the truncate and extensions to the dag combiner, allowing target
independent code to hack on them, eliminating them in this case.  This gives 
us this code for the example above:

_bar:
	subl	$12, %esp
	call	L_foo$stub
	addl	$12, %esp
	ret

The nasty aspect of this is that these conversions are not legal, but we want
the second pass of dag combiner (post-legalize) to be able to hack on them.
To handle this, we lie to legalize and say they are legal, then custom expand
them on entry to the isel pass (PreprocessForFPConvert).  This is gross, but
less gross than the code it is replacing :)

This also allows us to generate better code in several other cases.  For 
example on fp-stack-ret-conv.ll, we now generate:

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstps	8(%esp)
	movl	16(%esp), %eax
	cvtss2sd	8(%esp), %xmm0
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

where before we produced (incidentally, the old bad code is identical to what
gcc produces):

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstpl	(%esp)
	cvtsd2ss	(%esp), %xmm0
	cvtss2sd	%xmm0, %xmm0
	movl	16(%esp), %eax
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

Note that we generate slightly worse code on pr1505b.ll due to a scheduling 
deficiency that is unrelated to this patch.

llvm-svn: 46307
2008-01-24 08:07:48 +00:00
Evan Cheng 35abd840a6 Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type.
llvm-svn: 46286
2008-01-23 23:17:41 +00:00
Duncan Sands 95d46ef887 The last pieces needed for loading arbitrary
precision integers.  This won't actually work
(and most of the code is dead) unless the new
legalization machinery is turned on.  While
there, I rationalized the handling of i1, and
removed some bogus (and unused) sextload patterns.
For i1, this could result in microscopically
better code for some architectures (not X86).
It might also result in worse code if annotating
with AssertZExt nodes turns out to be more harmful
than helpful.

llvm-svn: 46280
2008-01-23 20:39:46 +00:00
Dale Johannesen 7f1ff5fedd Honor explicit section information on Darwin.
llvm-svn: 46267
2008-01-23 00:58:14 +00:00
Evan Cheng 1e0d4d2aa8 SSE varargs arguments are passed in memory.
llvm-svn: 46262
2008-01-22 23:26:53 +00:00
Chris Lattner 1dea406e73 Trivial patch to fix two warnings, please pull into llvm 2.2
llvm-svn: 46243
2008-01-22 04:47:47 +00:00
Anton Korobeynikov da19b1c875 Honour ByVal parameter attribute for name decoration
llvm-svn: 46200
2008-01-20 14:00:07 +00:00
Anton Korobeynikov c7ffe0f4db Remove Darwin'ism
llvm-svn: 46199
2008-01-20 13:59:37 +00:00
Anton Korobeynikov 28d4302807 Enable PIC codegen on x86-64/linux
llvm-svn: 46198
2008-01-20 13:58:16 +00:00
Duncan Sands 3e95d963e9 Need to handle any 'nest' parameter before integer
parameters, since otherwise it won't be passed in
the right register.  With this change trampolines
work on x86-64 (thanks to Luke Guest for providing
access to an x86-64 box).

llvm-svn: 46192
2008-01-19 16:42:10 +00:00
Dale Johannesen 5c94cb3596 Implement flt_rounds for PowerPC.
llvm-svn: 46174
2008-01-18 19:55:37 +00:00
Chris Lattner 87757d38b3 get symbolic information for ppc ldbl nodes.
llvm-svn: 46165
2008-01-18 18:51:16 +00:00
Chris Lattner f5b46f7dad Fix a latent bug exposed by my truncstore patch. We compiled stfiwx-2.ll to:
_test:
	fctiwz f0, f1
	stfiwx f0, 0, r4
	blr 

instead of:

_test:
	fctiwz f0, f1
	stfd f0, -8(r1)
	nop
	nop
	lwz r2, -4(r1)
	stb r2, 0(r4)
	blr 

The former is not correct (stores 4 bytes, not 1).

llvm-svn: 46161
2008-01-18 16:54:56 +00:00
Chris Lattner 7dc00e8021 make a method public
llvm-svn: 46159
2008-01-18 06:52:41 +00:00
Dale Johannesen 8ef89eabc2 Revert the part of 45849 that treated weak globals
as weak globals rather than commons.  While not wrong,
this change tickled a latent bug in Darwin's strip,
so revert it for now as a workaround.

llvm-svn: 46147
2008-01-17 23:36:04 +00:00
Dale Johannesen 60a9855799 Revert the part of 45848 that treated weak globals
as weak globals rather than commons.  While not wrong,
this change tickled a latent bug in Darwin's strip,
so revert it for now as a workaround.

llvm-svn: 46144
2008-01-17 23:04:07 +00:00
Scott Michel e4d3e3c0e7 Forward progress: crtbegin.c now compiles successfully!
Fixed CellSPU's A-form (local store) address mode, so that all globals,
externals, constant pool and jump table symbols are now wrapped within
a SPUISD::AFormAddr pseudo-instruction. This now identifies all local
store memory addresses, although it requires a bit of legerdemain during
instruction selection to properly select loads to and stores from local
store, properly generating "LQA" instructions.

Also added mul_ops.ll test harness for exercising integer multiplication.

llvm-svn: 46142
2008-01-17 20:38:41 +00:00
Chris Lattner 1ea55cf816 This commit changes:
1. Legalize now always promotes truncstore of i1 to i8. 
2. Remove patterns and gunk related to truncstore i1 from targets.
3. Rename the StoreXAction stuff to TruncStoreAction in TLI.
4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions.
5. Mark a wide variety of invalid truncstores as such in various targets, e.g.
   X86 currently doesn't support truncstore of any of its integer types.
6. Add legalize support for truncstores with invalid value input types.
7. Add a dag combine transform to turn store(truncate) into truncstore when
   safe.

The later allows us to compile CodeGen/X86/storetrunc-fp.ll to:

_foo:
	fldt	20(%esp)
	fldt	4(%esp)
	faddp	%st(1)
	movl	36(%esp), %eax
	fstps	(%eax)
	ret

instead of:

_foo:
	subl	$4, %esp
	fldt	24(%esp)
	fldt	8(%esp)
	faddp	%st(1)
	fstps	(%esp)
	movl	40(%esp), %eax
	movss	(%esp), %xmm0
	movss	%xmm0, (%eax)
	addl	$4, %esp
	ret

llvm-svn: 46140
2008-01-17 19:59:44 +00:00
Chris Lattner 72733e573b * Introduce a new SelectionDAG::getIntPtrConstant method
and switch various codegen pieces and the X86 backend over
  to using it.

* Add some comments to SelectionDAGNodes.h

* Introduce a second argument to FP_ROUND, which indicates
  whether the FP_ROUND changes the value of its input. If
  not it is safe to xform things like fp_extend(fp_round(x)) -> x.

llvm-svn: 46125
2008-01-17 07:00:52 +00:00
Duncan Sands 32b0ff6814 Trampoline support for x86-64. This looks like
it should work, but I have no machine to test
it on.  Committed because it will at least
cause no harm, and maybe someone can test it
for me!

llvm-svn: 46098
2008-01-16 22:55:25 +00:00
Chris Lattner e8bb9f2190 make it more clear that this predicate only applies to scalar FP types.
llvm-svn: 46058
2008-01-16 06:24:21 +00:00
Chris Lattner 14e616ef0b introduce a isTypeInSSEReg predicate, which allows us to simplify
some code.  No functionality change.

llvm-svn: 46055
2008-01-16 06:19:45 +00:00
Chris Lattner 8f7cec859e My previous commit had an incomplete message, it should have been:
make the 'fp return in ST(0)' optimization smart enough to
look through token factor nodes.  THis allows us to compile 
testcases like CodeGen/X86/fp-stack-retcopy.ll into:

_carg:
	subl	$12, %esp
	call	L_foo$stub
	fstpl	(%esp)
	fldl	(%esp)
	addl	$12, %esp
	ret

instead of:

_carg:
	subl	$28, %esp
	call	L_foo$stub
	fstpl	16(%esp)
	movsd	16(%esp), %xmm0
	movsd	%xmm0, 8(%esp)
	fldl	8(%esp)
	addl	$28, %esp
	ret

Still not optimal, but much better and this is a trivial patch.  Fixing 
the rest requires invasive surgery that is is not llvm 2.2 material.

llvm-svn: 46054
2008-01-16 05:56:59 +00:00
Chris Lattner ea001f1db7 make the 'fp return in ST(0)' optimization smart enough to
look through token factor

llvm-svn: 46053
2008-01-16 05:53:06 +00:00
Chris Lattner de5c74f18e various whitespace cleanups, no functionality change.
llvm-svn: 46052
2008-01-16 05:52:18 +00:00