Commit Graph

799 Commits

Author SHA1 Message Date
Bill Schmidt ef17c14254 PPC calling convention cleanup.
Most of PPCCallingConv.td is used only by the 32-bit SVR4 ABI.  Rename
things to clarify this.  Also delete some code that's been commented out
for a long time.

llvm-svn: 174526
2013-02-06 17:33:58 +00:00
Jakob Stoklund Olesen 8660a8c0fc Move MRI liveouts to PowerPC return instructions.
llvm-svn: 174409
2013-02-05 18:12:00 +00:00
Benjamin Kramer c35d526489 Disable a couple more vector splat optimizations on PPC.
I didn't see those because the test case used "not grep". FileCheck the test and
XFAIL it, preserving the old optimization, so this can be fixed eventually.

llvm-svn: 174330
2013-02-04 15:52:32 +00:00
Benjamin Kramer 548ffa274a SelectionDAG: Teach FoldConstantArithmetic how to deal with vectors.
This required disabling a PowerPC optimization that did the following:
input:
x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16>
lowered to:
tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8>
x = ADD tmp, tmp

The add now gets folded immediately and we're back at the BUILD_VECTOR we
started from. I don't see a way to fix this currently so I left it disabled
for now.

Fix some trivially foldable X86 tests too.

llvm-svn: 174325
2013-02-04 15:19:18 +00:00
Evan Cheng 0e88c7d897 Teach SDISel to combine fsin / fcos into a fsincos node if the following
conditions are met:
1. They share the same operand and are in the same BB.
2. Both outputs are used.
3. The target has a native instruction that maps to ISD::FSINCOS node or
   the target provides a sincos library call.

Implemented the generic optimization in sdisel and enabled it for
Mac OSX. Also added an additional optimization for x86_64 Mac OSX by
using an alternative entry point __sincos_stret which returns the two
results in xmm0 / xmm1.

rdar://13087969
PR13204

llvm-svn: 173755
2013-01-29 02:32:37 +00:00
Chandler Carruth 9fb823bbd4 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Bill Wendling 698e84fc4f Remove the Function::getFnAttributes method in favor of using the AttributeSet
directly.

This is in preparation for removing the use of the 'Attribute' class as a
collection of attributes. That will shift to the AttributeSet class instead.

llvm-svn: 171253
2012-12-30 10:32:01 +00:00
Hal Finkel 1b5ff08d43 Expand PPC64 atomic load and store
Use of store or load with the atomic specifier on 64-bit types would
cause instruction-selection failures. As with the 32-bit case, these
can use the default expansion in terms of cmp-and-swap.

llvm-svn: 171072
2012-12-25 17:22:53 +00:00
Benjamin Kramer c5071466d4 PowerPC: Expand VSELECT nodes.
There's probably a better expansion for those nodes than the default for
altivec, but this is better than crashing. VSELECTs occur in loop vectorizer
output.

llvm-svn: 170551
2012-12-19 15:49:14 +00:00
Bill Wendling 3d7b0b8ac7 Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future.
llvm-svn: 170502
2012-12-19 07:18:57 +00:00
Bill Schmidt 9f0b4ec0f5 This patch improves the 64-bit PowerPC InitialExec TLS support by providing
for a wider range of GOT entries that can hold thread-relative offsets.
This matches the behavior of GCC, which was not documented in the PPC64 TLS
ABI.  The ABI will be updated with the new code sequence.

Former sequence:

  ld 9,x@got@tprel(2)
  add 9,9,x@tls

New sequence:

  addis 9,2,x@got@tprel@ha
  ld 9,x@got@tprel@l(9)
  add 9,9,x@tls

Note that a linker optimization exists to transform the new sequence into
the shorter sequence when appropriate, by replacing the addis with a nop
and modifying the base register and relocation type of the ld.

llvm-svn: 170209
2012-12-14 17:02:38 +00:00
Bill Schmidt 9ed4dbcb75 This is another cleanup patch for 64-bit PowerPC TLS processing. I had
some hackery in place that hid my poor use of TblGen, which I've now sorted
out and cleaned up.  No change in observable behavior, so no new test cases.

llvm-svn: 170149
2012-12-13 20:57:10 +00:00
Bill Schmidt 732eb91f05 This is just a clean-up patch that simplifies the initial-exec TLS logic by
avoiding use of machine operand flags.  No change in observable behavior, so
no new test cases.

llvm-svn: 170141
2012-12-13 18:45:54 +00:00
Bill Schmidt 24b8dd6eb7 This patch implements local-dynamic TLS model support for the 64-bit
PowerPC target.  This is the last of the four models, so we now have 
full TLS support.

This is mostly a straightforward extension of the general dynamic model.
I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the
register copy following ADDI_TLSLD_L; otherwise everything above the
ADDIS_DTPREL_HA appeared dead and was removed.

As before, there are new test cases to test the assembly generation, and
the relocations output during integrated assembly.  The expected code
gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll.

There are a couple of things I think can be done more efficiently in the
overall TLS code, so there will likely be a clean-up patch forthcoming;
but for now I want to be sure the functionality is in place.

Bill

llvm-svn: 170003
2012-12-12 19:29:35 +00:00
Evan Cheng 962711ee71 Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I
mention the inline memcpy / memset expansion code is a mess?

This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset.
The first indicates whether it is expanding a memset or a memcpy / memmove.
The later is whether the memset is a memset of zero. It's totally possible
(likely even) that targets may want to do different things for memcpy and
memset of zero.

llvm-svn: 169959
2012-12-12 02:34:41 +00:00
Evan Cheng c3d1aca657 - Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term.
Also added more comments to explain why it is generally ok to return true.
- Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to
be true for loaded source (memcpy) or zero constants (memset). The poor name
choice is probably some kind of legacy issue.

llvm-svn: 169954
2012-12-12 01:32:07 +00:00
Bill Schmidt c56f1d34bc This patch implements the general dynamic TLS model for 64-bit PowerPC.
Given a thread-local symbol x with global-dynamic access, the generated
code to obtain x's address is:

     Instruction                            Relocation            Symbol
  addis ra,r2,x@got@tlsgd@ha           R_PPC64_GOT_TLSGD16_HA       x
  addi  r3,ra,x@got@tlsgd@l            R_PPC64_GOT_TLSGD16_L        x
  bl __tls_get_addr(x@tlsgd)           R_PPC64_TLSGD                x
                                       R_PPC64_REL24           __tls_get_addr
  nop
  <use address in r3>

The implementation borrows from the medium code model work for introducing
special forms of ADDIS and ADDI into the DAG representation.  This is made
slightly more complicated by having to introduce a call to the external
function __tls_get_addr.  Using the full call machinery is overkill and,
more importantly, makes it difficult to add a special relocation.  So I've
introduced another opcode GET_TLS_ADDR to represent the function call, and
surrounded it with register copies to set up the parameter and return value.

Most of the code is pretty straightforward.  I ran into one peculiarity
when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like
BL8_NOP_ELF except that it takes another parameter to represent the symbol
("x" above) that requires a relocation on the call.  Something in the 
TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated
identically during the emit phase, so this second operand was never
visited to generate relocations.  This is the reason for the slightly
messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding().

Two new tests are included to demonstrate correct external assembly and
correct generation of relocations using the integrated assembler.

Comments welcome!

Thanks,
Bill

llvm-svn: 169910
2012-12-11 20:30:11 +00:00
Bill Schmidt ca4a0c9dbd This patch introduces initial-exec model support for thread-local storage
on 64-bit PowerPC ELF.

The patch includes code to handle external assembly and MC output with the
integrated assembler.  It intentionally does not support the "old" JIT.

For the initial-exec TLS model, the ABI requires the following to calculate
the address of external thread-local variable x:

 Code sequence            Relocation                  Symbol
  ld 9,x@got@tprel(2)      R_PPC64_GOT_TPREL16_DS      x
  add 9,9,x@tls            R_PPC64_TLS                 x

The register 9 is arbitrary here.  The linker will replace x@got@tprel
with the offset relative to the thread pointer to the generated GOT
entry for symbol x.  It will replace x@tls with the thread-pointer
register (13).

The two test cases verify correct assembly output and relocation output
as just described.

PowerPC-specific selection node variants are added for the two
instructions above:  LD_GOT_TPREL and ADD_TLS.  These are inserted
when an initial-exec global variable is encountered by
PPCTargetLowering::LowerGlobalTLSAddress(), and later lowered to
machine instructions LDgotTPREL and ADD8TLS.  LDgotTPREL is a pseudo
that uses the same LDrs support added for medium code model's LDtocL,
with a different relocation type.

The rest of the processing is straightforward.

llvm-svn: 169281
2012-12-04 16:18:08 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Bill Schmidt 34627e3434 This patch implements medium code model support for 64-bit PowerPC.
The default for 64-bit PowerPC is small code model, in which TOC entries
must be addressable using a 16-bit offset from the TOC pointer.  Additionally,
only TOC entries are addressed via the TOC pointer.

With medium code model, TOC entries and data sections can all be addressed
via the TOC pointer using a 32-bit offset.  Cooperation with the linker
allows 16-bit offsets to be used when these are sufficient, reducing the
number of extra instructions that need to be executed.  Medium code model
also does not generate explicit TOC entries in ".section toc" for variables
that are wholly internal to the compilation unit.

Consider a load of an external 4-byte integer.  With small code model, the
compiler generates:

	ld 3, .LC1@toc(2)
	lwz 4, 0(3)

	.section	.toc,"aw",@progbits
.LC1:
	.tc ei[TC],ei

With medium model, it instead generates:

	addis 3, 2, .LC1@toc@ha
	ld 3, .LC1@toc@l(3)
	lwz 4, 0(3)

	.section	.toc,"aw",@progbits
.LC1:
	.tc ei[TC],ei

Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the
32-bit offset of ei's TOC entry from the TOC base pointer.  Similarly,
.LC1@toc@l is a relocation requesting the lower 16 bits.  Note that if
the linker determines that ei's TOC entry is within a 16-bit offset of
the TOC base pointer, it will replace the "addis" with a "nop", and
replace the "ld" with the identical "ld" instruction from the small
code model example.

Consider next a load of a function-scope static integer.  For small code
model, the compiler generates:

	ld 3, .LC1@toc(2)
	lwz 4, 0(3)

	.section	.toc,"aw",@progbits
.LC1:
	.tc test_fn_static.si[TC],test_fn_static.si
	.type	test_fn_static.si,@object
	.local	test_fn_static.si
	.comm	test_fn_static.si,4,4

For medium code model, the compiler generates:

	addis 3, 2, test_fn_static.si@toc@ha
	addi 3, 3, test_fn_static.si@toc@l
	lwz 4, 0(3)

	.type	test_fn_static.si,@object
	.local	test_fn_static.si
	.comm	test_fn_static.si,4,4

Again, the linker may replace the "addis" with a "nop", calculating only
a 16-bit offset when this is sufficient.

Note that it would be more efficient for the compiler to generate:

	addis 3, 2, test_fn_static.si@toc@ha
        lwz 4, test_fn_static.si@toc@l(3)

The current patch does not perform this optimization yet.  This will be
addressed as a peephole optimization in a later patch.

For the moment, the default code model for 64-bit PowerPC will remain the
small code model.  We plan to eventually change the default to medium code
model, which matches current upstream GCC behavior.  Note that the different
code models are ABI-compatible, so code compiled with different models will
be linked and execute correctly.

I've tested the regression suite and the application/benchmark test suite in
two ways:  Once with the patch as submitted here, and once with additional
logic to force medium code model as the default.  The tests all compile
cleanly, with one exception.  The mandel-2 application test fails due to an
unrelated ABI compatibility with passing complex numbers.  It just so happens
that small code model was incredibly lucky, in that temporary values in 
floating-point registers held the expected values needed by the external
library routine that was called incorrectly.  My current thought is to correct
the ABI problems with _Complex before making medium code model the default,
to avoid introducing this "regression."

Here are a few comments on how the patch works, since the selection code
can be difficult to follow:

The existing logic for small code model defines three pseudo-instructions:
LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for
constant pool addresses.  These are expanded by SelectCodeCommon().  The
pseudo-instruction approach doesn't work for medium code model, because
we need to generate two instructions when we match the same pattern.
Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY
node for medium code model, and generates an ADDIStocHA followed by either
a LDtocL or an ADDItocL.  These new node types correspond naturally to
the sequences described above.

The addis/ld sequence is generated for the following cases:
 * Jump table addresses
 * Function addresses
 * External global variables
 * Tentative definitions of global variables (common linkage)

The addis/addi sequence is generated for the following cases:
 * Constant pool entries
 * File-scope static global variables
 * Function-scope static variables

Expanding to the two-instruction sequences at select time exposes the
instructions to subsequent optimization, particularly scheduling.

The rest of the processing occurs at assembly time, in
PPCAsmPrinter::EmitInstruction.  Each of the instructions is converted to
a "real" PowerPC instruction.  When a TOC entry needs to be created, this
is done here in the same manner as for the existing LDtoc, LDtocJTI, and
LDtocCPT pseudo-instructions (I factored out a new routine to handle this).

I had originally thought that if a TOC entry was needed for LDtocL or
ADDItocL, it would already have been generated for the previous ADDIStocHA.
However, at higher optimization levels, the ADDIStocHA may appear in a 
different block, which may be assembled textually following the block
containing the LDtocL or ADDItocL.  So it is necessary to include the
possibility of creating a new TOC entry for those two instructions.

Note that for LDtocL, we generate a new form of LD called LDrs.  This
allows specifying the @toc@l relocation for the offset field of the LD
instruction (i.e., the offset is replaced by a SymbolLo relocation).
When the peephole optimization described above is added, we will need
to do similar things for all immediate-form load and store operations.

The seven "mcm-n.ll" test cases are kept separate because otherwise the
intermingling of various TOC entries and so forth makes the tests fragile
and hard to understand.

The above assumes use of an external assembler.  For use of the
integrated assembler, new relocations are added and used by
PPCELFObjectWriter.  Testing is done with "mcm-obj.ll", which tests for
proper generation of the various relocations for the same sequences
tested with the external assembler.

llvm-svn: 168708
2012-11-27 17:35:46 +00:00
Adhemerval Zanella bdface5699 PowerPC: Lowering floor intrinsic for Altivec
This patch lowers the llvm.floor, llvm.ceil, llvm.trunc, and
llvm.nearbyint to Altivec instruction when using 4 single-precision
float vectors.

llvm-svn: 168086
2012-11-15 20:56:03 +00:00
Craig Topper c8a2adf1ca Make a bunch of floating point operations on vectors Expand so that instruction selection won't fail.
llvm-svn: 168028
2012-11-15 08:02:19 +00:00
Craig Topper 61d045781a Add llvm.ceil, llvm.trunc, llvm.rint, llvm.nearbyint intrinsics.
llvm-svn: 168025
2012-11-15 06:51:10 +00:00
Craig Topper c4343f2c45 Set FFLOOR of vectors to expand to keep intruction selection from failing.
llvm-svn: 167922
2012-11-14 08:11:25 +00:00
Ulrich Weigand 339d0597d3 On PowerPC64, integer return values (as well as arguments) are supposed
to be extended to a full register.   This is modeled in the IR by marking
the return value (or argument) with a signext or zeroext attribute.

However, while these attributes are respected for function arguments,
they are currently ignored for function return values by the PowerPC
back-end.  This patch updates PPCCallingConv.td to ask for the promotion
to i64, and fixes LowerReturn and LowerCallResult to implement it.

The new test case verifies that both arguments and return values are
properly extended when passing them; and also that the optimizers
understand incoming argument and return values are in fact guaranteed
by the ABI to be extended.

The patch caused a spurious breakage in CodeGen/PowerPC/coalesce-ext.ll,
since the test case used a "ret" instruction to create a use of an i32
value at the end of the function (to set up data flow as required for
what the test is intended to test).  Since there's now an implicit
promotion to i64, that data flow no longer works as expected.  To fix
this, this patch now adds an extra "add" to ensure we have an appropriate
use of the i32 value.

llvm-svn: 167396
2012-11-05 19:39:45 +00:00
Hal Finkel 4f24c621d9 Add support for the PowerPC-specific inline asm Z constraint and y modifier.
The Z constraint specifies an r+r memory address, and the y modifier expands
to the "r, r" in the asm string. For this initial implementation, the base
register is forced to r0 (which has the special meaning of 0 for r+r addressing
on PowerPC) and the full address is taken in the second register. In the
future, this should be improved.

llvm-svn: 167388
2012-11-05 18:18:42 +00:00
Adhemerval Zanella c4182d1890 [PATCH] PowerPC: Expand load extend vector operations
This patch expands the SEXTLOAD, ZEXTLOAD, and EXTLOAD operations for
vector types when altivec is enabled.

llvm-svn: 167386
2012-11-05 17:15:56 +00:00
Chandler Carruth 7ec5085e01 Revert the series of commits starting with r166578 which introduced the
getIntPtrType support for multiple address spaces via a pointer type,
and also introduced a crasher bug in the constant folder reported in
PR14233.

These commits also contained several problems that should really be
addressed before they are re-committed. I have avoided reverting various
cleanups to the DataLayout APIs that are reasonable to have moving
forward in order to reduce the amount of churn, and minimize the number
of commits that were reverted. I've also manually updated merge
conflicts and manually arranged for the getIntPtrType function to stay
in DataLayout and to be defined in a plausible way after this revert.

Thanks to Duncan for working through this exact strategy with me, and
Nick Lewycky for tracking down the really annoying crasher this
triggered. (Test case to follow in its own commit.)

After discussing with Duncan extensively, and based on a note from
Micah, I'm going to continue to back out some more of the more
problematic patches in this series in order to ensure we go into the
LLVM 3.2 branch with a reasonable story here. I'll send a note to
llvmdev explaining what's going on and why.

Summary of reverted revisions:

r166634: Fix a compiler warning with an unused variable.
r166607: Add some cleanup to the DataLayout changes requested by
         Chandler.
r166596: Revert "Back out r166591, not sure why this made it through
         since I cancelled the command. Bleh, sorry about this!
r166591: Delete a directory that wasn't supposed to be checked in yet.
r166578: Add in support for getIntPtrType to get the pointer type based
         on the address space.
llvm-svn: 167221
2012-11-01 08:07:29 +00:00
Bill Schmidt 9953cf294b This patch addresses an ABI compatibility issue with empty aggregate
parameters.  Examples of these are:

  struct { } a;
  union { } b[256];
  int a[0];

An empty aggregate has an address, although dereferencing that address is
pointless.  When passed as a parameter, an empty aggregate does not consume
a protocol register, nor does it consume a doubleword in the parameter save
area.  Passing an empty aggregate by reference passes an address just as
for any other aggregate.  Returning an empty aggregate uses GPR3 as a hidden
address of the return value location, just as for any other aggregate.

The patch modifies PPCTargetLowering::LowerFormalArguments_64SVR4 and
PPCTargetLowering::LowerCall_64SVR4 to properly skip empty aggregate
parameters passed by value.  The handling of return values and by-reference
parameters was already correct.

Built on powerpc64-unknown-linux-gnu and tested with no new regressions.
A test case is included to test proper handling of empty aggregate
parameters on both sides of the function call protocol.

llvm-svn: 167090
2012-10-31 01:15:05 +00:00
Adhemerval Zanella 5c043aeb1b PowerPC: Expand FSRQT for vector types
This patch expands FSQRT for floating point vector types when altivec is
used.

llvm-svn: 167034
2012-10-30 18:29:42 +00:00
Adhemerval Zanella 56775e0f13 PowerPC: More support for Altivec compare operations
This patch adds more support for vector type comparisons using altivec.
It adds correct support for v16i8, v8i16, v4i32, and v4f32 vector
types for comparison operators ==, !=, >, >=, <, and <=.

llvm-svn: 167015
2012-10-30 13:50:19 +00:00
Bill Schmidt bd4ac26973 This patch solves a problem with passing varargs parameters under the PPC64
ELF ABI.

A varargs parameter consisting of a single-precision floating-point value,
or of a single-element aggregate containing a single-precision floating-point
value, must be passed in the low-order (rightmost) four bytes of the
doubleword stack slot reserved for that parameter.  If there are GPR protocol
registers remaining, the parameter must also be mirrored in the low-order
four bytes of the reserved GPR.

Prior to this patch, such parameters were being passed in the high-order
four bytes of the stack slot and the mirrored GPR.

The patch adds a new test case to verify the correct code generation.

llvm-svn: 166968
2012-10-29 21:18:16 +00:00
Ulrich Weigand 0de4a1e4ae Allow i32/i64 for 'f' constraint on PowerPC.
This fixes PR12757.

llvm-svn: 166943
2012-10-29 17:49:34 +00:00
Bill Schmidt 6ed3b99f43 This patch addresses a PPC64 ELF issue with passing parameters consisting of
structs having size 3, 5, 6, or 7.  Such a struct must be passed and received
as right-justified within its register or memory slot.  The problem is only
present for structs that are passed in registers.

Previously, as part of a patch handling all structs of size less than 8, I
added logic to rotate the incoming register so that the struct was left-
justified prior to storing the whole register.  This was incorrect because
the address of the parameter had already been adjusted earlier to point to
the right-adjusted value in the storage slot.  Essentially I had accidentally
accounted for the right-adjustment twice.

In this patch, I removed the incorrect logic and reorganized the code to make
the flow clearer.

The removal of the rotates changes the expected code generation, so test case
structsinregs.ll has been modified to reflect this.  I also added a new test
case, jaggedstructs.ll, to demonstrate that structs of these sizes can now
be properly received and passed.

I've built and tested the code on powerpc64-unknown-linux-gnu with no new
regressions.  I also ran the GCC compatibility test suite and verified that
earlier problems with these structs are now resolved, with no new regressions.

llvm-svn: 166680
2012-10-25 13:38:09 +00:00
Micah Villmow 12d9127833 Add in support for getIntPtrType to get the pointer type based on the address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.

llvm-svn: 166578
2012-10-24 15:52:52 +00:00
Bill Schmidt 57d6de5fd9 This is another TLC patch for separating code for the Darwin and ELF ABIs
for the PowerPC target, and factoring the results.  This will ease future
maintenance of both subtargets.

PPCTargetLowering::LowerCall_Darwin_Or_64SVR4() has grown a lot of special-case
code for the different ABIs, making maintenance difficult.  This is getting
worse as we repair errors in the 64-bit ELF ABI implementation, while avoiding
changes to the Darwin ABI logic.  This patch splits the routine into
LowerCall_Darwin() and LowerCall_64SVR4(), allowing both versions to be
significantly simplified.  I've factored out chunks of similar code where it
made sense to do so.  I also performed similar factoring on
LowerFormalArguments_Darwin() and LowerFormalArguments_64SVR4().

There are no functional changes in this patch, and therefore no new test
cases have been developed.

Built and tested on powerpc64-unknown-linux-gnu with no new regressions.

llvm-svn: 166480
2012-10-23 15:51:16 +00:00
Ulrich Weigand d34b5bd610 This patch fixes failures in the SingleSource/Regression/C/uint64_to_float
test case on PowerPC caused by rounding errors when converting from a 64-bit
integer to a single-precision floating point. The reason for this are
double-rounding effects, since on PowerPC we have to convert to an
intermediate double-precision value first, which gets rounded to the
final single-precision result.

The patch fixes the problem by preparing the 64-bit integer so that the
first conversion step to double-precision will always be exact, and the
final rounding step will result in the correctly-rounded single-precision
result.  The generated code sequence is equivalent to what GCC would generate.

When -enable-unsafe-fp-math is in effect, that extra effort is omitted
and we accept possible rounding errors (just like GCC does as well).

llvm-svn: 166178
2012-10-18 13:16:11 +00:00
Bill Schmidt 48081cad0d This patch addresses PR13949.
For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8
bytes are to be passed in the low-order bits ("right-adjusted") of the
doubleword register or memory slot assigned to them.  A previous patch
addressed this for aggregates passed in registers.  However, small
aggregates passed in the overflow portion of the parameter save area are
still being passed left-adjusted.

The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the
caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on
the callee side.  The main fix on the callee side simply extends
existing logic for 1- and 2-byte objects to 1- through 7-byte objects,
and correcting a constant left over from 32-bit code.  There is also a
fix to a bogus calculation of the offset to the following argument in
the parameter save area.

On the caller side, again a constant left over from 32-bit code is
fixed.  Additionally, some code for 1, 2, and 4-byte objects is
duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only.  The
LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to
handle both ABIs, and I propose to separate this into two functions in a
future patch, at which time the duplication can be removed.

The patch adds a new test (structsinmem.ll) to demonstrate correct
passing of structures of all seven sizes.  Eight dummy parameters are
used to force these structures to be in the overflow portion of the
parameter save area.

As a side effect, this corrects the case when aggregates passed in
registers are saved into the first eight doublewords of the parameter
save area:  Previously they were stored left-justified, and now are
properly stored right-justified.  This requires changing the expected
output of existing test case structsinregs.ll.

llvm-svn: 166022
2012-10-16 13:30:53 +00:00
Bill Schmidt 22162470ba This patch addresses PR13947.
For function calls on the 64-bit PowerPC SVR4 target, each parameter
is mapped to as many doublewords in the parameter save area as
necessary to hold the parameter.  The first 13 non-varargs
floating-point values are passed in registers; any additional
floating-point parameters are passed in the parameter save area.  A
single-precision floating-point parameter (32 bits) must be mapped to
the second (rightmost, low-order) word of its assigned doubleword
slot.

Currently LLVM violates this ABI requirement by mapping such a
parameter to the first (leftmost, high-order) word of its assigned
doubleword slot.  This is internally self-consistent but will not
interoperate correctly with libraries compiled with an ABI-compliant
compiler.

This patch corrects the problem by adjusting the parameter addressing
on both sides of the calling convention.

llvm-svn: 165714
2012-10-11 15:38:20 +00:00
Bill Wendling c9b22d735a Create enums for the different attributes.
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.

llvm-svn: 165488
2012-10-09 07:45:08 +00:00
Adhemerval Zanella fe3f793cec PR12716: PPC crashes on vector compare
Vector compare using altivec 'vcmpxxx' instructions have as third argument
a vector register instead of CR one, different from integer and float-point
compares. This leads to a failure in code generation, where 'SelectSETCC'
expects a DAG with a CR register and gets vector register instead.

This patch changes the behavior by just returning a DAG with the 
vector compare instruction based on the type. The patch also adds a testcase
for all vector types llvm defines.

It also included a fix on signed 5-bits predicates printing, where
signed values were not handled correctly as signed (char are unsigned by
default for PowerPC). This generates 'vspltisw' (vector splat)
instruction with SIM out of range.

llvm-svn: 165419
2012-10-08 18:59:53 +00:00
Adhemerval Zanella 5c6e08435e Add floating-point to and from integer conversion
This patch add altivec support for v4i32 to v4f32 and for v4f32 to
v4i32 vector rounding conversion.

llvm-svn: 165409
2012-10-08 17:27:24 +00:00
Micah Villmow cdfe20b97f Move TargetData to DataLayout.
llvm-svn: 165402
2012-10-08 16:38:25 +00:00
Bill Schmidt d1fa36f903 This patch splits apart PPCISelLowering::LowerFormalArguments_Darwin_Or_64SVR4
into separate versions for the Darwin and 64-bit SVR4 ABIs.  This will
facilitate doing more major surgery on the 64-bit SVR4 ABI in the near future.

llvm-svn: 165336
2012-10-05 21:27:08 +00:00
Bill Wendling 863bab689a Remove the `hasFnAttr' method from Function.
The hasFnAttr method has been replaced by querying the Attributes explicitly. No
intended functionality change.

llvm-svn: 164725
2012-09-26 21:48:26 +00:00
Roman Divacky ca10389bfe Specify MachinePointerInfo as refering to the argument value and offset of the
store when handling byval arguments. Thus preventing reordering of the store
with load with post-RA scheduler.

llvm-svn: 164553
2012-09-24 20:47:19 +00:00
Bill Schmidt 019cc6fe03 Small structs for PPC64 SVR4 must be passed right-justified in registers.
lib/Target/PowerPC/PPCISelLowering.{h,cpp}
 Rename LowerFormalArguments_Darwin to LowerFormalArguments_Darwin_Or_64SVR4.
 Rename LowerFormalArguments_SVR4 to LowerFormalArguments_32SVR4.
 Receive small structs right-justified in LowerFormalArguments_Darwin_Or_64SVR4.
 Rename LowerCall_Darwin to LowerCall_Darwin_Or_64SVR4.
 Rename LowerCall_SVR4 to LowerCall_32SVR4.
 Pass small structs right-justified in LowerCall_Darwin_Or_64SVR4.

test/CodeGen/PowerPC/structsinregs.ll
 New test.

llvm-svn: 164228
2012-09-19 15:42:13 +00:00
Roman Divacky 09adf3decc Fix the isLocalCall() by checking for linker weakness as well.
llvm-svn: 164155
2012-09-18 18:27:49 +00:00
Roman Divacky 762930637c Optimize local func calls to not emit nop for TOC restoration.
Patch by Adhemerval Zanella.

llvm-svn: 164138
2012-09-18 16:47:58 +00:00
Michael Liao abb87d4857 Fix PR11985
- BlockAddress has no support of BA + offset form and there is no way to
  propagate that offset into machine operand;
- Add BA + offset support and a new interface 'getTargetBlockAddress' to
  simplify target block address forming;
- All targets are modified to use new interface and X86 backend is enhanced to
  support BA + offset addressing.

llvm-svn: 163743
2012-09-12 21:43:09 +00:00
NAKAMURA Takumi ac49029fd9 PPCISelLowering.cpp: Fix r162725.
[Tobias von Koch] What's happening here is that the CR6SET/CR6UNSET is breaking the chain of register copies glued to the function call (BL_SVR4 node). The scheduler then moves other instructions in between those and the function call, which isn't good!

Right. That's the case where there is no chain of register copies before the call, so InFlag == 0... Attached is a new revision of the patch which should fix this for good.

llvm-svn: 162916
2012-08-30 15:52:29 +00:00
NAKAMURA Takumi 8ad54e04d2 PPCISelLowering.cpp: Whitespace.
llvm-svn: 162915
2012-08-30 15:52:23 +00:00
Hal Finkel 742b535e40 Add PPC Freescale e500mc and e5500 subtargets.
Add subtargets for Freescale e500mc (32-bit) and e5500 (64-bit) to
the PowerPC backend.

Patch by Tobias von Koch.

llvm-svn: 162764
2012-08-28 16:12:39 +00:00
Hal Finkel 5ab378037f Eliminate redundant CR moves on PPC32.
The 32-bit ABI requires CR bit 6 to be set if the call has fp arguments and
unset if it doesn't. The solution up to now was to insert a MachineNode to
set/unset the CR bit, which produces a CR vreg. This vreg was then copied
into CR bit 6. When the register allocator saw a bunch of these in the same
function, it allocated the set/unset CR bit in some random CR register (1
extra instruction) and then emitted CR moves before every vararg function
call, rather than just setting and unsetting CR bit 6 directly before every
vararg function call. This patch instead inserts a PPCcrset/PPCcrunset
instruction which are then matched by a dedicated instruction pattern.

Patch by Tobias von Koch.

llvm-svn: 162725
2012-08-28 02:10:27 +00:00
Richard Smith 228e6d4cf3 Fix integer undefined behavior due to signed left shift overflow in LLVM.
Reviewed offline by chandlerc.

llvm-svn: 162623
2012-08-24 23:29:28 +00:00
Roman Divacky ace4707ea6 Lower constant pools and jump tables via TOC on PPC64/SVR4.
In collaboration with Adhemerval Zanella.

llvm-svn: 162562
2012-08-24 16:26:02 +00:00
Roman Divacky 1faf5b07c6 Fix typo and grammar. By Adhemerval Zanella.
llvm-svn: 162032
2012-08-16 18:19:29 +00:00
Hal Finkel 70381a7b18 Add readcyclecounter lowering on PPC64.
On PPC64, this can be done with a simple TableGen pattern.
To enable this, I've added the (otherwise missing) readcyclecounter
SDNode definition to TargetSelectionDAG.td.

llvm-svn: 161302
2012-08-04 14:10:46 +00:00
Evan Cheng 39e90029a2 Target option DisableJumpTables is a gross hack. Move it to TargetLowering instead.
llvm-svn: 159611
2012-07-02 22:39:56 +00:00
Hal Finkel 460e94d842 Add support for the PPC isel instruction.
The isel (integer select) instruction is supported on the 440 and A2
embedded cores and on the POWER7.

llvm-svn: 159045
2012-06-22 23:10:08 +00:00
Hal Finkel 0a479ae7d1 Convert the PPC backend to use the new FMA infrastructure.
The existing contraction patterns are replaced with fma/fneg.
Overall functionality should be the same.

llvm-svn: 158955
2012-06-22 00:49:52 +00:00
Hal Finkel ca542beffe Add support for generating reg+reg (indexed) pre-inc loads on PPC.
llvm-svn: 158823
2012-06-20 15:43:03 +00:00
Hal Finkel 1cc27e44a4 Add support for generating reg+reg preinc stores on PPC.
PPC will now generate STWUX and friends.

llvm-svn: 158698
2012-06-19 02:34:32 +00:00
Hal Finkel 4e9f1a859f Enable ILP scheduling for all nodes by default on PPC.
Over the entire test-suite, this has an insignificantly negative average
performance impact, but reduces some of the worst slowdowns from the
anti-dep. change (r158294).

Largest speedups:
SingleSource/Benchmarks/Stanford/Quicksort - 28%
SingleSource/Benchmarks/Stanford/Towers - 24%
SingleSource/Benchmarks/Shootout-C++/matrix - 23%
MultiSource/Benchmarks/SciMark2-C/scimark2 - 19%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15%
(matrix and automotive-bitcount were both in the top-5 slowdown list from the
anti-dep. change)

Largest slowdowns:
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26%
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21%
SingleSource/Benchmarks/CoyoteBench/lpbench - 20%
MultiSource/Applications/d/make_dparser - 16%

llvm-svn: 158296
2012-06-10 19:32:29 +00:00
Roman Divacky c856653fb3 PPC32 uses R2 as the TLS register. Fix the copy and paste.
llvm-svn: 158004
2012-06-05 17:14:17 +00:00
Roman Divacky e3f15c98d1 Implement local-exec TLS on PowerPC.
llvm-svn: 157935
2012-06-04 17:36:38 +00:00
Hal Finkel 595817eebe Enable generating PPC pre-increment (r+imm) instructions by default.
It seems that this no longer causes test suite failures on PPC64 (after r157159),
and often gives a performance benefit, so it can be enabled by default.

llvm-svn: 157911
2012-06-04 02:21:00 +00:00
Justin Holewinski aa58397b3c Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall
to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.

NV_CONTRIB

llvm-svn: 157479
2012-05-25 16:35:28 +00:00
Gabor Greif c8a9abe9df effectively back out my last change (r155190)
llvm-svn: 155195
2012-04-20 11:41:38 +00:00
Gabor Greif 9eccbe9c82 fix obviously bogus (IMO) operand index of the load in asserts
(load only has one operand) and smuggle in some whitespace changes too

NB: I am obviously testing the water here, and believe that the unguarded
    cast is still wrong, but why is the getZExtValue of the load's operand
    tested against zero here? Any review is appreciated.
llvm-svn: 155190
2012-04-20 08:58:49 +00:00
Craig Topper abadc660e0 Convert some uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent.
llvm-svn: 155186
2012-04-20 06:31:50 +00:00
Gabor Greif 180c4445cf zap tabs
llvm-svn: 155128
2012-04-19 15:16:31 +00:00
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Roman Divacky b9663ccd6b Implement the SVR4 byval alignment for aggregates. Fixing a FIXME.
llvm-svn: 153876
2012-04-02 15:49:30 +00:00
Hal Finkel 322e41a914 Enable prefetch generation on PPC64.
llvm-svn: 153851
2012-04-01 20:08:17 +00:00
Hal Finkel 88ed4e3b15 Set the default PPC node scheduling preference to ILP (for the embedded cores).
The 440 and A2 cores have detailed itineraries, and this allows them to be
fully used to maximize throughput.

llvm-svn: 153845
2012-04-01 19:23:08 +00:00
Hal Finkel 51861b4855 Fix dynamic linking on PPC64.
Dynamic linking on PPC64 has had problems since we had to move the top-down
hazard-detection logic post-ra. For dynamic linking to work there needs to be
a nop placed after every call. It turns out that it is really hard to guarantee
that nothing will be placed in between the call (bl) and the nop during post-ra
scheduling. Previous attempts at fixing this by placing logic inside the
hazard detector only partially worked.

This is now fixed in a different way: call+nop codegen-only instructions. As far
as CodeGen is concerned the pair is now a single instruction and cannot be split.
This solution works much better than previous attempts.

The scoreboard hazard detector is also renamed to be more generic, there is currently
no cpu-specific logic in it.

llvm-svn: 153816
2012-03-31 14:45:15 +00:00
Craig Topper f6e7e12f75 Remove unnecessary llvm:: qualifications
llvm-svn: 153500
2012-03-27 07:21:54 +00:00
Hal Finkel e44eb28807 Fix small-integer VAARG on SVR4 ABI PPC64.
The PPC64 SVR4 ABI requires integer stack arguments, and thus the var. args., that
are smaller than 64 bits be zero extended to 64 bits.

llvm-svn: 153373
2012-03-24 03:53:55 +00:00
Craig Topper b25fda95f6 Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations.
llvm-svn: 152997
2012-03-17 18:46:09 +00:00
Craig Topper bef78fc2ee Convert more static tables of registers used by calling convention to uint16_t to reduce space.
llvm-svn: 152538
2012-03-11 07:57:25 +00:00
Craig Topper ca658c2264 Use uint16_t to store registers and opcode in static tables in the target specific backends.
llvm-svn: 152537
2012-03-11 07:16:55 +00:00
Roman Divacky ef21be2cda Convert PowerPC to register mask operands.
llvm-svn: 152122
2012-03-06 16:41:49 +00:00
Evan Cheng 65f9d19c4f Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call.
llvm-svn: 151645
2012-02-28 18:51:51 +00:00
Daniel Dunbar ee7b899343 Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part.
llvm-svn: 151630
2012-02-28 15:36:07 +00:00
Evan Cheng 87c7b09d8d Some ARM implementaions, e.g. A-series, does return stack prediction. That is,
the processor keeps a return addresses stack (RAS) which stores the address
and the instruction execution state of the instruction after a function-call
type branch instruction.

Calling a "noreturn" function with normal call instructions (e.g. bl) can
corrupt RAS and causes 100% return misprediction so LLVM should use a
unconditional branch instead. i.e.
mov lr, pc
b _foo
The "mov lr, pc" is issued in order to get proper backtrace.

rdar://8979299

llvm-svn: 151623
2012-02-28 06:42:03 +00:00
Craig Topper 760b134ffa Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified.
llvm-svn: 151134
2012-02-22 05:59:10 +00:00
Craig Topper e55c556a24 Convert assert(0) to llvm_unreachable
llvm-svn: 149961
2012-02-07 02:50:20 +00:00
David Blaikie 46a9f016c5 More dead code removal (using -Wunreachable-code)
llvm-svn: 148578
2012-01-20 21:51:11 +00:00
Benjamin Kramer 084b9f4134 Remove a bunch of unused variable assignments.
Found by the clang static analyzer.

llvm-svn: 148541
2012-01-20 14:42:32 +00:00
Benjamin Kramer 339ced4e34 Return an ArrayRef from ShuffleVectorSDNode::getMask and push it through CodeGen.
llvm-svn: 148218
2012-01-15 13:16:05 +00:00
Benjamin Kramer 6898db6269 Remove VectorExtras. This unused helper was written for a type of API that is discouraged now.
llvm-svn: 147738
2012-01-07 19:42:13 +00:00
Chandler Carruth 637cc6a8aa Initial CodeGen support for CTTZ/CTLZ where a zero input produces an
undefined result. This adds new ISD nodes for the new semantics,
selecting them when the LLVM intrinsic indicates that the undef behavior
is desired. The new nodes expand trivially to the old nodes, so targets
don't actually need to do anything to support these new nodes besides
indicating that they should be expanded. I've done this for all the
operand types that I could figure out for all the targets. Owners of
various targets, please review and let me know if any of these are
incorrect.

Note that the expand behavior is *conservatively correct*, and exactly
matches LLVM's current behavior with these operations. Ideally this
patch will not change behavior in any way. For example the regtest suite
finds the exact same instruction sequences coming out of the code
generator. That's why there are no new tests here -- all of this is
being exercised by the existing test suite.

Thanks to Duncan Sands for reviewing the various bits of this patch and
helping me get the wrinkles ironed out with expanding for each target.
Also thanks to Chris for clarifying through all the discussions that
this is indeed the approach he was looking for. That said, there are
likely still rough spots. Further review much appreciated.

llvm-svn: 146466
2011-12-13 01:56:10 +00:00
Owen Anderson 0b9b9da6c8 Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise.
llvm-svn: 146171
2011-12-08 19:32:14 +00:00
Nick Lewycky 50f02cb21b Move global variables in TargetMachine into new TargetOptions class. As an API
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.

One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.

llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Hal Finkel 6f0ae783fe add basic PPC register-pressure feedback; adjust the vaarg test to match the new register-allocation pattern
llvm-svn: 145065
2011-11-22 16:21:04 +00:00
Jay Foad 0745e645e0 Remove some unnecessary includes of PseudoSourceValue.h.
llvm-svn: 144631
2011-11-15 07:24:32 +00:00
Pete Cooper 82cd9e81fc Added invariant field to the DAG.getLoad method and changed all calls.
When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses

llvm-svn: 144100
2011-11-08 18:42:53 +00:00
Lang Hames 58dba012b6 Rename NonScalarIntSafe to something more appropriate.
llvm-svn: 143080
2011-10-26 23:50:43 +00:00
Hal Finkel 652985764e Revert change to function alignment b/c existing logic was fine
llvm-svn: 142224
2011-10-17 18:53:03 +00:00
Hal Finkel 0ade47acd0 Instructions for Book E PPC should be word aligned, set function alignment to reflect this
llvm-svn: 142194
2011-10-17 17:01:41 +00:00
Hal Finkel 450128a68c Add an implementation of the CanLowerReturn function to the PPC backend
llvm-svn: 141981
2011-10-14 19:51:36 +00:00
Duncan Sands f2641e1bc1 Add codegen support for vector select (in the IR this means a select
with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons.  Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all").  Patch mostly by
Nadav Rotem.

llvm-svn: 139159
2011-09-06 19:07:46 +00:00
Duncan Sands a098436b32 Split the init.trampoline intrinsic, which currently combines GCC's
init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC.  While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function.  To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function.  Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!).  Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC.  Patch mostly by Sanjoy Das.

llvm-svn: 139140
2011-09-06 13:37:06 +00:00
Roman Divacky 71038e7021 Set CR1EQ only when lowering vararg floating arguments (not any vararg
arguments as before), unset CR1EQ otherwise.

llvm-svn: 138802
2011-08-30 17:04:16 +00:00
Eli Friedman 7dfa791f4f Expand ATOMIC_LOAD and ATOMIC_STORE for architectures I don't know well enough to fix properly.
llvm-svn: 138751
2011-08-29 18:23:02 +00:00
Eli Friedman 30a49e93e3 New approach to r136737: insert the necessary fences for atomic ops in platform-independent code, since a bunch of platforms (ARM, Mips, PPC, Alpha are the relevant targets here) need to do essentially the same thing.
I think this completes the basic CodeGen for atomicrmw and cmpxchg.

llvm-svn: 136813
2011-08-03 21:06:02 +00:00
Evan Cheng 1142444565 Rename TargetAsmParser to MCTargetAsmParser and TargetAsmLexer to MCTargetAsmLexer; rename createAsmLexer to createMCAsmLexer and createAsmParser to createMCAsmParser.
llvm-svn: 136027
2011-07-26 00:24:13 +00:00
Roman Divacky aaba17e89b Set PPCII::MO_DARWIN_STUB only on MacOSX < 10.5.
llvm-svn: 135866
2011-07-24 08:22:56 +00:00
Chris Lattner 229907cd11 land David Blaikie's patch to de-constify Type, with a few tweaks.
llvm-svn: 135375
2011-07-18 04:54:35 +00:00
Cameron Zwarich f03fa189ca Add an intrinsic and codegen support for fused multiply-accumulate. The intent
is to use this for architectures that have a native FMA instruction.

llvm-svn: 134742
2011-07-08 21:39:21 +00:00
Roman Divacky cc5e53383e Remove accidentaly left node from previous iteration of the patch.
Noticed by Benjamin Kramer!

llvm-svn: 134376
2011-07-04 15:42:45 +00:00
Roman Divacky 4394e68c24 Implement ISD::VAARG lowering on PPC32.
llvm-svn: 134005
2011-06-28 15:30:42 +00:00
Roman Divacky d041962c20 Fix a few places where 32bit instructions/registerset were used on PPC64.
llvm-svn: 133260
2011-06-17 15:21:10 +00:00
Eli Friedman 164b1d753a PR10136: fix PPCTargetLowering::LowerCall_SVR4 so that a necessary CopyToReg doesn't appear to be dead.
Roman, since you're writing tests for other PPC-SVR4 vararg-related stuff, would you mind writing a test for this?

llvm-svn: 133018
2011-06-14 22:16:20 +00:00
Eric Christopher 0713a9d8fc Add a parameter to CCState so that it can access the MachineFunction.
No functional change.

Part of PR6965

llvm-svn: 132763
2011-06-08 23:55:35 +00:00
Roman Divacky a4a59aebd9 Fix wrong usages of CTR/MCTR where CTR8/MCTR8 was meant.
- Check for MTCTR8 in addition to MTCTR when looking up a hazard.

- When lowering an indirect call use CTR8 when targeting 64bit.

- Introduce BCTR8 that uses CTR8 and use it on 64bit when expanding ISD::BRIND.

The last change fixes PR8487. With those changes, we are able to compile a
running "ls" and "sh" on FreeBSD/PowerPC64.

llvm-svn: 132552
2011-06-03 15:47:49 +00:00
Eric Christopher de9399bf76 Have LowerOperandForConstraint handle multiple character constraints.
Part of rdar://9119939

llvm-svn: 132510
2011-06-02 23:16:42 +00:00
Cameron Zwarich 41025dc95b Make CodeGen/PowerPC/2007-09-11-RegCoalescerAssert.ll pass with the verifier.
llvm-svn: 131627
2011-05-19 03:11:06 +00:00
Eli Friedman 2518f8376d Make the logic for determining function alignment more explicit. No functionality change.
llvm-svn: 131012
2011-05-06 20:34:06 +00:00
Daniel Dunbar cd01ed5bd6 ADT/Triple: Renambe isOSX... methods to isMacOSX for consistency with the OS
triple component.

llvm-svn: 129838
2011-04-20 00:14:25 +00:00
Daniel Dunbar f954a0f028 Target/PPC: Eliminate a use of getDarwinVers().
llvm-svn: 129810
2011-04-19 20:57:03 +00:00
Chris Lattner 0ab5e2cded Fix a ton of comment typos found by codespell. Patch by
Luis Felipe Strano Moraes!

llvm-svn: 129558
2011-04-15 05:18:47 +00:00
Jakob Stoklund Olesen 13ce236c4c Insert code in the right location when lowering PowerPC atomics.
This causes defs to dominate uses, no instructions after terminators, and other
goodness.

llvm-svn: 128836
2011-04-04 17:57:29 +00:00
Jakob Stoklund Olesen 7067bff976 Use X0 instead of R0 for the zero register on ppc64.
The 32-bit R0 cannot be used where a 64-bit register is expected.

llvm-svn: 128828
2011-04-04 17:07:06 +00:00
Owen Anderson b2c80da4ae Allow targets to specify a the type of the RHS of a shift parameterized on the type of the LHS.
llvm-svn: 126518
2011-02-25 21:41:48 +00:00
Devang Patel f3292b2196 Revert r124611 - "Keep track of incoming argument's location while emitting LiveIns."
In other words, do not keep track of argument's location.  The debugger (gdb) is not prepared to see line table entries for arguments. For the debugger, "second" line table entry marks beginning of function body.
This requires some coordination with debugger to get this working. 
 - The debugger needs to be aware of prolog_end attribute attached with line table entries.
 - The compiler needs to accurately mark prolog_end in line table entries (at -O0 and at -O1+)

llvm-svn: 126155
2011-02-21 23:21:26 +00:00
Stuart Hastings 81c4306005 Swap VT and DebugLoc operands of getExtLoad() for consistency with
other getNode() methods.  Radar 9002173.

llvm-svn: 125665
2011-02-16 16:23:55 +00:00
Devang Patel 56cc5fdf09 Keep track of incoming argument's location while emitting LiveIns.
llvm-svn: 124611
2011-01-31 21:38:14 +00:00
Jeffrey Yasskin 249fcd4499 Remove unused variables found by gcc-4.6's -Wunused-but-set-variable.
llvm-svn: 123707
2011-01-18 00:51:23 +00:00
Anton Korobeynikov 2f93128109 Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there.
llvm-svn: 123170
2011-01-10 12:39:04 +00:00
Jeffrey Yasskin 9b43f33620 Change all self assignments X=X to (void)X, so that we can turn on a
new gcc warning that complains on self-assignments and
self-initializations.

llvm-svn: 122458
2010-12-23 00:58:24 +00:00
Chris Lattner 3e5fbd74ed rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for
something that just glues two nodes together, even if it is
sometimes used for flags.

llvm-svn: 122310
2010-12-21 02:38:05 +00:00
Wesley Peck 527da1b6e2 Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept.
llvm-svn: 119990
2010-11-23 03:31:01 +00:00
Chris Lattner dd6df84900 convert the operand bits into bitfields since they are all combinable in
different ways.  Add $non_lazy_ptr support, and proper lowering for
global values.

Now all the ppc regression tests pass with the new instruction printer.

llvm-svn: 119106
2010-11-15 03:13:19 +00:00
Chris Lattner edb9d84dcc add targetoperand flags for jump tables, constant pool and block address
nodes to indicate when ha16/lo16 modifiers should be used.  This lets
us pass PowerPC/indirectbr.ll.

The one annoying thing about this patch is that the MCSymbolExpr isn't
expressive enough to represent ha16(label1-label2) which we need on
PowerPC.  I have a terrible hack in the meantime, but this will have
to be revisited at some point.

Last major conversion item left is global variable references.

llvm-svn: 119105
2010-11-15 02:46:57 +00:00
Chris Lattner df8e17d80b implement support for the MO_DARWIN_STUB TargetOperand flag,
and have isel apply to to call operands as required.  This allows
us to get $stub suffixes on label references on ppc/tiger with the
new instprinter, fixing two tests.  Only 2 to go.

llvm-svn: 119093
2010-11-14 23:42:06 +00:00
Duncan Sands 71049f78ed In the calling convention logic, ValVT is always a legal type,
and as such can be represented by an MVT - the more complicated
EVT is not needed.  Use MVT for ValVT everywhere.

llvm-svn: 118245
2010-11-04 10:49:57 +00:00
Duncan Sands f5dda01f33 Inside the calling convention logic LocVT is always a simple
value type, so there is no point in passing it around using
an EVT.  Use the simpler MVT everywhere.  Rather than trying
to propagate this information maximally in all the code that
using the calling convention stuff, I chose to do a mainly
low impact change instead.

llvm-svn: 118167
2010-11-03 11:35:31 +00:00
John Thompson e8360b7182 Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support.
llvm-svn: 117667
2010-10-29 17:29:13 +00:00
Duncan Sands ee4eb2bad1 Remove some variables that are never really used
(gcc-4.6 warns about these).

llvm-svn: 117021
2010-10-21 16:03:28 +00:00
Jakob Stoklund Olesen 6c4353ecee PowerPC varargs functions store live-in registers on the stack. Make sure we use
virtual registers for those stores since RegAllocFast requires that each live
physreg only be used once.

This fixes PR8357.

llvm-svn: 116222
2010-10-11 20:43:09 +00:00
Chris Lattner d10babfd65 fix the expansion of va_arg instruction on PPC to know the arg
alignment for PPC32/64, avoiding some masking operations.

llvm-gcc expands vaarg inline instead of using the instruction
so it has never hit this.

llvm-svn: 116168
2010-10-10 18:34:00 +00:00
Chris Lattner 676c61db0e update a bunch of code to use the MachinePointerInfo version of getStore.
llvm-svn: 114461
2010-09-21 18:41:36 +00:00
Chris Lattner 6963c1f789 eliminate an old SelectionDAG::getTruncStore method, propagating
MachinePointerInfo around more.

llvm-svn: 114452
2010-09-21 17:42:31 +00:00
Chris Lattner 3d178ed4d4 propagate MachinePointerInfo through various uses of the old
SelectionDAG::getExtLoad overload, and eliminate it.

llvm-svn: 114446
2010-09-21 17:04:51 +00:00
Chris Lattner 7727d05dbb convert the targets off the non-MachinePointerInfo of getLoad.
llvm-svn: 114410
2010-09-21 06:44:06 +00:00
Chris Lattner 2510de2bea reimplement memcpy/memmove/memset lowering to use MachinePointerInfo
instead of srcvalue/offset pairs.  This corrects SV info for mem 
operations whose size is > 32-bits.

llvm-svn: 114401
2010-09-21 05:40:29 +00:00
Chris Lattner e3d864b857 convert targets to the new MF.getMachineMemOperand interface.
llvm-svn: 114391
2010-09-21 04:39:43 +00:00
Torok Edwin 31e90d2dd1 Use indirect calls in PowerPC JIT.
See PR5201. There is no way to know if direct calls will be within the allowed
range for BL. Hence emit all calls as indirect when in JIT mode.
Without this long-running applications will fail to JIT on PowerPC with a
relocation failure.

llvm-svn: 110246
2010-08-04 20:47:44 +00:00