Commit Graph

907 Commits

Author SHA1 Message Date
Ulrich Weigand c6fcb7a5de [PowerPC] Fix IsDarwin arg in PPCFrameLowering:: calls
As remarked in the commit message to r211493, in several places
throughout the 64-bit SVR4 ABI code there are calls to
PPCFrameLowering::getLinkageSize and getMinCallFrameSize
using an incorrect IsDarwin argument of "true".

(Some of those were made explicit by the above refactoring patch, others
have been there all along.)

This patch fixes those places to pass "false" for IsDarwin.

No change in generated code expected.

llvm-svn: 211494
2014-06-23 13:21:43 +00:00
Ulrich Weigand 2bffb95915 [PowerPC] Refactor setMinReservedArea and CalculateParameterAndLinkageAreaSize
The PPCISelLowering.cpp routines PPCTargetLowering::setMinReservedArea and
CalculateParameterAndLinkageAreaSize are currently used as subroutines
from both 64-bit SVR4 and Darwin ABI code.

However, the two ABIs are already quite different w.r.t. AltiVec
conventions, and they will become more different when the ELFv2 ABI is
supported.  Also, in general it seems better to disentangle ABI support
routines for different ABIs to avoid accidentally affecting one ABI when
intending to change only the other.

(Actually, the current code strictly speaking already contains a bug:
these routines call PPCFrameLowering::getMinCallFrameSize and
PPCFrameLowering::getLinkageSize with the IsDarwin parameter set to
"true" even on 64-bit SVR4.  This bug currently has no adverse effect
since those routines always return the same for 64-bit SVR4 and 64-bit
Darwin, but it still seems wrong ...  I'll fix this in a follow-up
commit shortly.)

To remove this code sharing, I'm simply inlining both routines into all
call sites (there are just two each, one for 64-bit SVR4 and one for
Darwin), and simplifying due to constant parameters where possible.

A small piece of code that *does* make sense to share is refactored into
the new routine EnsureStackAlignment, now also called from 32-bit SVR4
ABI code.

No change in generated code is expected.

llvm-svn: 211493
2014-06-23 13:08:27 +00:00
Ulrich Weigand 9ba552db89 [PowerPC] Fix on-stack AltiVec arguments with 64-bit SVR4
Current 64-bit SVR4 code seems to have some remnants of Darwin code
in AltiVec argument handing.  This had the effect that AltiVec arguments
(or subsequent arguments) were not correctly placed in the parameter area
in some cases.

The correct behaviour with the 64-bit SVR4 ABI is:
- All AltiVec arguments take up space in the parameter area, just like
  any other arguments, whether vararg or not.
- They are always 16-byte aligned, skipping a parameter area doubleword
  (and the associated GPR, if any), if necessary.

This patch implements the correct behaviour and adds a test case.
(Verified against GCC behaviour via the ABI compat test suite.)

llvm-svn: 211492
2014-06-23 12:36:34 +00:00
Ulrich Weigand 59c6ab20d6 [PowerPC] Fix small argument stack slot offset for LE
When small arguments (structures < 8 bytes or "float") are passed in a
stack slot in the ppc64 SVR4 ABI, they must reside in the least
significant part of that slot.  On BE, this means that an offset needs
to be added to the stack address of the parameter, but on LE, the least
significant part of the slot has the same address as the slot itself.

This changes the PowerPC back-end ABI code to only add the small
argument stack slot offset for BE.  It also adds test cases to verify
the correct behavior on both BE and LE.

llvm-svn: 211368
2014-06-20 16:34:05 +00:00
Ulrich Weigand f460d69ada [PowerPC] Remove unnecessary load of r12 in indirect call
When looking at the 64-bit SVR4 indirect call sequence, I noticed
an unnecessary load of r12.  And indeed the code says:

  // R12 must contain the address of an indirect callee. 

But this is not correct; in the 64-bit SVR4 (ELFv1) ABI, there is
no need to load r12 at this point.  It seems this code and comment
is a remnant of code originally shared with the Darwin ABI ...

This patch simply removes the unnecessary load.

llvm-svn: 211203
2014-06-18 18:33:36 +00:00
Ulrich Weigand ad0cb91ed9 [PowerPC] Simplify and improve loading into TOC register
During an indirect function call sequence on the 64-bit SVR4 ABI,
generate code must load and then restore the TOC register.

This does not use a regular LOAD instruction since the TOC
register r2 is marked as reserved.  Instead, the are two
special instruction patterns:

 let RST = 2, DS = 2 in
 def LDinto_toc: DSForm_1a<58, 0, (outs), (ins g8rc:$reg),
                     "ld 2, 8($reg)", IIC_LdStLD,
                     [(PPCload_toc i64:$reg)]>, isPPC64;
 
 let RST = 2, DS = 10, RA = 1 in
 def LDtoc_restore : DSForm_1a<58, 0, (outs), (ins),
                     "ld 2, 40(1)", IIC_LdStLD,
                     [(PPCtoc_restore)]>, isPPC64;

Note that these not only restrict the destination of the
load to r2, but they also restrict the *source* of the
load to particular address combinations.  The latter is
a problem when we want to support the ELFv2 ABI, since
there the TOC save slot is no longer at 40(1).

This patch replaces those two instructions with a single
instruction pattern that only hard-codes r2 as destination,
but supports generic addresses as source.  This will allow
supporting the ELFv2 ABI, and also helps generate more
efficient code for calls to absolute addresses (allowing
simplification of the ppc64-calls.ll test case).

llvm-svn: 211193
2014-06-18 17:52:49 +00:00
Ulrich Weigand 9aa09ef30f [PowerPC] Do not use BLA with the 64-bit SVR4 ABI
The PowerPC back-end uses BLA to implement calls to functions at
known-constant addresses, which is apparently used for certain
system routines on Darwin.

However, with the 64-bit SVR4 ABI, this is actually incorrect.
An immediate function pointer value on this platform is not
directly usable as a target address for BLA:
- in the ELFv1 ABI, the function pointer value refers to the
  *function descriptor*, not the code address
- in the ELFv2 ABI, the function pointer value refers to the
  global entry point, but BL(A) would only be correct when
  calling the *local* entry point

This bug didn't show up since using immediate function pointer
values is not usually done in the 64-bit SVR4 ABI in the first
place.  However, I ran into this issue with a certain use case
of LLVM as JIT, where immediate function pointer values were
uses to implement callbacks from JITted code to helpers in
statically compiled code.

Fixed by simply not using BLA with the 64-bit SVR4 ABI.

llvm-svn: 211174
2014-06-18 16:14:04 +00:00
Eric Christopher d90a8746df Remove an extraneous this-> to access the subtarget.
llvm-svn: 210849
2014-06-12 22:38:20 +00:00
Eric Christopher b1aaebecb1 Rename PPCSubTarget to Subtarget in PPCTargetLowering for consistency.
Also remove an extra local subtarget in the initialization functions.

llvm-svn: 210848
2014-06-12 22:38:18 +00:00
Bill Schmidt f910a0650e [PPC64LE] Recognize shufflevector patterns for little endian
Various masks on shufflevector instructions are recognizable as
specific PowerPC instructions (vector pack, vector merge, etc.).
There is existing code in PPCISelLowering.cpp to recognize the correct
patterns for big endian code.  The masks for these instructions are
different for little endian code due to the big-endian numbering
employed by these instructions.  This patch adds the recognition code
for little endian.

I've added a new test case test/CodeGen/PowerPC/vec_shuffle_le.ll for
this.  The existing recognizer test (vec_shuffle.ll) is unnecessarily
verbose and difficult to read, so I felt it was better to add a new
test rather than modify the old one.

llvm-svn: 210536
2014-06-10 14:35:01 +00:00
Bill Schmidt 6b5a7dfc24 [PPC64LE] Generate correct code for unaligned little-endian vector loads
The code in PPCTargetLowering::PerformDAGCombine() that handles
unaligned Altivec vector loads generates a lvsl followed by a vperm.
As we've seen in numerous other places, the vperm instruction has a
big-endian bias, and this is fixed for little endian by complementing
the permute control vector and swapping the input operands.  In this
case the lvsl is providing the permute control vector.  Rather than
generating an lvsl and a complement operation, it is sufficient to
generate an lvsr instruction instead.  Thus for LE code generation we
will generate an lvsr rather than an lvsl, and swap the other input
arguments on the vperm.

The existing test/CodeGen/PowerPC/vec_misalign.ll is updated to test
the code generation for PPC64 and PPC64LE, in addition to the existing
PPC32/G5 testing.

llvm-svn: 210493
2014-06-09 22:00:52 +00:00
Bill Schmidt 42995e8c74 [PPC64LE] Generate correct little-endian code for v16i8 multiply
The existing code in PPCTargetLowering::LowerMUL() for multiplying two
v16i8 values assumes that vector elements are numbered in big-endian
order.  For little-endian targets, the vector element numbering is
reversed, but the vmuleub, vmuloub, and vperm instructions still
assume big-endian numbering.  To account for this, we must adjust the
permute control vector and reverse the order of the input registers on
the vperm instruction.

The existing test/CodeGen/PowerPC/vec_mul.ll is updated to be executed
on powerpc64 and powerpc64le targets as well as the original powerpc
(32-bit) target.

llvm-svn: 210474
2014-06-09 16:06:29 +00:00
Bill Schmidt 4aedff8995 [PPC64LE] Fix lowering of BUILD_VECTOR and SHUFFLE_VECTOR for little endian
This patch fixes a couple of lowering issues for little endian
PowerPC.  The code for lowering BUILD_VECTOR contains a number of
optimizations that are only valid for big endian.  For now, we disable
those optimizations for correctness.  In the future, we will add
analogous optimizations that are correct for little endian.

When lowering a SHUFFLE_VECTOR to a VPERM operation, we again need to
make the now-familiar transformation of swapping the input operands
and complementing the permute control vector.  Correctness of this
transformation is tested by the accompanying test case.

llvm-svn: 210336
2014-06-06 14:06:26 +00:00
Eric Christopher a84189a2e4 Omit else branch after return.
llvm-svn: 210034
2014-06-02 17:29:07 +00:00
Eric Christopher 8995833a34 Have the TLOF creation take a Triple rather than needing a subtarget.
llvm-svn: 209937
2014-05-31 00:07:32 +00:00
Eric Christopher 5435224a53 isSVR4ABI() returned !isDarwin() so just move that to the else
block and remove the unreachable code.

llvm-svn: 209927
2014-05-30 22:47:53 +00:00
Eric Christopher 174c662b7c Rename CreateTLOF->createTLOF to match the rest of the file and the
rest of the targets with a similar function name.

llvm-svn: 209926
2014-05-30 22:47:48 +00:00
Bill Schmidt 71dddd51d9 [PATCH] Correct type used for VADD_SPLAT optimization on PowerPC
In PPCISelLowering.cpp: PPCTargetLowering::LowerBUILD_VECTOR(), there
is an optimization for certain patterns to generate one or two vector
splats followed by a vector add or subtract.  This operation is
represented by a VADD_SPLAT in the selection DAG.  Prior to this
patch, it was possible for the VADD_SPLAT to be assigned the wrong
data type, causing incorrect code generation.  This patch corrects the
problem.

Specifically, the code previously assigned the value type of the
BUILD_VECTOR node to the newly generated VADD_SPLAT node.  This is
correct much of the time, but not always.  The problem is that the
call to isConstantSplat() may return a SplatBitSize that is not the
same as the number of bits in the original element vector type.  The
correct type to assign is a vector type with the same element bit size
as SplatBitSize.

The included test case shows an example of this, where the
BUILD_VECTOR node has a type of v16i8.  The vector to be built is {0,
16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16}.  isConstantSplat
detects that we can generate a splat of 16 for type v8i16, which is
the type we must assign to the VADD_SPLAT node.  If we do not, we
generate a vspltisb of 8 and a vaddubm, which generates the incorrect
result {16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
16}.  The correct code generation is a vspltish of 8 and a vadduhm.

This patch also corrected code generation for
CodeGen/PowerPC/2008-07-10-SplatMiscompile.ll, which had been marked
as an XFAIL, so we can remove the XFAIL from the test case.

llvm-svn: 209662
2014-05-27 15:57:51 +00:00
Adam Nemet 571eb5fc91 [PowerPC] PR19796: Also match ISD::TargetConstant in isIntS16Immediate
The SplitIndexingFromLoad changes exposed a latent isel bug in the PowerPC64
backend.  We matched an immediate offset with STWX8 even though it only
supports register offset.

The culprit is the complex-pattern predicate, SelectAddrIdx, which decides
that if the offset is not ISD::Constant it must be a register.

Many thanks to Bill Schmidt for testing this.

llvm-svn: 209219
2014-05-20 17:20:34 +00:00
Benjamin Kramer f3ad23551d SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the bswap not.
- On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though.
- On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal.
- On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled.

llvm-svn: 209123
2014-05-19 13:12:38 +00:00
Saleem Abdulrasool f3a5a5c546 Target: remove old constructors for CallLoweringInfo
This is mostly a mechanical change changing all the call sites to the newer
chained-function construction pattern.  This removes the horrible 15-parameter
constructor for the CallLoweringInfo in favour of setting properties of the call
via chained functions.  No functional change beyond the removal of the old
constructors are intended.

llvm-svn: 209082
2014-05-17 21:50:17 +00:00
Jay Foad a0653a3e6c Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Hal Finkel 0d8db46799 [PowerPC] Add global named register support
Support for the intrinsics that read from and write to global named registers
is added for r1, r2 and r13 (depending on the subtarget).

llvm-svn: 208509
2014-05-11 19:29:11 +00:00
Craig Topper 2d2aa0ca1f Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I introduced most of these recently.
llvm-svn: 207616
2014-04-30 07:17:30 +00:00
Craig Topper 8c0b4d0791 Convert more SelectionDAG functions to use ArrayRef.
llvm-svn: 207397
2014-04-28 05:57:50 +00:00
Craig Topper e73658ddbb [C++] Use 'nullptr'.
llvm-svn: 207394
2014-04-28 04:05:08 +00:00
Craig Topper 64941d9786 Convert SelectionDAG::getMergeValues to use ArrayRef.
llvm-svn: 207374
2014-04-27 19:20:57 +00:00
Craig Topper 206fcd450a Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer and size.
llvm-svn: 207329
2014-04-26 19:29:41 +00:00
Craig Topper 48d114bed1 Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.
llvm-svn: 207327
2014-04-26 18:35:24 +00:00
Craig Topper 062a2baef0 [C++] Use 'nullptr'. Target edition.
llvm-svn: 207197
2014-04-25 05:30:21 +00:00
Reid Kleckner 5772b77789 Add 'musttail' marker to call instructions
This is similar to the 'tail' marker, except that it guarantees that
tail call optimization will occur.  It also comes with convervative IR
verification rules that ensure that tail call optimization is possible.

Reviewers: nicholas

Differential Revision: http://llvm-reviews.chandlerc.com/D3240

llvm-svn: 207143
2014-04-24 20:14:34 +00:00
Nick Lewycky aad475b324 Break PseudoSourceValue out of the Value hierarchy. It is now the root of its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead.
llvm-svn: 206255
2014-04-15 07:22:52 +00:00
Hal Finkel 34974ed503 [PowerPC] Implement some additional TLI callbacks
Add implementations of:
  bool isLegalICmpImmediate(int64_t Imm) const
  bool isLegalAddImmediate(int64_t Imm) const
  bool isTruncateFree(Type *Ty1, Type *Ty2) const
  bool isTruncateFree(EVT VT1, EVT VT2) const
  bool shouldConvertConstantLoadToIntImm(const APInt &Imm, Type *Ty) const

Unfortunately, this regresses counter-register-based loop formation because
some of the loops now end up in forms were SE cannot compute loop counts.
However, nevertheless, the test-suite results favor committing:

SingleSource/Benchmarks/BenchmarkGame/puzzle: 26% speedup
MultiSource/Benchmarks/FreeBench/analyzer/analyzer: 21% speedup
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan: 20% speedup
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv: 19% speedup
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv: 15% speedup
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2: 2% speedup

MultiSource/Benchmarks/VersaBench/bmm/bmm: 26% slowdown

llvm-svn: 206120
2014-04-12 21:52:38 +00:00
Hal Finkel a775e51274 [PowerPC] Don't return false from PPC::isVSLDOIShuffleMask
PPC::isVSLDOIShuffleMask should return -1, not false, when the shuffle
predicate should be false.

Noticed by inspection; no test case (yet).

llvm-svn: 205787
2014-04-08 19:00:27 +00:00
Craig Topper 840beec2d0 Make consistent use of MCPhysReg instead of uint16_t throughout the tree.
llvm-svn: 205610
2014-04-04 05:16:06 +00:00
Hal Finkel b4240ca0f4 [PowerPC] Don't ever expand BUILD_VECTOR of v2i64 with shuffles
If we have two unique values for a v2i64 build vector, this will always result
in two vector loads if we expand using shuffles. Only one is necessary.

llvm-svn: 205231
2014-03-31 17:48:16 +00:00
Hal Finkel 5c0d1454d6 [PowerPC] Handle VSX v2i64 SIGN_EXTEND_INREG
sitofp from v2i32 to v2f64 ends up generating a SIGN_EXTEND_INREG v2i64 node
(and similarly for v2i16 and v2i8). Even though there are no sign-extension (or
algebraic shifts) for v2i64 types, we can handle v2i32 sign extensions by
converting two and from v2i64. The small trick necessary here is to shift the
i32 elements into the right lanes before the i32 -> f64 step. This is because
of the big Endian nature of the system, we need the i32 portion in the high
word of the i64 elements.

For v2i16 and v2i8 we can do the same, but we first use the default Altivec
shift-based expansion from v2i16 or v2i8 to v2i32 (by casting to v4i32) and
then apply the above procedure.

llvm-svn: 205146
2014-03-30 13:22:59 +00:00
Hal Finkel 777c9dd90a [PowerPC] Handle v2i64 comparisons
v2i64 is a legal type under VSX, however we don't have native vector
comparisons. We can handle eq/ne by casting it to an Altivec type, but
everything else must be expanded.

llvm-svn: 205106
2014-03-29 16:04:40 +00:00
Hal Finkel 19be506a5e [PowerPC] Add subregister classes for f64 VSX values
We had stored both f64 values and v2f64, etc. values in the VSX registers. This
worked, but was suboptimal because we would always spill 16-byte values even
through we almost always had scalar 8-byte values. This resulted in an
increase in stack-size use, extra memory bandwidth, etc. To fix this, I've
added 64-bit subregisters of the Altivec registers, and combined those with the
existing scalar floating-point registers to form a class of VSX scalar
floating-point registers. The ABI code has also been enhanced to use this
register class and some other necessary improvements have been made.

llvm-svn: 205075
2014-03-29 05:29:01 +00:00
Hal Finkel 7811c6188e [PowerPC] v2[fi]64 need to be explicitly passed in VSX registers
v2[fi]64 values need to be explicitly passed in VSX registers. This is because
the code in TRI that finds the minimal register class given a register and a
value type will assert if given an Altivec register and a non-Altivec type.

llvm-svn: 205041
2014-03-28 19:58:11 +00:00
Hal Finkel 82569b6366 [PowerPC] Fix v2f64 vector extract and related patterns
First, v2f64 vector extract had not been declared legal (and so the existing
patterns were not being used). Second, the patterns for that, and for
scalar_to_vector, should really be a regclass copy, not a subregister
operation, because the VSX registers directly hold both the vector and scalar data.

llvm-svn: 204971
2014-03-27 22:22:48 +00:00
Hal Finkel ad801b7459 [PowerPC] Expand v2i64 shifts
These operations need to be expanded during legalization so that isel does not
crash. In theory, we might be able to custom lower some of these. That,
however, would need to be follow-up work.

llvm-svn: 204963
2014-03-27 21:26:33 +00:00
Hal Finkel df3e34d944 [PowerPC] Generate VSX permutations for v2[fi]64 vectors
llvm-svn: 204873
2014-03-26 22:58:37 +00:00
Hal Finkel 6e28e6aaaf [PowerPC] VSX loads and stores support unaligned access
I've not yet updated PPCTTI because I'm not sure what the actual relative cost
is compared to the aligned uses.

llvm-svn: 204848
2014-03-26 19:39:09 +00:00
Hal Finkel 7279f4b00d [PowerPC] Use v2f64 <-> v2i64 VSX conversion instructions
llvm-svn: 204843
2014-03-26 19:13:54 +00:00
Hal Finkel 9281c9a38b [PowerPC] Use VSX vector load/stores for v2[fi]64
These instructions have access to the complete VSX register file. In addition,
they "swap" the order of the elements so that element 0 (the scalar part) comes
first in memory and element 1 follows at a higher address.

llvm-svn: 204838
2014-03-26 18:26:30 +00:00
Hal Finkel a6c8b51212 [PowerPC] Add v2i64 as a legal VSX type
v2i64 needs to be a legal VSX type because it is the SetCC result type from
v2f64 comparisons. We need to expand all non-arithmetic v2i64 operations.

This fixes the lowering for v2f64 VSELECT.

llvm-svn: 204828
2014-03-26 16:12:58 +00:00
Hal Finkel 732f0f73a7 [PowerPC] Lower VSELECT using xxsel when VSX is available
With VSX there is a real vector select instruction, and so we should use it.
Note that VSELECT will still scalarize for v2f64 because the corresponding
SetCC result type (v2i64) is not currently a legal type.

llvm-svn: 204801
2014-03-26 12:49:28 +00:00
Hal Finkel 27774d9274 [PowerPC] Initial support for the VSX instruction set
VSX is an ISA extension supported on the POWER7 and later cores that enhances
floating-point vector and scalar capabilities. Among other things, this adds
<2 x double> support and generally helps to reduce register pressure.

The interesting part of this ISA feature is the register configuration: there
are 64 new 128-bit vector registers, the 32 of which are super-registers of the
existing 32 scalar floating-point registers, and the second 32 of which overlap
with the 32 Altivec vector registers. This makes things like vector insertion
and extraction tricky: this can be free but only if we force a restriction to
the right register subclass when needed. A new "minipass" PPCVSXCopy takes care
of this (although it could do a more-optimal job of it; see the comment about
unnecessary copies below).

Please note that, currently, VSX is not enabled by default when targeting
anything because it is not yet ready for that.  The assembler and disassembler
are fully implemented and tested. However:

 - CodeGen support causes miscompiles; test-suite runtime failures:
      MultiSource/Benchmarks/FreeBench/distray/distray
      MultiSource/Benchmarks/McCat/08-main/main
      MultiSource/Benchmarks/Olden/voronoi/voronoi
      MultiSource/Benchmarks/mafft/pairlocalalign
      MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4
      SingleSource/Benchmarks/CoyoteBench/almabench
      SingleSource/Benchmarks/Misc/matmul_f64_4x4

 - The lowering currently falls back to using Altivec instructions far more
   than it should. Worse, there are some things that are scalarized through the
   stack that shouldn't be.

 - A lot of unnecessary copies make it past the optimizers, and this needs to
   be fixed.

 - Many more regression tests are needed.

Normally, I'd fix these things prior to committing, but there are some
students and other contributors who would like to work this, and so it makes
sense to move this development process upstream where it can be subject to the
regular code-review procedures.

llvm-svn: 203768
2014-03-13 07:58:58 +00:00
Hal Finkel 7f908e8ef4 Fixup PPC Darwin i1 argument handling
Like on other targets, we need to zero_extend/truncate i1 args before copying
them to GPRs.

llvm-svn: 203045
2014-03-06 00:45:19 +00:00
Hal Finkel 2a9d318e4a When using CR bit registers on PPC32, handle the i1 vaarg case
When copying an i1 value into a GPR for a vaarg call, we need to explicitly
zero-extend the i1 value (otherwise an invalid CRBIT -> GPR copy will be
generated).

llvm-svn: 203041
2014-03-06 00:23:33 +00:00
Hal Finkel 6a56b21729 With PPC CR bit registers, handle int_to_fp on older cores
On cores without fpcvt support, we cannot promote int_to_fp i1 operations,
because there is nothing to promote them to. The most straightforward
implementation of this uses a select to choose between the two possible
resulting floating-point values (and that's what is done here).

llvm-svn: 203015
2014-03-05 22:14:00 +00:00
Hal Finkel 6aca2373f2 Add a PPC inline asm constraint type for single CR bits
Now that the PowerPC backend can track individual CR bits as first-class
registers, we should also have a way of allocating them for inline asm
statements. Because these registers are only one bit, if an output variable is
implicitly cast to a larger integer size, we'll get an any_extend to that
larger type (this is part of the existing target-independent logic). As a
result, regardless of the size of the output type, only the first bit is
meaningful.

The constraint identifier "wc" has been chosen for this purpose. Although gcc
does not currently support allocating individual CR bits, this identifier
choice has been coordinated with the gcc PowerPC team, and will be marked as
reserved for this purpose in the gcc constraints.md file.

llvm-svn: 202657
2014-03-02 18:23:39 +00:00
Benjamin Kramer b6d0bd48bd [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.
Remove the old functions.

llvm-svn: 202636
2014-03-02 12:27:27 +00:00
Hal Finkel 46043edc56 Remove extra truncs/exts around i32 bit operations on PPC64
This generalizes the code to eliminate extra truncs/exts around i1 bit
operations to also do the same on PPC64 for i32 bit operations. This eliminates
a fairly prevalent code wart:

int foo(int a) {
  return a == 5 ? 7 : 8;
}

On PPC64, because of the extension implied by the ABI, this would generate:

	cmplwi 0, 3, 5
	li 12, 8
	li 4, 7
	isel 3, 4, 12, 2
	rldicl 3, 3, 0, 32
	blr

where the 'rldicl 3, 3, 0, 32', the extension, is completely unnecessary. At
least for the single-BB case (which is all that the DAG combine mechanism can
handle), this unnecessary extension is no longer generated.

llvm-svn: 202600
2014-03-01 21:36:57 +00:00
Hal Finkel 5cae2168c7 Trying to unbreak the darwin11 builder
The CR bit tracking code broke PPC/Darwin; trying to get it working again...

(the darwin11 builder, which defaults to the darwin ABI when running PPC tests,
asserted when running test/CodeGen/PowerPC/inverted-bool-compares.ll)

llvm-svn: 202459
2014-02-28 01:17:25 +00:00
Hal Finkel 940ab934d4 Add CR-bit tracking to the PowerPC backend for i1 values
This change enables tracking i1 values in the PowerPC backend using the
condition register bits. These bits can be treated on PowerPC as separate
registers; individual bit operations (and, or, xor, etc.) are supported.
Tracking booleans in CR bits has several advantages:

 - Reduction in register pressure (because we no longer need GPRs to store
   boolean values).

 - Logical operations on booleans can be handled more efficiently; we used to
   have to move all results from comparisons into GPRs, perform promoted
   logical operations in GPRs, and then move the result back into condition
   register bits to be used by conditional branches. This can be very
   inefficient, because the throughput of these CR <-> GPR moves have high
   latency and low throughput (especially when other associated instructions
   are accounted for).

 - On the POWER7 and similar cores, we can increase total throughput by using
   the CR bits. CR bit operations have a dedicated functional unit.

Most of this is more-or-less mechanical: Adjustments were needed in the
calling-convention code, support was added for spilling/restoring individual
condition-register bits, and conditional branch instruction definitions taking
specific CR bits were added (plus patterns and code for generating bit-level
operations).

This is enabled by default when running at -O2 and higher. For -O0 and -O1,
where the ability to debug is more important, this feature is disabled by
default. Individual CR bits do not have assigned DWARF register numbers,
and storing values in CR bits makes them invisible to the debugger.

It is critical, however, that we don't move i1 values that have been promoted
to larger values (such as those passed as function arguments) into bit
registers only to quickly turn around and move the values back into GPRs (such
as happens when values are returned by functions). A pair of target-specific
DAG combines are added to remove the trunc/extends in:
  trunc(binary-ops(binary-ops(zext(x), zext(y)), ...)
and:
  zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...)
In short, we only want to use CR bits where some of the i1 values come from
comparisons or are used by conditional branches or selects. To put it another
way, if we can do the entire i1 computation in GPRs, then we probably should
(on the POWER7, the GPR-operation throughput is higher, and for all cores, the
CR <-> GPR moves are expensive).

POWER7 test-suite performance results (from 10 runs in each configuration):

SingleSource/Benchmarks/Misc/mandel-2: 35% speedup
MultiSource/Benchmarks/Prolangs-C++/city/city: 21% speedup
MultiSource/Benchmarks/MiBench/automotive-susan: 23% speedup
SingleSource/Benchmarks/CoyoteBench/huffbench: 13% speedup
SingleSource/Benchmarks/Misc-C++/Large/sphereflake: 13% speedup
SingleSource/Benchmarks/Misc-C++/mandel-text: 10% speedup

SingleSource/Benchmarks/Misc-C++-EH/spirit: 10% slowdown
MultiSource/Applications/lemon/lemon: 8% slowdown

llvm-svn: 202451
2014-02-28 00:27:01 +00:00
Matt Arsenault 25793a3f22 Add address space argument to allowsUnalignedMemoryAccess.
On R600, some address spaces have more strict alignment
requirements than others.

llvm-svn: 200887
2014-02-05 23:15:53 +00:00
Alp Toker cb40291100 Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

llvm-svn: 200018
2014-01-24 17:20:08 +00:00
Hal Finkel 3e4a34c8c3 Fix pointer info on PPC byval stores
For PPC64 SVR (and Darwin), the stores that take byval aggregate parameters
from registers into the stack frame had MachinePointerInfo objects with
incorrect offsets. These offsets are relative to the object itself, not to the
stack frame base.

This fixes self hosting on PPC64 when compiling with -enable-aa-sched-mi.

llvm-svn: 199763
2014-01-21 20:15:58 +00:00
Bill Wendling 13199b17f8 Remove unnecessary #includes.
llvm-svn: 198585
2014-01-06 06:00:00 +00:00
Bill Wendling 908bf814e7 Refactor function that checks that __builtin_returnaddress's argument is constant.
This moves the check up into the parent class so that all targets can use it
without having to copy (and keep in sync) the same error message.

llvm-svn: 198579
2014-01-06 00:43:20 +00:00
Bill Wendling df7dd28dc8 Emit an error message if the value passed to __builtin_returnaddress isn't a constant
__builtin_returnaddress requires that the value passed into is be a constant.
However, at -O0 even a constant expression may not be converted to a constant.
Emit an error message intead of crashing.

llvm-svn: 198531
2014-01-05 01:47:20 +00:00
Roman Divacky 32143e2bda Implement initial-exec TLS for PPC32.
llvm-svn: 197824
2013-12-20 18:08:54 +00:00
Alp Toker f907b891da Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

llvm-svn: 196471
2013-12-05 05:44:44 +00:00
Bill Schmidt cea1596205 [PowerPC] Fix PR17354: Generate nop after local calls for PIC code.
When generating code for shared libraries, even local calls may be
intercepted, so we need a nop after the call for the linker to fix up the
TOC.  Test case adapted from the one provided in PR17354.

llvm-svn: 191440
2013-09-26 17:09:28 +00:00
Bill Schmidt bdae03f227 [PowerPC] Add a FIXME.
Documenting a design choice to generate only medium model sequences for TLS
addresses at this time.  Small and large code models could be supported if
necessary.

llvm-svn: 190883
2013-09-17 20:22:05 +00:00
Hal Finkel 40c34781b5 PPC: Don't restrict lvsl generation to after type legalization
This is a re-commit of r190764, with an extra check to make sure that we're not
performing the transformation on illegal types (a small test case has been
added for this as well).

Original commit message:

The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.

llvm-svn: 190771
2013-09-15 22:09:58 +00:00
Hal Finkel 31025a6325 Revert r190764: PPC: Don't restrict lvsl generation to after type legalization
This is causing test-suite failures.

Original commit message:

The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.

llvm-svn: 190765
2013-09-15 15:41:11 +00:00
Hal Finkel 2945d4e916 PPC: Don't restrict lvsl generation to after type legalization
The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.

llvm-svn: 190764
2013-09-15 15:20:54 +00:00
Hal Finkel c3cfbf8677 Add missing break statement in PPCISelLowering
As it turns out, not a problem in practice, but it should be there.

llvm-svn: 190720
2013-09-13 20:09:02 +00:00
Chandler Carruth 51428e363f Remove an unused variable, fixing -Werror build with latest Clang.
llvm-svn: 190640
2013-09-12 23:30:48 +00:00
Hal Finkel 262a224712 Fix PPC ABI for ByVal structs with vector members
When a structure is passed by value, and that structure contains a vector
member, according to the PPC ABI, the structure will receive enhanced alignment
(so that the vector within the structure will always be aligned).

This should resolve PR16641.

llvm-svn: 190636
2013-09-12 23:20:06 +00:00
Hal Finkel 1e2e3ea584 Make the PPC fast-math sqrt expansion safe at 0
In fast-math mode sqrt(x) is calculated using the fast expansion of the
reciprocal of the reciprocal sqrt expansion. The reciprocal and reciprocal
sqrt expansions use the associated estimate instructions along with some Newton
iterations. Unfortunately, as a result, sqrt(0) was being calculated as NaN,
which is not correct. Now we explicitly return a result of zero if the input is
zero.

llvm-svn: 190624
2013-09-12 19:04:12 +00:00
Hal Finkel 21442b24fb Enable MI scheduling (and CodeGen AA) by default for embedded PPC cores
For embedded PPC cores (especially the A2 core), using the MI scheduler with AA
is far superior to the other scheduling options.

llvm-svn: 190558
2013-09-11 23:05:25 +00:00
Bill Schmidt 8470b0f96c [PowerPC] Call support for fast-isel.
This patch adds fast-isel support for calls (but not intrinsic calls
or varargs calls).  It also removes a badly-formed assert.  There are
some new tests just for calls, and also for folding loads into
arguments on calls to avoid extra extends.

llvm-svn: 189701
2013-08-30 22:18:55 +00:00
Bill Schmidt ccecf26157 [PowerPC] Add loads, stores, and related things to fast-isel.
This is the next big chunk of fast-isel code.  The primary purpose is
to implement selection of loads and stores, but there is a lot of
drag-along to support this.  The common code to analyze addresses for
both loads and stores is substantial.  It's also necessary to add the
materialization code for global values.

Related to load-store processing is the code to fold loads into
integer extends, since otherwise we generate lots of redundant
instructions.  We also need to add some overrides to some FastEmit
routines to ensure we don't assign GPR 0 to a virtual register when
this would change the meaning of an instruction.

I added handling selection of a few binary arithmetic instructions, to
enable committing some test cases I wrote a while back.

Finally, ap couple of miscellaneous changes:
 * I cleaned up some poor style from a previous patch in
   PPCISelLowering.cpp, pointed out by David Blaikie.
 * I enlarged the Addr.Offset field to avoid sign problems with 32-bit
   offsets. 

llvm-svn: 189636
2013-08-30 02:29:45 +00:00
Bill Schmidt 8c3976eca1 Dummy code to silence warning from 4189266
llvm-svn: 189272
2013-08-26 20:11:46 +00:00
Hal Finkel dbc78e1f73 Add the PPC fcpsgn instruction
Modern PPC cores support a floating-point copysign instruction, and we can use
this to lower the FCOPYSIGN node (which is created from calls to the libm
copysign function). A couple of extra patterns are necessary because the
operand types of FCOPYSIGN need not agree.

llvm-svn: 188653
2013-08-19 05:01:02 +00:00
Craig Topper 5671010cbb Replace getValueType().getSimpleVT() with getSimpleValueType(). Also remove one weird cast from MVT->EVT just to call getSimpleVT().
llvm-svn: 188441
2013-08-15 02:33:50 +00:00
Hal Finkel b3ca00d2a3 Actually fix PPC64 64-bit GPR inline asm constraint matching
This is a follow-up to r187693, correcting that code to request the correct
register class. The previous version, with the wrong register class, was not
really correcting the constraints, but rather was removing them. Coincidentally,
this fixed the failing test case in r187693, but obviously created other
problems.

llvm-svn: 188407
2013-08-14 20:05:04 +00:00
Hal Finkel 2b7b2f373b PPC: Map frin to round() not nearbyint() and rint()
Making use of the recently-added ISD::FROUND, which allows for custom lowering
of round(), the PPC backend will now map frin to round(). Previously, we had
been using frin to lower nearbyint() (and rint() via some custom lowering to
handle the extra fenv flags requirements), but only in fast-math mode because
frin does not tie-to-even. Several users had complained about this behavior,
and this new mapping of frin to round is certainly more appropriate (and does
not require fast-math mode).

In effect, this reverts r178362 (and part of r178337, replacing the nearbyint
mapping with the round mapping).

llvm-svn: 187960
2013-08-08 04:31:34 +00:00
Hal Finkel b176acb6b7 Fix PPC64 64-bit GPR inline asm constraint matching
Internally, the PowerPC backend names the 32-bit GPRs R[0-9]+, and names the
64-bit parent GPRs X[0-9]+. When matching inline assembly constraints with
explicit register names, on PPC64 when an i64 MVT has been requested, we need
to follow gcc's convention of using r[0-9]+ to refer to the 64-bit (parent)
registers.

At some point, we'll probably want to arrange things so that the generic code
in TargetLowering uses the AsmName fields declared in *RegisterInfo.td in order
to match these inline asm register constraints. If we do that, this change can
be reverted.

llvm-svn: 187693
2013-08-03 12:25:10 +00:00
Bill Schmidt 0cf702fa61 [PowerPC] Skeletal FastISel support for 64-bit PowerPC ELF.
This is the first of many upcoming patches for PowerPC fast
instruction selection support.  This patch implements the minimum
necessary for a functional (but extremely limited) FastISel pass.  It
allows the table-generated portions of the selector to be created and
used, but in most cases selection will fall back to the DAG selector.
None of the block terminator instructions are implemented yet, and
most interesting instructions require some special handling.
Therefore there aren't any new test cases with this patch.  There will
be quite a few tests coming with future patches.

This patch adds the make/CMake support for the new code (including
tablegen -gen-fast-isel) and creates the FastISel object for PPC64 ELF
only.  It instantiates the necessary virtual functions
(TargetSelectInstruction, TargetMaterializeConstant,
TargetMaterializeAlloca, tryToFoldLoadIntoMI, and FastLowerArguments),
but of these, only TargetMaterializeConstant contains any useful
implementation.  This is present since the table-generated code
requires the ability to materialize integer constants for some
instructions.

This patch has been tested by building and running the
projects/test-suite code with -O0.  All tests passed with the
exception of a couple of long-running tests that time out using -O0
code generation.

llvm-svn: 187399
2013-07-30 00:50:39 +00:00
Roman Divacky c3825df87e PPC32 va_list is an actual structure so va_copy needs to copy the whole
structure not just a pointer. This implements that and thus fixes va_copy
on PPC32. Fixes #15286. Both bug and patch by Florian Zeitz!

llvm-svn: 187158
2013-07-25 21:36:47 +00:00
Hal Finkel f05d6c7843 PPC: Add base-pointer support to builtin setjmp/longjmp
First, this changes the base-pointer implementation to remove an unnecessary
complication (and one that is incompatible with how builtin SjLj is
implemented): instead of using r31 as the base pointer when it is not needed as
a frame pointer, now the base pointer will always be r30 when needed.

Second, we introduce another pseudo register, BP, which is used just like the FP
pseudo register to refer to the base register before we know for certain what
register it will be.

Third, we now save BP into the jmp_buf, and restore r30 from that slot in
longjmp.  If the function that called setjmp did not use a base pointer, then
r30 will be overwritten by the setjmp-calling-function's restore code. FP
restoration (which is restored into r31) works the same way.

llvm-svn: 186545
2013-07-17 23:50:51 +00:00
Craig Topper b94011fd28 Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
llvm-svn: 186274
2013-07-14 04:42:23 +00:00
Hal Finkel 7ab3db52d3 PPC: Add a better comment about the i64 FI fixup
In discussing this change with Bill Schmidt, it was decided that the original
comment about negative FIs was incorrect. We'll still exclude them for now, but
now with a more-accurate explanation.

llvm-svn: 186005
2013-07-10 15:29:01 +00:00
Bill Schmidt 4122169308 [PowerPC] Better fix for PR16556.
A more complete example of the bug in PR16556 was recently provided,
showing that the previous fix was not sufficient.  The previous fix is
reverted herein.

The real problem is that ReplaceNodeResults() uses LowerFP_TO_INT as
custom lowering for FP_TO_SINT during type legalization, without
checking whether the input type is handled by that routine.
LowerFP_TO_INT requires the input to be f32 or f64, so we fail when
the input is ppcf128.

I'm leaving the test case from the initial fix (r185821) in place, and
adding the new test as another crash-only check.

llvm-svn: 185959
2013-07-09 18:50:20 +00:00
Stephen Lin 73de7bf5de AArch64/PowerPC/SystemZ/X86: This patch fixes the interface, usage, and all
in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in
order to resolve the following issues with fmuladd (i.e. optional FMA)
intrinsics:

1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd
intrinsics even if the subtarget does not support FMA instructions, leading
to laughably bad code generation in some situations.

2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128,
resulting in a call to a software fp128 FMA implementation.

3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types
like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize,
etc. to types that support hardware FMAs.

The function has also been slightly renamed for consistency and to force a
merge/build conflict for any out-of-tree target implementing it. To resolve,
see comments and fixed in-tree examples.

llvm-svn: 185956
2013-07-09 18:16:56 +00:00
Hal Finkel dbbf09b28e PPC: Allocate RS spill slot for unaligned i64 load/store
This fixes another bug found by llvm-stress!

If we happen to be doing an i64 load or store into a stack slot that has less
than a 4-byte alignment, then the frame-index elimination may need to use an
indexed load or store instruction (because the offset may not be a multiple of
4, a requirement of the STD/LD instructions). The extra register needed to hold
the offset comes from the register scavenger, and it is possible that the
scavenger will need to use an emergency spill slot. As a result, we need to
make sure that a spill slot is allocated when doing an i64 load/store into a
less-than-4-byte-aligned stack slot.

Because test cases for things like this tend to be fairly fragile, I've
concatenated a few small bugpoint-reduced test cases together to form the
regression test.

llvm-svn: 185907
2013-07-09 06:34:51 +00:00
Hal Finkel 21ada79757 PPC: Mark vector CC action for SETO and SETONE as Expand
Another bug found by llvm-stress! This fixes hitting
  llvm_unreachable("Invalid integer vector compare condition");
at the end of getVCmpInst in PPCISelDAGToDAG.

llvm-svn: 185855
2013-07-08 20:00:03 +00:00
Hal Finkel e39302258e PPC: Mark vector FREM as Expand by default
Another bug found by llvm-stress! This fixes crashing with:
  LLVM ERROR: Cannot select: v4f32 = frem ...

llvm-svn: 185840
2013-07-08 17:30:25 +00:00
Bill Schmidt 2db29ef467 [PowerPC] Fix PR16556 (handle undef ppcf128 in LowerFP_TO_INT).
PPCTargetLowering::LowerFP_TO_INT() expects its source operand to be
either an f32 or f64, but this is not checked.  A long double
(ppcf128) operand will normally be custom-lowered to a conversion to
f64 in this context.  However, this isn't the case for an UNDEF node.

This patch recognizes a ppcf128 as a legal source operand for
FP_TO_INT only if it's an undef, in which case it creates an undef of
the target type.

At some point we might want to do a wholesale custom lowering of
ISD::UNDEF when the type is ppcf128, but it's not really clear that's
a great idea, and probably more work than it's worth for a situation
that only arises in the case of a programming error.  At this point I
think simple is best.

The test case comes from PR16556, and is a crash-test only.

llvm-svn: 185821
2013-07-08 14:22:45 +00:00
Ulrich Weigand 5b427591d6 [PowerPC] Support @tls in the asm parser
This adds support for the last missing construct to parse TLS-related
assembler code:
   add 3, 4, symbol@tls

The ADD8TLS currently hard-codes the @tls into the assembler string.
This cannot be handled by the asm parser, since @tls is parsed as
a symbol variant.  This patch changes ADD8TLS to have the @tls suffix
printed as symbol variant on output too, which allows us to remove
the isCodeGenOnly marker from ADD8TLS.  This in turn means that we
can add a AsmOperand to accept @tls marked symbols on input.

As a side effect, this means that the fixup_ppc_tlsreg fixup type
is no longer necessary and can be merged into fixup_ppc_nofixup.

llvm-svn: 185692
2013-07-05 12:22:36 +00:00
Jakob Stoklund Olesen db429d9483 Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.
These exception-related opcodes are not used any longer.

llvm-svn: 185625
2013-07-04 13:54:20 +00:00
Jakob Stoklund Olesen a1f5b901a5 Revert r185595-185596 which broke buildbots.
Revert "Simplify landing pad lowering."
Revert "Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes."

llvm-svn: 185600
2013-07-04 00:26:30 +00:00
Jakob Stoklund Olesen f33ec531fa Remove the EXCEPTIONADDR, EHSELECTION, and LSDAADDR ISD opcodes.
These exception-related opcodes are not used any longer.

llvm-svn: 185596
2013-07-03 23:56:31 +00:00
Ulrich Weigand d5ebc626d5 [PowerPC] Always use mfocrf if available
When accessing just a single CR register, it is always preferable to
use mfocrf instead of mfcr, if the former is available on the CPU.

Current code makes that distinction in many, but not all places
where a single CR register value is retrieved.  One missing
location is PPCRegisterInfo::lowerCRSpilling.

To fix this and make this simpler in the future, this patch changes
the bulk of the back-end to always assume mfocrf is available and
simply generate it when needed.

On machines that actually do not support mfocrf, the instruction
is replaced by mfcr at the very end, in EmitInstruction.

This has the additional benefit that we no longer need the
MFCRpseud hack, since before EmitInstruction we always have
a MFOCRF instruction pattern, which already models data flow
as required.

The patch also adds the MFOCRF8 version of the instruction,
which was missing so far.

Except for the PPCRegisterInfo::lowerCRSpilling case, no change
in generated code intended.

llvm-svn: 185556
2013-07-03 17:05:42 +00:00
Chad Rosier 295bd43adb The getRegForInlineAsmConstraint function should only accept MVT value types.
llvm-svn: 184642
2013-06-22 18:37:38 +00:00