Commit Graph

24699 Commits

Author SHA1 Message Date
Arnold Schwaighofer 2a70c69d31 Revert series of sched model patches until I figure out what is going on.
llvm-svn: 183273
2013-06-04 22:35:17 +00:00
Arnold Schwaighofer 0024b8bd73 ARM sched model: Add VFP div instruction on Swift
llvm-svn: 183271
2013-06-04 22:16:08 +00:00
Arnold Schwaighofer 89901730b1 ARM sched model: Add SIMD/VFP load/store instructions on Swift
llvm-svn: 183270
2013-06-04 22:16:07 +00:00
Arnold Schwaighofer bc61f0912c ARM sched model: Add integer VFP/SIMD instructions on Swift
llvm-svn: 183269
2013-06-04 22:16:05 +00:00
Arnold Schwaighofer 83a4197085 ARM sched model: Add integer load/store instructions on Swift
llvm-svn: 183268
2013-06-04 22:16:04 +00:00
Arnold Schwaighofer f77ea45488 ARM sched model: Add integer arithmetic instructions on Swift
llvm-svn: 183267
2013-06-04 22:16:02 +00:00
Arnold Schwaighofer be3a06c85f ARM sched model: Cortex A9 - More InstRW sched resources
Add more InstRW mappings.

llvm-svn: 183266
2013-06-04 22:16:00 +00:00
Arnold Schwaighofer 76e2394799 ARM sched model: Add branch thumb instructions
llvm-svn: 183265
2013-06-04 22:15:59 +00:00
Arnold Schwaighofer 17359d9ba2 ARM sched model: Add branch thumb2 instructions
llvm-svn: 183264
2013-06-04 22:15:57 +00:00
Arnold Schwaighofer bdb5687468 ARM sched model: Add branch instructions
llvm-svn: 183263
2013-06-04 22:15:56 +00:00
Arnold Schwaighofer e971b08765 ARM sched model: Add preload thumb2 instructions
llvm-svn: 183262
2013-06-04 22:15:54 +00:00
Arnold Schwaighofer ab88312f51 ARM sched model: Add preload instructions
llvm-svn: 183261
2013-06-04 22:15:52 +00:00
Arnold Schwaighofer 83fa45629e ARM sched model: Add more ALU and CMP thumb instructions
llvm-svn: 183260
2013-06-04 22:15:51 +00:00
Arnold Schwaighofer 529c2be334 ARM sched model: Add more ALU and CMP thumb2 instructions
llvm-svn: 183259
2013-06-04 22:15:49 +00:00
Arnold Schwaighofer b6843f17eb ARM sched model: Add more ALU and CMP instructions
llvm-svn: 183258
2013-06-04 22:15:47 +00:00
Arnold Schwaighofer d5b9794a53 ARM sched model: Add divsion, loads, branches, vfp cvt
Add some generic SchedWrites and assign resources for Swift and Cortex A9.

llvm-svn: 183257
2013-06-04 22:15:46 +00:00
Arnold Schwaighofer 279c0aff1a ARMInstrInfo: Improve isSwiftFastImmShift
An instruction with less than 3 inputs is trivially a fast immediate shift.

llvm-svn: 183256
2013-06-04 22:15:43 +00:00
Venkatraman Govindaraju a54533ed78 Sparc: No functionality change. Cleanup whitespaces, comment formatting etc.,
llvm-svn: 183243
2013-06-04 18:33:25 +00:00
David Majnemer 452f1f97bd ARM: Fix crash in ARM backend inside of ARMConstantIslandPass
The ARM backend did not expect LDRBi12 to hold a constant pool operand.
Allow for LLVM to deal with the instruction similar to how it deals with
LDRi12.

This fixes PR16215.

llvm-svn: 183238
2013-06-04 17:46:15 +00:00
Vincent Lejeune 276ceb8d5f R600: Swizzle texture/export instructions
llvm-svn: 183229
2013-06-04 15:04:53 +00:00
Vladimir Medic ea381916b0 Test commit for user vmedic, to verify commit access. One line of comment is added to MipsAsmParser.cpp.
llvm-svn: 183215
2013-06-04 08:28:53 +00:00
Aaron Ballman 19978553d4 Silencing an MSVC warning about mixing bool and unsigned int.
llvm-svn: 183176
2013-06-04 01:03:03 +00:00
Tom Stellard 94593ee8c3 R600/SI: Add support for work item and work group intrinsics
llvm-svn: 183138
2013-06-03 17:40:18 +00:00
Tom Stellard ed882c2f1b R600/SI: Add a calling convention for compute shaders
llvm-svn: 183137
2013-06-03 17:40:11 +00:00
Tom Stellard 046039e81b R600/SI: Custom lower i64 sign_extend
llvm-svn: 183136
2013-06-03 17:40:03 +00:00
Tom Stellard 0518ff89ba R600/SI: Adjust some instructions' out register class after ISel
This is necessary to avoid generating VGPR to SGPR copies in some
cases.

llvm-svn: 183135
2013-06-03 17:39:58 +00:00
Tom Stellard bad1f59212 R600/SI: Handle REG_SEQUENCE in fitsRegClass()
llvm-svn: 183134
2013-06-03 17:39:54 +00:00
Tom Stellard b5a97004fb R600/SI: Handle nodes with glue results correctly SITargetLowering::foldOperands()
llvm-svn: 183133
2013-06-03 17:39:50 +00:00
Tom Stellard 2183b70523 R600/SI: Fixup CopyToReg register class in PostprocessISelDAG()
The CopyToReg nodes will sometimes try to copy a value from a VGPR to an
SGPR.  This kind of copy is not possible, so we need to detect
VGPR->SGPR copies and do something else.  The current strategy is to
replace these copies with VGPR->VGPR copies and hope that all the users
of CopyToReg can accept VGPRs as arguments.

llvm-svn: 183132
2013-06-03 17:39:46 +00:00
Tom Stellard 07a10a3d3f R600/SI: Add support for global loads
llvm-svn: 183131
2013-06-03 17:39:43 +00:00
Tom Stellard 556d9aa841 R600/SI: Rework MUBUF store instructions
The lowering of stores is now mostly handled in the tablegen files.  No
more BUFFER_STORE nodes I generated during legalization.

llvm-svn: 183130
2013-06-03 17:39:37 +00:00
Vincent Lejeune 91a942b93e R600: 3 op instructions have no write bit but the result are store in PV
llvm-svn: 183111
2013-06-03 15:56:12 +00:00
Vincent Lejeune eabf83e0a2 R600: CALL_FS consumes a stack size entry
llvm-svn: 183108
2013-06-03 15:44:42 +00:00
Vincent Lejeune f83df1f1cb R600: use capital letter for PV channel
llvm-svn: 183107
2013-06-03 15:44:35 +00:00
Vincent Lejeune a09873dda7 R600: Constraints input regs of interp_xy,_zw
llvm-svn: 183106
2013-06-03 15:44:16 +00:00
Ahmed Bougacha 05d53a018a X86: sub_xmm registers are 128 bits wide.
llvm-svn: 183103
2013-06-03 14:42:40 +00:00
Venkatraman Govindaraju f80d72f149 Sparc: Add support for indirect branch and blockaddress in Sparc backend.
llvm-svn: 183094
2013-06-03 05:58:33 +00:00
Venkatraman Govindaraju 774fe2e29a Sparc: When storing 0, use %g0 directly in the store instruction instead of
using two instructions (sethi and store).

llvm-svn: 183090
2013-06-03 00:21:54 +00:00
Venkatraman Govindaraju 0bbe1b210e Sparc: Combine add/or/sethi instruction with restore if possible.
llvm-svn: 183088
2013-06-02 21:48:17 +00:00
Venkatraman Govindaraju 3e8c7d98be Sparc: Perform leaf procedure optimization by default
llvm-svn: 183083
2013-06-02 02:24:27 +00:00
Venkatraman Govindaraju 28e2cd0e7e Sparc: Mark functions calling llvm.vastart and llvm.returnaddress intrinsics as non-leaf functions.
llvm-svn: 183079
2013-06-01 20:42:48 +00:00
Tim Northover 339bf154cc Revert r183069: "TMP: LEA64_32r fixing"
Very sorry, it was committed from the wrong branch by mistake.

llvm-svn: 183070
2013-06-01 10:23:46 +00:00
Tim Northover 57954f04b3 TMP: LEA64_32r fixing
llvm-svn: 183069
2013-06-01 10:21:54 +00:00
Tim Northover 3a1fd4c0ac X86: change MOV64ri64i32 into MOV32ri64
The MOV64ri64i32 instruction required hacky MCInst lowering because it
was allocated as setting a GR64, but the eventual instruction ("movl")
only set a GR32. This converts it into a so-called "MOV32ri64" which
still accepts a (appropriate) 64-bit immediate but defines a GR32.
This is then converted to the full GR64 by a SUBREG_TO_REG operation,
thus keeping everyone happy.

This fixes a typo in the opcode field of the original patch, which
should make the legact JIT work again (& adds test for that problem).

llvm-svn: 183068
2013-06-01 09:55:14 +00:00
Venkatraman Govindaraju 3521dcdcc4 [Sparc] Generate correct code for leaf functions with stack objects
llvm-svn: 183067
2013-06-01 04:51:18 +00:00
Ahmed Bougacha b1a4d9da3b Make SubRegIndex size mandatory, following r183020.
This also makes TableGen able to compute sizes/offsets of synthesized
indices representing tuples.

llvm-svn: 183061
2013-05-31 23:45:26 +00:00
Eric Christopher e1e57e5ebd Temporarily Revert "X86: change MOV64ri64i32 into MOV32ri64" as it
seems to have caused PR16192 and other JIT related failures.

llvm-svn: 183059
2013-05-31 23:30:45 +00:00
Benjamin Kramer fae7ff12d2 NVPTX: Don't even create a regalloc if we're not going to use it.
Fixes a leak found by valgrind.

llvm-svn: 183031
2013-05-31 19:21:58 +00:00
Ahmed Bougacha f1ed334d55 Add a way to define the bit range covered by a SubRegIndex.
NOTE: If this broke your out-of-tree backend, in *RegisterInfo.td, change
the instances of SubRegIndex that have a comps template arg to use the
ComposedSubRegIndex class instead.

In TableGen land, this adds Size and Offset attributes to SubRegIndex,
and the ComposedSubRegIndex class, for which the Size and Offset are
computed by TableGen. This also adds an accessor in MCRegisterInfo, and
Size/Offsets for the X86 and ARM subreg indices.

llvm-svn: 183020
2013-05-31 17:08:36 +00:00
Tim Northover 4d14144024 ARM: permit upper-case BE/LE on setend instruction
Patch by Amaury de la Vieuville.

llvm-svn: 183012
2013-05-31 15:58:45 +00:00
Tim Northover 4173e29a98 ARM: add fstmx and fldmx instructions for assembly
These instructions are deprecated oddities, but we still need to be able to
disassemble (and reassemble) them if and when they're encountered.

Patch by Amaury de la Vieuville.

llvm-svn: 183011
2013-05-31 15:55:51 +00:00
Tim Northover 1bb672da81 ARM: fix VEXT encoding corner case
The disassembly of VEXT instructions was too lax in the bits checked. This
fixes the case where the instruction affects Q-registers but a misaligned lane
was specified (should be UNDEFINED).

Patch by Amaury de la Vieuville

llvm-svn: 183003
2013-05-31 13:47:25 +00:00
Richard Sandiford 30efd87f6e [SystemZ] Don't use LOAD and STORE REVERSED for volatile accesses
Unlike most -- hopefully "all other", but I'm still checking -- memory
instructions we support, LOAD REVERSED and STORE REVERSED may access
the memory location several times.  This means that they are not suitable
for volatile loads and stores.

This patch is a prerequisite for better atomic load and store support.
The same principle applies there: almost all memory instructions we
support are inherently atomic ("block concurrent"), but LOAD REVERSED
and STORE REVERSED are exceptions.

Other instructions continue to allow volatile operands.  I will add
positive "allows volatile" tests at the same time as the "allows atomic
load or store" tests.

llvm-svn: 183002
2013-05-31 13:25:22 +00:00
Justin Holewinski dbb3b2f4b6 [NVPTX] Re-enable support for virtual registers in the final output
Now that 3.3 is branched, we are re-enabling virtual registers to help
iron out bugs before the next release. Some of the post-RA passes do
not play well with virtual registers, so we disable them for now. The
needed functionality of the PrologEpilogInserter pass is copied to a
new backend-specific NVPTXPrologEpilog pass.

The test for this commit is not breaking the existing tests.

llvm-svn: 182998
2013-05-31 12:14:49 +00:00
Tim Northover d4736d67f4 X86: change MOV64ri64i32 into MOV32ri64
The MOV64ri64i32 instruction required hacky MCInst lowering because it was
allocated as setting a GR64, but the eventual instruction ("movl") only set a
GR32. This converts it into a so-called "MOV32ri64" which still accepts a
(appropriate) 64-bit immediate but defines a GR32. This is then converted to
the full GR64 by a SUBREG_TO_REG operation, thus keeping everyone happy.

llvm-svn: 182991
2013-05-31 09:57:13 +00:00
Akira Hatanaka 2bf97336af [mips] Big-endian code generation for atomic instructions.
Patch by Jyun-Yan You.

llvm-svn: 182984
2013-05-31 03:25:44 +00:00
Rafael Espindola 99bd2ae479 Revert r182937 and r182877.
r182877 broke MCJIT tests on ARM and r182937 was working around another failure
by r182877.

This should make the ARM bots green.

llvm-svn: 182960
2013-05-30 20:37:52 +00:00
Tim Northover 64ec0ff433 X86: use sub-register sequences for MOV*r0 operations
Instead of having a bunch of separate MOV8r0, MOV16r0, ... pseudo-instructions,
it's better to use a single MOV32r0 (which will expand to "xorl %reg, %reg")
and obtain other sizes with EXTRACT_SUBREG and SUBREG_TO_REG. The encoding is
smaller and partial register updates can sometimes be avoided.

Until recently, this sequence was a barrier to rematerialization though. That
should now be fixed so it's an appropriate time to make the change.

llvm-svn: 182928
2013-05-30 13:19:42 +00:00
Justin Holewinski 994d66a345 [NVPTX] Fix case where a sext load of an i1 type may produce an
ld.u1 instead of an ld.u8.

llvm-svn: 182924
2013-05-30 12:22:39 +00:00
Tim Northover 04eb4234fc X86: change zext moves to use sub-register infrastructure.
32-bit writes on amd64 zero out the high bits of the corresponding 64-bit
register. LLVM makes use of this for zero-extension, but until now relied on
custom MCLowering and other code to fixup instructions. Now we have proper
handling of sub-registers, this can be done by creating SUBREG_TO_REG
instructions at selection-time.

Should be no change in functionality.

llvm-svn: 182921
2013-05-30 10:43:18 +00:00
Richard Sandiford 46af5a2cdc [SystemZ] Enable unaligned accesses
The code to distinguish between unaligned and aligned addresses was
already there, so this is mostly just a switch-on-and-test process.

llvm-svn: 182920
2013-05-30 09:45:42 +00:00
Andrew Trick ad6d08ac6f Order CALLSEQ_START and CALLSEQ_END nodes.
Fixes PR16146: gdb.base__call-ar-st.exp fails after
pre-RA-sched=source fixes.

Patch by Xiaoyi Guo!

This also fixes an unsupported dbg.value test case. Codegen was
previously incorrect but the test was passing by luck.

llvm-svn: 182885
2013-05-29 22:03:55 +00:00
Ahmed Bougacha 00e08db393 X86: Fix Defs/Uses for insts that imp-def/imp-use both an A-register and EFLAGS.
This corrects a problem where x86 instructions that implicitly define/use both
an A-register (RAX, EAX, ..) and EFLAGS were declared as only defining/using
EFLAGS, because the outer "let Defs/Uses = [EFLAGS]" in the various multiclasses
overrides the "let Defs/Uses = [areg]" in BinOpAI.

The instructions deriving from BinOpAI were moved out of the "let Defs", and a
BinOpAI_FF class was created, for instructions that implicitly define and use
EFLAGS and the A-register (SBC, ADC).

llvm-svn: 182883
2013-05-29 21:13:57 +00:00
Chad Rosier 33b736626e Don't assume the registers will be enumerated sequentially.
llvm-svn: 182879
2013-05-29 20:42:21 +00:00
JF Bastien f60e0e44ca Enable FastISel on ARM for Linux and NaCl
FastISel was only enabled for iOS ARM and Thumb2, this patch enables it
for ARM (not Thumb2) on Linux and NaCl.

Thumb2 support needs a bit more work, mainly around register class
restrictions.

The patch punts to SelectionDAG when doing TLS relocation on non-Darwin
targets. I will fix this and other FastISel-to-SelectionDAG failures in
a separate patch.

The patch also forces FastISel to retain frame pointers: iOS always
keeps them for backtracking (so emitted code won't change because of
this), but Linux was getting much worse code that was incorrect when
using big frames (such as test-suite's lencod). I'll also fix this in a
later patch, it will probably require a peephole so that FastISel
doesn't rematerialize frame pointers back-to-back.

The test changes are straightforward, similar to:
  http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html
They also add a vararg test that got dropped in that change.

I ran all of test-suite on A15 hardware with --optimize-option=-O0 and
all the tests pass.

llvm-svn: 182877
2013-05-29 20:38:10 +00:00
Bill Wendling 70b1400e6d Don't reach into the middle of TargetMachine and cache one of its ivars.
Not only does this break encapsulation, it's gross.

llvm-svn: 182876
2013-05-29 20:37:19 +00:00
JF Bastien 13969d0ab6 Tidy some register classes for ARM and Thumb
Tidy up three places where the register class for ARM and Thumb wasn't
restrictive enough:
 - No PC dest for reg-reg add/orr/sub.
 - No PC dest for shifts.
 - No PC or SP for Thumb2 reg-imm add.

I encountered this while combining FastISel with
-verify-machineinstrs. These instructions defined registers whose
classes weren't restrictive enough, and the uses failed
verification. They're also undefined in the ISA, or would produce code
that FastISel wouldn't want. This doesn't fix the register class
narrowing issue (where uses should restrict definitions), and isn't
thorough, but it's a small step in the right direction.

llvm-svn: 182863
2013-05-29 15:45:47 +00:00
NAKAMURA Takumi dbd3bbe126 SparcFrameLowering.cpp: Mark verifyLeafProcRegUse() as UNUSED. [-Wunused-function]
llvm-svn: 182850
2013-05-29 12:10:42 +00:00
Richard Sandiford e1d9f00f09 [SystemZ] Immediate compare-and-branch support
This patch adds support for the CIJ and CGIJ instructions.

llvm-svn: 182846
2013-05-29 11:58:52 +00:00
Patrik Hagglund ae8faf2e9a Temporary fix to get rid of gcc warning.
llvm-svn: 182832
2013-05-29 07:32:08 +00:00
Venkatraman Govindaraju ca0fe2f57e [Sparc] Add support for leaf functions in sparc backend.
llvm-svn: 182822
2013-05-29 04:46:31 +00:00
Jack Carter 0259300325 Mips assembler: Improve set register alias handling
This patch solves the problem of numeric register values not being accepted:

../set_alias.s:1:11: error: expected valid expression after comma
        .set    r4,$4
                    ^
The parsing of .set directive is changed and handling of symbols in code 
as well to enable this feature. 

The test example is added.

Patch by Vladimir Medic

llvm-svn: 182807
2013-05-28 22:21:05 +00:00
Tim Northover 8a1aa518a3 AArch64: clarify -help message
llvm-svn: 182804
2013-05-28 21:09:39 +00:00
Jyotsna Verma cceafb2d6d Hexagon: Typo fix.
llvm-svn: 182790
2013-05-28 19:01:45 +00:00
Richard Sandiford 0fb90ab0cb [SystemZ] Register compare-and-branch support
This patch adds support for the CRJ and CGRJ instructions.  Support for
the immediate forms will be a separate patch.

The architecture has a large number of comparison instructions.  I think
it's generally better to concentrate on using the "best" comparison
instruction first and foremost, then only use something like CRJ if
CR really was the natual choice of comparison instruction.  The patch
therefore opportunistically converts separate CR and BRC instructions
into a single CRJ while emitting instructions in ISelLowering.

llvm-svn: 182764
2013-05-28 10:41:11 +00:00
Richard Sandiford 53c9efd9c1 [SystemZ] Tweak SystemZInstrInfo::isBranch() interface
This is needed for the upcoming compare-and-branch patch.  No functional
change intended.

llvm-svn: 182762
2013-05-28 10:13:54 +00:00
Rafael Espindola f30f2cce50 Make helper functions static.
And remove header and cpp file that are empty after that.

llvm-svn: 182746
2013-05-27 22:34:59 +00:00
Preston Gurd 048f99de11 Convert sqrt functions into sqrt instructions when -ffast-math is in effect.
When -ffast-math is in effect (on Linux, at least), clang defines
__FINITE_MATH_ONLY__ > 0 when including <math.h>. This causes the
preprocessor to include <bits/math-finite.h>, which renames the sqrt functions.
For instance, "sqrt" is renamed as "__sqrt_finite". 

This patch adds the 3 new names in such a way that they will be treated
as equivalent to their respective original names.

llvm-svn: 182739
2013-05-27 15:44:35 +00:00
Hal Finkel 8ebfe6c263 PPC: Add a isConsecutiveLS utility function
isConsecutiveLS is a slightly more general form of
SelectionDAG::isConsecutiveLoad. Aside from also handling stores, it also does
not assume equality of the chain operands is necessary. In the case of the PPC
backend, this chain condition is checked in a more general way by the
surrounding code.

Mostly, this part of the refactoring in preparation for supporting optimized
unaligned stores.

llvm-svn: 182723
2013-05-27 02:06:39 +00:00
Hal Finkel 7d8a691b5d Prefer to duplicate PPC Altivec loads when expanding unaligned loads
When expanding unaligned Altivec loads, we use the decremented offset trick to
prevent page faults. Unfortunately, if we have a sequence of consecutive
unaligned loads, this leads to suboptimal code generation because the 'extra'
load from the first unaligned load can be combined with the base load from the
second (but only if the decremented offset trick is not used for the first).
Search up and down the chain, through loads and token factors, looking for
consecutive loads, and if one is found, don't use the offset reduction trick.
These duplicate loads are later combined to yield the desired sequence (in the
future, we might want a more-powerful chain search, but that will require some
changes to allow the combiner routines to access the AA object).

This should complete the initial implementation of the optimized unaligned
Altivec load expansion. There is some refactoring that should be done, but
that will happen when the unaligned store expansion is added.

llvm-svn: 182719
2013-05-26 18:08:30 +00:00
Hal Finkel bc2ee4c4e6 PPC: Combine duplicate (offset) lvsl Altivec intrinsics
The lvsl permutation control instruction is a function only of the alignment of
the pointer operand (relative to the 16-byte natural alignment of Altivec
vectors). As a result, multiple lvsl intrinsics where the operands differ by a
multiple of 16 can be combined.

llvm-svn: 182708
2013-05-25 04:05:05 +00:00
Andrew Trick e2431c64bc Track IR ordering of SelectionDAG nodes 3/4.
Remove the old IR ordering mechanism and switch to new one.  Fix unit
test failures.

llvm-svn: 182704
2013-05-25 03:08:10 +00:00
Andrew Trick ef9de2a739 Track IR ordering of SelectionDAG nodes 2/4.
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.

llvm-svn: 182703
2013-05-25 02:42:55 +00:00
Hal Finkel cf2e908014 PPC: Initial support for permutation-based unaligned Altivec loads
Altivec only directly supports aligned loads, but the loads have a strange
property: If given an unaligned address, they truncate the address to the next
lower aligned address, and load from there.  This property, along with an extra
load and some special-purpose permutation-control instructions that generate
the appropriate permutations from the original unaligned address, allow
efficient lowering of aligned loads. This code uses the trick explained in the
Apple Velocity Engine optimization overview document to prevent the needed
extra load from possibly causing a page fault if the original address happens
to be aligned.

As noted in the FIXMEs, there are several additional optimizations that can be
performed to reduce the cost of these loads even more. These will be
implemented in future commits.

llvm-svn: 182691
2013-05-24 23:00:14 +00:00
Quentin Colombet f482805c28 Follow up of the introduction of MCSymbolizer.
- Ressurect old MCDisassemble API to soften transition.
- Extend MCTargetDesc to set target specific symbolizer.

llvm-svn: 182688
2013-05-24 22:51:52 +00:00
Michael J. Spencer df1ecbd734 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
llvm-svn: 182680
2013-05-24 22:23:49 +00:00
Richard Sandiford dc5ed71353 [SystemZ] Improve AsmParser handling of invalid instructions
Previously, an invalid instruction like:

	foo     %r1, %r0

would generate the rather odd error message:

....: error: unknown token in expression
	foo     %r1, %r0
		^

We now get the more informative:

....: error: invalid instruction
	foo     %r1, %r0
	^

The same would happen if an address were used where a register was expected.
We now get "invalid operand for instruction" instead.

llvm-svn: 182644
2013-05-24 14:26:46 +00:00
Richard Sandiford 675f86996a [SystemZ] Improve AsmParser register parsing
The idea is to make sure that:

(1) "register expected" is restricted to cases where ParseRegister()
    is called and the token obviously isn't a register.

(2) "invalid register" is restricted to cases where a register-like "%..."
    sequence is found, but the "..." makes no sense.

(3) the generic "invalid operand for instruction" is used in cases where
    the wrong register type is used (GPR instead of FPR, etc.).

(4) the new "invalid register pair" is used if the register has the right type,
    but is not a valid register pair.

Testing of (1)-(3) is now restricted to regs-bad.s.  It uses a representative
instruction for each register class to make sure that only registers from
that class are accepted.

(4) is tested by both regs-bad.s (which checks all invalid register pairs)
and insn-bad.s (which tests one invalid pair for each instruction that
requires a pair).

While there, I changed "Number" to "Num" for consistency with the
operand class.

llvm-svn: 182643
2013-05-24 14:14:38 +00:00
Benjamin Kramer 534d3a4670 Remove the Copied parameter from MemoryObject::readBytes.
There was exactly one caller using this API right, the others were relying on
specific behavior of the default implementation. Since it's too hard to use it
right just remove it and standardize on the default behavior.

Defines away PR16132.

llvm-svn: 182636
2013-05-24 10:54:58 +00:00
Ahmed Bougacha aa79068157 MC: Disassembled CFG reconstruction.
This patch builds on some existing code to do CFG reconstruction from
a disassembled binary:
- MCModule represents the binary, and has a list of MCAtoms.
- MCAtom represents either disassembled instructions (MCTextAtom), or
  contiguous data (MCDataAtom), and covers a specific range of addresses.
- MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is
  backed by an MCTextAtom, and has the usual successors/predecessors.
- MCObjectDisassembler creates a module from an ObjectFile using a
  disassembler. It first builds an atom for each section. It can also
  construct the CFG, and this splits the text atoms into basic blocks.

MCModule and MCAtom were only sketched out; MCFunction and MCBB were
implemented under the experimental "-cfg" llvm-objdump -macho option.
This cleans them up for further use; llvm-objdump -d -cfg now generates
graphviz files for each function found in the binary.

In the future, MCObjectDisassembler may be the right place to do
"intelligent" disassembly: for example, handling constant islands is just
a matter of splitting the atom, using information that may be available
in the ObjectFile. Also, better initial atom formation than just using
sections is possible using symbols (and things like Mach-O's
function_starts load command).

This brings two minor regressions in llvm-objdump -macho -cfg:
- The printing of a relocation's referenced symbol.
- An annotation on loop BBs, i.e., which are their own successor.

Relocation printing is replaced by the MCSymbolizer; the basic CFG
annotation will be superseded by more related functionality.

llvm-svn: 182628
2013-05-24 01:07:04 +00:00
Ahmed Bougacha ad1084de84 Add MCSymbolizer for symbolic/annotated disassembly.
This is a basic first step towards symbolization of disassembled
instructions. This used to be done using externally provided (C API)
callbacks. This patch introduces:
- the MCSymbolizer class, that mimics the same functions that were used
  in the X86 and ARM disassemblers to symbolize immediate operands and
  to annotate loads based off PC (for things like c string literals).
- the MCExternalSymbolizer class, which implements the old C API.
- the MCRelocationInfo class, which provides a way for targets to
  translate relocations (either object::RelocationRef, or disassembler
  C API VariantKinds) to MCExprs.
- the MCObjectSymbolizer class, which does symbolization using what it
  finds in an object::ObjectFile. This makes simple symbolization (with
  no fancy relocation stuff) work for all object formats!
- x86-64 Mach-O and ELF MCRelocationInfos.
- A basic ARM Mach-O MCRelocationInfo, that provides just enough to
  support the C API VariantKinds.

Most of what works in otool (the only user of the old symbolization API
that I know of) for x86-64 symbolic disassembly (-tvV) works, namely:
- symbol references: call _foo; jmp 15 <_foo+50>
- relocations:       call _foo-_bar; call _foo-4
- __cf?string:       leaq 193(%rip), %rax ## literal pool for "hello"
Stub support is the main missing part (because libObject doesn't know,
among other things, about mach-o indirect symbols).

As for the MCSymbolizer API, instead of relying on the disassemblers
to call the tryAdding* methods, maybe this could be done automagically
using InstrInfo? For instance, even though PC-relative LEAs are used
to get the address of string literals in a typical Mach-O file, a MOV
would be used in an ELF file. And right now, the explicit symbolization
only recognizes PC-relative LEAs. InstrInfo should have already have
most of what is needed to know what to symbolize, so this can
definitely be improved.

I'd also like to remove object::RelocationRef::getValueString (it seems
only used by relocation printing in objdump), as simply printing the
created MCExpr is definitely enough (and cleaner than string concats).

llvm-svn: 182625
2013-05-24 00:39:57 +00:00
Ulrich Weigand 9948546923 [PowerPC] Remove symbolLo/symbolHi instruction operand types
Now that there is no longer any distinction between symbolLo
and symbolHi operands in either printing, encoding, or parsing,
the operand types can be removed in favor of simply using
s16imm.

This completes the patch series to decouple lo/hi operand part
processing from the particular instruction whose operand it is.

No change in code generation expected from this patch.

llvm-svn: 182618
2013-05-23 22:48:06 +00:00
Ulrich Weigand 41789de165 [PowerPC] Clean up generation of ha16() / lo16() markers
When targeting the Darwin assembler, we need to generate markers ha16() and
lo16() to designate the high and low parts of a (symbolic) immediate.  This
is necessary not just for plain symbols, but also for certain symbolic
expression, typically along the lines of ha16(A - B).  The latter doesn't
work when simply using VariantKind flags on the symbol reference.
This is why the current back-end uses hacks (explicitly called out as such
via multiple FIXMEs) in the symbolLo/symbolHi print methods.

This patch uses target-defined MCExpr codes to represent the Darwin
ha16/lo16 constructs, following along the lines of the equivalent solution
used by the ARM back end to handle their :upper16: / :lower16: markers.
This allows us to get rid of special handling both in the symbolLo/symbolHi
print method and in the common code MCExpr::print routine.  Instead, the
ha16 / lo16 markers are printed simply in a custom print routine for the
target MCExpr types.  (As a result, the symbolLo/symbolHi print methods
can now replaced by a single printS16ImmOperand routine that also handles
symbolic operands.)

The patch also provides a EvaluateAsRelocatableImpl routine to handle
ha16/lo16 constructs.  This is not actually used at the moment by any
in-tree code, but is provided as it makes merging into David Fang's
out-of-tree Mach-O object writer simpler.

Since there is no longer any need to treat VK_PPC_GAS_HA16 and
VK_PPC_DARWIN_HA16 differently, they are merged into a single
VK_PPC_ADDR16_HA (and likewise for the _LO16 types).

llvm-svn: 182616
2013-05-23 22:26:41 +00:00
Tim Northover bc93308489 ARM: implement @llvm.readcyclecounter intrinsic
This implements the @llvm.readcyclecounter intrinsic as the specific
MRC instruction specified in the ARM manuals for CPUs with the Power
Management extensions.

Older CPUs had slightly different methods which may also have to be
implemented eventually, but this should cover all v7 cases.

rdar://problem/13939186

llvm-svn: 182603
2013-05-23 19:11:20 +00:00
Tim Northover cedd48183f ARM: Add Performance Monitor Extensions feature
Performance monitors, including a basic cycle counter, are an official
extension in the ARMv7 specification. This adds support for enabling and
disabling them, orthogonally from CPU selection.

rdar://problem/13939186

llvm-svn: 182602
2013-05-23 19:11:14 +00:00
Tom Stellard 1b086cbcb8 R600: Fix R600ControlFlowFinalizer not considering VTX_READ 128 bit dst reg
Patch by: Vincent Lejeune

https://bugs.freedesktop.org/show_bug.cgi?id=64877

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 182600
2013-05-23 18:26:42 +00:00
Benjamin Kramer d78bb468bd Move passes from namespace llvm into anonymous namespaces. Sort includes while there.
llvm-svn: 182594
2013-05-23 17:10:37 +00:00
Benjamin Kramer ad5c24f161 More symbols that should be static.
llvm-svn: 182590
2013-05-23 16:09:15 +00:00
Benjamin Kramer e79beacb32 Hexagon: Make helper functions static.
llvm-svn: 182588
2013-05-23 15:43:11 +00:00
Benjamin Kramer 635e368e33 R600: Hide symbols of implementation details.
Also removes an unused function.

llvm-svn: 182587
2013-05-23 15:43:05 +00:00
Aaron Ballman 15f193a1a3 Setting the default value (fixes CRT assertions about uninitialized variable use when doing debug MSVC builds), and fixing coding style.
llvm-svn: 182585
2013-05-23 14:55:00 +00:00
Rafael Espindola 00345fa97b Fix 32 bit build in c++11 mode.
The error was:
error: non-constant-expression cannot be narrowed from type 'long long' to 'long' in initializer list [-Wc++11-narrowing]
        MI.getOperand(6).getImm() & 0x1F,

llvm-svn: 182584
2013-05-23 13:22:30 +00:00
Rafael Espindola 39aca620db Fix a leak on the r600 backend.
This should bring the valgrind bot back to life.

llvm-svn: 182561
2013-05-23 03:31:47 +00:00
Rafael Espindola bd6847fbea clang-format this file.
llvm-svn: 182560
2013-05-23 03:28:39 +00:00
Chad Rosier abdb1d69ab Simplify logic now that r182490 is in place. No functional change intended.
llvm-svn: 182531
2013-05-22 23:17:36 +00:00
Bill Schmidt f88571e027 Change some PowerPC PatLeaf definitions to ImmLeaf for fast-isel.
Using PatLeaf rather than ImmLeaf when defining immediate predicates
prevents simple patterns using those predicates from being recognized
for fast instruction selection.  This patch replaces the immSExt16
PatLeaf predicate with two ImmLeaf predicates, imm32SExt16 and
imm64SExt16, allowing a few more patterns to be recognized (ADDI,
ADDIC, MULLI, ADDI8, and ADDIC8).  Using the new predicates does not
help for LI, LI8, SUBFIC, and SUBFIC8 because these are rejected for
other reasons, but I see no reason to retain the PatLeaf predicate.

No functional change intended, and thus no test cases yet.  This is
preliminary work for enabling fast-isel support for PowerPC.  When
that support is ready, we'll be able to test this function.

llvm-svn: 182510
2013-05-22 20:09:24 +00:00
Nadav Rotem 7b66c47051 X86: Fix a bug in EltsFromConsecutiveLoads. We can't generate new loads without chains.
llvm-svn: 182507
2013-05-22 19:28:41 +00:00
Benjamin Kramer d76cc186fc X86: When expanding PCMPGTQ to PCMPGTD we always want to compare the lower halves as unsigned.
Take #2 on fixing PR15977.

llvm-svn: 182486
2013-05-22 17:01:12 +00:00
Rafael Espindola e3d83fb8c3 Fix use after free (pr16103).
llvm-svn: 182482
2013-05-22 15:31:11 +00:00
Rafael Espindola ebd8e38849 Check that a function starts with llvm. before using GET_FUNCTION_RECOGNIZER.
Fixes a use of uninitialized memory found by asan and valgind.

llvm-svn: 182480
2013-05-22 14:57:42 +00:00
Richard Sandiford 14a4449589 [SystemZ] Rename PSW to CC
Addresses a review comment from Ulrich Weigand.  No functional change intended.

I'm not sure whether the old TODO that this patch touches still holds,
but that's something we'd get to when adding a targetted scheduling
description.

llvm-svn: 182474
2013-05-22 13:38:45 +00:00
Richard Sandiford 03528f346a [SystemZ] Fix thinko in long branch pass
The original version of the pass could underestimate the length of a backward
branch in cases like:

    alignment to N bytes or more
    ...
    relaxable branch A
    ...
 foo: (aligned to M<N bytes)
    ...
 bar: (aligned to N bytes)
    ...
    relaxable branch B to foo

We don't add any misalignment gap for "bar" because N bytes of alignment
had already been reached earlier in the function.  In this case, assuming
that A is relaxed can push "foo" closer to "bar", and make B appear to be
in range.  Similar problems can occur for forward branches.

I don't think it's possible to create blocks with mixed alignments as
things stand, not least because we haven't yet defined getPrefLoopAlignment()
for SystemZ (that would need benchmarking).  So I don't think we can test
this yet.

Thanks to Rafael Espíndola for spotting the bug.

llvm-svn: 182460
2013-05-22 09:57:57 +00:00
David Majnemer 7ea2a52a0c X86: Remove test instructions proceeding shift by immediate instructions
Allow LLVM to take advantage of shift instructions that set the ZF flag,
making instructions that test the destination superfluous.

llvm-svn: 182454
2013-05-22 08:13:02 +00:00
NAKAMURA Takumi 4f328e1c2f R600ISelLowering.cpp: Avoid "using namespace Intrinsic;" to appease MSC. Specify namespaces explicitly here.
MSC is confused about "memcpy" between <cstring> and llvm::Intrinsic::memcpy, when llvm::Intrinsic were exposed.

llvm-svn: 182452
2013-05-22 06:37:31 +00:00
NAKAMURA Takumi 18ca09c1cc R600: Whitespace and untabify.
llvm-svn: 182451
2013-05-22 06:37:25 +00:00
Owen Anderson 616852848a Create an FPOW SDNode opcode def in the target independent .td file rather than in a specific backend.
llvm-svn: 182450
2013-05-22 06:36:09 +00:00
Rafael Espindola 21ea01d132 Attempt to fix the mingw32 bot.
This should hopefully fix
http://lab.llvm.org:8011/builders/clang-x86_64-darwin11-self-mingw32

llvm-svn: 182446
2013-05-22 02:30:47 +00:00
Rafael Espindola 525cf28652 s/u_int32_t/uint32_t/
llvm-svn: 182444
2013-05-22 01:36:19 +00:00
Rafael Espindola f568827654 Fix warning in non-assert build.
llvm-svn: 182443
2013-05-22 01:29:38 +00:00
Reed Kotler c6c7e4a67c Mips16 does not use register scavenger from TargetRegisterInfo. It allocates
a RegScavenger object on it's own.
 

llvm-svn: 182430
2013-05-21 22:06:02 +00:00
Akira Hatanaka be76cd0b8e [mips] Rename option to make it compatible with gcc.
llvm-svn: 182397
2013-05-21 17:17:59 +00:00
Akira Hatanaka 6871031be9 [mips] Add instruction selection patterns for blez and bgez.
llvm-svn: 182396
2013-05-21 17:13:47 +00:00
Justin Holewinski 48f4ad3fc0 [NVPTX] Add @llvm.nvvm.sqrt.f() intrinsic
llvm-svn: 182394
2013-05-21 16:51:30 +00:00
Jyotsna Verma 1b056e422c Hexagon: SelectionDAG should not use MVT::Other to check the legality of BR_CC.
llvm-svn: 182390
2013-05-21 15:54:32 +00:00
Hal Finkel c5211291f1 Fix PPC branch selection for counter-based branches
Although I had added some support for the BDZ/BDNZ branches into the selector
(in r158204), I had not correctly adjusted the condition at the top of the
loop. As a result, these branches were still essentially unsupported.

This fixes PR16086. Unfortunately, any test case would be very large (because
it would need to force the loop backedge to exceed the range of the 16-bit
immediate).

llvm-svn: 182385
2013-05-21 14:21:09 +00:00
Elena Demikhovsky 0dd4025ae9 removed commented lines
llvm-svn: 182377
2013-05-21 13:27:44 +00:00
Elena Demikhovsky fad029202f Removed SSEPacked domain from all forms (AVX, SSE, signed, unsigned) scalar compare instructions, like COMISS, COMISD.
No functional changes.

llvm-svn: 182371
2013-05-21 12:04:22 +00:00
Benjamin Kramer 18ef6b22b9 X86: When emulating unsigned PCMPGTQ with PCMPGTD, fix the sign bit for the smaller type.
Otherwise we'll get a mix of signed and unsigned compares.
Fixes PR15977.

llvm-svn: 182364
2013-05-21 09:58:54 +00:00
Richard Sandiford 3b105a063f Fix indentation
llvm-svn: 182356
2013-05-21 08:48:24 +00:00
Reed Kotler 0fed8d4ef7 Add some additional functions to the list of helper functions for
pic calls. These need to be there so we don't try and use helper
functions when we call those.

As part of this, make sure that we properly exclude helper functions in pic
mode when indirect calls are involved.

llvm-svn: 182343
2013-05-21 00:50:30 +00:00
Hal Finkel a969df84ab Rename LoopSimplify.h to LoopUtils.h
As discussed, LoopUtils.h is a better name.

llvm-svn: 182314
2013-05-20 20:46:30 +00:00
Akira Hatanaka 5de4416962 [mips] Add (setne $lhs, 0) instruction selection pattern.
llvm-svn: 182307
2013-05-20 18:18:07 +00:00
Akira Hatanaka 1cb024207f [mips] Trap on integer division by zero.
By default, a teq instruction is inserted after integer divide. No divide-by-zero
checks are performed if option "-mnocheck-zero-division" is used.

llvm-svn: 182306
2013-05-20 18:07:43 +00:00
Hal Finkel e6d7c285b3 Remove copied preheader insertion logic from PPCCTRLoops
Now that the preheader insertion logic in LoopSimplify is externally exposed,
use it, and remove the copy-and-pasted version.

No functionality change intended.

llvm-svn: 182300
2013-05-20 16:47:10 +00:00
Justin Holewinski 4c47d87ba6 [NVPTX] Fix mis-use of CurrentFnSym in NVPTXAsmPrinter. This was causing a symbol name error in the output PTX.
llvm-svn: 182298
2013-05-20 16:42:18 +00:00
Justin Holewinski 18f3a1ffe6 [NVPTX] Add programmatic interface to NVVMReflect pass
llvm-svn: 182297
2013-05-20 16:42:16 +00:00
Hal Finkel 0859ef29d5 Rename PPC MTCTRse to MTCTRloop
As the pairing of this instruction form with the bdnz/bdz branches is now
enforced by the verification pass, make it clear from the name that these
are used only for counter-based loops.

No functionality change intended.

llvm-svn: 182296
2013-05-20 16:08:37 +00:00
Hal Finkel 8ca3884147 Add a PPCCTRLoops verification pass
When asserts are enabled, this adds a verification pass for PPC counter-loop
formation. Unfortunately, without sacrificing code quality, there is no better
way of forming counter-based loops except at the (late) IR level. This means
that we need to recognize, at the IR level, anything which might turn into a
function call (or indirect branch). Because this is currently a finite set of
things, and because SelectionDAG lowering is basic-block local, this can be
done. Nevertheless, it is fragile, and failure results in a miscompile. This
verification pass checks that all (reachable) counter-based branches are
dominated by a loop mtctr instruction, and that no instructions in between
clobber the counter register. If these conditions are not satisfied, then an
ICE will be triggered.

In short, this is to help us sleep better at night.

llvm-svn: 182295
2013-05-20 16:08:17 +00:00
Benjamin Kramer 927ca942ce R600: Fix bug detected by GCC warning.
R600TextureIntrinsicsReplacer.cpp:232: warning: the address of ‘ArgsType’ will always evaluate as ‘true’

This doesn't have any effect on the output as a vararg intrinsic behaves the
same way as a non-vararg one.

llvm-svn: 182293
2013-05-20 15:58:43 +00:00
Tom Stellard f1ee716446 R600/SI: Use a multiclass for MUBUF_Load_Helper
This will simplify the instructions and also the pattern definitions.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182288
2013-05-20 15:02:31 +00:00
Tom Stellard b8458f88d6 R600/SI: Add a pattern for S_LOAD_DWORDX2_* instructions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182287
2013-05-20 15:02:28 +00:00
Tom Stellard d2eebf001e R600/SI: Add pattern for rotr
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182286
2013-05-20 15:02:24 +00:00
Tom Stellard 5643c4ac72 R600: Swap the legality of rotl and rotr
The hardware supports rotr and not rotl.

llvm-svn: 182285
2013-05-20 15:02:19 +00:00
Tom Stellard 1cfd7a50bb R600/SI: Add patterns for 64-bit shift operations
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182284
2013-05-20 15:02:12 +00:00
Tom Stellard 459a79a81c R600/SI: Use the same names for VOP3 operands and encoding fields
This makes it possible to reorder the operands without breaking the
encoding.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182283
2013-05-20 15:02:08 +00:00
Tom Stellard b35efba4d9 R600/SI: Make fitsRegClass() operands const
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182282
2013-05-20 15:02:01 +00:00
Mihai Popa f41e3f56a5 VSTn instructions have a number of encoding constraints which are not implemented. I have added these using wrapper methods around the original custom decoder (incidentally - this is a huge poorly written method that should be cleaned up. I have left it as is since the changes would be much to hard to review).
llvm-svn: 182281
2013-05-20 14:57:05 +00:00
Mihai Popa dcf0922720 Q registers are encoded in fields of the same length as D registers. As Q registers are half as many, the ARM reference manual mandates the least significant bit to be zeroed out. Failure to do so should result in an undefined instruction. With this change test/MC/Disassembler/ARM/invalid-VQADD-arm.txt is passing (removed XFAIL).
llvm-svn: 182279
2013-05-20 14:42:43 +00:00
Richard Sandiford 312425f32d [SystemZ] Add long branch pass
Before this change, the SystemZ backend would use BRCL for all branches
and only consider shortening them to BRC when generating an object file.
E.g. a branch on equal would use the JGE alias of BRCL in assembly output,
but might be shortened to the JE alias of BRC in ELF output.  This was
a useful first step, but it had two problems:

(1) The z assembler isn't traditionally supposed to perform branch shortening
    or branch relaxation.  We followed this rule by not relaxing branches
    in assembler input, but that meant that generating assembly code and
    then assembling it would not produce the same result as going directly
    to object code; the former would give long branches everywhere, whereas
    the latter would use short branches where possible.

(2) Other useful branches, like COMPARE AND BRANCH, do not have long forms.
    We would need to do something else before supporting them.

    (Although COMPARE AND BRANCH does not change the condition codes,
    the plan is to model COMPARE AND BRANCH as a CC-clobbering instruction
    during codegen, so that we can safely lower it to a separate compare
    and long branch where necessary.  This is not a valid transformation
    for the assembler proper to make.)

This patch therefore moves branch relaxation to a pre-emit pass.
For now, calls are still shortened from BRASL to BRAS by the assembler,
although this too is not really the traditional behaviour.

The first test takes about 1.5s to run, and there are likely to be
more tests in this vein once further branch types are added.  The feeling
on IRC was that 1.5s is a bit much for a single test, so I've restricted
it to SystemZ hosts for now.

The patch exposes (and fixes) some typos in the main CodeGen/SystemZ tests.
A later patch will remove the {{g}}s from that directory.

llvm-svn: 182274
2013-05-20 14:23:08 +00:00
Justin Holewinski 01f89f0428 [NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputs
This converter currently only handles global variables in address space 0. For
these variables, they are promoted to address space 1 (global memory), and all
uses are updated to point to the result of a cvta.global instruction on the new
variable.

The motivation for this is address space 0 global variables are illegal since we
cannot declare variables in the generic address space.  Instead, we place the
variables in address space 1 and explicitly convert the pointer to address
space 0. This is primarily intended to help new users who expect to be able to
place global variables in the default address space.

llvm-svn: 182254
2013-05-20 12:13:32 +00:00
Justin Holewinski 700b6fa934 [NVPTX] Fix i1 kernel parameters and global variables. ABI rules say we need to use .u8 for i1 parameters for kernels.
llvm-svn: 182253
2013-05-20 12:13:28 +00:00
Stepan Dyatkovskiy d0e34a200f PR15868 fix.
Introduction:
In case when stack alignment is 8 and GPRs parameter part size is not N*8:
we add padding to GPRs part, so part's last byte must be recovered at
address K*8-1.
We need to do it, since remained (stack) part of parameter starts from
address K*8, and we need to "attach" "GPRs head" without gaps to it:

Stack:
|---- 8 bytes block ----| |---- 8 bytes block ----| |---- 8 bytes...
[ [padding] [GPRs head] ] [ ------ Tail passed via stack  ------ ...

FIX:
Note, once we added padding we need to correct *all* Arg offsets that are going
after padded one. That's why we need this fix: Arg offsets were never corrected
before this patch. See new test-cases included in patch.

We also don't need to insert padding for byval parameters that are stored in GPRs
only. We need pad only last byval parameter and only in case it outsides GPRs
and stack alignment = 8.
Though, stack area, allocated for recovered byval params, must satisfy
"Size mod 8 = 0" restriction.

This patch reduces stack usage for some cases:
We can reduce ArgRegsSaveArea since inner N*4 bytes sized byval params my be
"packed" with alignment 4 in some cases.

llvm-svn: 182237
2013-05-20 08:01:34 +00:00
Jakob Stoklund Olesen f927800325 Also expand 64-bit bitcasts.
llvm-svn: 182229
2013-05-20 01:01:43 +00:00
Jakob Stoklund Olesen c7bc5fbc5c Implement spill and fill of I64Regs.
llvm-svn: 182228
2013-05-20 00:53:25 +00:00
Jakob Stoklund Olesen 751e9b8407 Mark i64 SETCC as expand so it is turned into a SELECT_CC.
llvm-svn: 182227
2013-05-20 00:28:36 +00:00
Benjamin Kramer 8bad66e586 Replace some bit operations with simpler ones. No functionality change.
llvm-svn: 182226
2013-05-19 22:01:57 +00:00
Jakob Stoklund Olesen 86c5469d26 Don't use %g0 to materialize 0 directly.
The wired physreg doesn't work on tied operands like on MOVXCC.

Add a README note to fix this later.

llvm-svn: 182225
2013-05-19 21:47:13 +00:00
Jakob Stoklund Olesen 92ebf1153e Select i64 values with %icc conditions.
llvm-svn: 182224
2013-05-19 20:38:21 +00:00
Jakob Stoklund Olesen 7ca944b9db Add floating point selects on %xcc predicates.
llvm-svn: 182222
2013-05-19 20:33:11 +00:00
Jakob Stoklund Olesen 4a78c86a6a Implement SPselectfcc for i64 operands.
Also clean up the arguments to all the MOVCC instructions so the
operands always are (true-val, false-val, cond-code).

llvm-svn: 182221
2013-05-19 20:20:54 +00:00
Venkatraman Govindaraju 3320e5a921 [Sparc] Rearrange integer registers' allocation order so that register allocator will use I and G registers before using L and O registers.
Also, enable registers %g2-%g4 to be used in application and %g5 in 64 bit mode.

llvm-svn: 182219
2013-05-19 20:07:20 +00:00
Jakob Stoklund Olesen ead983cec9 Handle i64 FrameIndex nodes in SPARC v9 mode.
llvm-svn: 182216
2013-05-19 19:14:24 +00:00
Hal Finkel 2f474f0e8a Check InlineAsm clobbers in PPCCTRLoops
We don't need to reject all inline asm as using the counter register (most does
not). Only those that explicitly clobber the counter register need to prevent
the transformation.

llvm-svn: 182191
2013-05-18 09:20:39 +00:00
Tim Northover fd2639f784 AArch64: add CMake dependency to fix very parallel builds
llvm-svn: 182190
2013-05-18 08:17:47 +00:00
David Majnemer 5ba473afb0 X86: Bad peephole interaction between adc, MOV32r0
The peephole tries to reorder MOV32r0 instructions such that they are
before the instruction that modifies EFLAGS.

The problem is that the peephole does not consider the case where the
instruction that modifies EFLAGS also depends on the previous state of
EFLAGS.

Instead, walk backwards until we find an instruction that has a def for
EFLAGS but does not have a use.
If we find such an instruction, insert the MOV32r0 before it.
If it cannot find such an instruction, skip the optimization.

llvm-svn: 182184
2013-05-18 01:02:03 +00:00
Matt Arsenault 75865923c9 Add LLVMContext argument to getSetCCResultType
llvm-svn: 182180
2013-05-18 00:21:46 +00:00
JF Bastien 97b08c404c Support unaligned load/store on more ARM targets
This patch matches GCC behavior: the code used to only allow unaligned
load/store on ARM for v6+ Darwin, it will now allow unaligned load/store
for v6+ Darwin as well as for v7+ on Linux and NaCl.

The distinction is made because v6 doesn't guarantee support (but LLVM
assumes that Apple controls hardware+kernel and therefore have
conformant v6 CPUs), whereas v7 does provide this guarantee (and
Linux/NaCl behave sanely).

The patch keeps the -arm-strict-align command line option, and adds
-arm-no-strict-align. They behave similarly to GCC's -mstrict-align and
-mnostrict-align.

I originally encountered this discrepancy in FastIsel tests which expect
unaligned load/store generation. Overall this should slightly improve
performance in most cases because of reduced I$ pressure.

llvm-svn: 182175
2013-05-17 23:49:01 +00:00
Rafael Espindola 5986ce0e5d Fix the build in c++11 mode.
The errors were:

non-constant-expression cannot be narrowed from type 'int64_t' (aka 'long') to 'uint32_t' (aka 'unsigned int') in initializer list

and

non-constant-expression cannot be narrowed from type 'long' to 'uint32_t' (aka 'unsigned int') in initializer list

llvm-svn: 182168
2013-05-17 22:45:52 +00:00
Vincent Lejeune d3fcb5016c R600: Lower int_load_input to copyFromReg instead of Register node
It solves a bug uncovered by dot4 patch where the register class of
int_load_input use was ignored.

llvm-svn: 182130
2013-05-17 16:51:06 +00:00
Vincent Lejeune 3d5118ca40 R600: Use bottom up scheduling algorithm
llvm-svn: 182129
2013-05-17 16:50:56 +00:00
Vincent Lejeune 4c81d4da6f R600: Use depth first scheduling algorithm
It should increase PV substitution opportunities and lower gpr
usage (pending computations path are "flushed" sooner)

llvm-svn: 182128
2013-05-17 16:50:44 +00:00
Vincent Lejeune e958c8e0d8 R600: Replace big texture opcode switch in scheduler by usesTC/usesVC
llvm-svn: 182127
2013-05-17 16:50:37 +00:00
Vincent Lejeune 519f21eed3 R600: Relax some vector constraints on Dot4.
Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register
coalescer to remove some unneeded COPY.
This patch also defines some structures/functions that can be used to handle
every vector instructions (CUBE, Cayman special instructions...) in a similar
fashion.

llvm-svn: 182126
2013-05-17 16:50:32 +00:00
Vincent Lejeune d3eed66e8c R600: Improve texture handling
llvm-svn: 182125
2013-05-17 16:50:20 +00:00
Vincent Lejeune 4ebef18ab5 R600: Rename 128 bit registers.
Almost all instructions that takes a 128 bits reg as input (fetch, export...)
have the abilities to swizzle their argument and output. Instead of printing
default swizzle for each 128 bits reg, rename T*.XYZW to T* and let instructions
print potentially optimized swizzles themselves.

llvm-svn: 182124
2013-05-17 16:50:09 +00:00
Vincent Lejeune 0fca91d52e R600: Some factorization
llvm-svn: 182123
2013-05-17 16:50:02 +00:00
Vincent Lejeune f9f4e1e7db R600: Factorize Fetch size limit inside AMDGPUSubTarget
llvm-svn: 182122
2013-05-17 16:49:55 +00:00
Vincent Lejeune 709e01688d R600: prettier dump of clamp
llvm-svn: 182121
2013-05-17 16:49:49 +00:00
Tom Stellard ecc2ad1cd4 R600: Fix encoding for R600 family GPUs
Reviewed-by: Vincent Lejeune <vljn@ovi.com>

https://bugs.freedesktop.org/show_bug.cgi?id=64193
https://bugs.freedesktop.org/show_bug.cgi?id=64257
https://bugs.freedesktop.org/show_bug.cgi?id=64320

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 182113
2013-05-17 15:23:21 +00:00
Tom Stellard edade94bbc R600: Pass MCSubtargetInfo reference to R600CodeEmitter
llvm-svn: 182112
2013-05-17 15:23:12 +00:00
Venkatraman Govindaraju 641b0b5a21 [Sparc] Implements hasReservedCallFrame and hasFP.
This is to generate correct framesetup code when the function
 has variable sized allocas.

llvm-svn: 182108
2013-05-17 15:14:34 +00:00
Benjamin Kramer fc33e1d99b X86: Make shuffle -> shift conversion more aggressive about undefs.
Shuffles that only move an element into position 0 of the vector are common in
the output of the loop vectorizer and often generate suboptimal code when SSSE3
is not available. Lower them to vector shifts if possible.

We still prefer palignr over psrldq because it has higher throughput on
sandybridge.

llvm-svn: 182102
2013-05-17 14:48:34 +00:00
Ulrich Weigand 2dbe06a987 [PowerPC] Fix hi/lo encoding in old-style code emitter
This patch implements the equivalent change to r182091/r182092
in the old-style code emitter.  Instead of having two separate
16-bit immediate encoding routines depending on the instruction,
this patch introduces a single encoder that checks the machine
operand flags to decide whether the low or high half of a
symbol address is required.

Since now both encoders make no further distinction between
"symbolLo" and "symbolHi", the .td operand can now use a
single getS16ImmEncoding method.

Tested by running the old-style JIT tests on 32-bit Linux.

llvm-svn: 182097
2013-05-17 14:14:12 +00:00
Ulrich Weigand 6e23ac606e [PowerPC] Merge/rename PPC fixup types
Now that fixup_ppc_ha16 and fixup_ppc_lo16 are being treated exactly
the same everywhere, it no longer makes sense to have two fixup types.

This patch merges them both into a single type fixup_ppc_half16,
and renames fixup_ppc_lo16_ds to fixup_ppc_half16ds for consistency.
(The half16 and half16ds names are taken from the description of
relocation types in the PowerPC ABI.)

No change in code generation expected.

llvm-svn: 182092
2013-05-17 12:37:21 +00:00
Ulrich Weigand 994f49ed79 [PowerPC] Fix processing of ha16/lo16 fixups
The current PowerPC MC back end distinguishes between fixup_ppc_ha16
and fixup_ppc_lo16, which are determined by the instruction the fixup
applies to, and uses this distinction to decide whether a fixup ought
to resolve to the high or the low part of a symbol address.

This isn't quite correct, however.  It is valid -if unusual- assembler
to use, e.g.
  li 1, symbol@ha
or
  lis 1, symbol@l
Whether the high or the low part of the address is used depends solely
on the @ suffix, not on the instruction.

In addition, both
  li 1, symbol
and
  lis 1, symbol
are valid, assuming the symbol address fits into 16 bits; again, both
will then refer to the actual symbol value (so li will load the value
itself, while lis will load the value shifted by 16).


To fix this, two places need to be adapted.  If the fixup cannot be
resolved at assembler time, a relocation needs to be emitted via
PPCELFObjectWriter::getRelocType.  This routine already looks at
the VK_ type to determine the relocation.  The only problem is that
will reject any _LO modifier in a ha16 fixup and vice versa.  This
is simply incorrect; any of those modifiers ought to be accepted
for either fixup type.

If the fixup *can* be resolved at assembler time, adjustFixupValue
currently selects the high bits of the symbol value if the fixup
type is ha16.  Again, this is incorrect; see the above example
  lis 1, symbol

Now, in theory we'd have to respect a VK_ modifier here.  However,
in fact common code never even attempts to resolve symbol references
using any nontrivial VK_ modifier at assembler time; it will always
fall back to emitting a reloc and letting the linker handle it.

If this ever changes, presumably there'd have to be a target callback
to resolve VK_ modifiers.  We'd then have to handle @ha etc. there.

llvm-svn: 182091
2013-05-17 12:36:29 +00:00
Benjamin Kramer 2057a2b86f Don't cast away constness.
llvm-svn: 182086
2013-05-17 11:39:41 +00:00
Christian Konig b7be72df5b R600/SI: return undef instead of null for skipped arguments
This is a candidate for the stable branch.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=64694

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182084
2013-05-17 09:46:48 +00:00
Venkatraman Govindaraju 54bf611c79 [Sparc] Prevent instructions that defines or uses %o7 to be in call's delay slot.
llvm-svn: 182063
2013-05-16 23:53:29 +00:00
Akira Hatanaka 252f54f769 [mips] Improve instruction selection for pattern (store (fp_to_sint $src), $ptr).
Previously, three instructions were needed:

trunc.w.s $f0, $f2
mfc1 $4, $f0
sw $4, 0($2)

Now we need only two:

trunc.w.s $f0, $f2
swc1 $f0, 0($2)

llvm-svn: 182053
2013-05-16 21:17:15 +00:00
Rafael Espindola b08d2c2db0 Remove addFrameMove.
Now that we have good testing, remove addFrameMove and create cfi
instructions directly.

llvm-svn: 182052
2013-05-16 21:02:15 +00:00
Akira Hatanaka d82ee940c3 [mips] Factor out unaligned store lowering code.
llvm-svn: 182050
2013-05-16 20:45:17 +00:00
Jack Carter 03f0fd37a9 Mips assembler: Add TwoOperandConstraint definitions
This patch removes alias definition for addiu $rs,$imm 
and instead uses the TwoOperandAliasConstraint field in 
the ArithLogicI instruction class. 

This way all instructions that inherit ArithLogicI class 
have the same macro defined. 

The usage examples are added to test files.

Patch by Vladimir Medic

llvm-svn: 182048
2013-05-16 20:24:27 +00:00
Jack Carter 59817110ff Mips td file formatting: white space and long lines
llvm-svn: 182047
2013-05-16 20:08:49 +00:00
Hal Finkel 5f587c59a5 Create an new preheader in PPCCTRLoops to avoid counter register clobbers
Some IR-level instructions (such as FP <-> i64 conversions) are not chained
w.r.t. the mtctr intrinsic and yet may become function calls that clobber the
counter register. At the selection-DAG level, these might be reordered with the
mtctr intrinsic causing miscompiles. To avoid this situation, if an existing
preheader has instructions that might use the counter register, create a new
preheader for the mtctr intrinsic. This extra block will be remerged with the
old preheader at the MI level, but will prevent unwanted reordering at the
selection-DAG level.

llvm-svn: 182045
2013-05-16 19:58:38 +00:00
Akira Hatanaka fce4dd7974 [mips] Test case for r182042. Add comment.
llvm-svn: 182044
2013-05-16 19:57:23 +00:00
Akira Hatanaka 39d40f7baf [mips] Fix instruction selection pattern for sint_to_fp node to avoid emitting an
invalid instruction sequence.

Rather than emitting an int-to-FP move instruction and an int-to-FP conversion
instruction during instruction selection, we emit a pseudo instruction which gets
expanded post-RA. Without this change, register allocation can possibly insert a
floating point register move instruction between the two instructions, which is not
valid according to the ISA manual.

mtc1 $f4, $4         # int-to-fp move instruction.
mov.s $f2, $f4       # move contents of $f4 to $f2.
cvt.s.w $f0, $f2     # int-to-fp conversion.

llvm-svn: 182042
2013-05-16 19:48:37 +00:00
Jack Carter 51785c4715 Mips assembler: Add branch macro definitions
This patch adds bnez and beqz instructions which represent alias definitions for bne and beq instructions as follows:
bnez $rs,$imm => bne $rs,$zero,$imm
beqz $rs,$imm => beq $rs,$zero,$imm

The corresponding test cases are added.

Patch by Vladimir Medic

llvm-svn: 182040
2013-05-16 19:40:19 +00:00
Akira Hatanaka 21bab5badc [mips] Fix indentation.
llvm-svn: 182036
2013-05-16 18:42:42 +00:00
Akira Hatanaka 7b6e4f1366 [mips] Delete unused enum value.
llvm-svn: 182035
2013-05-16 18:40:12 +00:00
Ulrich Weigand 9d980cbdb9 [PowerPC] Use true offset value in "memrix" machine operands
This is the second part of the change to always return "true"
offset values from getPreIndexedAddressParts, tackling the
case of "memrix" type operands.

This is about instructions like LD/STD that only have a 14-bit
field to encode immediate offsets, which are implicitly extended
by two zero bits by the machine, so that in effect we can access
16-bit offsets as long as they are a multiple of 4.

The PowerPC back end currently handles such instructions by
carrying the 14-bit value (as it will get encoded into the
actual machine instructions) in the machine operand fields
for such instructions.  This means that those values are
in fact not the true offset, but rather the offset divided
by 4 (and then truncated to an unsigned 14-bit value).

Like in the case fixed in r182012, this makes common code
operations on such offset values not work as expected.
Furthermore, there doesn't really appear to be any strong
reason why we should encode machine operands this way.

This patch therefore changes the encoding of "memrix" type
machine operands to simply contain the "true" offset value
as a signed immediate value, while enforcing the rules that
it must fit in a 16-bit signed value and must also be a
multiple of 4.

This change must be made simultaneously in all places that
access machine operands of this type.  However, just about
all those changes make the code simpler; in many cases we
can now just share the same code for memri and memrix
operands.

llvm-svn: 182032
2013-05-16 17:58:02 +00:00
Hal Finkel 47db66d43f PPC32 cannot form counter loops around i64 FP conversions
On PPC32, i64 FP conversions are implemented using runtime calls (which clobber
the counter register). These must be excluded.

llvm-svn: 182023
2013-05-16 16:52:41 +00:00
Aaron Ballman b4284e6cb6 Fixing a 64-bit conversion warning in MSVC.
llvm-svn: 182018
2013-05-16 16:03:36 +00:00
Rafael Espindola 63d2e0ad9a Remove dead calls to addFrameMove.
Without a PROLOG_LABEL present, the cfi instructions are never printed.

llvm-svn: 182016
2013-05-16 15:08:37 +00:00
Ulrich Weigand 7aa76b6a07 [PowerPC] Report true displacement value from getPreIndexedAddressParts
DAGCombiner::CombineToPreIndexedLoadStore calls a target routine to
decompose a memory address into a base/offset pair.  It expects the
offset (if constant) to be the true displacement value in order to
perform optional additional optimizations; in particular, to convert
other uses of the original pointer into uses of the new base pointer
after pre-increment.

The PowerPC implementation of getPreIndexedAddressParts, however,
simply calls SelectAddressRegImm, which returns a TargetConstant.
This value is appropriate for encoding into the instruction, but
it is not always usable as true displacement value:

- Its type is always MVT::i32, even on 64-bit, where addresses
  ought to be i64 ... this causes the optimization to simply
  always fail on 64-bit due to this line in DAGCombiner:

      // FIXME: In some cases, we can be smarter about this.
      if (Op1.getValueType() != Offset.getValueType()) {

- Its value is truncated to an unsigned 16-bit value if negative.
  This causes the above opimization to generate wrong code.

This patch fixes both problems by simply returning the true
displacement value (in its original type).  This doesn't
affect any other user of the displacement.

llvm-svn: 182012
2013-05-16 14:53:05 +00:00
Richard Sandiford 7fdd268b68 [SystemZ] Tweak register array comment
llvm-svn: 182007
2013-05-16 13:39:02 +00:00
Patrik Hagglund b3391b58f7 Removed unused variable, detected by gcc
-Wunused-but-set-variable. Leftover from r181979.

llvm-svn: 181993
2013-05-16 08:37:22 +00:00
Rafael Espindola 7242186b10 Delete dead code.
llvm-svn: 181982
2013-05-16 04:59:17 +00:00
Rafael Espindola e3d5e5354e Don't call addFrameMove on XCore.
getExceptionHandlingType is not ExceptionHandling::DwarfCFI on xcore, so
etFrameInstructions is never called. There is no point creating cfi
instructions if they are never used.

llvm-svn: 181979
2013-05-16 04:16:25 +00:00
Rafael Espindola 6e8c0d94f8 Removed dead code.
llvm-svn: 181975
2013-05-16 03:34:58 +00:00
Reed Kotler 515e937685 Patch number 2 for mips16/32 floating point interoperability stubs.
This creates stubs that help Mips32 functions call Mips16 
functions which have floating point parameters that are normally passed
in floating point registers.
 

llvm-svn: 181972
2013-05-16 02:17:42 +00:00
Derek Schuff 36f00d9f02 Revert "Support unaligned load/store on more ARM targets"
This reverts r181898.

llvm-svn: 181944
2013-05-15 23:07:43 +00:00
Rafael Espindola 84ee6c40a8 Delete dead code.
llvm-svn: 181941
2013-05-15 22:27:35 +00:00
Hal Finkel 80267a0a37 undef setjmp in PPCCTRLoops
Trying to unbreak the VS build by copying some undef code from
Utils/LowerInvoke.cpp.

llvm-svn: 181938
2013-05-15 22:20:24 +00:00
David Majnemer 8f16974273 X86: Remove redundant test instructions
Increase the number of instructions LLVM recognizes as setting the ZF
flag. This allows us to remove test instructions that redundantly
recalculate the flag.

llvm-svn: 181937
2013-05-15 22:03:08 +00:00
Hal Finkel 25c1992bc7 Implement PPC counter loops as a late IR-level pass
The old PPCCTRLoops pass, like the Hexagon pass version from which it was
derived, could only handle some simple loops in canonical form. We cannot
directly adapt the new Hexagon hardware loops pass, however, because the
Hexagon pass contains a fundamental assumption that non-constant-trip-count
loops will contain a guard, and this is not always true (the result being that
incorrect negative counts can be generated). With this commit, we replace the
pass with a late IR-level pass which makes use of SE to calculate the
backedge-taken counts and safely generate the loop-count expressions (including
any necessary max() parts). This IR level pass inserts custom intrinsics that
are lowered into the desired decrement-and-branch instructions.

The most fragile part of this new implementation is that interfering uses of
the counter register must be detected on the IR level (and, on PPC, this also
includes any indirect branches in addition to function calls). Also, to make
all of this work, we need a variant of the mtctr instruction that is marked
as having side effects. Without this, machine-code level CSE, DCE, etc.
illegally transform the resulting code. Hopefully, this can be improved
in the future.

This new pass is smaller than the original (and much smaller than the new
Hexagon hardware loops pass), and can handle many additional cases correctly.
In addition, the preheader-creation code has been copied from LoopSimplify, and
after we decide on where it belongs, this code will be refactored so that it
can be explicitly shared (making this implementation even smaller).

The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for
the new Hexagon pass. There are a few classes of loops that this pass does not
transform (noted by FIXMEs in the files), but these deficiencies can be
addressed within the SE infrastructure (thus helping many other passes as well).

llvm-svn: 181927
2013-05-15 21:37:41 +00:00
Rafael Espindola 0f2a6fe613 Cleanup relocation sorting for ELF.
We want the order to be deterministic on all platforms. NAKAMURA Takumi
fixed that in r181864. This patch is just two small cleanups:

* Move the function to the cpp file. It is only passed to array_pod_sort.
* Remove the ppc implementation which is now redundant

llvm-svn: 181910
2013-05-15 18:22:01 +00:00
NAKAMURA Takumi dc9f013a5d PPCISelLowering.h: Escape \@ in comments. [-Wdocumentation]
llvm-svn: 181907
2013-05-15 18:01:35 +00:00
NAKAMURA Takumi dcc66456cc Whitespace.
llvm-svn: 181906
2013-05-15 18:01:28 +00:00
Derek Schuff 72ddaba785 Support unaligned load/store on more ARM targets
This patch matches GCC behavior: the code used to only allow unaligned
load/store on ARM for v6+ Darwin, it will now allow unaligned load/store for
v6+ Darwin as well as for v7+ on other targets.

The distinction is made because v6 doesn't guarantee support (but LLVM assumes
that Apple controls hardware+kernel and therefore have conformant v6 CPUs),
whereas v7 does provide this guarantee (and Linux behaves sanely).

Overall this should slightly improve performance in most cases because of
reduced I$ pressure.

Patch by JF Bastien

llvm-svn: 181897
2013-05-15 16:08:30 +00:00
Ulrich Weigand 2fb140ef31 [PowerPC] Remove need for adjustFixupOffst hack
Now that applyFixup understands differently-sized fixups, we can define
fixup_ppc_lo16/fixup_ppc_lo16_ds/fixup_ppc_ha16 to properly be 2-byte
fixups, applied at an offset of 2 relative to the start of the 
instruction text.

This has the benefit that if we actually need to generate a real
relocation record, its address will come out correctly automatically,
without having to fiddle with the offset in adjustFixupOffset.

Tested on both 64-bit and 32-bit PowerPC, using external and
integrated assembler.

llvm-svn: 181894
2013-05-15 15:07:06 +00:00
Richard Sandiford ffd144174d [SystemZ] Make use of SUBTRACT HALFWORD
Thanks to Ulrich Weigand for noticing that this instruction was missing.

llvm-svn: 181893
2013-05-15 15:05:29 +00:00
Ulrich Weigand 56f5b28d2e [PowerPC] Correctly handle fixups of other than 4 byte size
The PPCAsmBackend::applyFixup routine handles the case where a
fixup can be resolved within the same object file.  However,
this routine is currently hard-coded to assume the size of
any fixup is always exactly 4 bytes.

This is sort-of correct for fixups on instruction text; even
though it only works because several of what really would be
2-byte fixups are presented as 4-byte fixups instead (requiring
another hack in PPCELFObjectWriter::adjustFixupOffset to clean
it up).

However, this assumption breaks down completely for fixups
on data, which legitimately can be of any size (1, 2, 4, or 8).

This patch makes applyFixup aware of fixups of varying sizes,
introducing a new helper routine getFixupKindNumBytes (along
the lines of what the ARM back end does).  Note that in order
to handle fixups of size 8, we also need to fix the return type
of adjustFixupValue to uint64_t to avoid truncation.

Tested on both 64-bit and 32-bit PowerPC, using external and
integrated assembler.

llvm-svn: 181891
2013-05-15 15:01:46 +00:00
Richard Sandiford 619859f42e [SystemZ] Add more future work items to the README
Based on an analysis by Ulrich Weigand.

llvm-svn: 181882
2013-05-15 12:53:31 +00:00
Arnold Schwaighofer af85f6083a ARM ISel: Don't create illegal types during LowerMUL
The transformation happening here is that we want to turn a
"mul(ext(X), ext(X))" into a "vmull(X, X)", stripping off the extension. We have
to make sure that X still has a valid vector type - possibly recreate an
extension to a smaller type. In case of a extload of a memory type smaller than
64 bit we used create a ext(load()). The problem with doing this - instead of
recreating an extload - is that an illegal type is exposed.

This patch fixes this by creating extloads instead of ext(load()) sequences.

Fixes PR15970.

radar://13871383

llvm-svn: 181842
2013-05-14 22:33:24 +00:00
Bill Schmidt a87a7e2620 Implement the PowerPC system call (sc) instruction.
Instruction added at request of Roman Divacky.  Tested via asm-parser.

llvm-svn: 181821
2013-05-14 19:35:45 +00:00
Jyotsna Verma 803e506fec Hexagon: Pass to replace tranfer/copy instructions into combine instruction
where possible.

llvm-svn: 181817
2013-05-14 18:54:06 +00:00
Eric Christopher b27cd8bea6 Reapply "Subtract isn't commutative, fix this for MMX psub." with
a somewhat randomly chosen cpu that will minimize cpu specific
differences on bots.

llvm-svn: 181814
2013-05-14 18:33:40 +00:00
Eric Christopher 3eee7454cf Temporarily revert "Subtract isn't commutative, fix this for MMX psub."
It's causing failures on the atom bot.

llvm-svn: 181812
2013-05-14 18:20:42 +00:00
Eric Christopher 0344f495f9 Subtract isn't commutative, fix this for MMX psub.
Patch by Andrea DiBiagio.

llvm-svn: 181809
2013-05-14 17:52:05 +00:00
Jyotsna Verma 2dca82ad1c Hexagon: Add patterns to generate 'combine' instructions.
llvm-svn: 181805
2013-05-14 17:16:38 +00:00
Jyotsna Verma 11bd54afd6 Hexagon: ArePredicatesComplement should not restrict itself to TFRs.
llvm-svn: 181803
2013-05-14 16:36:34 +00:00
Bill Schmidt ef3d1a24ed PPC32: Fix stack collision between FP and CR save areas.
The changes to CR spill handling missed a case for 32-bit PowerPC.
The code in PPCFrameLowering::processFunctionBeforeFrameFinalized()
checks whether CR spill has occurred using a flag in the function
info.  This flag is only set by storeRegToStackSlot and
loadRegFromStackSlot.  spillCalleeSavedRegisters does not call
storeRegToStackSlot, but instead produces MI directly.  Thus we don't
see the CR is spilled when assigning frame offsets, and the CR spill
ends up colliding with some other location (generally the FP slot).

This patch sets the flag in spillCalleeSavedRegisters for PPC32 so
that the CR spill is properly detected and gets its own slot in the
stack frame.

llvm-svn: 181800
2013-05-14 16:08:32 +00:00
Jyotsna Verma c61e350a7d Hexagon: Remove dead-code after unconditional return from addPreSched2.
llvm-svn: 181797
2013-05-14 15:33:27 +00:00
Tom Stellard 1e21b53020 R600/SI: Add processor type for Hainan asic
Patch by: Alex Deucher

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181792
2013-05-14 14:42:56 +00:00
Richard Sandiford eb9af29426 [SystemZ] Add disassembler support
llvm-svn: 181777
2013-05-14 10:17:52 +00:00
Richard Sandiford 1fb5883d77 [SystemZ] Rework handling of constant PC-relative operands
The GNU assembler treats things like:

        brasl   %r14, 100

in the same way as:

        brasl   %r14, .+100

rather than as a branch to absolute address 100.  We implemented this in
LLVM by creating an immediate operand rather than the usual expr operand,
and by handling immediate operands specially in the code emitter.
This was undesirable for (at least) three reasons:

- the specialness of immediate operands was exposed to the backend MC code,
  rather than being limited to the assembler parser.

- in disassembly, an immediate operand really is an absolute address.
  (Note that this means reassembling printed disassembly can't recreate
  the original code.)

- it would interfere with any assembly manipulation that we might
  try in future.  E.g. operations like branch shortening can change
  the relative position of instructions, but any code that updates
  sym+offset addresses wouldn't update an immediate "100" operand
  in the same way as an explicit ".+100" operand.

This patch changes the implementation so that the assembler creates
a "." label for immediate PC-relative operands, so that the operand
to the MCInst is always the absolute address.  The patch also adds
some error checking of the offset.

llvm-svn: 181773
2013-05-14 09:47:26 +00:00
Richard Sandiford 6a808f986b [SystemZ] Remove bogus isAsmParserOnly
Marking instructions as isAsmParserOnly stops them from being disassembled.
However, in cases where separate asm and codegen versions exist, we actually
want to disassemble to the asm ones.

No functional change intended.

llvm-svn: 181772
2013-05-14 09:38:07 +00:00
Richard Sandiford 7d37cd26c6 [SystemZ] Match operands to fields by name rather than by order
The SystemZ port currently relies on the order of the instruction operands
matching the order of the instruction field lists.  This isn't desirable
for disassembly, where the two are matched only by name.  E.g. the R1 and R2
fields of an RR instruction should have corresponding R1 and R2 operands.

The main complication is that addresses are compound operands,
and as far as I know there is no mechanism to allow individual
suboperands to be selected by name in "let Inst{...} = ..." assignments.
Luckily it doesn't really matter though.  The SystemZ instruction
encoding groups all address fields together in a predictable order,
so it's just as valid to see the entire compound address operand as
a single field.  That's the approach taken in this patch.

Matching by name in turn means that the operands to COPY SIGN and
CONVERT TO FIXED instructions can be given in natural order.
(It was easier to do this at the same time as the rename,
since otherwise the intermediate step was too confusing.)

No functional change intended.

llvm-svn: 181771
2013-05-14 09:36:44 +00:00
Richard Sandiford d454ec0c31 [SystemZ] Match operands to fields by name rather than by order
The SystemZ port currently relies on the order of the instruction operands
matching the order of the instruction field lists.  This isn't desirable
for disassembly, where the two are matched only by name.  E.g. the R1 and R2
fields of an RR instruction should have corresponding R1 and R2 operands.

The main complication is that addresses are compound operands,
and as far as I know there is no mechanism to allow individual
suboperands to be selected by name in "let Inst{...} = ..." assignments.
Luckily it doesn't really matter though.  The SystemZ instruction
encoding groups all address fields together in a predictable order,
so it's just as valid to see the entire compound address operand as
a single field.  That's the approach taken in this patch.

Matching by name in turn means that the operands to COPY SIGN and
CONVERT TO FIXED instructions can be given in natural order.
(It was easier to do this at the same time as the rename,
since otherwise the intermediate step was too confusing.)

No functional change intended.

llvm-svn: 181769
2013-05-14 09:28:21 +00:00
Reed Kotler 821e86f021 Fix typo.
llvm-svn: 181759
2013-05-14 06:00:01 +00:00
Reed Kotler cad47f0297 Removed an unnamed namespace and forgot to make two of the functions inside
"static".

llvm-svn: 181754
2013-05-14 02:13:45 +00:00
Reed Kotler 2c4657d9b7 This is the first of three patches which creates stubs used for
Mips16/32 floating point interoperability.

When Mips16 code calls external functions that would normally have some
of its parameters or return values passed in floating point registers,
it needs (Mips32) helper functions to do this because while in Mips16 mode
there is no ability to access the floating point registers.

In Pic mode, this is done with a set of predefined functions in libc.
This case is already handled in llvm for Mips16.

In static relocation mode, for efficiency reasons, the compiler generates
stubs that the linker will use if it turns out that the external function
is a Mips32 function. (If it's Mips16, then it does not need the helper
stubs).

These stubs are identically named and the linker knows about these tricks
and will not create multiple copies and will delete them if they are not
needed.

llvm-svn: 181753
2013-05-14 02:00:24 +00:00
Jack Carter f5f48d8ff7 Mips assembler: Assembler macro ADDIU $rs,imm
This patch adds alias for addiu instruction which enables following syntax:

    addiu $rs,imm

The macro is translated as:

    addiu $rs,$rs,imm


Contributer: Vladimir Medic
llvm-svn: 181729
2013-05-13 20:26:46 +00:00
Bill Schmidt 6cda22a3b4 Fix goofy commentary in PPCTargetObjectFile.cpp.
llvm-svn: 181725
2013-05-13 19:40:36 +00:00
Bill Schmidt 22d40dcfe9 PPC64: Constant initializers with dynamic relocations go in .data.rel.ro.
This fixes warning messages observed in the oggenc application test in
projects/test-suite.  Special handling is needed for the 64-bit
PowerPC SVR4 ABI when a constant is initialized with a pointer to a
function in a shared library.  Because a function address is
implemented as the address of a function descriptor, the use of copy
relocations can lead to problems with initialization.  GNU ld
therefore replaces copy relocations with dynamic relocations to be
resolved by the dynamic linker.  This means the constant cannot reside
in the read-only data section, but instead belongs in .data.rel.ro,
which is designed for constants containing dynamic relocations.

The implementation creates a class PPC64LinuxTargetObjectFile
inheriting from TargetLoweringObjectFileELF, which behaves like its
parent except to place constants of this sort into .data.rel.ro.

The test case is reduced from the oggenc application.

llvm-svn: 181723
2013-05-13 19:34:37 +00:00
Akira Hatanaka 9edae02db8 [mips] Add option -mno-ldc1-sdc1.
This option is used when the user wants to avoid emitting double precision FP
loads and stores. Double precision FP loads and stores are expanded to single
precision instructions after register allocation.

llvm-svn: 181718
2013-05-13 18:23:35 +00:00
Akira Hatanaka 310e26a832 [mips] Define a helper function which creates an instruction with the same
operands as the prototype instruction but with a different opcode.

llvm-svn: 181714
2013-05-13 17:57:42 +00:00
Akira Hatanaka 067d8152f0 [mips] Rename functions. No functionality changes.
llvm-svn: 181713
2013-05-13 17:43:19 +00:00
Rafael Espindola b84cde5219 Remove unused fields and arguments.
llvm-svn: 181706
2013-05-13 14:34:48 +00:00
Mihai Popa dc1764c5a4 The purpose of the patch is to fix the syntax of ARM mrc and mrc2 instructions when they are used to write to the APSR. In this case, the destination operand should be APSR_nzcv, and the encoding of the target should be 0b1111 (same as for PC). In pre-UAL syntax, this form used the PC register as a textual target. This is still allowed for backward compatibility.
llvm-svn: 181705
2013-05-13 14:10:04 +00:00
Lang Hames 67c09b3f88 Correctly preserve the input chain for potential tailcall nodes whose
return values are bitcasts.

The chain had previously been being clobbered with the entry node to
the dag, which sometimes caused other code in the function to be
erroneously deleted when tailcall optimization kicked in.

<rdar://problem/13827621>

llvm-svn: 181696
2013-05-13 10:21:19 +00:00
Duncan Sands 0480b9b54e Suppress GCC compiler warnings in release builds about variables that are only
read in asserts.

llvm-svn: 181689
2013-05-13 07:50:47 +00:00
Rafael Espindola 227144c23c Remove the MachineMove class.
It was just a less powerful and more confusing version of
MCCFIInstruction. A side effect is that, since MCCFIInstruction uses
dwarf register numbers, calls to getDwarfRegNum are pushed out, which
should allow further simplifications.

I left the MachineModuleInfo::addFrameMove interface unchanged since
this patch was already fairly big.

llvm-svn: 181680
2013-05-13 01:16:13 +00:00
Rafael Espindola 1b09836bc3 Change getFrameMoves to return a const reference.
To add a frame now there is a dedicated addFrameMove which also takes
care of constructing the move itself.

llvm-svn: 181657
2013-05-11 02:38:11 +00:00
Reed Kotler 783c79446b Checkin in of first of several patches to finish implementation of
mips16/mips32 floating point interoperability. 

This patch fixes returns from mips16 functions so that if the function
was in fact called by a mips32 hard float routine, then values
that would have been returned in floating point registers are so returned.

Mips16 mode has no floating point instructions so there is no way to
load values into floating point registers.

This is needed when returning float, double, single complex, double complex
in the Mips ABI.

Helper functions in libc for mips16 are available to do this.

For efficiency purposes, these helper functions have a different calling
convention from normal Mips calls.

Registers v0,v1,a0,a1 are used to pass parameters instead of
a0,a1,a2,a3.

This is because v0,v1,a0,a1 are the natural registers used to return
floating point values in soft float. These values can then be moved
to the appropriate floating point registers with no extra cost.

The only register that is modified is ra in this call.

The helper functions make sure that the return values are in the floating
point registers that they would be in if soft float was not in effect
(which it is for mips16, though the soft float is implemented using a mips32
library that uses hard float).
 

llvm-svn: 181641
2013-05-10 22:25:39 +00:00
Jyotsna Verma bf0bd1f4ab Fix unused variable error.
Earlier, this variable was used in an assert and was causing failure on
darwin.

llvm-svn: 181630
2013-05-10 21:44:02 +00:00
Jyotsna Verma 438cec566b Hexagon: Fix switch statements in GetDotOldOp and IsNewifyStore.
No functionality change.

llvm-svn: 181628
2013-05-10 20:58:11 +00:00
Jyotsna Verma 300f0b966c Hexagon: Fix switch cases in HexagonVLIWPacketizer.cpp.
llvm-svn: 181624
2013-05-10 20:27:34 +00:00
Rafael Espindola 86067ad6a9 Fix the R600 build.
llvm-svn: 181621
2013-05-10 18:31:42 +00:00
Chad Rosier c8569cba93 [ms-inline asm] Fix a crasher when we fail on a direct match.
The issue was that the MatchingInlineAsm and VariantID args to the
MatchInstructionImpl function weren't being set properly.  Specifically, when
parsing intel syntax, the parser thought it was parsing inline assembly in the
at&t dialect; that will never be the case.  

The crash was caused when the emitter tried to emit the instruction, but the
operands weren't set.  When parsing inline assembly we only set the opcode, not
the operands, which is used to lookup the instruction descriptor.
rdar://13854391 and PR15945

Also, this commit reverts r176036.  Now that we're correctly parsing the intel
syntax the pushad/popad don't match properly.  I've reimplemented that fix using
a MnemonicAlias.

llvm-svn: 181620
2013-05-10 18:24:17 +00:00
Rafael Espindola 140a837acd Remove unused argument.
llvm-svn: 181618
2013-05-10 18:16:59 +00:00
Rafael Espindola 7501a81a50 Remove unused function.
llvm-svn: 181606
2013-05-10 16:53:12 +00:00
Logan Chien 4ea23b56c5 Implement AsmParser for ARM unwind directives.
This commit implements the AsmParser for fnstart, fnend,
cantunwind, personality, handlerdata, pad, setfp, save, and
vsave directives.

This commit fixes some minor issue in the ARMELFStreamer:

* The switch back to corresponding section after the .fnend
  directive.

* Emit the unwind opcode while processing .fnend directive
  if there is no .handlerdata directive.

* Emit the unwind opcode to .ARM.extab while processing
  .handlerdata even if .personality directive does not exist.

llvm-svn: 181603
2013-05-10 16:17:24 +00:00
Tom Stellard 2b971eb0d0 R600: Remove AMDILPeeopholeOptimizer and replace optimizations with tablegen patterns
The BFE optimization was the only one we were actually using, and it was
emitting an intrinsic that we don't support.

https://bugs.freedesktop.org/show_bug.cgi?id=64201

Reviewed-by: Christian König <christian.koenig@amd.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181580
2013-05-10 02:09:45 +00:00
Tom Stellard 3a7c34c778 R600: Expand SUB for v2i32/v4i32
Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181579
2013-05-10 02:09:39 +00:00
Tom Stellard 3deddc5079 R600: Expand MUL for v4i32/v2i32
Fixes piglit test for OpenCL builtin mul24, and allows mad24 to run.

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181578
2013-05-10 02:09:34 +00:00
Tom Stellard 7fb3963498 R600: Expand SRA for v4i32/v2i32
v2: Add v4i32 test

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181577
2013-05-10 02:09:29 +00:00
Tom Stellard a99c6ae47a R600: Expand vselect for v4i32 and v2i32
v2: Add vselect v4i32 test

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181576
2013-05-10 02:09:24 +00:00
Chad Rosier edb1dc8498 [x86AsmParser] It's valid to stop parsing an operand at an immediate.
rdar://13854369 and PR15944

llvm-svn: 181564
2013-05-09 23:48:53 +00:00
Bill Wendling 07fe235e2b Generate a compact unwind encoding in the face of a stack alignment push.
We generate a `push' of a random register (%rax) if the stack needs to be
aligned by the size of that register. However, this could mess up compact unwind
generation. In particular, we want to still generate compact unwind in the
presence of this monstrosity.

Check if the push of of the %rax/%eax register. If it is and it's marked with
the `FrameSetup' flag, then we can generate a compact unwind encoding for the
function only if the push is the last FrameSetup instruction.

llvm-svn: 181540
2013-05-09 20:10:38 +00:00
Jyotsna Verma 00681dc1f0 Hexagon: Remove switch cases from GetDotNewPredOp and isPostIncrement functions.
No functionality change.

llvm-svn: 181535
2013-05-09 19:16:07 +00:00
Jyotsna Verma 978e972ff9 Hexagon: Use relation map for getMatchingCondBranchOpcode() and
getInvertedPredicatedOpcode() functions instead of switch cases.

llvm-svn: 181530
2013-05-09 18:25:44 +00:00
Bill Wendling 98d5c52d2e Simplify the code a bit.
The compact unwind registers were defined in two different
places. It's better just to place them in the function that uses them
and specify that this is a 64-bit or 32-bit machine.

No functionality change.

llvm-svn: 181529
2013-05-09 18:21:45 +00:00
Richard Osborne 1333fa3d68 [XCore] Fix handling of functions where only the LR is spilled.
Previously we only checked if the LR required saving if the frame size was
non zero. However because the caller reserves 1 word for the callee to use
that doesn't count towards our frame size it is possible for the LR to need
saving and for the frame size to be 0.

We didn't hit when the LR needed saving because of a function calls because
the 1 word of stack we must allocate for our callee means the frame size
is always non zero in this case. However we can hit this case if the LR is
clobbered in inline asm.

llvm-svn: 181520
2013-05-09 16:43:42 +00:00
Akira Hatanaka b4526ea132 [mips] Add instruction selection pattern for (seteq $LHS, 0).
llvm-svn: 181459
2013-05-08 19:38:04 +00:00
Roman Divacky 2d26e8e56b Remove unused isLegalAddressImmediate() method.
llvm-svn: 181452
2013-05-08 17:51:39 +00:00
Ulrich Weigand e462053f64 [PowerPC] Fix regression in generating @ha/@l relocs
The patch I committed as revision 167864 introduced a regression that
causes LLVM to no longer generate appropriate relocs for @ha/@l symbol
references (but fail an assertion instead).

This is fixed here by re-enabling support for the VK_PPC_GAS_HA16/
VK_PPC_GAS_LO16 variant kinds (and their Darwin variants) in
PPCELFObjectWriter.cpp.

Tested by running projects/test-suite in -m32 mode with the integrated
assembler forced on.  A standalone test case will be committed shortly
as well.

llvm-svn: 181450
2013-05-08 17:50:07 +00:00
Bill Schmidt 38b6cb51bc Fix handling of anonymous aggregate parameters for powerpc*-apple-darwin8.
This fixes bug 15821 similarly to the powerpc64-linux fix for bug 14779.

Patch by David Fang.

llvm-svn: 181449
2013-05-08 17:22:33 +00:00
Stepan Dyatkovskiy 2703bcaad3 For r181148: fixed warning 'enumeral and non-enumeral type in conditional expression'.
llvm-svn: 181437
2013-05-08 14:51:27 +00:00
Hal Finkel 08e53ee551 PPCInstrInfo::optimizeCompareInstr should not optimize FP compares
The floating-point record forms on PPC don't set the condition register bits
based on a comparison with zero (like the integer record forms do), but rather
based on the exception status bits.

llvm-svn: 181423
2013-05-08 12:16:14 +00:00
Preston Gurd 9264c95400 Corrected Atom latencies for SSE SQRT instructions.
llvm-svn: 181346
2013-05-07 19:57:34 +00:00
Jyotsna Verma 5eb598001c Hexagon: Fix Small Data support to handle -G 0 correctly.
llvm-svn: 181344
2013-05-07 19:53:00 +00:00
Hal Finkel c363245ff2 Cleanup PPCInstrInfo::optimizeCompareInstr
Implement suggestions by Bill Schmidt in post-commit review. No functionality
change intended.

llvm-svn: 181338
2013-05-07 17:49:55 +00:00
Jyotsna Verma 03c6ca905c Reverting r181331.
Missing file, HexagonSplitConst32AndConst64.cpp, from lib/Target/Hexagon/CMakeLists.txt.

llvm-svn: 181334
2013-05-07 17:12:35 +00:00
Jyotsna Verma 19f0b40dcf Hexagon: Fix Small Data support to handle -G 0 correctly.
llvm-svn: 181331
2013-05-07 16:42:15 +00:00
Jyotsna Verma a03eb9b5d5 Hexagon: Set accessSize and addrMode on all load/store instructions.
llvm-svn: 181324
2013-05-07 15:06:29 +00:00
Michael Kuperstein 1a0c91f73b Re-enable AVX detection on x64 platforms.
llvm-svn: 181313
2013-05-07 14:05:33 +00:00
Richard Sandiford 9589691ca7 [SystemZ] Fix InitMCCodeGenInfo call
createSystemZMCCodeGenInfo was not passing the optimization level to
InitMCCodeGenInfo(), so -O0 would be ignored.  Fixes DebugInfo/namespace.ll
after the changes in r181271.

llvm-svn: 181312
2013-05-07 12:56:31 +00:00
Tom Stellard f787ef1d96 R600/SI: Add intrinsic for MIMG IMAGE_GET_RESINFO opcode
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181269
2013-05-06 23:02:19 +00:00
Tom Stellard e363dbf7eb R600/SI: Handle arbitrary destination type in SITargetLowering::adjustWritemask
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181268
2013-05-06 23:02:15 +00:00
Tom Stellard 353b336e8c R600/SI: Add intrinsic for texture image loading
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181267
2013-05-06 23:02:12 +00:00
Tom Stellard c932d7329c R600/SI: Add pattern for uint_to_fp
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181266
2013-05-06 23:02:07 +00:00
Tom Stellard cf6452c7d4 R600/SI: Add patterns for integer maxima / minima
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181265
2013-05-06 23:02:04 +00:00
Tom Stellard 9b3d2535bf R600/SI: Add pattern for AMDGPU.trunc intrinsic
Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 181263
2013-05-06 23:02:00 +00:00
Krzysztof Parzyszek 18ee1193bf Print IR from Hexagon MI passes with -print-before/after-all.
llvm-svn: 181255
2013-05-06 21:58:00 +00:00
Krzysztof Parzyszek 59df52c585 Cleanup of the HexagonTargetMachine setup.
llvm-svn: 181250
2013-05-06 21:25:45 +00:00
Jyotsna Verma 84c471029b Hexagon: Add multiclass/encoding bits for the New-Value Jump instructions.
llvm-svn: 181235
2013-05-06 18:49:23 +00:00
Krzysztof Parzyszek d50074712f Make references to HexagonTargetMachine "const".
llvm-svn: 181233
2013-05-06 18:38:37 +00:00
Tom Stellard d93cede8e4 R600: Remove dead code from the CodeEmitter v2
v2:
  - Replace switch statement with TSFlags query

Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181229
2013-05-06 17:50:57 +00:00
Tom Stellard 043de4c5af R600: Emit config values in register / value pairs
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181228
2013-05-06 17:50:51 +00:00