MSVC always places the implicit sret parameter after the implicit this
parameter of instance methods. We used to handle this for
x86_thiscallcc by allocating the sret parameter on the stack and leaving
the this pointer in ecx, but that doesn't handle alternative calling
conventions like cdecl, stdcall, fastcall, or the win64 convention.
Instead, change the verifier to allow sret on the second parameter.
This also requires changing the Mips and X86 backends to return the
argument with the sret parameter, instead of assuming that the sret
parameter comes first.
The Sparc backend also returns sret parameters in a register, but I
wasn't able to update it to handle secondary sret parameters. It
currently calls report_fatal_error if you feed it an sret in the second
parameter.
Reviewers: rafael.espindola, majnemer
Differential Revision: http://reviews.llvm.org/D3617
llvm-svn: 208453
Summary:
Adds MIPS32r6/MIPS64r6 and checks the compatibility requirements for these
processors.
I've also included comments to describe removed and re-encoded instructions,
along with placeholder def's for the new instructions but there are no
functional changes to codegen at this point.
Reviewers: jkolek, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3622
llvm-svn: 208399
This creates a lot of core infrastructure in which to add, with little
effort, quite a bit more to mips fast-isel
Test Plan: simplestore.ll
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3527
llvm-svn: 207790
target cannot be determined accurately. This is the case for NaCl where the
sandboxing instructions are added in MC layer, after the MipsLongBranch pass.
It is also the case when the code has inline assembly. Instead of calculating
offset in the MipsLongBranch pass, use %hi(sym1 - sym2) and %lo(sym1 - sym2)
expressions that are resolved during the fixup.
This patch also deletes microMIPS test file test/CodeGen/Mips/micromips-long-branch.ll
and implements microMIPS CHECKs in a much simpler way in a file
test/CodeGen/Mips/longbranch.ll, together with MIPS32 and MIPS64.
llvm-svn: 207656
Summary:
This isn't supported directly so we rotate the vector by the desired number of
elements, insert to element zero, then rotate back.
The i64 case generates rather poor code on MIPS32. There is an obvious
optimisation to be made in future (do both insert.w's inside a shared
rotate/unrotate sequence) but for now it's sufficient to select valid code
instead of aborting.
Depends on D3536
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3537
llvm-svn: 207640
Summary:
This isn't supported directly so we splat the vector element and extract
the most convenient copy.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3530
llvm-svn: 207524
This is so that EF_MIPS_NAN2008 is set if we are using IEEE 754-2008
NaN encoding (-mnan=2008). This patch also adds support for parsing
'.nan legacy' and '.nan 2008' assembly directives. The handling of
these directives should match GAS' behaviour i.e., the last directive
in use sets the ELF header bit (EF_MIPS_NAN2008).
Differential Revision: http://reviews.llvm.org/D3346
llvm-svn: 206396
Summary: This was a case of incorrect usage of hasMips64() vs isABI_N64()
Reviewers: matheusalmeida, dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3398
llvm-svn: 206388
This should fix the ninja-x64-msvc-RA-centos6 builder.
I suspect the check in MipsSubtarget.cpp is incorrect and is really trying to
check for a bare-metal target rather and anything other than linux. I'll
investigate this.
llvm-svn: 206385
if not in micromips mode.
The test (elf_st_other.ll) was renamed as the name and description didn't
make sense as the test wasn't checking any symbol table entry.
Differential Revision: http://reviews.llvm.org/D3346
llvm-svn: 206377
Summary:
I had difficulty finding tests for the N32 and N64 ABI so I've added a
collection of calling convention tests based on the document MIPS ABIs
Described (MD00305), the MIPSpro N32 Handbook, and the SYSV ABI. Where the
documents/implementations disagree, I've used GCC to resolve the conflict.
A few interesting details:
* For N32, LLVM uses 64-bit pointers when saving $ra despite pointers being
32-bit. I've yet to find a supporting statement in the ABI documentation but
the current behaviour matches GCC.
* For O32, the non-variable portion of a varargs argument list is also subject
to the rule that floating-point is passed via GPR's (on N32/N64 only the
variable portion is subject to this rule). This agrees with GCC's behaviour
and the SYSV ABI but contradicts part of the MIPSpro N32 Handbook which talks about O32's behaviour.
* The N32 implementation has the wrong callee-saved register list.
(I already have a fix for this but will commit it as a follow-up).
I've left RUN-TODO lines in for O32 on MIPS64. I don't plan to support this case
for now but we should revisit it.
Reviewers: matheusalmeida, vmedic
Reviewed By: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3339
llvm-svn: 206370
Summary:
This was another incorrect use of hasMips64() vs isGP64bit().
Depends on D3344
Reviewers: matheusalmeida, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3347
llvm-svn: 206187
Summary:
Two exceptions to this:
test/CodeGen/Mips/octeon.ll
test/CodeGen/Mips/octeon_popcnt.ll
these test extensions to MIPS64
One test is altered for MIPS-IV:
test/CodeGen/Mips/mips64countleading.ll
Tests dclo/dclz which were added in MIPS64. The MIPS-IV version tests
that dclo/dclz are not emitted.
Four tests fail and are not in this patch:
test/CodeGen/Mips/abicalls.ll
test/CodeGen/Mips/fcopysign-f32-f64.ll
test/CodeGen/Mips/fcopysign.ll
test/CodeGen/Mips/stack-alignment.ll
Depends on D3343
Reviewers: matheusalmeida, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3344
llvm-svn: 206185
Summary:
- Conditional moves acting on 64-bit GPR's should require MIPS-IV rather than MIPS64
- ISD::MUL, and ISD::MULH[US] should be lowered on all 64-bit ISA's
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
I've added additional testcases to cover as much of the codegen changes
affecting MIPS-IV as I can. Where I've been unable to find an existing
MIPS64 testcase that can be re-used for MIPS-IV (mainly tests covering
ISD::GlobalAddress and similar), I at least agree that MIPS-IV should
behave like MIPS64. Further testcases that are fixed by this patch will follow
in my next commit. The testcases from that commit that fail for MIPS-IV without
this patch are:
LLVM :: CodeGen/Mips/2010-07-20-Switch.ll
LLVM :: CodeGen/Mips/cmov.ll
LLVM :: CodeGen/Mips/eh-dwarf-cfa.ll
LLVM :: CodeGen/Mips/largeimmprinting.ll
LLVM :: CodeGen/Mips/longbranch.ll
LLVM :: CodeGen/Mips/mips64-f128.ll
LLVM :: CodeGen/Mips/mips64directive.ll
LLVM :: CodeGen/Mips/mips64ext.ll
LLVM :: CodeGen/Mips/mips64fpldst.ll
LLVM :: CodeGen/Mips/mips64intldst.ll
LLVM :: CodeGen/Mips/mips64load-store-left-right.ll
LLVM :: CodeGen/Mips/sint-fp-store_pattern.ll
Reviewers: dsanders
Reviewed By: dsanders
CC: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3343
llvm-svn: 206183
Summary:
They behave in accordance with the Has2008 and ABS2008 configuration bits of the processor which are used to select between the 1985 and 2008 versions of IEEE 754. In 1985 mode, these instructions are arithmetic (i.e. they raise invalid operation exceptions when given NaN), in 2008 mode they are non-arithmetic (i.e. they are copies).
nmadd.[ds], and nmsub.[ds] are still subject to -enable-no-nans-fp-math because the ISA spec does not explicitly state that they obey Has2008 and ABS2008.
Fixed the issue with the previous version of this patch (r205628). A pre-existing 'let Predicate =' statement was removing some predicates that were necessary for FP64 to behave correctly.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3274
llvm-svn: 205844
Summary:
They behave in accordance with the Has2008 and ABS2008 configuration bits of the
processor which are used to select between the 1985 and 2008 versions of IEEE
754. In 1985 mode, these instructions are arithmetic (i.e. they raise invalid
operation exceptions when given NaN), in 2008 mode they are non-arithmetic
(i.e. they are copies).
nmadd.[ds], and nmsub.[ds] are still subject to -enable-no-nans-fp-math because
the ISA spec does not explicitly state that they obey Has2008 and ABS2008.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3274
llvm-svn: 205628
Adds the instructions ext/ext32/cins/cins32.
It also changes pop/dpop to accept the two operand version and
adds a simple pattern to generate baddu.
Tests for the two operand versions (including baddu/dmul/dpop/pop)
and the code generation pattern for baddu are included.
Reviewed by: Daniel.Sanders@imgtec.com
llvm-svn: 205449
While reviewing r204163, I noticed that the MIPS16 test only checked for a .ent
directive and didn't actually check the code emitted. Fixed this and added a
check for llvm.bswap.i32 on MIPS64 at the same time.
llvm-svn: 205177
Implementing the LLVM part of the call to __builtin___clear_cache
which translates into an intrinsic @llvm.clear_cache and is lowered
by each target, either to a call to __clear_cache or nothing at all
incase the caches are unified.
Updating LangRef and adding some tests for the implemented architectures.
Other archs will have to implement the method in case this builtin
has to be compiled for it, since the default behaviour is to bail
unimplemented.
A Clang patch is required for the builtin to be lowered into the
llvm intrinsic. This will be done next.
llvm-svn: 204802
Summary:
VECTOR_SHUFFLE concatenates the vectors in an vectorwise fashion.
<0b00, 0b01> + <0b10, 0b11> -> <0b00, 0b01, 0b10, 0b11>
VSHF concatenates the vectors in a bitwise fashion:
<0b00, 0b01> + <0b10, 0b11> ->
0b0100 + 0b1110 -> 0b01001110
<0b10, 0b11, 0b00, 0b01>
We must therefore swap the operands to get the correct result.
The test case that discovered the issue was MultiSource/Benchmarks/nbench.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3142
llvm-svn: 204480
The Octeon cpu from Cavium Networks is mips64r2 based and has an extended
instruction set. In order to utilize this with LLVM, a new cpu feature "octeon"
and a subtarget feature "cnmips" is added. A small set of new instructions
(baddu, dmul, pop, dpop, seq, sne) is also added. LLVM generates dmul, pop and
dpop instructions with option -mcpu=octeon or -mattr=+cnmips.
llvm-svn: 204337
Summary:
SLP Vectorization of intrinsics (r203707) has exposed cases where the
expansion of vector bswap is failing (PR19151).
Reviewers: hfinkel
CC: chandlerc
Differential Revision: http://llvm-reviews.chandlerc.com/D3104
llvm-svn: 204163
Summary:
Correct the match patterns and the lowerings that made the CodeGen tests pass despite the mistakes.
The original testcase that discovered the problem was SingleSource/UnitTests/SignlessType/factor.c in test-suite.
During review, we also found that some of the existing CodeGen tests were incorrect and fixed them:
* bitwise.ll: In bsel_v16i8 the IfSet/IfClear were reversed because bsel and bmnz have different operand orders and the test didn't correctly account for this. bmnz goes 'IfClear, IfSet, CondMask', while bsel goes 'CondMask, IfClear, IfSet'.
* vec.ll: In the cases where a bsel is emitted as a bmnz (they are the same operation with a different input tied to the result) the operands were in the wrong order.
* compare.ll and compare_float.ll: The bsel operand order was correct for a greater-than comparison, but a greater-than comparison instruction doesn't exist. Lowering this operation inverts the condition so the IfSet/IfClear need to be swapped to match.
The differences between BSEL, BMNZ, and BMZ and how they map to/from vselect are rather confusing. I've therefore added a note to MSA.txt to explain this in a single place in addition to the comments that explain each case.
Reviewers: matheusalmeida, jacksprat
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3028
llvm-svn: 203657
The syntax for "cmpxchg" should now look something like:
cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic
where the second ordering argument gives the required semantics in the case
that no exchange takes place. It should be no stronger than the first ordering
constraint and cannot be either "release" or "acq_rel" (since no store will
have taken place).
rdar://problem/15996804
llvm-svn: 203559
* Add masking instructions before loads and stores (in MC layer).
* Add masking instructions after SP changes (in MC layer).
* Forbid loads, stores and SP changes in delay slots (in MI layer).
Differential Revision: http://llvm-reviews.chandlerc.com/D2904
llvm-svn: 203484
Summary:
Previously, attempting to extract lanes 2 and 3 would actually extract lane 1.
The MSA CodeGen tests only covered lanes 0 and 1.
Differential Revision: http://llvm-reviews.chandlerc.com/D2935
llvm-svn: 202848
Summary:
Parts of the compiler still believed MSA load/stores have a 16-bit offset when
it is actually 10-bit. Corrected this, and fixed a closely related issue this
uncovered where load/stores with 10-bit and 12-bit offsets (MSA and microMIPS
respectively) could not load/store using offsets from the stack/frame pointer.
They accepted frameindex+offset, but not frameindex by itself.
Reviewers: jacksprat, matheusalmeida
Reviewed By: jacksprat
Differential Revision: http://llvm-reviews.chandlerc.com/D2888
llvm-svn: 202717
Summary:
This removes the need to coerce UnknownABI to the default ABI (O32 for
MIPS32, N64 for MIPS64 [*]) in both MipsSubtarget and MipsAsmParser.
Clang has been updated to disable both possible default ABI's before enabling
the ABI it intends to use.
[*] N64 being the default for MIPS64 is not actually correct.
However N32 is not fully implemented/tested yet.
Depends on: D2830
Reviewers: jacksprat, matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D2832
Differential Revision: http://llvm-reviews.chandlerc.com/D2846
llvm-svn: 201792
Summary:
This is consistent with the integrated assembler.
All mips64 codegen tests previously passed -mcpu. Removed -mcpu from
blez_bgez.ll and const-mult.ll to cover the default case.
Ideally, the two implementations of selectMipsCPU() will be merged but it's
proven difficult to find a home for the function that doesn't cause link errors.
For now, we'll hoist the common functionality into a function and mark it with
FIXME's.
Reviewers: jacksprat, matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D2830
llvm-svn: 201782
1) Fix a specific bug when certain conversion functions are called in a program compiled as mips16 with hard float and
the program is linked as c++. There are two libraries that are reversed in the link order with gcc/g++ and clang/clang++ for
mips16 in this case and the proper stubs will then not be called. These stubs are normally handled in the Mips16HardFloat pass
but in this case we don't know at that time that we need to generate the stubs. This must all be handled later in code generation
and we have moved this functionality to MipsAsmPrinter. When linked as C (gcc or clang) the proper stubs are linked in from libc.
2) Set up the infrastructure to handle 90% of what is in the Mips16HardFloat pass in this new area of MipsAsmPrinter. This is a more
logical place to handle this and we have known for some time that we needed to move the code later and not implement it using
inline asm as we do now but it was not clear exactly where to do this and what mechanism should be used. Now it's clear to us
how to do this and this patch contains the infrastructure to move most of this to MipsAsmPrinter but the actual moving will be done
in a follow on patch. The same infrastructure is used to fix this current bug as described in #1. This change was requested by the list
during the original putback of the Mips16HardFloat pass but was not practical for us do at that time.
llvm-svn: 201426
Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for
targets with mature MC support. Such targets will always parse the inline
assembly (even when emitting assembly). Targets without mature MC support
continue to use EmitRawText() for assembly output.
The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced
with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler
to parse inline assembly (even when emitting assembly output). UseIntegratedAs
is set to true for targets that consider any failure to parse valid assembly
to be a bug. Target specific subclasses generally enable the integrated
assembler in their constructor. The default value can be overridden with
-no-integrated-as.
All tests that rely on inline assembly supporting invalid assembly (for example,
those that use mnemonics such as 'foo' or 'hello world') have been updated to
disable the integrated assembler.
Changes since review (and last commit attempt):
- Fixed test failures that were missed due to configuration of local build.
(fixes crash.ll and a couple others).
- Fixed tests that happened to pass because the local build was on X86
(should fix 2007-12-17-InvokeAsm.ll)
- mature-mc-support.ll's should no longer require all targets to be compiled.
(should fix ARM and PPC buildbots)
- Object output (-filetype=obj and similar) now forces the integrated assembler
to be enabled regardless of default setting or -no-integrated-as.
(should fix SystemZ buildbots)
Reviewers: rafael
Reviewed By: rafael
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2686
llvm-svn: 201333
This patch adds NaCl target for Mips. It also forbids indexed loads and
stores if the target is NaCl.
Patch by Sasa Stankovic.
Differential Revision: http://llvm-reviews.chandlerc.com/D2690
llvm-svn: 200855
This patch teaches the DAGCombiner how to fold a sext/aext/zext dag node when
the operand in input is a build vector of constants (or UNDEFs).
The inability to fold a sext/zext of a constant build_vector was the root
cause of some pcg bugs affecting vselect expansion on x86-64 with AVX support.
Before this change, the DAGCombiner only knew how to fold a sext/zext/aext of a
ConstantSDNode.
llvm-svn: 200234
These were:
* noreorder handling on the target object streamer and asm parser.
* setting the initial flag bits based on the enabled features.
* setting the elf header flag for micromips
It is *really* depressing I am the one doing this instead of someone at
mips actually taking the time to understand the infrastructure.
llvm-svn: 200138
r200064 depends on r200051.
r200051 is broken: I tries to replace .mips_hack_elf_flags, which is a good
thing, but what it replaces it with is even worse.
The new emitMipsELFFlags it adds corresponds to no assembly directive, is not
marked as a hack and is not even printed to the .s file.
The patch also introduces more uses of hasRawTextSupport.
The correct way to remove .mips_hack_elf_flags is to have the mips target
streamer handle the default flags (and command line options). That way the
same code path is used for asm and obj. The streamer interface should *really*
correspond to what is printed in the .s file.
llvm-svn: 200078
Summary:
$rs and $rt were the wrong way round in the .td and the testcase wasn't
strict enough to detect the mistake.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D2554
llvm-svn: 199498
than it needs to be by 1 bit but I need to finish some other things so
that all the boundary cases will work in that situation. constpool.c
in test-suite will fail to assemble under our new internal test-suite sync
without this change.
llvm-svn: 199343
This also fixes the placement of the function label comment. It was being
placed next to the mips16 directive instead of next to the label.
llvm-svn: 199245
tail call optimization. Some more work may be needed for indirect
calls but this patch fixes the current regression in Prolangc++/trees.
S2 optimization as part of the general cleanup and optimization
of prolog and epilog was not saving S2 in this case and needed to.
llvm-svn: 197630
Some tiny cosmetic code changes to follow. Because of the wide
ranging nature of the patch a full 24 test cycle was needed to
check against regression. This was the smallest patch I could
make to progress from the earlier ones in the series.
llvm-svn: 197350
Save S2(reg 18) only when we are calling floating point stubs that
have a return value of float or complex. Some more work to make this
better but this is the first step.
llvm-svn: 196921
Summary:
The MSA ld.[bhwd] and st.[bhwd] instructions scale the immediate by the
element size before use as an offset. The offset must therefore be a
multiple of the element size to be valid in these instructions. However,
an unaligned base address is valid in MSA.
This commit causes the compiler to emit valid code when the calculated
offset is not a multiple of the element size by accounting for the offset
using addiu and using a zero offset in the load/store.
Depends on D2338
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D2339
llvm-svn: 196777
Summary:
The immediate in these instructions is scaled before use as an offset.
They therefore have a wider reach than ld.b/st.b.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D2338
llvm-svn: 196775
in case the operands are constants and its difference is |1|.
It should be possible in those cases to rematerialize the result using
MIPS's slt and similar instructions.
The small update to some of the tests in cmov.ll, sel1c.ll and sel2c.ll was needed
otherwise the optimization implemented in this patch would have been triggered
(difference between the operands was 1) and that would have changed the semantic
of the tests.
llvm-svn: 196498
this completes the basic port of ARM constant islands to Mips16.
More testing, code review, cleanup is in order but basically everything
seems to be working. A bug in gas is preventing some of the runtime
testing but I hope to resolve this soon.
llvm-svn: 196331
This prevents the compiler from emitting invalid ld.[bhwd]'s and st.[bhwd]'s
when the stack frame is between 512 and 32,768 bytes in size.
llvm-svn: 195973
in constant islands for Mips16. We introdcuce JalB16 as a synomnym
for Jal16. It makes it easier to read and is also necessary because
Jal16 is a call instruction but JalB16 is being used as a branch.
Various parts of LLVM will not work properly even in this late stage of
the backend if we use what was declared as a call instruction to function
as a branch. For one, basic block labels may not get emitted in some
situations.
llvm-svn: 195968
conditional branches for very large targets. That will be the next small
patch. Everything now should in principle work as good (functionality
wise) as without constant islands so we decided at Mips/Imagination to
make constant islands the default for Mips16 now so that it will get
excercised a lot and this port is still experimentatl though hopefully soon
we will change the status. Some more cleanup and code review is in order
but things are converging fast.
llvm-svn: 195902
make PIC calls a little more efficient:
1. Remove instructions setting up $gp if it is known that a function has been
called at least once.
2. Save the address of a called function in a register instead of loading
it from the GOT at every call site.
llvm-svn: 195892
Summary:
Moved the requirement for SelectionDAG::getConstant() to return legally
typed nodes slightly earlier. There were two optional DAGCombine passes
that were missed out and were required to produce type-legal DAGs.
Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant().
This provides support for both promoted and expanded vector types whereas the
previous code only supported promoted vector types.
Fixes a "Type for zero vector elements is not legal" assertion detected by
an llvm-stress generated test.
Reviewers: resistor
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2251
llvm-svn: 195635
to what is needed for constant islands. The prescan method for Mips16 constant
islands will eventually go away. It is only temporary and should be done
earlier when the instructions are first created or from the DAG. If we keep
it here we need to handle better the situation where constant islands
is called multiple times since don't want to prescan more than once.
llvm-svn: 195569
I had to move some code and I moved a declaration forward past it's first use
in the function but by nutty coincidence there was another variable of the same
name and type and with completely unrelated function that was declared globally
in the class so no compilation error ensued.
It required some unusual conditions for it to even matter. Caused test
case casts.c in test-suite to fail during compilation with a duplicate
symbol error. I would have noticed it during final code review for this port.
llvm-svn: 195565
Mask == ~InvMask asserts if the width of Mask and InvMask differ.
The combine isn't valid (with two exceptions, see below) if the widths differ
so test for this before testing Mask == ~InvMask.
In the specific cases of Mask=~0 and InvMask=0, as well as Mask=0 and
InvMask=~0, the combine is still valid. However, there are more appropriate
combines that could be used in these cases such as folding x & 0 to 0, or
x & ~0 to x.
llvm-svn: 195364
Summary:
LegalizeSetCCCondCode can now legalize SETEQ and SETNE by returning the inverse
condition and requesting that the caller invert the result of the condition.
The caller of LegalizeSetCCCondCode must handle the inverted CC, and they do
so as follows:
SETCC, BR_CC:
Invert the result of the SETCC with SelectionDAG::getNOT()
SELECT_CC:
Swap the true/false operands.
This is necessary for MSA which lacks an integer SETNE instruction.
Reviewers: resistor
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2229
llvm-svn: 195355
Hard float for mips16 means essentially to compile as soft float but to
use a runtime library for soft float that is written with native mips32
floating point instructions (those runtime routines run in mips32 hard
float mode).
The patch reviewed by Reed Kotler.
llvm-svn: 195123
Fixed an inappropriate use of BuildPairF64 when compiling for MIPS32 with FP64
which resulted in an impossible constraint on the register allocation. It now
uses BuildPairF64_64.
llvm-svn: 195007
Now that FileCheck supports multiple check prefixes, we don't need to keep the
little and big endian versions of this test separate anymore. Merge them back
together.
llvm-svn: 194826
Summary:
When getConstant() is called for an expanded vector type, it is split into
multiple scalar constants which are then combined using appropriate build_vector
and bitcast operations.
In addition to the usual big/little endian differences, the case where the
element-order of the vector does not have the same endianness as the elements
themselves is also accounted for. For example, for v4i32 on big-endian MIPS,
the byte-order of the vector is <3210,7654,BA98,FEDC>. For little-endian, it is
<0123,4567,89AB,CDEF>.
Handling this case turns out to be a nop since getConstant() returns a splatted
vector (so reversing the element order doesn't change the value)
This fixes a number of cases in MIPS MSA where calling getConstant() during
operation legalization introduces illegal types (e.g. to legalize v2i64 UNDEF
into a v2i64 BUILD_VECTOR of illegal i64 zeros). It should also handle bigger
differences between illegal and legal types such as legalizing v2i64 into v8i16.
lowerMSASplatImm() in the MIPS backend no longer needs to avoid calling
getConstant() so this function has been updated in the same patch.
For the sake of transparency, the steps I've taken since the review are:
* Added 'virtual' to isVectorEltOrderLittleEndian() as requested. This revealed
that the MIPS tests were falsely passing because a polymorphic function was
not actually polymorphic in the reviewed patch.
* Fixed the tests that were now failing. This involved deleting the code to
handle the MIPS MSA element-order (which was previously doing an byte-order
swap instead of an element-order swap). This left
isVectorEltOrderLittleEndian() unused and it was deleted.
* Fixed build failures caused by rebasing beyond r194467-r194472. These build
failures involved the bset, bneg, and bclr instructions added in these commits
using lowerMSASplatImm() in a way that was no longer valid after this patch.
Some of these were fixed by calling SelectionDAG::getConstant() instead,
others were fixed by a new function getBuildVectorSplat() that provided the
removed functionality of lowerMSASplatImm() in a more sensible way.
Reviewers: bkramer
Reviewed By: bkramer
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1973
llvm-svn: 194811
Summary:
This patch (correctly) breaks some MSA tests by exposing the cases when
SelectionDAG::getConstant() produces illegal types. These have been temporarily
marked XFAIL and the XFAIL flag will be removed when
SelectionDAG::getConstant() is fixed.
There are three categories of failure:
* Immediate instructions are not selected in one endian mode.
* Immediates used in ldi.[bhwd] must be different according to endianness.
(this only affects cases where the 'wrong' ldi is used to load the correct
bitpattern. E.g. (bitcast:v2i64 (build_vector:v4i32 ...)))
* Non-immediate instructions that rely on immediates affected by the
previous two categories as part of their match pattern.
For example, the bset match pattern is the vector equivalent of
'ws | (1 << wt)'.
One test needed correcting to expect different output depending on whether big
or little endian was in use. This test was
test/CodeGen/Mips/msa/basic_operations.ll and experiences the second category
of failure shown above. The little endian version of this test is named
basic_operations_little.ll and will be merged back into basic_operations.ll in
a follow up commit now that FileCheck supports multiple check prefixes.
Reviewers: bkramer, jacksprat, dsanders
Reviewed By: dsanders
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1972
llvm-svn: 194806
short form. Constant islands will expand them if they are out of range.
Since there is not direct object emitter at this time, it does not
have any material affect because the assembler sorts this out. But we
need to know for the actual constant island work. We track the difference
by putting # 16 inst in the comments.
llvm-svn: 194766
specifically about the .space directive. This allows us to force large
blocks of code to appear in test cases for things like constant islands
without having to make giant test cases to force things like long
branches to take effect.
llvm-svn: 194555
Like GCC, this re-uses the 'f' constraint and a new 'w' print-modifier:
asm ("ldi.w %w0, 1", "=f"(result));
Unlike GCC, the 'w' print-modifer is not _required_ to produce the intended
output. This is a consequence of differences in the internal handling of
the registers in each compiler. To be source-compatible between the
compilers, users must use the 'w' print-modifier.
MSA registers (including control registers) are supported in clobber lists.
llvm-svn: 194476
Upcoming commit(s) are going to add support for bseti and bnegi. This would
cause some existing tests to (correctly) change behaviour and emit a different
instruction. This patch prevents this by changing the constant used in ori and
xori tests so that they will not be matchable by the bseti and bnegi patterns
when these instructions are matchable from normal IR.
llvm-svn: 194467
This has no material effect at this time since we don't have a direct
object emitter for mips16 and the assembler can't tell them apart. I
place a comment "16 bit inst" for those so that I can tell them apart in the
output. The constant island pass has only been minimally changed to allow
this. More complete branch work is forthcoming but this is the first
step.
llvm-svn: 194442
formal arguments on the stack and stores created afterwards. We need this to
ensure tail call optimized function calls do not write over the argument area
of the stack before it is read out.
llvm-svn: 194309
Submit the basic port of the rest of ARM constant islands code to Mips.
Two test cases are added which reflect the next level of functionality:
constants getting moved to water areas that are out of range from the
initial placement at the end of the function and basic blocks being split to
create water when none exists that can be used. There is a bunch of this
code that is not complete and has been marked with IN_PROGRESS. I will
finish cleaning this all up during the next week or two and submit the
rest of the test cases. I have elminated some code for dealing with
inline assembly because to me it unecessarily complicates things and
some of the newer features of llvm like function attributies and builtin
assembler give me better tools to solve the alignment issues created
there. Also, for Mips16 I even have the option of not doing constant
islands in the present of inline assembler if I chose. When everything
has been completed I will summarize the port and notify people that
are knowledgable regarding the ARM Constant Islands code so they can
review it in it's entirety if they wish.
llvm-svn: 194053
Also corrected the definition of the intrinsics for these instructions (the
result register is also the first operand), and added intrinsics for bsel and
bseli to clang (they already existed in the backend).
These four operations are mostly equivalent to bsel, and bseli (the difference
is which operand is tied to the result). As a result some of the tests changed
as described below.
bitwise.ll:
- bsel.v test adapted so that the mask is unknown at compile-time. This stops
it emitting bmnzi.b instead of the intended bsel.v.
- The bseli.b test now tests the right thing. Namely the case when one of the
values is an uimm8, rather than when the condition is a uimm8 (which is
covered by bmnzi.b)
compare.ll:
- bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this
is the same operation (see MSA.txt).
i8.ll
- CHECK-DAG-ized test.
- bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands
because this is the same operation (see MSA.txt).
- bseli.b still emits bseli.b though because the immediate makes it
distinguishable from bmnzi.b.
vec.ll:
- CHECK-DAG-ized test.
- bmz.v tests now (correctly) emits bmnz.v with swapped operands (see
MSA.txt).
- bsel.v tests now (correctly) emits bmnz.v with swapped operands (see
MSA.txt).
llvm-svn: 193693
This required correcting the definition of the bins[lr]i intrinsics because
the result is also the first operand.
It also required removing the (arbitrary) check for 32-bit immediates in
MipsSEDAGToDAGISel::selectVSplat().
Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d
because the constant is legalized into a ConstantPool. Similar things can
happen with binsri.d with more than 10 bits set in the mask. The resulting
code when this happens is correct but not optimal.
llvm-svn: 193687
(or (and $a, $mask), (and $b, $inverse_mask)) => (vselect $mask, $a, $b).
where $mask is a constant splat. This allows bitwise operations to make use
of bsel.
It's also a stepping stone towards matching bins[lr], and bins[lr]i from
normal IR.
Two sets of similar tests have been added in this commit. The bsel_* functions
test the case where binsri cannot be used. The binsr_*_i functions will
start to use the binsri instruction in the next commit.
llvm-svn: 193682
splat.d is implemented but this subtest is currently disabled. This is because
it is difficult to match the appropriate IR on MIPS32. There is a patch under
review that should help with this so I hope to enable the subtest soon.
llvm-svn: 193680
Before I just ported the shell of the pass. I've tried to keep everything
nearly identical to the ARM version. I think it will be very easy to eventually
merge these two and create a new more general pass that other targets can
use. I have some improvements I would like to make to allow pools to
be shared across functions and some other things. When I'm all done we
can think about making a more general pass. More to be ported but the
basic mechanism works now almost as good as gcc mips16.
llvm-svn: 193509
of loops.
Previously, two consecutive calls to function "func" would result in the
following sequence of instructions:
1. load $16, %got(func)($gp) // load address of lazy-binding stub.
2. move $25, $16
3. jalr $25 // jump to lazy-binding stub.
4. nop
5. move $25, $16
6. jalr $25 // jump to lazy-binding stub again.
With this patch, the second call directly jumps to func's address, bypassing
the lazy-binding resolution routine:
1. load $25, %got(func)($gp) // load address of lazy-binding stub.
2. jalr $25 // jump to lazy-binding stub.
3. nop
4. load $25, %got(func)($gp) // load resolved address of func.
5. jalr $25 // directly jump to func.
llvm-svn: 191591
For v4f32 and v2f64, EXTRACT_VECTOR_ELT is matched by a pseudo-insn which may
be expanded to subregister copies and/or instructions as appropriate.
llvm-svn: 191514
Most constant BUILD_VECTOR's are matched using ComplexPatterns which cover
bitcasted as well as normal vectors. However, it doesn't seem to be possible to
match ldi.[bhwd] in a type-agnostic manner (e.g. to support the widest range of
immediates, it should be possible to use ldi.b to load v2i64) using TableGen so
ldi.[bhwd] is matched using custom code in MipsSEISelDAGToDAG.cpp
This made the majority of the constant splat BUILD_VECTOR lowering redundant.
The only transformation remaining for constant splats is when an (up-to) 32-bit
constant splat is possible but the value does not fit into a 10-bit signed
integer. In this case, the BUILD_VECTOR is transformed into a bitcasted
BUILD_VECTOR so that fill.[bhw] can be used to splat the vector from a GPR32
register (which is initialized using the usual lui/addui sequence).
There are no additional tests since this is a re-implementation of previous
functionality. The change is intended to make it easier to implement some of
the upcoming instruction selection patches since they can rely on existing
support for BUILD_VECTOR's in the DAGCombiner.
compare_float.ll changed slightly because a BITCAST is no longer
introduced during legalization.
llvm-svn: 191299
Changes to MIPS SelectionDAG:
* Added nodes VEXTRACT_[SZ]EXT_ELT to represent extract and extend in a single
operation and implemented the DAG combines necessary to fold sign/zero
extends into the extract.
llvm-svn: 191199
Note: There's a later patch on my branch that re-implements this to select
build_vector without the custom SelectionDAG nodes. The future patch avoids
the constant-folding problems stemming from the custom node (i.e. it doesn't
need to re-implement all the DAG combines related to BUILD_VECTOR).
Changes to MIPS specific SelectionDAG nodes:
* Added VSPLAT
This is a special case of BUILD_VECTOR that covers the case the
BUILD_VECTOR is a splat operation.
* Added VSPLATD
This is a special case of VSPLAT that handles the cases when v2i64 is legal
llvm-svn: 191191
1) make sure that the first two instructions of the sequence cannot
separate from each other. The linker requires that they be sequential.
If they get separated, it can still work but it will not work in all
cases because the first of the instructions mostly involves the hi part
of the pc relative offset and that part changes slowly. You would have
to be at the right boundary for this to matter.
2) make sure that this sequence begins on a longword boundary.
There appears to be a bug in binutils which makes some of these calculations
get messed up if the instruction sequence does not begin on a longword
boundary. This is being investigated with the appropriate binutils folks.
llvm-svn: 190966
precision loads and stores as well as reg+imm double precision loads and stores.
Previously, expansion of loads and stores was done after register allocation,
but now it takes place during legalization. As a result, users will see double
precision stores and loads being emitted to spill and restore 64-bit FP registers.
llvm-svn: 190235
don't exist in libc. This is really not the right way to solve this problem;
but it's not clear to me at this time exactly what is the right way.
If we create stubs here, they will cause link errors because these functions
do not exist in libc.
llvm-svn: 189727
has hard float, when you compile the mips32 code you have to make sure
that it knows to compile any mips32 routines as hard float. I need to clean
up the way mips16 hard float is specified but I need to first think through
all the details. Mips16 always has a form of soft float, the difference being
whether the underlying hardware has floating point. So it's not really
necessary to pass the -soft-float to llvm since soft-float is always true
for mips16 by virtue of the fact that it will not register floating point
registers. By using this fact, I can simplify the way this is all handled.
llvm-svn: 189690
These intrinsics are legalized to V(ALL|ANY)_(NON)?ZERO nodes,
are matched as SN?Z_[BHWDV]_PSEUDO pseudo's, and emitted as
a branch/mov sequence to evaluate to 0 or 1.
Note: The resulting code is sub-optimal since it doesnt seem to be possible
to feed the result of an intrinsic directly into a brcond. At the moment
it uses (SETCC (VALL_ZERO $ws), 0, SETEQ) and similar which unnecessarily
evaluates the boolean twice.
llvm-svn: 189478
The MSA control registers have been added as reserved registers,
and are only used via ISD::Copy(To|From)Reg. The intrinsics are lowered
into these nodes.
llvm-svn: 189468
Note that all of these tests use ld.b and st.b for the loads and stores
regardless of the data size. This is because the definition of bitcast is
equivalent to a store/load sequence and DAG combiner accordingly folds bitcasts
to/from v16i8 into the load/store nodes to product load/store nodes with
type v16i8.
llvm-svn: 189333
I need to add the rest of these to the list or else to delay putting
out the actual stub until later in code generation when I know if
the external function ever got emitted
Resubmit this patch. The target triple needs to be added to the test so that
clang does not tell the backend the wrong target when the host is BSD. There
is a clang bug in here somewhere that I need to track down. At Mips this
has been filed internally as a bug.
llvm-svn: 189186
I need to add the rest of these to the list or else to delay putting
out the actual stub until later in code generation when I know if
the external function ever got emitted.
llvm-svn: 189161
functions be compiled as mips32, without having to add attributes. This
is useful in certain situations where you don't want to have to edit the
function attributes in the source. For now it's only an option used for
the compiler developers when debugging the mips16 port.
llvm-svn: 188826
This regards how mips16 is viewed. It's not really a target type but
there has always been a target for it in the td files. It's more properly
-mcpu=mips32 -mattr=+mips16 . This is how clang treats it but we have
always had the -mcpu=mips16 which I probably should delete now but it will
require updating all the .ll test cases for mips16. In this case it changed
how we decide if we have a count bits instruction and whether instruction
lowering should then expand ctlz. Now that we have dual mode compilation,
-mattr=+mips16 really just indicates the inital processor mode that
we are compiling for. (It is also possible to have -mcpu=64 -mattr=+mips16
but as far as I know, nobody has even built such a processor, though there
is an architecture manual for this).
llvm-svn: 188586
- Instead of setting the suffixes in a bunch of places, just set one master
list in the top-level config. We now only modify the suffix list in a few
suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py).
- Aside from removing the need for a bunch of lit.local.cfg files, this enables
4 tests that were inadvertently being skipped (one in
Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and
CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been
XFAILED).
- This commit also fixes a bunch of config files to use config.root instead of
older copy-pasted code.
llvm-svn: 188513
is actually an instrinsic that will not occur in libc. This list here
is not exhaustive but fixes the one places in test-suite where this occurs.
I have filed a bug against myself to research the full list and add them
to the array of such cases. In the future, actual stub generation will occur
in a later phase and we won't need this code because we will know at that time
during the compilation that in fact no helper function was even needed.
llvm-svn: 188149
I need to go through all the runtime routine list and see if there
are any more I need to add for mips16 floating point. Prototypes must
be correct or else I don't know to add a helper function call.
llvm-svn: 188106
helper functions. This can be optimized out later when the remaining
parts of the helper function work is moved into the Mips16HardFloat pass.
For now it forces us to use the 32 bit save/restore instructions instead
of the 16 bit ones.
llvm-svn: 187712
This is actually an LLVM bug in the way it generates signatures for these
when soft float is enabled. For example, floor ends up having the signature
of int64(int64). The signature part is not the same as where the actual
parameter types are recorded, and those ARE of course int64(int64) when
soft float is enabled. (Yes, Mips16 hard float uses soft float but with
different runtime rounes but then has to interoperate with Mips32 using
normal floating point). This logic will eventually be moved to the
Mips16HardFloat pass so it's not worth sorting out these issues in LLVM
since nobody but Mips16 cares about these signatures, as far as I know,
and even I won't eventually either.
llvm-svn: 187613
Also avoid locals evicting locals just because they want a cheaper register.
Problem: MI Sched knows exactly how many registers we have and assumes
they can be colored. In cases where we have large blocks, usually from
unrolled loops, greedy coloring fails. This is a source of
"regressions" from the MI Scheduler on x86. I noticed this issue on
x86 where we have long chains of two-address defs in the same live
range. It's easy to see this in matrix multiplication benchmarks like
IRSmk and even the unit test misched-matmul.ll.
A fundamental difference between the LLVM register allocator and
conventional graph coloring is that in our model a live range can't
discover its neighbors, it can only verify its neighbors. That's why
we initially went for greedy coloring and added eviction to deal with
the hard cases. However, for singly defined and two-address live
ranges, we can optimally color without visiting neighbors simply by
processing the live ranges in instruction order.
Other beneficial side effects:
It is much easier to understand and debug regalloc for large blocks
when the live ranges are allocated in order. Yes, global allocation is
still very confusing, but it's nice to be able to comprehend what
happened locally.
Heuristics could be added to bias register assignment based on
instruction locality (think late register pairing, banks...).
Intuituvely this will make some test cases that are on the threshold
of register pressure more stable.
llvm-svn: 187139
This update was done with the following bash script:
find test/CodeGen -name "*.ll" | \
while read NAME; do
echo "$NAME"
if ! grep -q "^; *RUN: *llc.*debug" $NAME; then
TEMP=`mktemp -t temp`
cp $NAME $TEMP
sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \
while read FUNC; do
sed -i '' "s/;\(.*\)\([A-Za-z0-9_-]*\):\( *\)$FUNC: *\$/;\1\2-LABEL:\3$FUNC:/g" $TEMP
done
sed -i '' "s/;\(.*\)-LABEL-LABEL:/;\1-LABEL:/" $TEMP
sed -i '' "s/;\(.*\)-NEXT-LABEL:/;\1-NEXT:/" $TEMP
sed -i '' "s/;\(.*\)-NOT-LABEL:/;\1-NOT:/" $TEMP
sed -i '' "s/;\(.*\)-DAG-LABEL:/;\1-DAG:/" $TEMP
mv $TEMP $NAME
fi
done
llvm-svn: 186280
This was done with the following sed invocation to catch label lines demarking function boundaries:
sed -i '' "s/^;\( *\)\([A-Z0-9_]*\):\( *\)test\([A-Za-z0-9_-]*\):\( *\)$/;\1\2-LABEL:\3test\4:\5/g" test/CodeGen/*/*.ll
which was written conservatively to avoid false positives rather than false negatives. I scanned through all the changes and everything looks correct.
llvm-svn: 186258
The pass emits a call to sqrt that has attribute "read-none". This call will be
converted to an ISD::FSQRT node during DAG construction, which will turn into
a mips native sqrt instruction.
llvm-svn: 183802
the Mips16 port. A few of the psuedos could either take signed
or unsigned arguments and I did not distinguish the case and improperly
rejected some valid cases that the assembler had previously accepted
when they were pure pseudos that expanded as assembly instructions.
llvm-svn: 183633
Fix an assertion when the compiler encounters big constants whose bit width is
not a multiple of 64-bits.
Although clang would never generate something like this, the backend should be
able to handle any legal IR.
<rdar://problem/13363576>
llvm-svn: 183544
pic calls. These need to be there so we don't try and use helper
functions when we call those.
As part of this, make sure that we properly exclude helper functions in pic
mode when indirect calls are involved.
llvm-svn: 182343
By default, a teq instruction is inserted after integer divide. No divide-by-zero
checks are performed if option "-mnocheck-zero-division" is used.
llvm-svn: 182306
Previously, three instructions were needed:
trunc.w.s $f0, $f2
mfc1 $4, $f0
sw $4, 0($2)
Now we need only two:
trunc.w.s $f0, $f2
swc1 $f0, 0($2)
llvm-svn: 182053
This creates stubs that help Mips32 functions call Mips16
functions which have floating point parameters that are normally passed
in floating point registers.
llvm-svn: 181972
Mips16/32 floating point interoperability.
When Mips16 code calls external functions that would normally have some
of its parameters or return values passed in floating point registers,
it needs (Mips32) helper functions to do this because while in Mips16 mode
there is no ability to access the floating point registers.
In Pic mode, this is done with a set of predefined functions in libc.
This case is already handled in llvm for Mips16.
In static relocation mode, for efficiency reasons, the compiler generates
stubs that the linker will use if it turns out that the external function
is a Mips32 function. (If it's Mips16, then it does not need the helper
stubs).
These stubs are identically named and the linker knows about these tricks
and will not create multiple copies and will delete them if they are not
needed.
llvm-svn: 181753
This option is used when the user wants to avoid emitting double precision FP
loads and stores. Double precision FP loads and stores are expanded to single
precision instructions after register allocation.
llvm-svn: 181718
mips16/mips32 floating point interoperability.
This patch fixes returns from mips16 functions so that if the function
was in fact called by a mips32 hard float routine, then values
that would have been returned in floating point registers are so returned.
Mips16 mode has no floating point instructions so there is no way to
load values into floating point registers.
This is needed when returning float, double, single complex, double complex
in the Mips ABI.
Helper functions in libc for mips16 are available to do this.
For efficiency purposes, these helper functions have a different calling
convention from normal Mips calls.
Registers v0,v1,a0,a1 are used to pass parameters instead of
a0,a1,a2,a3.
This is because v0,v1,a0,a1 are the natural registers used to return
floating point values in soft float. These values can then be moved
to the appropriate floating point registers with no extra cost.
The only register that is modified is ra in this call.
The helper functions make sure that the return values are in the floating
point registers that they would be in if soft float was not in effect
(which it is for mips16, though the soft float is implemented using a mips32
library that uses hard float).
llvm-svn: 181641
its fields.
This removes false dependencies between DSP instructions which access different
fields of the the control register. Implicit register operands are added to
instructions RDDSP and WRDSP after instruction selection, depending on the
value of the mask operand.
llvm-svn: 181041
register.
- Define pseudo instructions which store or load ccond field of the DSP
control register.
- Emit the pseudos in MipsSEInstrInfo::storeRegToStack and loadRegFromStack.
- Expand the pseudos before callee-scan save.
- Emit instructions RDDSP or WRDSP to copy between ccond field and GPRs.
llvm-svn: 180969
Expand copy instructions between two accumulator registers before callee-saved
scan is done. Handle copies between integer GPR and hi/lo registers in
MipsSEInstrInfo::copyPhysReg. Delete pseudo-copy instructions that are not
needed.
llvm-svn: 180827
Mips32 code as Mips16 unless it can't be compiled as Mips 16. For now this
would happen as long as floating point instructions are not needed.
Probably it would also make sense to compile as mips32 if atomic operations
are needed too. There may be other cases too.
A module pass prescans the IR and adds the mips16 or nomips16 attribute
to functions depending on the functions needs.
Mips 16 mode can result in a 40% code compression by utililizing 16 bit
encoding of many instructions.
The hope is for this to replace the traditional gcc way of dealing with
Mips16 code using floating point which involves essentially using soft float
but with a library implemented using mips32 floating point. This gcc
method also requires creating stubs so that Mips32 code can interact with
these Mips 16 functions that have floating point needs. My conjecture is
that in reality this traditional gcc method would never win over this
new method.
I will be implementing the traditional gcc method also. Some of it is already
done but I needed to do the stubs to finish the work and those required
this mips16/32 mixed mode capability.
I have more ideas for to make this new method much better and I think the old
method will just live in llvm for anyone that needs the backward compatibility
but I don't for what reason that would be needed.
llvm-svn: 179185
Modifier 'D' is to use the second word of a double integer.
We had previously implemented the pure register varient of
the modifier and this patch implements the memory reference.
#include "stdio.h"
int b[8] = {0,1,2,3,4,5,6,7};
void main()
{
int i;
// The first word. Notice, no 'D'
{asm (
"lw %0,%1;"
: "=r" (i)
: "m" (*(b+4))
);}
printf("%d\n",i);
// The second word
{asm (
"lw %0,%D1;"
: "=r" (i)
: "m" (*(b+4))
);}
printf("%d\n",i);
}
llvm-svn: 179135
and mips16 on a per function basis.
Because this patch is somewhat involved I have provide an overview of the
key pieces of it.
The patch is written so as to not change the behavior of the non mixed
mode. We have tested this a lot but it is something new to switch subtargets
so we don't want any chance of regression in the mainline compiler until
we have more confidence in this.
Mips32/64 are very different from Mip16 as is the case of ARM vs Thumb1.
For that reason there are derived versions of the register info, frame info,
instruction info and instruction selection classes.
Now we register three separate passes for instruction selection.
One which is used to switch subtargets (MipsModuleISelDAGToDAG.cpp) and then
one for each of the current subtargets (Mips16ISelDAGToDAG.cpp and
MipsSEISelDAGToDAG.cpp).
When the ModuleISel pass runs, it determines if there is a need to switch
subtargets and if so, the owning pointers in MipsTargetMachine are
appropriately changed.
When 16Isel or SEIsel is run, they will return immediately without doing
any work if the current subtarget mode does not apply to them.
In addition, MipsAsmPrinter needs to be reset on a function basis.
The pass BasicTargetTransformInfo is substituted with a null pass since the
pass is immutable and really needs to be a function pass for it to be
used with changing subtargets. This will be fixed in a follow on patch.
llvm-svn: 179118
This patch initializes t9 to the handler address, but only if the relocation
model is pic. This handles the case where handler to which eh.return jumps
points to the start of the function.
Patch by Sasa Stankovic.
llvm-svn: 178588
derived class MipsSETargetLowering.
We shouldn't be generating madd/msub nodes if target is Mips16, since Mips16
doesn't have support for multipy-add/sub instructions.
llvm-svn: 178404
Apparently my final cleanup to use a relevant suffix for these tests before
committing r176831 caused them to stop running since lit wasn't configured to
run tests with that suffix in those directories (why don't we just have a
global suffix list?). So, add the suffix to the relevant directories & fix the
test that has bitrotted over the last week due to my debug info schema changes.
llvm-svn: 177315
This calling convention was added just to handle functions which return vector
of floats. The fix committed in r165585 solves the problem.
llvm-svn: 176530
This patch eliminates the need to emit a constant move instruction when this
pattern is matched:
(select (setgt a, Constant), T, F)
The pattern above effectively turns into this:
(conditional-move (setlt a, Constant + 1), F, T)
llvm-svn: 176384
SltCCRxRy16, SltiCCRxImmX16, SltiuCCRxImmX16, SltuCCRxRy16
$T8 shows up as register $24 when emitted from C++ code so we had
to change some tests that were already there for this functionality.
llvm-svn: 175593
This expansion will be moved to expandISelPseudos as soon as I can figure
out how to do that. There are other instructions which use this
ExpandFEXT_T8I816_ins and as soon as I have finished expanding them all,
I will delete the macro asm string text so it has no way to be used
in the future.
llvm-svn: 175413
not matter but makes it more gcc compatible which avoids possible subtle
problems. Also, turned back on a disabled check in helloworld.ll.
llvm-svn: 175237
if the offset fits in 11 bits. This makes use of the fact that the abi
requires sp to be 8 byte aligned so the actual offset can fit in 8
bits. It will be shifted left and sign extended before being actually used.
The assembler or direct object emitter will shift right the 11 bit
signed field by 3 bits. We don't need to deal with that here.
llvm-svn: 175073
same so we put in the comment field an indicator when we think we are
emitting the 16 bit version. For the direct object emitter, the difference is
important as well as for other passes which need an accurate count of
program size. There will be other similar putbacks to this for various
instructions.
llvm-svn: 174747
allowed size for the instruction. This code uses RegScavenger to fix this.
We sometimes need 2 registers for Mips16 so we must handle things
differently than how register scavenger is normally used.
llvm-svn: 174696
is a vararg function.
The original code was examining flag OutputArg::IsFixed to determine whether
CC_MipsN_VarArg or CC_MipsN should be called. This is not correct, since this
flag is often set to false when the function being analyzed is a non-variadic
function.
llvm-svn: 174442
and enables the instruction printer to print aliased
instructions.
Due to usage of RegisterOperands a change in common
code (utils/TableGen/AsmWriterEmitter.cpp) is required
to get the correct register value if it is a RegisterOperand.
Contributer: Vladimir Medic
llvm-svn: 174358
Allow Mips16 routines to call Mips32 routines that have abi requirements
that either arguments or return values are passed in floating point
registers. This handles only the pic case. We have not done non pic
for Mips16 yet in any form.
The libm functions are Mips32, so with this addition we have a complete
Mips16 hard float implementation.
We still are not able to complete mix Mip16 and Mips32 with hard float.
That will be the next phase which will have several steps. For Mips32
to freely call Mips16 some stub functions must be created.
llvm-svn: 173320
these patches are tested a lot by test-suite but
make check tests are forthcoming once the next
few patches that complete this are committed.
with the next few patches the pass rate for mips16 is
near 100%
llvm-svn: 170656
physical register $r1 to $r0.
GNU disassembler recognizes an "or" instruction as a "move", and this change
makes the disassembled code easier to read.
Original patch by Reed Kotler.
llvm-svn: 170655
Mips16 is really a processor decoding mode (ala thumb 1) and in the same
program, mips16 and mips32 functions can exist and can call each other.
If a jal type instruction encounters an address with the lower bit set, then
the processor switches to mips16 mode (if it is not already in it). If the
lower bit is not set, then it switches to mips32 mode.
The linker knows which functions are mips16 and which are mips32.
When relocation is performed on code labels, this lower order bit is
set if the code label is a mips16 code label.
In general this works just fine, however when creating exception handling
tables and dwarf, there are cases where you don't want this lower order
bit added in.
This has been traditionally distinguished in gas assembly source by using a
different syntax for the label.
lab1: ; this will cause the lower order bit to be added
lab2=. ; this will not cause the lower order bit to be added
In some cases, it does not matter because in dwarf and debug tables
the difference of two labels is used and in that case the lower order
bits subtract each other out.
To fix this, I have added to mcstreamer the notion of a debuglabel.
The default is for label and debug label to be the same. So calling
EmitLabel and EmitDebugLabel produce the same result.
For various reasons, there is only one set of labels that needs to be
modified for the mips exceptions to work. These are the "$eh_func_beginXXX"
labels.
Mips overrides the debug label suffix from ":" to "=." .
This initial patch fixes exceptions. More changes most likely
will be needed to DwarfCFException to make all of this work
for actual debugging. These changes will be to emit debug labels in some
places where a simple label is emitted now.
Some historical discussion on this from gcc can be found at:
http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00623.htmlhttp://gcc.gnu.org/ml/gcc-patches/2008-11/msg01273.html
llvm-svn: 170279
In this case, essentially it is soft float with different library routines.
The next step will be to make this fully interoperational with mips32 floating
point and that requires creating stubs for functions with signatures that
contain floating point types.
I have a more sophisticated design for mips16 hardfloat which I hope to
implement at a later time that directly does floating point without the need
for function calls.
The mips16 encoding has no floating point instructions so one needs to
switch to mips32 mode to execute floating point instructions.
llvm-svn: 170259
We will make them delay slot forms if there is something that can be
placed in the delay slot during a separate pass. Mips16 extended instructions
cannot be placed in delay slots.
llvm-svn: 166990
Previously mips16 was sharing the pattern addr which is used for mips32
and mips64. This had a number of problems:
1) Storing and loading byte and halfword quantities for mips16 has particular
problems due to the primarily non mips16 nature of SP. When we must
load/store byte/halfword stack objects in a function, we must create a mips16
alias register for SP. This functionality is tested in stchar.ll.
2) We need to have an FP register under certain conditions (such as
dynamically sized alloca). We use mips16 register S0 for this purpose.
In this case, we also use this register when accessing frame objects so this
issue also affects the complex pattern addr16. This functionality is
tested in alloca16.ll.
The Mips16InstrInfo.td has been updated to use addr16 instead of addr.
The complex pattern C++ function for addr has been copied to addr16 and
updated to reflect the above issues.
llvm-svn: 166897
Instructions emitted to compute branch offsets now use immediate operands
instead of symbolic labels. This change was needed because there were problems
when R_MIPS_HI16/LO16 relocations were used to make shared objects.
llvm-svn: 162731
In SelectionDAGLegalize::ExpandLegalINT_TO_FP, expand INT_TO_FP nodes without
using any f64 operations if f64 is not a legal type.
Patch by Stefan Kristiansson.
llvm-svn: 162728